It has been a long time since I really had to think about how monads are implemented so I thought I should take a refresher course, in probably reinventing the wheel by implementing from scratch a sequential logging monad I need for a project. The point of it is that client code can append to it, but not do anything else with it if I hide the Log and LogT constructors. What I ended up coming up with is,
class Logger l e | l -> e where writeEntry :: e -> l ()
newtype Log e a = Log (Seq e -> (a, Seq e))
newtype Monad m => LogT e m a = LogT (Seq e -> m (a, Seq e))
runLog :: Log e a -> (a, [e])
runLog (Log f) = let (x, l) = f empty in (x, toList l)
runLogT :: Monad m => LogT e m a -> m (a, [e])
runLogT (LogT f) = do (x, l) <- f empty return $ (x, toList l)
instance Logger (Log e) e where writeEntry e = Log (\l -> ((), l |> e))
instance Monad m => Logger (LogT e m) e where writeEntry e = LogT (\l -> return ((), l |> e))
instance Monad (Log e) where Log f >>= g' = let fg l = let (fl, l') = f l Log g = g' fl in g l' in Log fg return x = Log ((,) x)
instance Monad m => Monad (LogT e m) where LogT f >>= g' = let fg l = do (fl, l') <- f l let LogT g = g' fl g l' in LogT fg return x = LogT (return . (,) x)
instance MonadTrans (LogT e) where lift x' = let f l = do x <- x' return (x, l) in LogT f
liftLog :: Monad m => Log e a -> LogT e m a
liftLog (Log f) = let fm l = let (fl, l') = f l in return (fl, l') in LogT fm
I wonder, in what ways have I not done this well, or violated typical conventions? I managed to get this code working, but, especially with my use of flexible instances of Logger, I wonder if I introduced unnecessary weirdness.
Presumably I could have instead somehow wrapped the Writer monad to do much of this work for me while preventing users from fiddling the log. I also wondered if it would make sense as an instance of MonadWriter.
Mark T. B. Carroll <Mark.Carr...@aetion.com> wrote:
> It has been a long time since I really had to think about how monads are > implemented so I thought I should take a refresher course, in probably > reinventing the wheel by implementing from scratch a sequential logging > monad I need for a project. The point of it is that client code can > append to it, but not do anything else with it if I hide the Log and > LogT constructors.
I'm not a particular fan of "encapsulation by force". Yes, it's desirable to cleanly seperate the distinct rules of client code (this client shouldn't mess with the internals, that client is allowed to do so), but usually it's enough to clearly mark "abstract" and "internal" operation. Enforcement beyond that principle leads to problems, in my experience.
> Presumably I could have instead somehow wrapped the Writer monad to do > much of this work for me while preventing users from fiddling the log. > I also wondered if it would make sense as an instance of MonadWriter.
That would have been my first thought: If you want Data.Sequence, something like
type WLog e = Writer (Seq e)
logEntry :: e -> WLog e () logEntry e = tell $ singleton $ e
Inside the Monad, the client really can't do a lot besides using tell/listen/pass, and if you restrict that to logEntry, even less.
Yes, he could deconstruct the type if he really wants to, but I suppose one could fix that with an additional newtype somewhere and hiding the constructors, if it's really necessary.
(And if you need to give this to, say, students, don't assume they won't find a way around it just because your hiding it...)
> class Logger l e | l -> e where > writeEntry :: e -> l ()
Any particular reason to overload writeEntry? Do you want to write generic code that can use multiple, different loggers, and will actually be instantianted for different ones in the same program?
> newtype Log e a = Log (Seq e -> (a, Seq e))
Why do you implement this as a function? To absolutely make sure the client can't access it?
And especially with a function (and also with the Writer monad), you should be very careful about strictness/lazyness, otherwise your logging might very well happening by constructing nested thunks instead of the actual log.
> I wonder, in what ways have I not done this well, or violated typical > conventions? I managed to get this code working, but, especially with > my use of flexible instances of Logger, I wonder if I introduced > unnecessary weirdness.
I think that's basically because of the overloading of writeEntry.
Dirk Thierbach <dthierb...@usenet.arcornews.de> writes: > Mark T. B. Carroll <Mark.Carr...@aetion.com> wrote: >> It has been a long time since I really had to think about how monads are >> implemented so I thought I should take a refresher course, in probably >> reinventing the wheel by implementing from scratch a sequential logging >> monad I need for a project. The point of it is that client code can >> append to it, but not do anything else with it if I hide the Log and >> LogT constructors.
> I'm not a particular fan of "encapsulation by force". Yes, it's > desirable to cleanly seperate the distinct rules of client code (this > client shouldn't mess with the internals, that client is allowed to do > so), but usually it's enough to clearly mark "abstract" and "internal" > operation. Enforcement beyond that principle leads to problems, in my > experience.
I like it partly for the same reason I like the way monads work in Haskell in general - for instance, knowing that I don't use unsafePerformIO, when I see that the code isn't in the IO monad I can know it doesn't do any IO actions. Given that I hardly ever run into bugs in the main libraries, I find that kind of guarantee very helpful so that I can just not think about the hypothesis and move on. In particular, in this case I used the monad to try to check for programmer (me) error, in that the way a piece of code was working was evidence that it was doing something it shouldn't. Indeed, it wasn't, but a look at my own pre-logging code still left me not quite as confident as I needed to be in this case. (The crucial part was the order in which things happened; parts of it in a simulation seemed perhaps to have information they shouldn't yet have.)
Shouldn't "encapsulation by force" be easy, though? Is the experience you have in mind particular to Haskell-like languages? I've loved it in languages like Modula-3, it seemed just a simple thing to want that I figured any awkwardness would be due to deficiencies I had as a working Haskell programmer rather than anything deeper, which is why I was determined to try it. Perhaps my sense of this is in error. Certainly, other things, like how instance definitions seem to have a universality to them, don't quite fit with Modula-3-like information-hiding intuitions.
>> Presumably I could have instead somehow wrapped the Writer monad to do >> much of this work for me while preventing users from fiddling the log. >> I also wondered if it would make sense as an instance of MonadWriter.
> That would have been my first thought: If you want Data.Sequence, > something like
> type WLog e = Writer (Seq e)
> logEntry :: e -> WLog e () > logEntry e = tell $ singleton $ e
> Inside the Monad, the client really can't do a lot besides using > tell/listen/pass, and if you restrict that to logEntry, even less.
Yes, it was exactly `pass' that I was avoiding. What would be nice is being able to arrange a compile- or run-time check that pass isn't being called. Trying to grep my source code felt ... low assurance! Especially as, given that Haskell's strong capacity for mixing and matching abstractions often leaves things unrecognizable to me, I wanted to be sure that it couldn't be called without even naming it.
I hadn't actually remembered that Writer works like that. (After all, Reader doesn't feed a sequence, element by element, does it?) There's a hint in the "Monoid w =>" in its signatures, but the Haddock documentation is ... not very revealing, instead being the usual reference to an academic paper that inspired the idea (and half the time I follow those links, there seems to have been a bit of a rethink and notation change along the way ): ). Clearly I need to do my standard libraries refresher more often.
> Yes, he could deconstruct the type if he really wants to, but I suppose > one could fix that with an additional newtype somewhere and hiding the > constructors, if it's really necessary.
Only at the expense of suspecting that if I try that, somehow you will sense that from far away and roll your eyes at me. (-:
> (And if you need to give this to, say, students, don't assume they > won't find a way around it just because your hiding it...)
True. Mostly I was going for being certain that I couldn't unwittingly work around it.
>> class Logger l e | l -> e where >> writeEntry :: e -> l ()
> Any particular reason to overload writeEntry? Do you want to write > generic code that can use multiple, different loggers, and will > actually be instantianted for different ones in the same program?
I wanted to be able to, at least. It was partly through this being an educational exercise: whatever I thought Haskell should easily let me do, I wanted to try to make it do that. This same codebase also has a "deriving instance" that perhaps arises from the same impulses.
>> newtype Log e a = Log (Seq e -> (a, Seq e))
> Why do you implement this as a function? To absolutely make sure > the client can't access it?
That was an important constraint, but perhaps mostly because I didn't notice an alternative, I was just happy it worked at all. My monads tend to be state-like ones and I have stuck in my head the whole pass-along-the-state idea. You're thinking that writeState should just pop the singleton in the tuple and then (>>=) should concatenate them?
> And especially with a function (and also with the Writer monad), > you should be very careful about strictness/lazyness, otherwise > your logging might very well happening by constructing nested thunks > instead of the actual log.
I do have problems with accidentally leaking space, and indeed this kind of thing tends to be the culprit. I have a similar problem now with other code and I am looking very suspiciously at part of it -
noteEvent newEvent actor = case history actor of Nothing -> actor Just oldEvents -> actor { history = Just (oldEvents Seq.|> newEvent) }
noteEvents newEvents actor = case history actor of Nothing -> actor Just oldEvents -> actor { history = Just (oldEvents Seq.>< Seq.fromList newEvents) }
Most of main laziness problems tend to be standard library ones, though. For instance, the -With functions of Map. I've ended up using my own tweaked versions of the source before to get the strictness I wanted. Laziness creeps in where I don't want it. Even in basic things like foldM I find myself coding, say, a variant that forces the accumulator argument each step.
>> I wonder, in what ways have I not done this well, or violated typical >> conventions? I managed to get this code working, but, especially with >> my use of flexible instances of Logger, I wonder if I introduced >> unnecessary weirdness.
> I think that's basically because of the overloading of writeEntry.
Ah, okay, it may be a necessary price of the genericity then. I do find myself using flexible contexts a bit too. Or, at least, I thought I did, but I can't presently find much local code that uses either.
I suppose I have an innate habit of, "overengineer now, make debugging easier later, and trust the computer more than myself".
Mark T. B. Carroll <Mark.Carr...@aetion.com> wrote:
> Shouldn't "encapsulation by force" be easy, though? Is the experience > you have in mind particular to Haskell-like languages?
It's from OOP, but I think it generalizes. I've often enough run into a situation where I wanted to access something from a library a bit off the mainline API, and couldn't, which led to unnecessarily complicated work-arounds or duplicated code.
As I said, clearly marking "normal" and "internal" operations is good practice -- makes it easier to keep invariants, for example. Enforcing them is bad, IMHO, because there may be situations when the client wants more, and the programmer actually knows what he's doing.
> Yes, it was exactly `pass' that I was avoiding. What would be nice is > being able to arrange a compile- or run-time check that pass isn't being > called.
Then make your own version of MonadWriter (compile-time), or make an instance that assigns "error" to "pass" (run-time). The price you pay is that you duplicate library code.
> There's a hint in the "Monoid w =>" in its signatures,
Yep, the Monoid part is important :-)
>> Yes, he could deconstruct the type if he really wants to, but I suppose >> one could fix that with an additional newtype somewhere and hiding the >> constructors, if it's really necessary. > Only at the expense of suspecting that if I try that, somehow you will > sense that from far away and roll your eyes at me. (-:
Don't worry, I won't telepathically sabotage your code :-)
>> (And if you need to give this to, say, students, don't assume they >> won't find a way around it just because your hiding it...) > True. Mostly I was going for being certain that I couldn't unwittingly > work around it.
I'm usally fine with just making a comment in the module ("only use safe methods here" or something similar). If you just want to keep yourself from shooting your own foot, I suppose you could also redefine "pass" module-locally. Though you might shoot yourself in the other foot with that, if you try to use "pass" elsewhere...
>>> class Logger l e | l -> e where >>> writeEntry :: e -> l ()
>> Any particular reason to overload writeEntry? Do you want to write >> generic code that can use multiple, different loggers, and will >> actually be instantianted for different ones in the same program?
> I wanted to be able to, at least.
My rule of thumb for that: Wait until you actually have two different instances (or more), then design the type class. It's much easier to generalize if you can see concrete differences and similarities.
I suppose you could also try type-constructors, something like
class Logger m where writeEntry :: e -> m e ()
That gets rid of the extensions.
> It was partly through this being an educational exercise: whatever I > thought Haskell should easily let me do, I wanted to try to make it > do that.
Type classes are not always easy :-)
> My monads tend to be state-like ones and I have stuck in my head the > whole pass-along-the-state idea. You're thinking that writeState > should just pop the singleton in the tuple and then (>>=) should > concatenate them?
If it helps, you can think of Reader and Writer as "one half" of the state monad: Writer is the output tuple (s,a) and Reader is the input function s -> a. Strictly speaking, this is of course wrong, but for getting intuition, it's not so far off.
> I do have problems with accidentally leaking space, and indeed this kind > of thing tends to be the culprit. I have a similar problem now with > other code and I am looking very suspiciously at part of it -
> noteEvent newEvent actor = > case history actor of > Nothing -> actor > Just oldEvents -> actor { history = Just (oldEvents Seq.|> newEvent) }
> noteEvents newEvents actor = > case history actor of > Nothing -> actor > Just oldEvents -> actor { history = Just (oldEvents Seq.>< Seq.fromList newEvents) }
I suppose a `seq` wouldn't hurt. Also, why don't you just use the empty sequence instead of Nothing? If the history is empty, that's as good as nothing :-)
Dirk Thierbach <dthierb...@usenet.arcornews.de> writes: > I'm not a particular fan of "encapsulation by force". Yes, it's > desirable to cleanly seperate the distinct rules of client code (this > client shouldn't mess with the internals, that client is allowed to do > so), but usually it's enough to clearly mark "abstract" and "internal" > operation. Enforcement beyond that principle leads to problems, in my > experience.
Isn't that enforcement the precise point of a type system? "A tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute" (Pierce, TaPL). If some type of route-around is desired then it should be included in the interface.
Dirk Thierbach <dthierb...@usenet.arcornews.de> writes: > Mark T. B. Carroll <Mark.Carr...@aetion.com> wrote: >> Shouldn't "encapsulation by force" be easy, though? Is the experience >> you have in mind particular to Haskell-like languages?
> It's from OOP, but I think it generalizes. I've often enough run into > a situation where I wanted to access something from a library a bit > off the mainline API, and couldn't, which led to unnecessarily > complicated work-arounds or duplicated code.
Ah, yes. I guess here we have a question of discouraging versus being sure they don't; on this occasion, I needed much assurance, enough that I didn't trust myself, but that's admittedly not the usual situation.
(snip)
> My rule of thumb for that: Wait until you actually have two different > instances (or more), then design the type class. It's much easier to > generalize if you can see concrete differences and similarities.
Certainly there are times when the obvious generalization to me is not the obvious one to someone else. This especially happens when I thread general monads through things and have choices as to quite which bits are in the monad.
> I suppose you could also try type-constructors, something like
> class Logger m where > writeEntry :: e -> m e ()
> That gets rid of the extensions.
Hmmm, thank you, I shall think about that.
>> noteEvent newEvent actor = >> case history actor of >> Nothing -> actor >> Just oldEvents -> actor { history = Just (oldEvents Seq.|> newEvent) }
>> noteEvents newEvents actor = >> case history actor of >> Nothing -> actor >> Just oldEvents -> actor { history = Just (oldEvents Seq.>< Seq.fromList newEvents) }
> I suppose a `seq` wouldn't hurt.
Mmmm. I may need to force down into the structure of the events too so they don't cause retention of big old things they mention uncomputed little parts of. I suppose I should look into the semantics of this record field assignment stuff in general actually.
> Also, why don't you just use the empty sequence instead of Nothing? If > the history is empty, that's as good as nothing :-)
`Nothing' means don't bother noting events. When one wants to start, one puts a Just Seq.empty there.
Paul Rubin <http://phr...@NOSPAM.invalid> writes: > Isn't that enforcement the precise point of a type system? "A > tractable syntactic method for proving the absence of certain program > behaviors by classifying phrases according to the kinds of values they > compute" (Pierce, TaPL). If some type of route-around is desired > then it should be included in the interface.
I thus audaciously put thoughts into Dirk's head: Paradoxically, perhaps what is wanted is lack of straitjacket _and_ lack of restriction. That is, let the user of the module/object/whatever grobble in its internals as much as they like, but only insofar as they somehow reflect that particular taint in the types of their functions that do so. I don't know if any languages are good at offering the capability for client code to bypass information hiding and suchlike while still offering strong guarantees that it's not happening when one doesn't want it to. Something that allows blocks of code to be marked as violating specific checked invariants, reminiscent of atomic locks, might be a good start.
Mark T. B. Carroll <Mark.Carr...@aetion.com> wrote:
> Paradoxically, perhaps what is wanted is lack of straitjacket _and_ > lack of restriction. That is, let the user of the > module/object/whatever grobble in its internals as much as they > like, but only insofar as they somehow reflect that particular taint > in the types of their functions that do so.
Yes, exactly. My idea of implementing this would be some kind of "module views": Each module can provide a number of different "export lists". For sake of an example, say the module M has three export lists called "public", "protected", and "private", then one could do something like
import M#public
to get the normal functionality, and stuff like
import M#public import qualified M#private as MPrivate
to mess around with the internals. Which is obvious in the code because you'd have to prefix it with MPrivate every time, so you cannot shoot yourself into the foot without being explicit about it. Which is the whole point.
But I don't know any language that actually does that.
Paul Rubin <http://phr...@nospam.invalid> wrote: > Isn't that enforcement the precise point of a type system? "A > tractable syntactic method for proving the absence of certain program > behaviors by classifying phrases according to the kinds of values they > compute" (Pierce, TaPL). If some type of route-around is desired > then it should be included in the interface.
Exactly. So what I'm complaining about is library developers who don't can imagine that a route-around is desired (to extend the functionality), and hence don't include it in the interface. Which is quite excuseable, of course one cannot imagine all the ways this code could possibly be used in the future. And that's the reason why it's a bad idea to *enforce* encapsulation, no matter what, without a route-around.
If the client says "I just want to use the functions are safe and guarantee the absence of some behaviour", fine. He should be able to do that, and the compiler should support him in doing so. If the client says OTOH "In this particular case, I want to use one of the unsafe functions, and I will provide the guarantee of absence myself", he should also be able to do that. If he can't, the penalty is more work, up to duplication of a substantial part of the library. And that's undesireable.
> I like it partly for the same reason I like the way monads work in > Haskell in general - for instance, knowing that I don't use > unsafePerformIO, when I see that the code isn't in the IO monad I can > know it doesn't do any IO actions.
But this is not true. Any pattern match can cause IO (well, "I") in Haskell.
Paul Rubin <http://phr...@NOSPAM.invalid> writes: > Florian Weimer <f...@deneb.enyo.de> writes: >> But this is not true. Any pattern match can cause IO (well, "I") in >> Haskell.
> Do you mean because of lazy i/o? That really seems like a wart in the > language.
hGetContents is certainly a horror, especially as it's not clear from the signature that there be dragons. At least unsafePerformIO has a hint in its name. I recently found myself writing a readFile' that seq's the length of the contents before returning them.
I noticed a "safe lazy IO" thing in Hackage but I haven't yet got around to investigating it so I don't know if it addresses these concerns. I often don't need general lazy IO, I just want to process the incoming data in steps, forcing the result of the previous step before slurping data for the next. I suppose I should write myself general map and fold functions for doing that safely with strict IO.
Of course, if one's using standard convenience functions to make things easy, then if one just puts the information into some datatype and does a liftM read $ readFile or suchlike, it's my experience that `read' likes to suck the whole thing in anyway, gobbling a lot of memory as it does so, before returning. (So don't do that then. (-:)
> Paul Rubin <http://phr...@NOSPAM.invalid> writes:
>> Florian Weimer <f...@deneb.enyo.de> writes: >>> But this is not true. Any pattern match can cause IO (well, "I") in >>> Haskell.
>> Do you mean because of lazy i/o?
Exactly.
> I noticed a "safe lazy IO" thing in Hackage but I haven't yet got around > to investigating it so I don't know if it addresses these concerns.
It still potentially performs imperative reads on pattern matches. It deals with two practical problems of lazy I/O: closing the input stream to free resources at the right time, and improved error handling. It does not deal with the fundamental safety issue.
> often don't need general lazy IO, I just want to process the incoming > data in steps, forcing the result of the previous step before slurping > data for the next. I suppose I should write myself general map and fold > functions for doing that safely with strict IO.
If it's safe, you can't use pattern matching on lists as a replacement for iterators.
Paul Rubin <http://phr...@NOSPAM.invalid> writes: > "Mark T. B. Carroll" <Mark.Carr...@Aetion.com> writes: >> I suppose I should write myself general map and fold >> functions for doing that safely with strict IO.
> Yeah, I think this is what Iteratee aims to do in a more general way. > I haven't been able to understand it well enough to use it, so far.
Huh, thank you, yes - I hadn't known about that. When I notice something would be worth writing, often someone has more than adequately beaten me to it!
> Paul Rubin<http://phr...@nospam.invalid> wrote: >> Isn't that enforcement the precise point of a type system? "A >> tractable syntactic method for proving the absence of certain program >> behaviors by classifying phrases according to the kinds of values they >> compute" (Pierce, TaPL). If some type of route-around is desired >> then it should be included in the interface.
> Exactly. So what I'm complaining about is library developers who don't > can imagine that a route-around is desired (to extend the functionality), > and hence don't include it in the interface. Which is quite excuseable, > of course one cannot imagine all the ways this code could possibly be > used in the future. And that's the reason why it's a bad idea to > *enforce* encapsulation, no matter what, without a route-around.
But if the client want to bypass the public interface (it knows what it does) it could change the interface of the library by modifying the sources ? usually I hide everything, and if one want to use internal state he looks at the code and changes what he wants. Isn't it better to force the user to really know what he is doing ?
> If the client says "I just want to use the functions are safe and > guarantee the absence of some behaviour", fine. He should be able to > do that, and the compiler should support him in doing so. If the > client says OTOH "In this particular case, I want to use one of the > unsafe functions, and I will provide the guarantee of absence myself", > he should also be able to do that. If he can't, the penalty is more > work, up to duplication of a substantial part of the library. And > that's undesireable.
Rose Howard <rose.howard-5o6lt...@yopmail.com> wrote: > But if the client want to bypass the public interface (it knows what it > does) it could change the interface of the library by modifying the > sources ?
If he has the sources in the first place. And sometimes it's still impractical to change the source, for example if the library is already installed in compiled form on the end-user's machines.
> usually I hide everything, and if one want to use internal > state he looks at the code and changes what he wants. > Isn't it better to force the user to really know what he is doing ?
No, because there's no need to treat users of your code like they are stupid people. Or if you do, they'll hate you :-)
Again: It's smart to make a distinction between a safe subset and all the code. It's stupid to *enforce* that distinction, under all circumstances.
Dirk Thierbach <dthierb...@usenet.arcornews.de> writes: > My idea of implementing this would be some kind of "module > views": Each module can provide a number of different "export lists". (snip) > so you cannot shoot yourself into the foot without being explicit > about it.
Sounds good.
> But I don't know any language that actually does that.
You can get partway there with some without too much wiggling. For instance, in Modula-3, in an interface you can declare multiple "partial revelations" and indeed a full revelation of objects inside the module; you could reveal the same ones to different extents.