Architecturally flawed

View: New views
12 Messages — Rating Filter:   Alert me  

Architecturally flawed

by Andrew Coppin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

OK, so I just spent an entire day trying to write some code, and after
hours of struggling I have something that's semi-working but very ugly
and rather unreliable. My usual guideline for Haskell programming is "if
it seems hard, you're doing it wrong"...

After many hours of effort, I came up with these:

  data Writer x
  instance Monad Writer
  run_writer :: Writer () -> ByteString
  write1 :: Bool -> Writer ()
  write8 :: Word8 -> Writer ()
  write16 :: Word16 -> Writer ()
  write32 :: Word32 -> Writer ()
  write64 :: Word64 -> Writer ()
  writeN :: Word8 -> Word32 -> Writer ()

  data Reader x
  instance Monad Reader
  run_reader :: Reader x -> ByteString -> x
  is_EOF :: Reader Bool
  read1 :: Reader Bool
  read8 :: Reader Word8
  read16 :: Reader Word16
  read32 :: Reader Word32
  read64 :: Reader Word64
  readN :: Word8 -> Reader Word32

Notice that, unlike the Binary package, all of these are bit-aligned
rather than byte-aligned. This is what took so much effort. I strived to
make each operation as efficient as possible - by performing as few
steps as possible. It would have been *far* easier to, say, implement
read8 by calling read1 eight times and then packing the bits back into a
byte. But that wouldn't be very fast. So actually read8 reads a pair of
bytes, bit-shifts them as required, and then ANDs them together. And so
on. (Note that the Writer monad uses Data.Binary.Builder (only) from the
Binary package. I assume this is more efficient...)

I even sat down and wrote a battery of QuickCheck properties because -
as you can imagine - there's a hell of a lot to go wrong here! After
many hours, I managed to get it to pass all tests...

Next I decided to write an LZW compressor. Here's what I came up with:

  data EncodeLZW symbol
  eLZW_start :: (Ord symbol) => EncodeLZW symbol
  eLZW_encode :: (Ord symbol) => EncodeLZW symbol -> symbol -> Writer
(EncodeLZW symbol)
  eLZW_end :: (Ord symbol) => EncodeLZW symbol -> Writer ()

  data DecodeLZW symbol
  dLZW_start :: DecodeLZW symbol
  dLZW_decode :: DecodeLZW symbol -> Reader (DecodeLZW symbol, symbol)

So each step of the encoder (possibly) writes some bits, and also
returns a new encoder state. The decoder is similar. (At least one
program bug was caused by the wrong version of the state being used
somewhere!)

Then I tried to run the code, and oh look, there's a bug somewhere.

Emphasis "somewhere". At this point I have several KB of code and no
idea how in hell to test it. What I *want* to do is single-step through
the encoder algorithm, watch it write bits out and update state, and
then single-step through the decoder watching it read bits in and update
its state, etc. But I can't. All the encoder does is return a giant
Writer object which you can't look inside, only run. And likewise for
the decoder.

I could try GHC's new debugger. But my experiences with it so far have
shown that for all but the most trivial programs possible, it becomes
intractably difficult to figure out what the debugger is actually
showing you. I actually tried to debug a "normal" LZW implementation
once - one that didn't involve two highly convoluted custom monads and a
stateful execution model with manually threaded state. This is *not* my
idea of a fun time...

In a "normal" programming language, at this point you would sprinkle a
few print statements around so you can see the intermediate steps the
program is taking. But I can't. I'm in the wrong monad. Curses!

In the end, the only solution I could come up with was to design a
second, "hacked" version of the two monads. Instead of importing one
module, you import another that provides the same interface. However,
now Reader and Writer are aliases to IO. All write requests cause
pretty-printed binary to be sent to stdout, and all read requests pop up
a prompt for input from stdin. It worked reasonably well in that I could
now add the vitally necessary print statements, and see intermediate
values and so forth... It wasn't very pretty though.

At this point I decided that lugging 5 individual functions around was
silly, and I should write a class:

  class Encoder e where
    encoder_start :: e x
    encode :: e x -> x -> Writer (e x)
    encoder_end :: e x -> Writer ()

  class Decoder d where
    decoder_start :: d x
    decode :: d x -> Reader (d x, x)

10 points to the first person to spot the fatal flaw in this plan...

Yep, that's right. The type system won't accept this. The reason? x
needs an Ord context.

If you turn on multi-parameter type classes *and* flexible instances,
apparently this works:

  class Encoder e x where...

  instance (Ord x) => Encoder EncodeLZW x where...

(I haven't tried but I *presume* it means what I think it does...)

Alternatively, you can avoid using dangerous type system extensions and
just write

  class Encoder e where
    encode :: (Symbol x) => e x -> x -> Writer (e x)
    ...

and make Symbol encompass everything any possible encoder could want.
(In reality, x is almost always Word8 or something. But I dislike loss
of generality just for the hell of it...)

So at least I got that part fixed. Heh. But now I find myself worrying
about yet *another* problem: is Writer lazy enough?

I mean, the idea is that you can write a program that lazily reads from
a file, pushes it through a Writer, and lazily writes the result back to
another file. The thing should chug along reasonably fast and use a
constant amount of RAM. But all this is only true of Writer is actually
lazy enough. I have a horrible feeling that all the complicated Origami
inside makes it too strict. And I have no way to check!

Actually, thinking about it, is Reader lazy enough? You call run_reader
and it hands over your data, but if that data is, say, a list, will
run_reader build the entire damn list before it'll hand it over? Or will
the monadic code by called as-needed to generate the list elements?
Obviously I desire the latter, but I'm really not sure what the actual
behaviour is...

In summary, I've spent several days coding, and I still don't have a
single algorithm working. I've spent all my time wrestling with the
mundane details of infrastructure rather than the interesting algorithms
I actually set out to implement. This makes me very sad. :-(

If anybody can think of a better set of abstractions or another way to
do debugging, I'd be interested to hear about it.

(This would all be so trivial in an OO language. Just make an Encoder
object that updates its own state internally and talks to a Source
object and a Destination object for its data...)

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: Architecturally flawed

by Don Stewart-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

andrewcoppin:
> I could try GHC's new debugger. But my experiences with it so far have
> shown that for all but the most trivial programs possible, it becomes
> intractably difficult to figure out what the debugger is actually
> showing you. I actually tried to debug a "normal" LZW implementation
> once - one that didn't involve two highly convoluted custom monads and a
> stateful execution model with manually threaded state. This is *not* my
> idea of a fun time...

For what its worth, people are using this at work on large projects
happily.
 
> In a "normal" programming language, at this point you would sprinkle a
> few print statements around so you can see the intermediate steps the
> program is taking. But I can't. I'm in the wrong monad. Curses!

Use Debug.Trace.trace.
 
> In the end, the only solution I could come up with was to design a
> second, "hacked" version of the two monads. Instead of importing one
> module, you import another that provides the same interface. However,
> now Reader and Writer are aliases to IO. All write requests cause
> pretty-printed binary to be sent to stdout, and all read requests pop up
> a prompt for input from stdin. It worked reasonably well in that I could
> now add the vitally necessary print statements, and see intermediate
> values and so forth... It wasn't very pretty though.

You should have used Debug.Trace.
 
> So at least I got that part fixed. Heh. But now I find myself worrying
> about yet *another* problem: is Writer lazy enough?
>
> I mean, the idea is that you can write a program that lazily reads from
> a file, pushes it through a Writer, and lazily writes the result back to
> another file. The thing should chug along reasonably fast and use a
> constant amount of RAM. But all this is only true of Writer is actually
> lazy enough. I have a horrible feeling that all the complicated Origami
> inside makes it too strict. And I have no way to check!

Actually, you can use 'chasingBottoms' to write QuickCheck properties
that state your expected laziness or strictness behaviour.
 
> Actually, thinking about it, is Reader lazy enough? You call run_reader
> and it hands over your data, but if that data is, say, a list, will
> run_reader build the entire damn list before it'll hand it over? Or will
> the monadic code by called as-needed to generate the list elements?
> Obviously I desire the latter, but I'm really not sure what the actual
> behaviour is...

The mtl Reader class is lazy.
 
> In summary, I've spent several days coding, and I still don't have a
> single algorithm working. I've spent all my time wrestling with the
> mundane details of infrastructure rather than the interesting algorithms
> I actually set out to implement. This makes me very sad. :-(

You should have stopped by #haskell a few days ago.
 
> If anybody can think of a better set of abstractions or another way to
> do debugging, I'd be interested to hear about it.

There's already a bit layer for Data.Binary on hackage, for what its
worth. And an LZW encoder/decoder.
 
> (This would all be so trivial in an OO language. Just make an Encoder
> object that updates its own state internally and talks to a Source
> object and a Destination object for its data...)

Make an Encoder class that updates its own state internally and lazy
streams input and output. As Data.Binary does, as zlib does, et al.

-- Don
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: Architecturally flawed

by Jamie Brandon-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have a similar piece of code at http://code.haskell.org/gmap/serial/
which is fairly well tested. It currently only outputs lists of words
but its based on Data.Binary so it should be fairly easy to get
bytestrings out it (bytestrings worked up till 2-3 weeks ago, I just
havent bothered to keep the byestring part up to date).I dont yet know
how fast it is but I will be tweaking it over the next month or so.

Jamie
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

GHCi Debugger (was Re: Architecturally flawed)

by Thomas M. DuBuisson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


> I could try GHC's new debugger. But my experiences with it so far have
> shown that for all but the most trivial programs possible, it becomes
> intractably difficult to figure out what the debugger is actually
> showing you.

GDB is to C as
(a) GHCi debugger :: Haskell
(b) Pigs :: Farmers
(c) Food :: TomMD
(d) None of the above

Hint: Its not (a).  The GHCi debugger seems to catch extra flack because
people want to pour through their Haskell code as they do imperative
code.  I can sympathize - I would like to do that too - but it would be
an inaccurate picture of the programs execution.  so long as you regard
the GHCi as a new/useful tool and not try to pretend its like other
debuggers you'll probably be happier.

I've found ghcid to be useful when quickcheck + HPC + ChasingBottoms
fails to narrow down the problem any further.  Its gotten to the point
where I often know exactly which LOC/module will be the next step (based
on knowledge of the data dependencies).  A fair share of bugs have
fallen to the sword named vim as a result :-).  It really is useful, but
like all other haskellisms, one must learn to ride a bike all over
again.

At times I think of ghcid as the anti-gdb.  If there's a series of let
bindings, each mutating the predecessor, its enjoyable to see the
debugger start at the bottom and crawl its way back up.

Tom

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: HPC for debugging (Was Re: GHCi Debugger

by Don Stewart-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

thomas.dubuisson:
> > I could try GHC's new debugger. But my experiences with it so far have
> > shown that for all but the most trivial programs possible, it becomes
> > intractably difficult to figure out what the debugger is actually
> > showing you.
>
> At times I think of ghcid as the anti-gdb.  If there's a series of let
> bindings, each mutating the predecessor, its enjoyable to see the
> debugger start at the bottom and crawl its way back up.

I'd like to relate a debugging effort this week. A colleague had an
exception thrown from deep within a large body of code, we knew not where.
A few trace statements didn't yield much information. The debugger was
fired up on this >50 module program, and we got a few interesting hints
and pieces, but it seemed as if the exception was floating away from its
call site, to the point it was being demanded. Hmm. A puzzle.

So instead we compiled the application with -fhpc, ran it till the
exception occured, and then ran hpc markup on the result. (This marks up
the program source in colours showing what code was actually executed
during a given program run).

Loading the colourised trace into firefox, we saw at a glance all the
code that had been executed up to the point of the exception, and then
there was the exception itself, staring at us amongst a chunk of bright
yellow code, a lone streak of uncoloured code where it shouldn't have
been.

It took all of 5 seconds to find the bug with HPC.    

-- Don
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: Re: HPC for debugging (Was Re: GHCi Debugger

by Henning Thielemann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Thu, 10 Jul 2008, Don Stewart wrote:

> thomas.dubuisson:
>>> I could try GHC's new debugger. But my experiences with it so far have
>>> shown that for all but the most trivial programs possible, it becomes
>>> intractably difficult to figure out what the debugger is actually
>>> showing you.
>>
>> At times I think of ghcid as the anti-gdb.  If there's a series of let
>> bindings, each mutating the predecessor, its enjoyable to see the
>> debugger start at the bottom and crawl its way back up.
>
> I'd like to relate a debugging effort this week. A colleague had an
> exception thrown from deep within a large body of code, we knew not where.

Let me relate this to the Extensible Exception thread of the
Haskell-Library list. Your exception - was it an 'error' or an IO
exception? If it would be an exception (like "file not found") and we
would use ErrorT monad for exceptions with specific types for the
exceptions, then it would be clearer where the exception can come from.
However if it was an error, then we cannot handle it terminally by some
exception-catching like mechanism.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: QuickCheck [Architecturally flawed]

by Andrew Coppin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Andrew Coppin wrote:

> After many hours of effort, I came up with these:
>
>  data Writer x
>  instance Monad Writer
>  run_writer :: Writer () -> ByteString
>  write1 :: Bool -> Writer ()
>  write8 :: Word8 -> Writer ()
>  write16 :: Word16 -> Writer ()
>  write32 :: Word32 -> Writer ()
>  write64 :: Word64 -> Writer ()
>  writeN :: Word8 -> Word32 -> Writer ()
>
>  data Reader x
>  instance Monad Reader
>  run_reader :: Reader x -> ByteString -> x
>  is_EOF :: Reader Bool
>  read1 :: Reader Bool
>  read8 :: Reader Word8
>  read16 :: Reader Word16
>  read32 :: Reader Word32
>  read64 :: Reader Word64
>  readN :: Word8 -> Reader Word32

How would you write QuickCheck properties for these?

For starters, what would be a good set of properties to confirm that any
monad is actually working correctly? More particularly, how about a
state monad? It's easy to screw up the implementation and pass the wrong
state around. How would you catch that?

Secondly, the monads themselves. I started writing things like "if X has
the lowest bit set then the lowest bit of the final byte of the output
should be set"... but then all I'm really doing is reimplementing the
algorithm as a property rather than a monad! If a property fails, is the
program wrong or is the property wrong?

In the end, what I opted to do was define various properties where I
take some arbitrary data, write it with the Writer monad, then read it
back with the Reader monad and confirm that the data stays identical.
(This actually fails for writeN, which writes the N least-significant
bits of the supplied data, so you need to apply some masking before
doing equity. Or, equivilently, reject some test values...)

Looking at the QuickCheck paper, it seems I should probably have done
some checking that the size of the output is correct. I didn't actually
bother because it's really easy to get right, whereas strickiness with
bit-shifts and indexing is all too easy to screw up.

What I finally did was try writing/reading with each primitive (to check
that actually works properly), and then again with a random number of
individual bits packed on each side to give random alignment (to check
that the index adjustments actually work right). It's easy to make the
code work correctly with a certain alignment, but fail spectacularly
otherwise. It's packed at *both* ends because it's also quite easy to
make it write out the correct bit pattern, but leave the bit pointer
with the wrong value, causing subsequent writes to screw up.

How would you approach this one? All hints welcomed.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: Improvements [Architecturally flawed]

by Andrew Coppin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Andrew Coppin wrote:

> data Writer x
>  instance Monad Writer
>  run_writer :: Writer () -> ByteString
>  write1 :: Bool -> Writer ()
>  write8 :: Word8 -> Writer ()
>  write16 :: Word16 -> Writer ()
>  write32 :: Word32 -> Writer ()
>  write64 :: Word64 -> Writer ()
>  writeN :: Word8 -> Word32 -> Writer ()
>
>  data Reader x
>  instance Monad Reader
>  run_reader :: Reader x -> ByteString -> x
>  is_EOF :: Reader Bool
>  read1 :: Reader Bool
>  read8 :: Reader Word8
>  read16 :: Reader Word16
>  read32 :: Reader Word32
>  read64 :: Reader Word64
>  readN :: Word8 -> Reader Word32
>
> Next I decided to write an LZW compressor. Here's what I came up with:
>
>  data EncodeLZW symbol
>  eLZW_start :: (Ord symbol) => EncodeLZW symbol
>  eLZW_encode :: (Ord symbol) => EncodeLZW symbol -> symbol -> Writer
> (EncodeLZW symbol)
>  eLZW_end :: (Ord symbol) => EncodeLZW symbol -> Writer ()
>
>  data DecodeLZW symbol
>  dLZW_start :: DecodeLZW symbol
>  dLZW_decode :: DecodeLZW symbol -> Reader (DecodeLZW symbol, symbol)

Suddenly it seems very obvious to me... I built a monad to allow writing
bits to a ByteString, so why not make another monad for writing symbols
to an LZW-compressed ByteString?

So... run a monad on top of another monad. Can that be done? Notionally
it should be possible. I mean, thinking about it, what's the difference
between a Writer and an EncodeLZW? One writes bits using some internal
state, the other writes symbols using a more complex algorithm and
internal state. Heck, if I augment Writer slightly so that the user can
carry along some arbitrary state of their own, then I can just do
something like

  newtype EncodeLZW symbol x = Wrap (Writer (StateLZW symbol) x)
deriving Monad

Now I automatically have Wrap :: Writer (StateLZW symbol) x -> EncodeLZW
symbol x as my magical lifting operator. Now the caller can't use any
Writer functions (because they all have the wrong type) and can only use
the LZW writing functions I provide. I can even write a class -
something like

  class Encoder e s where
    encode :: symbol -> e symbol ()
    run_encoder :: e symbol () -> Writer s ()

where the "run_encoder" thing is actually just a typecast. (I'm doing is
this way so you could possibly run another, different, encoder feeding
to the same sink. If I returned an actual ByteString you'd be prevented
from doing that.)

So far, this only appears to require GeneralizedNewtypeDeriving and
MultiParamTypeClasses, so we should be golden...

...unless somebody else has a better idea?

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: Re: QuickCheck [Architecturally flawed]

by Ryan Ingram :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I think you can use the duality of Writer/Reader to help you here; you
have the law that, for suitable "dual" computations r and w,

run_reader r (run_writer (w x)) == x

Then you can build up a list of rules specifying which computations
are dual; read64 is dual to write64, for example.  You can then have
some laws like:

if r1 is dual to w1, and r2 is dual to w2,
then
    r1 >>= \x -> r2 >>= \y -> (x,y)
is dual to
    \(x,y) -> w1 x >> w2 y

if r1 is dual to w1, and r2 is dual to w2,
then
   read1 >>= \b -> case b of True -> liftM Left r1 ; False -> liftM Right r2
is dual to
   \x -> case x of Left l -> w1 l; Right r -> w2 r

You can then use these to build up more complicated reader/writer
duals and verify that the main "identity" law holds.

It's a little bit tricky; QuickCheck is not good at dealing with
polymorphic data, but you could generalize this to a simple term ADT:
   data SimpleTerm = Leaf Word8 Word32 | Pair SimpleTerm SimpleTerm |
Switch (Either SimpleTerm SimpleTerm) deriving Eq

and make a suitable "arbitrary" instance for SimpleTerm to test your
reader/writer.  Leaf would test readN/writeN, or you can make custom
leaves to test the other operations.

  -- ryan

On Fri, Jul 11, 2008 at 11:10 AM, Andrew Coppin
<andrewcoppin@...> wrote:

> Andrew Coppin wrote:
>>
>> After many hours of effort, I came up with these:
>>
>>  data Writer x
>>  instance Monad Writer
>>  run_writer :: Writer () -> ByteString
>>  write1 :: Bool -> Writer ()
>>  write8 :: Word8 -> Writer ()
>>  write16 :: Word16 -> Writer ()
>>  write32 :: Word32 -> Writer ()
>>  write64 :: Word64 -> Writer ()
>>  writeN :: Word8 -> Word32 -> Writer ()
>>
>>  data Reader x
>>  instance Monad Reader
>>  run_reader :: Reader x -> ByteString -> x
>>  is_EOF :: Reader Bool
>>  read1 :: Reader Bool
>>  read8 :: Reader Word8
>>  read16 :: Reader Word16
>>  read32 :: Reader Word32
>>  read64 :: Reader Word64
>>  readN :: Word8 -> Reader Word32
>
> How would you write QuickCheck properties for these?
>
> For starters, what would be a good set of properties to confirm that any
> monad is actually working correctly? More particularly, how about a state
> monad? It's easy to screw up the implementation and pass the wrong state
> around. How would you catch that?
>
> Secondly, the monads themselves. I started writing things like "if X has the
> lowest bit set then the lowest bit of the final byte of the output should be
> set"... but then all I'm really doing is reimplementing the algorithm as a
> property rather than a monad! If a property fails, is the program wrong or
> is the property wrong?
>
> In the end, what I opted to do was define various properties where I take
> some arbitrary data, write it with the Writer monad, then read it back with
> the Reader monad and confirm that the data stays identical. (This actually
> fails for writeN, which writes the N least-significant bits of the supplied
> data, so you need to apply some masking before doing equity. Or,
> equivilently, reject some test values...)
>
> Looking at the QuickCheck paper, it seems I should probably have done some
> checking that the size of the output is correct. I didn't actually bother
> because it's really easy to get right, whereas strickiness with bit-shifts
> and indexing is all too easy to screw up.
>
> What I finally did was try writing/reading with each primitive (to check
> that actually works properly), and then again with a random number of
> individual bits packed on each side to give random alignment (to check that
> the index adjustments actually work right). It's easy to make the code work
> correctly with a certain alignment, but fail spectacularly otherwise. It's
> packed at *both* ends because it's also quite easy to make it write out the
> correct bit pattern, but leave the bit pointer with the wrong value, causing
> subsequent writes to screw up.
>
> How would you approach this one? All hints welcomed.
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe@...
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: Re: QuickCheck [Architecturally flawed]

by Paul Johnson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ryan Ingram wrote:

> I think you can use the duality of Writer/Reader to help you here; you
> have the law that, for suitable "dual" computations r and w,
>
> run_reader r (run_writer (w x)) == x
>
> Then you can build up a list of rules specifying which computations
> are dual; read64 is dual to write64, for example.  You can then have
> some laws like:
>
> if r1 is dual to w1, and r2 is dual to w2,
> then
>     r1 >>= \x -> r2 >>= \y -> (x,y)
> is dual to
>     \(x,y) -> w1 x >> w2 y
>
> if r1 is dual to w1, and r2 is dual to w2,
> then
>    read1 >>= \b -> case b of True -> liftM Left r1 ; False -> liftM Right r2
> is dual to
>    \x -> case x of Left l -> w1 l; Right r -> w2 r
>
> You can then use these to build up more complicated reader/writer
> duals and verify that the main "identity" law holds.
>
> It's a little bit tricky; QuickCheck is not good at dealing with
> polymorphic data, but you could generalize this to a simple term ADT:
>    data SimpleTerm = Leaf Word8 Word32 | Pair SimpleTerm SimpleTerm |
> Switch (Either SimpleTerm SimpleTerm) deriving Eq
>
> and make a suitable "arbitrary" instance for SimpleTerm to test your
> reader/writer.  Leaf would test readN/writeN, or you can make custom
> leaves to test the other operations.
>
>   -- ryan
>
> On Fri, Jul 11, 2008 at 11:10 AM, Andrew Coppin
> <andrewcoppin@...> wrote:
>  
>> For starters, what would be a good set of properties to confirm that any
>> monad is actually working correctly? More particularly, how about a state
>> monad? It's easy to screw up the implementation and pass the wrong state
>> around. How would you catch that?
>>    
See http://www.cs.chalmers.se/~rjmh/Papers/QuickCheckST.ps on QuickCheck
tests for monadic properties.  The basic idea is to write a Gen action
to generate a list of actions in your target monad.
>> Secondly, the monads themselves. I started writing things like "if X has the
>> lowest bit set then the lowest bit of the final byte of the output should be
>> set"... but then all I'm really doing is reimplementing the algorithm as a
>> property rather than a monad! If a property fails, is the program wrong or
>> is the property wrong
This is a fundamental issue with the way that QuickCheck, or any other
automatic test generator, works.  QuickCheck tests are a formal
specification of the properties of your program, so they have the same
fundamental complexity as your program.  See
http://en.wikipedia.org/wiki/Kolmogorov_complexity for more on this.  
The only exception is when a complicated algorithm produces a simple
result, such as sorting or square roots.  There are two ways of dealing
with this:

1: Write abbreviated properties that only specify part of your program's
behaviour, and trust to single-case tests and code inspection for the
rest.  For instance a test for "reverse" done this way might test that
reverse "abcd" = "dcba" and otherwise just check that the input and
output strings were the same length.

2: Accept that your tests are going to be as long as the program
itself.  When a test fails, just figure out which is at fault (you have
to do this for any testing method anyway).  You still gain reliability
because you have implemented the algorithm in two different ways, so
hopefully a defect in one will not have a matching defect in the other.  
I say "hopefully" because the history of N-version programming suggests
that such errors are not independent, even when the versions are
developed by different teams.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: Re: QuickCheck [Architecturally flawed]

by Andrew Coppin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ryan Ingram wrote:
> I think you can use the duality of Writer/Reader to help you here; you
> have the law that, for suitable "dual" computations r and w,
>
> run_reader r (run_writer (w x)) == x
>
> Then you can build up a list of rules specifying which computations
> are dual; read64 is dual to write64, for example.

OK. This is more or less the approach I ended up taking. Otherwise the
QuickCheck properties just reimplement the functions themselves - which
doesn't tell you anything! ;-)

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: Architecturally flawed

by Tillmann Rendel-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Andrew Coppin wrote:
> [design of a bitwise binary library]
>
> (This would all be so trivial in an OO language. Just make an Encoder
> object that updates its own state internally and talks to a Source
> object and a Destination object for its data...)

I guess it's on the same level of trivialness in Haskell, too, but to be
fair, I haven't tried it... I would proceed as follows:

(1) Try to not shadow names from the mtl or other standard packages.

     I choose BitSink and BitSource instead of Reader and Writer.

(2) Select a small number of primitive operations.

     I select the operations "read n bits" and "put n bits" as primitive
     operations. As interface format, I choose [Boolean], which is
     not exactly optimized, but easy to understand.

     It is easy to implement operations for single bits, bytes etc. on
     top of this operations. We will later include them into the
     typeclass, but we will first make the [Boolean]-based operations
     correct.

(2) Make BitSink and BitSource composable, e.g., as monad transformers.
     The type classes could look like:

     class MonadBitSource m where
       getBits :: Int -> m [Boolean]

     class MonadBitSink m where
       putBits :: [Boolean] -> m ()

     And we need a lot of trivial instances for the various mtl monad
     transformers in the style of:

     instance MonadBitSink m => MonadBitSink (ReaderT m) where
       putBits = lift . putBits


(3) Write a very simple implementation to (a) check that the typeclasses
     makes sense and is implementable and (b) have a test-implementation
     for later correctness tests.

     The easiest implementation I can think of consists of a state monad
     which handles a list of booleans. It could look like this:

     newtype BitListT m a = BitListT (StateT [Boolean] m a)
       deriving (Functor, Monad, MonadReader r, MonadWriter w, ...)

     You should be able to derive all mtl classes except
     MonadTransformer, MonadIO and MonadState. Instantiate these
     yourself:

     instance MonadTransformer BitListT where
       lift (BitListT p) = BitListT (lift p)

     instance MonadIO m => MonadIO (BitListT m) where
       liftIO = lift . liftIO

     We want to hide BitListT's state and expose a state in the nested
     monad to the user, if there is any.

     instance MonadState s m => MonadState s (BitListT m) where
       get = lift get
       put = lift . put

     Finally, the real stuff:

     instance MonadBitSink (BitListT m) where
       putBits x = BitListT $ modify (++ x)

     instance MonadBitSource (BitListT m) where
       getBits n = BitListT $ do
         result <- gets (take n)
         guard (length result == n)
         modify (drop n)
         return result

     runSinkBitListT :: BitListT m a -> m (BitListT ([Boolean], a))
     runSinkBitListT (BitListT p) = return $ runState p []

     runSourceBitListT :: BitListT m a -> [Boolean] -> m a
     runSourceBitListT (BitListT p) bits = return $ evalState p bits

(4) Check the simple implementation

     Now we can write quickcheck properties (if you believe in XP, you
     can write them before (3), of course) to check our simple
     implementation and document the specification. given simple function
     runSink = runIdentity . runSinkBitListT and runSource bits =
     runIdentity . runSourceBitListT bits, we have such properties as

       forall n . forall bits . length bits >= n ==>
         length (runSource (getBits n)) == n

       forall bits . runSource (getBits (length bits)) == bits

       forall a . forall b .
         (runSource (liftM2 (++) (getBits a) (getBits b)))
         == runSource (getBits (a + b))

       forall p . forall q .
         runSink p ++ runSink q == runSink (p ++ q)

     usw. Use the tricks already mentioned in this thread for the last
     property. Don't forget to write properties for the high-level
     interface putWord8 etc.

(5) Write a more realistic instance, e.g. by replacing [Boolean] through
     (Int, ByteString) and doing clever things in getBits / putBits. Test
     this instance both with the existing properties and against the
     simple instance, i.e., verify that getBits and setBits means the
     same in both monads.

(6) move the high-level functions getWord8 & Co. into the typeclass,
     keep the definition as defaults. they are fine for BitListT, but
     implement your own versions for the other instance. Quickcheck them
     against the properties, against the simple implementation in
     BitListT and against the default definitions.

(7) if you need tracing, use liftIO (print ...) and finally understand
     what's the point about monad transformer stacks and MonadIO and
     why you almost always want to define a monad transformer instead
     of a monad.

(8) Write your LZW stuff with

     newtype EncoderT s m a = EncoderT (StateT (LZW s) (BitSink m) a)
       deriving (a lot of stuff)

(9) Have fun!


   Tillmann
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe
LightInTheBox - Buy quality products at wholesale price