GHC predictability

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 - 3 - 4 | Next >

GHC predictability

by Jeff Polakow :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hello,

One frequent criticism of Haskell (and by extension GHC) is that it has unpredictable performance and memory consumption. I personally do not find this to be the case. I suspect that most programmer confusion is rooted in shaky knowledge of lazy evaluation; and I have been able to fix, with relative ease, the various performance problems I've run into. However I am not doing any sort of performance critical computing (I care about minutes or seconds, but not about milliseconds).

I would like to know what others think about this. Is GHC predictable? Is a thorough knowledge of lazy evaluation good enough to write efficient (whatever that means to you) code? Or is intimate knowledge of GHC's innards necessary?

thanks,
  Jeff

PS I am conflating Haskell and GHC because I use GHC (with its extensions) and it produces (to my knowledge) the fastest code.

---

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by Don Stewart-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

jeff.polakow:

>    Hello,
>
>    One frequent criticism of Haskell (and by extension GHC) is that it has
>    unpredictable performance and memory consumption. I personally do not find
>    this to be the case. I suspect that most programmer confusion is rooted in
>    shaky knowledge of lazy evaluation; and I have been able to fix, with
>    relative ease, the various performance problems I've run into. However I
>    am not doing any sort of performance critical computing (I care about
>    minutes or seconds, but not about milliseconds).
>
>    I would like to know what others think about this. Is GHC predictable? Is
>    a thorough knowledge of lazy evaluation good enough to write efficient
>    (whatever that means to you) code? Or is intimate knowledge of GHC's
>    innards necessary?
>
>    thanks,
>      Jeff
>
>    PS I am conflating Haskell and GHC because I use GHC (with its extensions)
>    and it produces (to my knowledge) the fastest code.

This has been my experience to. I'm not even sure where
"unpredicatiblity" would even come in, other than though not
understanding the demand patterns of the code.

It's relatively easy to look at the Core to get a precise understanding
of the runtime behaviour.

I've also not found the GC unpredicatble either.

-- Don
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by David Roundy-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, May 09, 2008 at 02:24:12PM -0700, Don Stewart wrote:

> jeff.polakow:
> > Hello,
> >
> > One frequent criticism of Haskell (and by extension GHC) is that it has
> > unpredictable performance and memory consumption. I personally do not find
> > this to be the case. I suspect that most programmer confusion is rooted in
> > shaky knowledge of lazy evaluation; and I have been able to fix, with
> > relative ease, the various performance problems I've run into. However I
> > am not doing any sort of performance critical computing (I care about
> > minutes or seconds, but not about milliseconds).
> >
> > I would like to know what others think about this. Is GHC predictable? Is
> > a thorough knowledge of lazy evaluation good enough to write efficient
> > (whatever that means to you) code? Or is intimate knowledge of GHC's
> > innards necessary?
> >
> > thanks,
> >   Jeff
> >
> > PS I am conflating Haskell and GHC because I use GHC (with its extensions)
> > and it produces (to my knowledge) the fastest code.
>
> This has been my experience to. I'm not even sure where
> "unpredicatiblity" would even come in, other than though not
> understanding the demand patterns of the code.

I think the unpredictability comes in due to the difficulty of predicting
resource useage in the presence of lazy data (and particularly lazy IO).
It's not really that hard to predict the behavior of code that you write,
but it can certainly be hard to predict the effect of changes that you make
to someone else's code.  It's an effect of the possibility of constructing
and consuming large data objects without ever holding them in memory.  It's
beautiful, and it's wonderful, but it's also really easy for someone to add
a second consumer of the same object, and send performance through the
floor.

Of course, one can avoid this particular pattern, but then you lose some of
the very nice abstractions that laziness gives us.
--
David Roundy
Department of Physics
Oregon State University
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by Abhay Parvate :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

As a beginner, I had found the behaviour quite unpredictable. But with time I found that I could reason out the behaviour with my slowly growing knowledge of laziness. I don't spot all the places in my program that will suck while writing a program, but post facto many things become clear. (And then there is the profiler!)

GHC's internal details had been never necessary to me! I aspire to write computationally heavy programs in haskell in future, and I have been successful in reaching factors of 3 to 5 with C programs (though I have not been upto factors of 1 for which I find claims here and there) without any knowledge of GHC internals. But the GHC user guide is immensely valuable.

I would like to note that beginners' codes are many times time/memory consuming even in slighly complicated cases, and it may be a big source of frustration and turn-away if they don't stick up and pursue. This is not a problem of GHC, or even Haskell; it generally applies to functional programming.

These are my opinions; I am only an advanced beginner :)

2008/5/10 Jeff Polakow <jeff.polakow@...>:

Hello,

One frequent criticism of Haskell (and by extension GHC) is that it has unpredictable performance and memory consumption. I personally do not find this to be the case. I suspect that most programmer confusion is rooted in shaky knowledge of lazy evaluation; and I have been able to fix, with relative ease, the various performance problems I've run into. However I am not doing any sort of performance critical computing (I care about minutes or seconds, but not about milliseconds).

I would like to know what others think about this. Is GHC predictable? Is a thorough knowledge of lazy evaluation good enough to write efficient (whatever that means to you) code? Or is intimate knowledge of GHC's innards necessary?

thanks,
  Jeff

PS I am conflating Haskell and GHC because I use GHC (with its extensions) and it produces (to my knowledge) the fastest code.

---

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe



_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by Andrew Coppin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Don Stewart wrote:

> jeff.polakow:
>  
>>    Hello,
>>
>>    One frequent criticism of Haskell (and by extension GHC) is that it has
>>    unpredictable performance and memory consumption. I personally do not find
>>    this to be the case. I suspect that most programmer confusion is rooted in
>>    shaky knowledge of lazy evaluation; and I have been able to fix, with
>>    relative ease, the various performance problems I've run into. However I
>>    am not doing any sort of performance critical computing (I care about
>>    minutes or seconds, but not about milliseconds).
>>
>>    I would like to know what others think about this. Is GHC predictable? Is
>>    a thorough knowledge of lazy evaluation good enough to write efficient
>>    (whatever that means to you) code? Or is intimate knowledge of GHC's
>>    innards necessary?
>>
>>    thanks,
>>      Jeff
>>
>>    PS I am conflating Haskell and GHC because I use GHC (with its extensions)
>>    and it produces (to my knowledge) the fastest code.
>>    
>
> This has been my experience to. I'm not even sure where
> "unpredicatiblity" would even come in, other than though not
> understanding the demand patterns of the code.
>
> It's relatively easy to look at the Core to get a precise understanding
> of the runtime behaviour.
>
> I've also not found the GC unpredicatble either.
>  

I offer up the following example:

  mean xs = sum xs / length xs

Now try, say, "mean [1.. 1e9]", and watch GHC eat several GB of RAM. (!!)

If we now rearrange this to

  mean = (\(s,n) -> s / n) . foldr (\x (s,n) -> let s' = s+x; n' = n+1
in s' `seq` n' `seq` (s', n')) (0,0)

and run the same example, and watch it run in constant space.

Of course, the first version is clearly readable, while the second one
is almost utterly incomprehensible, especially to a beginner. (It's even
more fun that you need all those seq calls in there to make it work
properly.)

The sad fact is that if you just write something in Haskell in a nice,
declarative style, then roughly 20% of the time you get good
performance, and 80% of the time you get laughably poor performance. For
example, I sat down and spent the best part of a day writing an MD5
implementation. Eventually I got it so that all the test vectors work
right. (Stupid little-endian nonsense... mutter mutter...) When I tried
it on a file containing more than 1 MB of data... ooooohhhh dear... I
gave up after waiting several minutes for an operation that the C
implementation can do in milliseconds. I'm sure there's some way of
fixing this, but... the source code is pretty damn large, and very messy
as it is. I shudder to think what you'd need to do to it to speed it up.

Of course, the first step in any serious attempt at performance
improvement is to actually profile the code to figure out where the time
is being spent. Laziness is *not* your friend here. I've more or less
given up trying to comprehend the numbers I get back from the GHC
profiles, because they apparently defy logic. I'm sure there's a reason
to the madness somewhere, but... for nontrivial programs, it's just too
hard to figure out what's going on.

Probably the best part is that almost any nontrivial program you write
spends 60% or more of its time doing GC rather than actual work. Good
luck with the heap profiler. It's even more mysterious than the time
profiles. ;-)

In short, as a fairly new Haskell programmer, I find it completely
impossibly to write code that doesn't crawl along at a snail's pace.
Even when I manage to make it faster, I usually have no clue why. (E.g.,
adding a seq to a mergesort made it 10x faster. Why? Changing from
strict ByteString to lazy ByteString made one program 100x faster. Why?)

Note that I'm not *blaming* GHC for this - I think it's just inherantly
very hard to predict performance in a lazy language. (Remember,
deterministic isn't the same as predictable - see Chaos Theory for why.)
I wish it wasn't - becuase I really *really* want to write all my
complex compute-bounded programs in Haskell, because it makes algorithms
so beautifully easy to express. But when you're trying to implement
something that takes hours to run even in C...

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by Duncan Coutts :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Mon, 2008-05-12 at 20:01 +0100, Andrew Coppin wrote:
> In short, as a fairly new Haskell programmer, I find it completely
> impossibly to write code that doesn't crawl along at a snail's pace.
> Even when I manage to make it faster, I usually have no clue why. (E.g.,
> adding a seq to a mergesort made it 10x faster. Why? Changing from
> strict ByteString to lazy ByteString made one program 100x faster. Why?)

This isn't just a little language issue. You know nothing about the data
representations you're working with and then you're surprised that
switching data representations makes a big difference. Have you looked
up the time complexity of the operations you're using?

Duncan

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by Yitzchak Gale :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Andrew Coppin wrote:

>  I offer up the following example:
>
>   mean xs = sum xs / length xs
>
>  Now try, say, "mean [1.. 1e9]", and watch GHC eat several GB of RAM. (!!)
>
>  If we now rearrange this to
>
>   mean = (\(s,n) -> s / n) . foldr (\x (s,n) -> let s' = s+x; n' = n+1 in s'
> `seq` n' `seq` (s', n')) (0,0)
>
>  and run the same example, and watch it run in constant space.
>
>  Of course, the first version is clearly readable, while the second one is
> almost utterly incomprehensible, especially to a beginner. (It's even more
> fun that you need all those seq calls in there to make it work properly.)

You can write it like this:

mean = uncurry (/) . foldl' (\(s,n) x -> ((,) $! s+x) $! n+1) (0,0)

I don't think that's so bad. And for real-life examples, you almost
never need the ($!)'s or seq's - your function will do some kind
of pattern matching that will force the arguments. So really, all
you need to remember is: if you're repeating a fast calculation across
a big list, use foldl'. And insertWith', if you're storing the result in
a Data.Map. That's about it.

>  The sad fact is that if you just write something in Haskell in a nice,
> declarative style, then roughly 20% of the time you get good performance,
> and 80% of the time you get laughably poor performance.

I don't know why you think that. I've written a wide variety of functions
over the past few years. I find that when performance isn't good enough,
it's because of the algorithm, not because of laziness. Laziness
works for me, not against me.

Of course, it depends what you mean by "good performance". I have
never needed shootout-like performance. But to get that, you need
some messy optimization in any language.

Regards,
Yitz
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by Jeff Polakow :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hello,

> I offer up the following example:
>

This is an instructive example.

>   mean xs = sum xs / length xs
>

In order to type-check, I actually need to write something like:

    mean xs = sum xs / fromIntegral (length xs)

There are other ways of get the numeric types to match correctly, but this is fairly general.

Then, I immediately blow my stack if I try something like:

    mean [1..1000000000].

The culprit is actually sum which is defined in the base libraries as either a foldl or a direct recursion depending on a compiler flag. In either case, the code is not strict enough; just trying to compute:

     sum [1..10000000]

blows the stack. This can be easily fixed by defining a suitable strict sum:

    sum' = foldl' (+) 0

and now sum' has constant space. We could try to redefine mean using sum':

    mean1 xs = sum' xs / fromIntegral (length xs)

but this still gobbles up memory. The reason is that xs is used twice and cannot be discarded as it is generated. So we must move to a direct fold, as you did, to get a space efficient mean.

> If we now rearrange this to
>
>   mean = (\(s,n) -> s / n) . foldr (\x (s,n) -> let s' = s+x; n' = n+1
> in s' `seq` n' `seq` (s', n')) (0,0)
>
> and run the same example, and watch it run in constant space.
>
This code actually blows the stack on my machine just like the first naive mean. Foldl is perhaps more intuitive  to use here, since we are summing the numbers as we encounter them while walking down the list, and there is a strict version, foldl', provided in the base libraries.


    mean2 = uncurry (/) . foldl' (\(s,n) x -> (s+x, n+1)) (0,0)

However, this still gobbles up memory... the reason is that pairs are lazy. So we need a way to force the (s+x) and (n+1). An easy, and unobtrusive way to do this is to use strict pattern matching:

    mean2 = uncurry (/) . foldl' (\(!s, !n) x -> (s+x, n+1)) (0,0)

Now we can run:

    mean2 [1..1000000000]

in constant space.

While using an explicit foldl is less readable than the direct version (especially to a beginner), it is a standard functional idiom. Furthermore, a good understanding of lazy evaluation should be enough to guide you to using the strict foldl' and then then strict patterns. Is this a reasonable analysis?

Also, we've made no attempt to address speed. However, I would argue that the code's performance time is predictable-- it grows linearly with the size of the list. Improving the performance time is another matter that requires knowing about the internal representation of the various types being used.

-Jeff


---

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Parent Message unknown Re: GHC predictability

by Don Stewart-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I offer up the following example:
>
>  mean xs = sum xs / length xs
>
> Now try, say, "mean [1.. 1e9]", and watch GHC eat several GB of RAM. (!!)

But you know why, don't you?

>  sat down and spent the best part of a day writing an MD5
> implementation. Eventually I got it so that all the test vectors work
> right. (Stupid little-endian nonsense... mutter mutter...) When I tried
> it on a file containing more than 1 MB of data... ooooohhhh dear...

Did you use Data.Binary or the other libraries for efficient access to gigabytes of data?

> Of course, the first step in any serious attempt at performance improvement
> is to actually profile the code to figure out where the time
> is being spent.

Well, actually, for our 'mean()' case, it means just using the right structures.
Arrays for example:

    import Data.Array.Vector
    import Text.Printf

    mean :: UArr Double -> Double
    mean arr = b / fromIntegral a
      where
        k (n :*: s) a = n+1 :*: s+a
        a :*: b = foldlU k (0 :*: 0) arr :: (Int :*: Double)

    main = printf "%f\n" . mean $ enumFromToFracU 1 1e9

For example,

    $ time ./A
    5.00000000067109e8
    ./A  3.53s user 0.00s system 99% cpu 3.532 total

Try allocating an array of doubles in C, and getting similar results.
(The compiler is optimising this to:

    Main_zdszdwfold_info:
      leaq        32(%r12), %rax
      cmpq        %r15, %rax
      movq        %rax, %r12
      ja  .L10
      movsd       .LC0(%rip), %xmm0
      ucomisd     %xmm5, %xmm0
      jae .L12
      movq        %rsi, (%rax)
      movq        $base_GHCziFloat_Dzh_con_info, -24(%rax)
      movsd       %xmm6, -16(%rax)
      movq        $base_GHCziBase_Izh_con_info, -8(%rax)
      leaq        -7(%rax), %rbx
      leaq        -23(%rax), %rsi
      jmp *(%rbp)
    .L12:
      movapd      %xmm6, %xmm0
      addq        $1, %rsi
      subq        $32, %r12
      addsd       %xmm5, %xmm0
      addsd       .LC2(%rip), %xmm5
      movapd      %xmm0, %xmm6
      jmp Main_zdszdwfold_info

Note even any garbage collection. This should be pretty much the same
performance-wise as unoptimised C.

> almost any nontrivial program you write
> spends 60% or more of its time doing GC rather than actual work.

Ok, you're doing something very wrong. GC time is typically less than 15% of the running
time of typical work programs I hack on.

I bet you're using lists inappropriately?

> I find it completely impossibly to write code that doesn't crawl along at a
> snail's pace.  Even when I manage to make it faster, I usually have no clue
> why.

I think there is a problem that few people are taking the time to understand
the compilation model of Haskell, while they've had the C runtime behaviour
drilled into their brains since college.

Until you sit down and understand what your Haskell code means, it will be very
hard to reason about optimisations, unfortunately.

GHC does what it does well, and its predictable. As long as you understand
the operations your trying to make predictions about.

I suggest installing ghc-core, and looking at how your code is compiled.
Try many examples, and have a good knowledge of the STG paper.

-- Don
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by Albert Y. C. Lai :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Advanced technology ought to look like unpredictable magic.

My experience with lazy evaluation is such that every time a program is
slower or bulkier than I presumed, it is not arbitrariness, it is
something new to learn.

My experience with GHC is such that every surprise it gives me is a
pleasant surprise: it produces a program faster or leaner than lazy
evaluation would have it. "Where has the box gone?"
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by Spencer Janssen-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, May 12, 2008 at 08:01:53PM +0100, Andrew Coppin wrote:
> I offer up the following example:
>
>  mean xs = sum xs / length xs
>
> Now try, say, "mean [1.. 1e9]", and watch GHC eat several GB of RAM. (!!)

I don't see why the performance implications of this program are surprising.
Just ask any programmer used to a strict language how much memory "[1 .. 1e9]"
will require.

> If we now rearrange this to
>
>  mean = (\(s,n) -> s / n) . foldr (\x (s,n) -> let s' = s+x; n' = n+1 in
> s' `seq` n' `seq` (s', n')) (0,0)
>
> and run the same example, and watch it run in constant space.

This will use linear stack space.  You probably meant to use foldl'?

Better:

    mean = uncurry (/) . foldl' f (0, 0)
     where f (!s, !n) x = (s + x, n + 1)

           -- or, if you require Haskell '98:
           f (s, n) x = s `seq` n `seq` (s + x, n + 1)

This version is very legible in my opinion.  In fact, the algorithm is
identical to what I'd write in C.  Also, "mean [1 .. 1e9]" will actually work
in Haskell, while in C you'll just run out of memory.


Cheers,
Spencer Janssen
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by Don Stewart-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

gale:

> Andrew Coppin wrote:
> >  I offer up the following example:
> >
> >   mean xs = sum xs / length xs
> >
> >  Now try, say, "mean [1.. 1e9]", and watch GHC eat several GB of RAM. (!!)
> >
> >  If we now rearrange this to
> >
> >   mean = (\(s,n) -> s / n) . foldr (\x (s,n) -> let s' = s+x; n' = n+1 in s'
> > `seq` n' `seq` (s', n')) (0,0)
> >
> >  and run the same example, and watch it run in constant space.
> >
> >  Of course, the first version is clearly readable, while the second one is
> > almost utterly incomprehensible, especially to a beginner. (It's even more
> > fun that you need all those seq calls in there to make it work properly.)
>
> You can write it like this:
>
> mean = uncurry (/) . foldl' (\(s,n) x -> ((,) $! s+x) $! n+1) (0,0)
>
> I don't think that's so bad. And for real-life examples, you almost
> never need the ($!)'s or seq's - your function will do some kind
> of pattern matching that will force the arguments. So really, all
> you need to remember is: if you're repeating a fast calculation across
> a big list, use foldl'. And insertWith', if you're storing the result in
> a Data.Map. That's about it.
>
> >  The sad fact is that if you just write something in Haskell in a nice,
> > declarative style, then roughly 20% of the time you get good performance,
> > and 80% of the time you get laughably poor performance.
>
> I don't know why you think that. I've written a wide variety of functions
> over the past few years. I find that when performance isn't good enough,
> it's because of the algorithm, not because of laziness. Laziness
> works for me, not against me.
>
> Of course, it depends what you mean by "good performance". I have
> never needed shootout-like performance. But to get that, you need
> some messy optimization in any language.

We can actually get great performance here,

    {-# LANGUAGE TypeOperators #-}

    import Data.Array.Vector
    import Text.Printf

    mean :: UArr Double -> Double
    mean arr = b / fromIntegral a
      where
        k (n :*: s) a = n+1 :*: s+a
        a :*: b = foldlU k (0 :*: 0) arr :: (Int :*: Double)

    main = printf "%f\n" . mean $ enumFromToFracU 1 1e9

ghc -O2

    $ time ./A
    500000000.067109
    ./A  3.69s user 0.00s system 99% cpu 3.692 total

Versus on lists:

    import Data.List
    import Text.Printf
    import Data.Array.Vector

    mean :: [Double] -> Double
    mean arr = b / fromIntegral a
      where
        k (n :*: s) a = (n+1 :*: s+a)
        (a :*: b) = foldl' k (0 :*: 0) arr :: (Int :*: Double)

    main = printf "%f\n" . mean $ [1 .. 1e9]    

    $ time ./A    
    500000000.067109
    ./A  66.08s user 1.53s system 99% cpu 1:07.61 total

Note the use of strict pairs. Key to ensuring  the accumulators end up in
registers.    The performance difference here is due to fold (and all left
folds) not fusing in normal build/foldr fusion.

The vector version runs about the same speed as unoptimsed C.

-- Don
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by Abhay Parvate :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I don't know why, but perhaps beginners may expect too much from the laziness, almost to the level of magic (me too, in the beginning!). In an eager language, a function like

mean :: (Fractional a) => [a] -> a

expects the *whole* list before it can calculate the mean, and the question of the 'mean' function consuming memory does not arise. We look for other methods of finding the mean of very long lists. We do not expect such a function in C or Scheme to succeed when the number of numbers is more than that can fit the memory. (It will not even be called; the list creation itself will not succeed.) Lazy languages allow us to use the same abstraction while allowing doing more. But it is not magic, it is plain normal order evaluation. Just as every Scheme programmer or C programmer must understand the consequences of the fact that the arguments to a function will be evaluated first, a Haskell programmer must understand the consequences of the fact that the arguments to a function will be evaluated only when needed/forced. Perhaps an early emphasis on an understanding of normal order evaluation is needed while learning Haskell in order to stop expecting magic, especially when one comes prejudiced from eager languages.

Regards
Abhay


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by Brandon S. Allbery KF8NH :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On 2008 May 12, at 22:18, Jeff Polakow wrote:

Then, I immediately blow my stack if I try something like:

    mean [1..1000000000].

The culprit is actually sum which is defined in the base libraries as either a foldl or a direct recursion depending on a compiler flag. In either case, the code is not strict enough; just trying to compute:

     sum [1..10000000]

There's also an insufficient-laziness issue with enumerations in at least some versions of the standard library, IIRC.  meaning that just saying [1..10000000] can introduce a space leak that can lead to a stack blowout.

-- 
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@...
system administrator [openafs,heimdal,too many hats] allbery@...
electrical and computer engineering, carnegie mellon university    KF8NH



_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by Darrin Thompson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, May 13, 2008 at 2:20 AM, Don Stewart <dons@...> wrote:
>  Note the use of strict pairs. Key to ensuring  the accumulators end up in
>  registers.    The performance difference here is due to fold (and all left
>  folds) not fusing in normal build/foldr fusion.
>
>  The vector version runs about the same speed as unoptimsed C.
>

These "tricks" going into Real World Haskell? When you say someone
needs to get familiar with the "STG paper" it scares me (a beginner)
off a little, an I've been making an effort to approach the papers. I
could barely understand the Fusion one and getting familiar with
compiler internals sounds like something I'd not be ready for.
Probably if I really looked at ghc-core I'd be pleasantly surprised
but I'm totally biased against even looking. Gcc is hard to read, thus
ghc is also. So while you are right about all this when you say it, I
think your goal is to persuade. RWH has some of the best practical
prose I've read yet. Well done there. Hopefully chapter 26 will be
crammed full of this stuff?

--
Darrin
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: GHC predictability

by Achim Schneider :: Rate this Message: