Re: how is Haskell not robust?

Adrian Hey <ahey@xxxxxxxxxxxxxxxxxxx> wrote:

The powerset example was totally predictable to me. I knew how it
would use space by just looking at it.

The point I'm making is that nobody can predict it's space behaviour
without knowing the context in which it's used (which we don't).

That's a good point actually. But there are actually two variants:
lazy and strict. In general, lazy is assumed. If that's not the case,
the documentation should state it.

Also the difference between your two variants were also totally
clear. The first is memoizing, thus less lazy in favor of being
more efficient as a whole. The second is not memoizing, thus
preferring laziness.

The difference in *possible* space behaviours of the two variants
might well be clear, but you still can't know the actual space
behaviour of either in the absence of context information.

In the worst case the function uses O(2^n) memory.

There are thunks. Those are either evaluated or not. There is
demand. Demand makes thunks get evaluated. That's it. I'm not
sure in what words the H98 standard specifies this, though.

It doesn't AFAICS. I think you're describing ghc here, not Haskell.

AFAIK there is really a clear specification of what 'non-strict
semantics' are. GHC implements it in this way. Even if the actual
implementation is different, you can still use these terms.

It doesn't mention the term 'graph' anywhere, but that's also not

Only if we don't care about space behaviour.

I think the above definition lets you derive space behaviour as well.

isPrime :: Integral i => i -> Bool
isPrime n = all (\d -> n `rem` d /= 0) . takeWhile (\p -> p*p <= n) $ primes

primes :: Integral i => [i]
primes = 2 : filter isPrime [3..]

Uses O(1) memory, no matter which prime you pick by index (!!).

main :: IO ()
main = print (primes !! 1000000) >> print (primes !! 1000000)

Compiling (with -fno-cse just to be sure) and running gives..

<A long pause and about 23 GiB heap allocation later..>
<Then "immediately"..>

So it sure looks like the entire list of the first 1000001 primes has
been retained between the two calls (as one would expect even from
informal semantics you described above). Also if I comment out the
second print I still get 23 GiB heap allocation (not half this). I
also get about 16 MiB heap residency in each case.

Have I misunderstood what you mean by "Uses O(1) memory, no matter
which prime you pick"?

Okay, forget that example. It uses O(1) memory, if you enable
optimization (-O), otherwise the list fusion optimization doesn't take
place. Also memoizing keeps the entire list in memory, not just the
element calculated. But you could replace 'primes' by a recursive
function, though. In fact, the space leak is the correct behaviour.

The reason why it doesn't use twice 23 GiB heap is simple: 'primes' is
memoized. Remember that it's not a function, but a value. Your
indexing with 'primes !! 1000000' doesn't parameterize 'primes', but it
parameterizes (!!).

What I don't understand is that it uses this huge amount of memory. It
used only a few MiB for me.

In Haskell the second definition _is_ always the best. This is
because of semantics. It will also use O(2^n) memory.

Not according to my maths. Recalling the definition..

powerset [] = [[]]
powerset (x:xs) = powerset xs ++ map (x:) (powerset xs)

If C(N) is the number of "cons cells" needed to retain the entire
result then the recurrence relation we need to solve is:

C(0) = 1
C(N) = 2*C(N-1) + 2^(N-1)

The exact solution of which is: C(N) = (N+2)*(2^(N-1))
C(N) is O(N*(2^N)) IOW.

Well, I didn't count individual cells. Also there is only one sublist
that really takes 'n' cells. However, this is worst case behaviour,
which happens, when the entire powerset list is used. Otherwise you
have list fusion optimization, which can reduce memory usage to O(2^n)
with respect to individual cells.

This "retaining in memory" is called memoizing in Haskell.

This seems to be a rather non-standard use of the term "memoizing",
but I know what you mean, you know what I mean, so let's not argue
about it.

Values with names are always kept, as long as they are in scope and
there is still some reference to it.

Indeed. So how do reconcile this with your earlier assertion that
primes "uses O(1) memory, no matter which prime you pick by index"?

If primes is still referenced it uses (retains) O(N) memory where N is
in the number of primes less than or equal to the largest prime that
has been demanded so far.

Or maybe that should be O(N*LOG(N)) memory if we include the growth of
bits needed to actually represent ever increasing primes. Or something
like that..

Yes, that's expected behaviour. As noted earlier, I have picked a bad
example, because I had list fusion optimization in mind. Replacing
'primes' by a function would solve that problem.

There is no implicit sharing. Sharing is always done through
memoization. I repeat: Haskell's semantics are very simple, even
though for some reason you just don't want to believe it.

That isn't what I said.

(sub-graph) sharing semantics are not specified, but let's assume that
if they were specified they would indeed be very simple. It does not
follow that the space behaviour of real programs is at all simple to
understand. Newtons law of gravitation is very simple, but a solution
to the N-body problem is far from simple for N>2.

If you say that you understand the space behaviour of all your Haskell
programs, then fine. But I think the rest of us mere mortals will have
to continue living in ignorance or tinkering about with heap profilers
and wotnot.

At some point I learned it. Since then the behaviour of my programs
seems natural to me. I think everybody can learn it. Of course
Haskell's semantics are harder than C's semantics, but so what? If it
pays off, I learn it.


nightmare = unsafePerformIO (getWrongWife >>= sex)