# Re: how is Haskell not robust?

*From*: Ertugrul Söylemez <es@xxxxxxxx>*Date*: Mon, 1 Jun 2009 13:06:53 +0200

Adrian Hey <ahey@xxxxxxxxxxxxxxxxxxx> wrote:

The powerset example was totally predictable to me. I knew how it

would use space by just looking at it.

The point I'm making is that nobody can predict it's space behaviour

without knowing the context in which it's used (which we don't).

That's a good point actually. But there are actually two variants:

lazy and strict. In general, lazy is assumed. If that's not the case,

the documentation should state it.

Also the difference between your two variants were also totally

clear. The first is memoizing, thus less lazy in favor of being

more efficient as a whole. The second is not memoizing, thus

preferring laziness.

The difference in *possible* space behaviours of the two variants

might well be clear, but you still can't know the actual space

behaviour of either in the absence of context information.

In the worst case the function uses O(2^n) memory.

There are thunks. Those are either evaluated or not. There is

demand. Demand makes thunks get evaluated. That's it. I'm not

sure in what words the H98 standard specifies this, though.

It doesn't AFAICS. I think you're describing ghc here, not Haskell.

AFAIK there is really a clear specification of what 'non-strict

semantics' are. GHC implements it in this way. Even if the actual

implementation is different, you can still use these terms.

It doesn't mention the term 'graph' anywhere, but that's also not

necessary.

Only if we don't care about space behaviour.

I think the above definition lets you derive space behaviour as well.

isPrime :: Integral i => i -> Bool

isPrime n = all (\d -> n `rem` d /= 0) . takeWhile (\p -> p*p <= n) $ primes

primes :: Integral i => [i]

primes = 2 : filter isPrime [3..]

Uses O(1) memory, no matter which prime you pick by index (!!).

main :: IO ()

main = print (primes !! 1000000) >> print (primes !! 1000000)

Compiling (with -fno-cse just to be sure) and running gives..

<A long pause and about 23 GiB heap allocation later..>

15485867

<Then "immediately"..>

15485867

So it sure looks like the entire list of the first 1000001 primes has

been retained between the two calls (as one would expect even from

informal semantics you described above). Also if I comment out the

second print I still get 23 GiB heap allocation (not half this). I

also get about 16 MiB heap residency in each case.

Have I misunderstood what you mean by "Uses O(1) memory, no matter

which prime you pick"?

Okay, forget that example. It uses O(1) memory, if you enable

optimization (-O), otherwise the list fusion optimization doesn't take

place. Also memoizing keeps the entire list in memory, not just the

element calculated. But you could replace 'primes' by a recursive

function, though. In fact, the space leak is the correct behaviour.

The reason why it doesn't use twice 23 GiB heap is simple: 'primes' is

memoized. Remember that it's not a function, but a value. Your

indexing with 'primes !! 1000000' doesn't parameterize 'primes', but it

parameterizes (!!).

What I don't understand is that it uses this huge amount of memory. It

used only a few MiB for me.

In Haskell the second definition _is_ always the best. This is

because of semantics. It will also use O(2^n) memory.

Not according to my maths. Recalling the definition..

powerset [] = [[]]

powerset (x:xs) = powerset xs ++ map (x:) (powerset xs)

If C(N) is the number of "cons cells" needed to retain the entire

result then the recurrence relation we need to solve is:

C(0) = 1

C(N) = 2*C(N-1) + 2^(N-1)

The exact solution of which is: C(N) = (N+2)*(2^(N-1))

C(N) is O(N*(2^N)) IOW.

Well, I didn't count individual cells. Also there is only one sublist

that really takes 'n' cells. However, this is worst case behaviour,

which happens, when the entire powerset list is used. Otherwise you

have list fusion optimization, which can reduce memory usage to O(2^n)

with respect to individual cells.

This "retaining in memory" is called memoizing in Haskell.

This seems to be a rather non-standard use of the term "memoizing",

but I know what you mean, you know what I mean, so let's not argue

about it.

Values with names are always kept, as long as they are in scope and

there is still some reference to it.

Indeed. So how do reconcile this with your earlier assertion that

primes "uses O(1) memory, no matter which prime you pick by index"?

If primes is still referenced it uses (retains) O(N) memory where N is

in the number of primes less than or equal to the largest prime that

has been demanded so far.

Or maybe that should be O(N*LOG(N)) memory if we include the growth of

bits needed to actually represent ever increasing primes. Or something

like that..

Yes, that's expected behaviour. As noted earlier, I have picked a bad

example, because I had list fusion optimization in mind. Replacing

'primes' by a function would solve that problem.

There is no implicit sharing. Sharing is always done through

memoization. I repeat: Haskell's semantics are very simple, even

though for some reason you just don't want to believe it.

That isn't what I said.

(sub-graph) sharing semantics are not specified, but let's assume that

if they were specified they would indeed be very simple. It does not

follow that the space behaviour of real programs is at all simple to

understand. Newtons law of gravitation is very simple, but a solution

to the N-body problem is far from simple for N>2.

If you say that you understand the space behaviour of all your Haskell

programs, then fine. But I think the rest of us mere mortals will have

to continue living in ignorance or tinkering about with heap profilers

and wotnot.

At some point I learned it. Since then the behaviour of my programs

seems natural to me. I think everybody can learn it. Of course

Haskell's semantics are harder than C's semantics, but so what? If it

pays off, I learn it.

Greets,

Ertugrul.

--

nightmare = unsafePerformIO (getWrongWife >>= sex)

http://blog.ertes.de/

.

**Follow-Ups**:**Re: how is Haskell not robust?***From:*Jon Harrop

**References**:**Re: how is Haskell not robust?***From:*Adrian Hey

**Re: how is Haskell not robust?***From:*Ertugrul Söylemez

**Re: how is Haskell not robust?***From:*Adrian Hey

**Re: how is Haskell not robust?***From:*Ertugrul Söylemez

**Re: how is Haskell not robust?***From:*Adrian Hey

- Prev by Date:
**Re: how is Haskell not robust?** - Next by Date:
**Re: how is Haskell not robust?** - Previous by thread:
**Re: how is Haskell not robust?** - Next by thread:
**Re: how is Haskell not robust?** - Index(es):