Re: RAD vs. performance




"Dr Jon D Harrop" <jon@xxxxxxxxxxxxxxxxx> wrote in message
news:1151595430.635899.320300@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
cr88192 wrote:
"Jon Harrop" <usenet@xxxxxxxxxxxxxx> wrote in message
When I converted a graphical application from C++ to OCaml the worst
case
performance got 5x faster because OCaml's GC spread the work across
several
collections where as the C++ STL amortised deallocations at the end of
scopes.

can't say I have experienced this personally.

note I am not using any external GC, I wrote the GC myself.

in my case, I am using ref counting, which isn't all that fast, but does
generally avoid sudden delays (as can be incured during mark/sweep), so
is
probably better for real-time stuff (interactive programs).

On the contrary, ref counting is very prone to stalls whereas OCaml's
incremental collector is not (except for arbitrary-size atoms,
including the stack and arrays of pointers). Mathematica uses reference
counting and, consequently, suffers from long stalls when you delete a
big notebook cell.


hmm, that one may be worth noting...

a possible hack here could be to bound the number of objects that can be
freed at once. should the limit be reached, whatever is left is added to a
"to be removed when reasonable" queue.

however, it can be noted that, even as such, such cases are likely to be
rare (typically in my experience abount the only really large objects are
the toplevel scopes, and presumably they are not likely to be freed so
often...).

other things would include, eg, parts of the scene graph, ...


note that some of these kinds of programs can be fairly sensitive, eg, to
a
level where one has to accumulate the time per frame, and step in fixed
amounts, as irregular delays can mess up stability (one steps, eg, in
fixed
10ms steps or similar, with an OS timer that may chaotically return 0,
16,
or 32ms).

Yes, my graphical application is soft real-time, so the worst-case
performance is very important.

yeah.

or, in the case of a user, for some kinds of apps, unexpected delays
(even
small ones) can be an annoyance.

Absolutely.

also, ref counting is capable of releasing memory as soon as it is no
longer
needed, wheras with a real gc one may have to wait, and thus a lot more
heap
overhead may be incurred during operation.

But reference counting requires references, which use up a lot of
memory.


errm, I don't keep inverse references or anything, only a count (gathering
inverse references would be possible, but is likely to be a very expensive
operation involving scanning over the roots and heap...).

this count costs 3 bits (4 actually, if one takes into account the 'protect'
flag, which is set in some situations to indicate that the object should not
be freed temporarily, with a 'locked' state used for long-term non-freeable
objects).

the other bits used (per-cell) include 2 bits for a cell type, and 2 more
used for holding the current mark state (white, black, grey, locked). this
state is used by the mark/sweep collector ('grey' is a conventional holdover
from when I wrote an incremental gc, and also because I use 2 bits).

since each cell is 8 bytes, this leads to about 1/8 heap overhead (12.5%).
the initial heap size is 512kB, so this is an overhead of about 64kB
(initial heap is 64k cells).

in my case, the recursive fib(28) operates without allocating any additional
cons cells (from the default cons pool), which is 256 cells (or about 2kB),
and completes in a few seconds (unsure exactly, wasn't using a timer).

algo: fib(x)if(x>2)fib(x-1)+fib(x-2) else 1;
so, overhead is good (prior to getting the ref-counting working, each call
would end up using up the whole heap, probably incurring several gc phases).


actually, since this time around I am doing ref counting, which requires
traping all assignments to any roots/object slots, there is little stopping
me from adding code for a write barrier (needed for incremental gc, at one
point I used mmap/mprotect, but similar does not work as well on
windows...). then again, I would have to decide between true multithreading,
or a time-bound loop (more often used for thread simulation in my apps, as I
tend to write frame-driven, single threaded apps).


for now though, I will try to keep garbage low, and fall back to the main gc
only when necessary (and tolerate it's imposed delay).

another possibility could be keeping hueristics, eg, for when a gc may make
sense, and an app can include some calls to a "gc may make sense about now"
type function (presumably when something not timing critical is going on,
eg, performing a file/image save, ...).


an incremental gc can avoid many of these problems, but then one has the
problem that these tend to be non-portable, and a little more work to
implement (thus far, I have only really written one of these).

Yes. OCaml's GC took a lot of work. There was once a concurrent
version...

How do you collect cycles?


luckily, they should be rare (and reserved mainly, eg, for things like the
environment graph or scene graph, and not so much inner data, which is more
likely to be short lists or atomic values).

actually, the scene graph may or may not be stored in the vm's heap anyways,
but may be kept in whatever apps' heap (likely to use plain conservative
mark/sweep, as ref-counting is imo a major pita for generic c code).


in my case, cycles will end up waiting around until the main gc (mark/sweep)
kicks on.


Cheers,
Jon.



.



Relevant Pages

  • Re: one thread is created for each object construction?
    ... the JVM creates one thread for each object construction? ... All Java objects live on the heap, all of the time, in any compliant ... JIT compiler statically proves never has references escape the local ... This includes the keys in WeakHashMap too -- the map's ...
    (comp.lang.java.programmer)
  • Re: to dispose or not ?
    ... The dotNet garbage collector is, at it's heart, a move to another heap style collector for the Gen0 and Gen1 heaps and a mark/sweep/compact collector for the Gen2 heap. ... Setting an object to Nothing simply releases the reference from the variable to the object in the heap. ...
    (microsoft.public.dotnet.languages.vb)
  • RE: Best Ref-counting algorithms?
    ... Operations on heap pointers are very rare, so even a slow GC doesn't ... even though there are many references (through ... the repeated storing and fetching of intermediate parse trees, ... In conjunction with segmented memory allocation, ...
    (comp.compilers)
  • Re: List mList;
    ... the references are the only thing on the stack. ... references (which are on the heap as well) to either structures or classes. ... //Each type has 2 strings and a getter property for each string. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: "The Little Lisper"
    ... > eventually sending your computer to the swapping hell (unless heap ... Some clients of our collector do that, ... Linked queues and lazily evaluated lists seem to be the canonical, ... black-listing algorithm in the collector is likely to handle the ...
    (comp.lang.lisp)