Re: forth and virtual memory



According to Anton Ertl <anton@xxxxxxxxxxxxxxxxxxxxxxxxxx>:
Writing a file with mmap() is more complex than reading a file with
mmap(); IIRC you have to specify the file length in advance with
truncate() or ftruncate(), so in some sense you do pre-allocate the
complete file (although on many OSs (f)truncate() alone creates a
sparse file, which does not need much space).

That's the problem, actually. On most Unix-like systems, ftruncate()
creates a sparse file, which means that the space is not really
allocated. When the application writes to the pages, the OS will
allocate space for those pages, but if the write are performed somewhat
randomly in the file, then this will lead to severe fragmentation on the
filesystem (filesystem allocation heuristics are tuned for low
fragmentation when new files are written linearly).

You avoid that effect by first filling the file with random data,
linearly, which forces the OS to allocate blocks in a proper way (i.e.
most sequential on the disk). But this means that the complete space is
used, mostly for junk data.


That's a nice dream, but in the real world I have yet to see a case
where that effect outweighs the overhead of copying garbage
collection.

The overhead of copying garbage collection can be very small, even
negative, in particular in systems with little memory. When the GC can
move objects, then allocation may become a simple matter of moving up a
pointer, whereas a non-copying GC needs to locate a hole of the suitable
size. The strong runtime typing also helps, because the size of an
object can then be deduced from the object alone, without need for extra
administrative data.

I did write a Java-to-C translator for an embedded system; I programmed
a "mark-compact" GC which was not the fastest GC ever, but it turned out
to be appropriate. It used only a few percents of the computing time,
allowed for very fast allocation (just bump up the pointer), and it
could handle low memory situations satisfactorily (i.e. more than 90% of
available RAM used for live objects, with no extra administrative
pointers between objects, and elementary allocation still worked).
Non-copying allocators (GC or not) usually have trouble coping with such
usage rates due to fragmentation (many small holes, but no "big enough"
hole).

Hence, if the overhead is small, it is not difficult to outweigh it.


This just states that garbage collection, when properly done, is not
expensive, and speeds up things through lower memory consumption (both
for fragmentation issues, and because GC encourages object sharing).
That it speeds up things through better cache coherency is another
matter. Jones and Lins dedicate a full chapter to that subject in their
book "Garbage Collection, Algorithms for Automatic Dynamic Memory
Management" (a good book which I recommend, by the way). Their measures
are from about ten years ago and should be actualized.


Yes, and it's not known as a champion of high performance.

What Java is known for, and what it actually does, are two distinct
things.

According to my tests, the Java JIT compiler is somewhat equivalent (on
x86 32-bit hardware) to what "gcc -O" provides. However, if the code
uses arrays, then performance drops, because Java checks bounds on array
accesses, whereas C does not; hence, it is common to observe
computationaly intensive algorithms which run 2 or 3 times slower in
Java than in C.

Garbage collection, however, is not an issue here. Still according to my
tests, the GC itself uses an almost negligible amount of CPU. Rewriting
the code to reuse already allocated objects instead of allocating new
ones often leads to degraded performance.

(Microsoft, for its C#/.NET/whatever system, added a way to include
some "unsafe" code which basically avoids the overhead of checking
array bounds.)


Yet, these advantages do not seem to outweigh the overheads of JIT
ompilation. AFAIK Andrew Haley is still working on static Java
compilation.

The main impediment to Java speed is that Sun engineers have figured out
that performance is overrated (especially the kind of performance which
is measured by microbenchmarks) and thus concentrate on other areas
(e.g. properly scaling to machines with dozens of processors). What Sun
engineers have NOT figured out is that microbenchmarks are an important
part of how people _feel_ about a language; not caring much about them,
though a rational stance, is thus Bad Public Relations.

Forth users should recognize a familiar pattern here.


--Thomas Pornin
.



Relevant Pages

  • Re: forth and virtual memory
    ... too, maybe even the same order, so ordering the blocks by allocation ... on systems with too little memory ... What Java is known for, and what it actually does, are two distinct ... My measurements indicate that some of the benchmarks (from SpecJVM98, ...
    (comp.lang.forth)
  • Re: C++ vs Java "new" (no flame war please!)
    ... of such objects that would fit in available memory. ... memory and much more time than would the JVM for the equivalent Java ... The "allocation" ... If you have a small object with a string, ...
    (comp.lang.java)
  • Re: D gets it right
    ... >releasing memory properly using good old free, ... Java and ... (define (make-adder k) ... stack-based allocation method like that espoused by propoents of RAII, ...
    (comp.programming)
  • Re: C++ vs Java "new" (no flame war please!)
    ... such objects that would fit in available memory. ... you have a small object with a string, ... so it would only do one allocation for the Foo object and none for the String. ... Even if one did copy the String, the time overhead of the Java allocation and copy would be much less than for the C++ code you showed. ...
    (comp.lang.java)
  • Re: Memory Allocation in ASM code
    ... I'm currently trying to implement a very crude compression algorithm. ... there is a need to create some temporary memory to ... if i start a garbage collection (using SysRPL GARBAGE ... I guess that all the problems lies in the temporary memory allocation ...
    (comp.sys.hp48)