Re: forth and virtual memory



On Apr 28, 1:21 pm, Thomas Pornin <por...@xxxxxxxxx> wrote:
According to gavino <gavcom...@xxxxxxxxx>:

http://varnish.projects.linpro.no/wiki/ArchitectNotes

This is an impressive read........they guy seems to do some smart
things by letting the os kernel do a lot of the work its designed for
and he does some tricks to avoid systems calls and keepign things in
filesystems that are expensive.

Another way to see this text is the following: this guy read the man
page for the mmap() system call and thought he was Enlightened. He's not
the first. Using a file-backed memory area and relying on the kernel and
the MMU to move things around between the RAM and the disk is not a new
idea. I even did it myself ten years ago when writing a C preprocessor
(I had weird hobbies at that time), and that was no invention of mine
either.

That technique is no silver bullet. On a modern operating system, the
part which handles the MMU (often dubbed the "kernel") tries to do smart
things when it comes to deciding which chunk of data should reside in
RAM and which should not; but it does so "generically", without really
knowing what the application will do. Distinct applications have
distinct access patterns and distinct needs. Letting the kernel do the
job means that you believe in the "one size fits all" concept, which
often turns out to have only limited applicability to what is
colloquially known as the "real world".

Filling a file from a virtual memory view also often leads to high
filesystem fragmentation, unless you preallocate the complete file,
which is almost invariably a waste of space on a non-dedicated system.
Besides, relying entirely on the MMU means that you merge the adresse
space and the disk space, which implies that on a 32-bit system, you
will have trouble handling more than 2 gigabytes of data.

Some research projects have gone much farther along those lines. For
instance, the EROS operating system was built around the idea of
"checkpointing", in which not only the RAM is used as a cache over the
disk, but the system arranges to save its complete state at regular
intervals, transactionaly, so that power failures can be mostly ignored;
this can be achieved efficently with the MMU. The EROS project was
launched in 1991.

That modern systems have layers of cache to cope with a slow RAM and
very slow disk is not a novel idea either. It is true -- and the guy is
right to mind cache issues -- but as far as I can see, the world of
programming did not wait for him to get the point.

Can forth use this level of detail when it exists on linux? How does
forth interact with virtual memory?

The issue is mostly orthogonal. On a Linux system, the kernel handles
the MMU, and applications talk to the kernel through system calls
(namely, software-triggered interrupts). Low-level programming languages
allow the application programmer to issue direct system calls, and Forth
is not different from C in that respect.

Of course, most of Forth usage is done on much smaller systems which do
not have a MMU, and usually no disk either. Which makes the whole point
quite moot.

There _are_ some cache-related optimizations which are not directly
usable in Forth -- and not in C either. As was pointed out before, cache
issues dominate the question of performance in most cpu-intensive tasks.
Best performance is achieved when memory accesses are performed in a way
which fits well the behaviour of caches. When the programmer uses a
strongly-typed programming language, it becomes possible for the
operating system (or virtual machine implementation) to actually _move_
objects in RAM dynamically to better fit the access patterns. This is a
feature of modern Garbage Collectors and is applicable only if the GC
precisely knows which data element is a pointer and which is not
(because moving an object also means adjusting the references to that
object). Standard Forth is somewhat typeless; a GC for Forth has been
published, but it is a conservative GC which does not move objects
around. There are some Forth-like systems with types, which may allow
for advanced GC techniques (e.g. StrongForth).

If simply using mmap() is 2006-programming, then improved cache
coherency through an object-moving GC must be from year 2015, at least.
Strangely enough, Sun Microsystems has sold the idea at industrial
levels for the last decade (under the name "Java").

--Thomas Pornin

wait since this is a cache program it HAS to be good with 2G+ ......
.



Relevant Pages

  • Re: [9fans] GCC/G++: some stress testing
    ... Programming the memory hierarchy is really a specific instance of programming for masking latency. ... This goes well beyond inserting prefetches in an optimization stage, presenting itself as problem decompositions that keep the current working set in cache, while simultaneously avoiding having multiple processors chewing on the same data. ... Successful algorithms in this space work on small bundles of data that either get flushed back to memory uncached, or in small bundles that can be passed from compute kernel to compute kernel cheaply. ... Having language structures to help with these decompositions and caching decisions is a great help - that's one of the reasons why functional programming keeps rearing its head in this space. ...
    (comp.os.plan9)
  • Re: forth and virtual memory
    ... Using a file-backed memory area and relying on the kernel and ... part which handles the MMU tries to do smart ... in which not only the RAM is used as a cache over the ... programming did not wait for him to get the point. ...
    (comp.lang.forth)
  • Re: forth and virtual memory
    ... Using a file-backed memory area and relying on the kernel and ... part which handles the MMU tries to do smart ... in which not only the RAM is used as a cache over the ... programming did not wait for him to get the point. ...
    (comp.lang.forth)
  • Re: [PATCH 0 of 4] Generic AIO by scheduling stacks
    ... thought we couldn't do native kernel threads for "normal" threading ... disciplines in question: i hacked on scheduling, 1:1 threading, on Tux, ... the programmer from the risks and complexities of thread programming, ... Having a 1:1 relationship between user-space and kernel-space context is ...
    (Linux-Kernel)
  • Re: Cached memory never gets released
    ... Stock linux 2.4.26 kernel. ... Due to flash bug 3M of memory gets lost due to font memory getting lost ... The output of "free" cache number steadily grows. ... longer to exhaust all of system memory with the cache. ...
    (Linux-Kernel)