Re: Code density and performance?



>>> On Tue, 02 Aug 2005 10:26:49 +0200, Jan Vorbrüggen
>>> <jvorbrueggen-not@xxxxxxxxxxx> said:

[ ... ]

>> (in olden times IBM, with their absurdly large 4KB pages, did
>> quite a bit of research on better locality and packing as
>> much temporally related stuff as possible into each of those
>> immense pages).

jvorbrueggen> That applies even to VAX/VMS with its small 512
jvorbrueggen> byte pages.

Indeed, because the long-forgotten statistics I had mentioned
show that programs that have not been specifically locality
improved tend to have ''hotspots'' of about 100-200 bytes
long... 512 bytes is sort of a decent compromise between too
small and too large for that sort of program.

jvorbrueggen> [ ... ] They noted that initialization of all of
jvorbrueggen> its components was calling little routines scatted
jvorbrueggen> all over the place, accessing small pieces of data
jvorbrueggen> scattered all over the place. [ ... ]

Precisely what I was hinting at.

Trouble is, most current ''unoptimized'' programs seem to be
like that, especially and tragically shared libraries.

But also, notoriously, things like the X11 reference server,
which most of the time just gets a request to blit some pixmap
on the screen, and the BLIT code probably is a few KB at most,
but has immense memory footprints because of the extreme
dispersal of the code and data that gets to the code; and
similarly for banal programs like 'xterm', which has a memory
footprint of several MB. And what to say of 'ls', with a 700KB
memory footprint. On a x86 GNU/Linux system:

------------------------------------------------------------------------
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
18437 pts/0 S+ 0:00 19 228 7411 2860 0.5 xterm
18597 pts/10 T 0:00 0 71 3736 740 0.1 ls
------------------------------------------------------------------------

Warning: the above numbers are difficult to interpret, but
they relate some order-of-magnitude horror.

I suspect that if the page size on that x86 went from 4KB to
512B the memory footprint of both 'xterm' and 'ls' above would
go down by a factor close to 8 (in particular because a large
part of the problem is that both depend on massive shared libs).

jvorbrueggen> The non-locality is the problem, not the large
jvorbrueggen> page size.

Very funny :-). More explicitly: it is hard to do locality
improvement and clustering of data/code, and it hits diminishing
returns fairly fast.

For a similar and related issue, poor shared library
performance, a disconsolate and realistic note here:

http://WWW.LiveJournal.com/users/udrepper/2491.html

In any case as I wrote practically nobody can be bothered to do
it, so non-locality is a given, and then large page sizes just
exacerbate it.

jvorbrueggen> In any case, relative to memory sizes, page sizes
jvorbrueggen> have shrunk, not grown.

That's sort of irrelevant -- because memory footprint relates to
page size (for a given locality profile), not total memory size.

More memory simply means more memory wasting apps can fit into
it, which is an improvement of sorts (for RAM manufacturers :->).

jvorbrueggen> But performing a disk I/O still has an overhead
jvorbrueggen> that has grown only worse with time, again
jvorbrueggen> relative to the rest of the system.

So far so good, but the implications of this comment are quite
crazy; it sort of implies that since the cost of I/O is sort of
high and mostly constant wrt to page size, then one might as
well try to get as much stuff in or out per page operation, to
amortize the cost of that IO over as much data as possible, so
large pages minimize that; which is very popular argument
indeed.

But what matters most is to reduce the number of such transfers,
not the cost per byte transferred; bringing in data that is
going to be infrequently used just because it is on the same
page as data that has been just referenced and that is almost
free in terms of I/O, costs many many times more than the
''saving'', because odds are that it will displace data that is
already in memory and since it has been kept there probably has
a high frequency of use (but then many contemporary systems like
GNU/Linux use delirious page/buffer replacement policies, so
there).

jvorbrueggen> I have read that the only real advantage of WXP
jvorbrueggen> over W2K [ ... ] logs boot- and login-time dis
jvorbrueggen> accesses and reorders the files that are accessed
jvorbrueggen> such that head movements are reduced to a minimum.

That's about laying out things on disk, a bit different from
memory pressure minimization. Intel used to have for MS Windows
3.x or 9x some kind of EXE accelerator that worked much like
that.

[ ... ]
.



Relevant Pages


Loading