Re: Jonesforth and Hayes CORE tests



Anton Ertl <anton@xxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
Andrew Haley <andrew29@xxxxxxxxxxxxxxxxxxxxxxx> writes:
[Traditional ITC implementation of CREATE...DOES> and cache
consistency issues:]
Actually, that's not quite true: there is a pathological case where it
could happen, something like:

: array create does> cells + ;
array foo 20 cells allot

Here, the data may end up in the same cache line the defining word,

Yes. These kinds of problems have plagued various Forth systems since
the instruction caches was separated from the data cache in the
Pentium. E.g, such an issue is the reason why BigForth is about 30
times slower than iForth on cd16sim, and probably also why BigForth is
slower than Gforth on brew and lexex.

You may consider it pathological, but it still occurs in real-world
code, and pretty often. The main programs where it occurs rarely are
the small benchmarks that are often used to evaluate performance.

I would have expected that precisely the reverse was true, since small
benchmarks are where small defining words and there children are
likely to be close enough to be on the same cache line.

Andrew.
.



Relevant Pages

  • Re: AM2 CPUs with 2x1MB cache?
    ... it may be that the speed advantage of the extra cache is even less noticeable on AM2 chips than on the S939 chips -- has anyone seen a benchmark? ... You've obviously not been reading the benchmarks since socket 754 was introduced. ... The 939 dual-core series was similar, ...
    (uk.comp.homebuilt)
  • Re: Which 64-bit amd
    ... If your problem fits in the cache, ... very beginning saying that SPEC numbers are primarily determined by memory ... HINT assertion would be true. ... I've looked at the HINT benchmarks and they seem relatively interesting but I don't ...
    (comp.os.linux.hardware)
  • Re: reducing number consing / cache experimental results
    ... rf> In my experiments you have to exceed the L2 cache size to see things ... I have run my performance benchmarks on 3 UltraSPARC machines: ... The columns are IPC, icache miss rate, icache stall rate, ecache ...
    (comp.lang.lisp)
  • Re: Jonesforth and Hayes CORE tests
    ... the small benchmarks that are often used to evaluate performance. ... likely to be close enough to be on the same cache line. ... be in different consistency regions; or you have a big sieve or large ... M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html ...
    (comp.lang.forth)
  • Re: CPU - Hertz v. Cache
    ... >> Adding or removing 2 MB external cache makes very little difference ... I haven't done any thorough benchmarks though. ... DDR channels == 1M cache with 1 DDR channel). ... AMD chips because the P4 has a greater memory latency time vs the AMD ...
    (comp.os.linux.hardware)

Loading