Re: RAFTS, compile on EXECUTE and the x86
- From: anton@xxxxxxxxxxxxxxxxxxxxxxxxxx (Anton Ertl)
- Date: Wed, 25 Aug 2010 12:07:20 GMT
Alex McDonald <blog@xxxxxxxxxxx> writes:
On 25 Aug, 11:44, an...@xxxxxxxxxxxxxxxxxxxxxxxxxx (Anton Ertl) wrote:
Also, the main point of doing the compilation at the latest
point-in-time possible is to be able to do optimizations that go
beyond singe colon definitions, e.g., inlining or interprocedural
register allocation.
And the point of this paragraph is: If I am collecting code until the
last possible moment in order to see the maximum of code, so that I
can see optimization opportunities, it probably does not make much
sense to compile that big chunk in little pieces.
The code modification seems to upset the timings for several hundreds
if not thousands of iterations if I'm interpreting the output from
Code Analyst correctly.
That's strange. I would expect a big slowdown on the first time the
code is executed after it was changed, from the cache invalidation and
the need to reload the code through the memory hierarchy, but after
that the performance should be pretty stable. Maybe another slowdown
when the next piece of code is compiled, if it shares a cache line
with the current piece of code, but that should be it.
OTOH, who knows what goes on in the complex hierarchy of subsystems on
current CPUs. Or maybe the measurement infrastructure does not report
the numbers correctly.
The MPE benchmark may not be ideal for this
and might need lengthened;
Well, it rather seems like a best-case: Far fewer calls than normal
Forth code, and big basic blocks.
the execution times are measured in sub-
seconds, generally around 100-200ms for most of the tests.
That sounds like plenty. That's more than enough time to load 1M
cache lines, and I very much doubt that your code is nearly as big.
For
example, the first dhrystone iterated 500K times runs at 2.9M
dhrystones/second; the second and subsequent runs at 3.7M/sec.
The first run includes the compilation time, no? What if you try a
1-iteration run, then a 500k-iteration run? How does the 500k-run
fare then?
Compiling on compile is what I don't want to do
Generating native code when the Forth text interpreter compiles is not
what I was suggesting.
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2010: http://www.euroforth.org/ef10/
.
- Follow-Ups:
- Re: RAFTS, compile on EXECUTE and the x86
- From: Alex McDonald
- Re: RAFTS, compile on EXECUTE and the x86
- From: Alex McDonald
- Re: RAFTS, compile on EXECUTE and the x86
- References:
- RAFTS, compile on EXECUTE and the x86
- From: Alex McDonald
- Re: RAFTS, compile on EXECUTE and the x86
- From: Anton Ertl
- Re: RAFTS, compile on EXECUTE and the x86
- From: Alex McDonald
- RAFTS, compile on EXECUTE and the x86
- Prev by Date: Re: RAFTS, compile on EXECUTE and the x86
- Next by Date: Re: How far can stack [LIFO] solve do automatic garbage collection and prevent memory leak ?
- Previous by thread: Re: RAFTS, compile on EXECUTE and the x86
- Next by thread: Re: RAFTS, compile on EXECUTE and the x86
- Index(es):
Relevant Pages
|