Re: Gforth and gcc "progress"
- From: anton@xxxxxxxxxxxxxxxxxxxxxxxxxx (Anton Ertl)
- Date: Thu, 28 Jun 2007 12:38:25 GMT
Andrew Haley <andrew29@xxxxxxxxxxxxxxxxxxxxxxx> writes:
Anton Ertl <anton@xxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
Andrew Haley <andrew29@xxxxxxxxxxxxxxxxxxxxxxx> writes:
I really would like to have a sensible discussion about how to fix
some of the problems you're having with gcc.
So do you have any ideas how they could be fixed?
That depends on the specific problem. Some things, for exmaple
register allocation, are very hard, and in any case are being actively
worked on. Other things might be easier.
What would be interesting to me is a to know the most important gcc
deficiencies from the point of view of GForth and your estimate of how
significant these deficiencies are.
1) Register allocation is one issue, but that is probably not easy to
solve. One thing that I can imagine somthing can be done about, is
that, even on platforms with many registers, like Alpha, MIPS, or
AMD64, gcc allocates registers badly for the Gforth engine: we can
barely allocate the virtual machine registers into real registers, but
then we don't have any registers left for stack caching.
The reason for this seems to be that these machines have few
callee-saved registers, and gcc seems to allocate our virtual machine
registers only into these registers (probably because the survive a
number of calls). What's worse, even with explicit register
allocation we cannot get around that, because we can only use
callee-saved registers there, too (at least last time I tried). The
only architecture where I am happy about the registers is PPC, because
it has many callee-saved registers.
It would be great if gcc would make better use of the registers by
itself, but if not, I would at least like to do it myself using
explicit register allocation.
2) Fixing PR25285. This one strikes at unpredictable times, as the
example with gcc-4.1.2 without and with STACK_CACHE_DEFAULT_FAST=0
shows. We do have a workaround for that, but that workaround has a
negative performance impact even if the compiler does not exhibit
PR25285. Speed impacts:
Difference between raw PR25285 and workaround.
0.476 0.748 0.280 0.476 PR25285 without workaround
0.364 0.524 0.288 0.472 with workaround enabled
The cost of the workaround on a compiler without PR25285:
sieve bubble matrix fib
0.208 0.296 0.108 0.336 no PR25285 without workaround
0.248 0.340 0.120 0.384 with workaround enabled
Other programs are affected even more by this. In particular, in our
work on using dynamic superinstructions in the Cacao JVM interpreter
<http://www.complang.tuwien.ac.at/papers/ertl+06dotnet.ps.gz>, we
found slowdowns by up to a factor of 2 by enabling or disabling the
"throw" feature that is affected by PR25285. I found a workaround for
that, but it is brittle; it worked if we did not add static
superinstructions, but when I did add static superinstructions, it no
longer worked.
3) Code arrangement. Not a problem with current gcc versions (apart
from PR25285), but it has been in the past, so maybe one should add a
test case or some other way to remind the maintainers of this issue.
We generate code dynamically by taking code fragments (between two
labels (as values)) that gcc has generated and copying the fragments
elsewhere. In order for this technique to work, there is one
requirement: If a piece of source code is between two labels, the
corresponding executable code must be between the adddresses
corresponding to these labels. This property should be guaranteed, at
least through a compiler option like -fno-reorder-blocks (PR25285
breaks it).
The performance impact of dynamic code generation is typically a
factor of 2, but sometimes much higher:
sieve bubble matrix fib
0.212 0.292 0.108 0.336 dynamic code generation
0.420 0.540 0.704 0.696 no dynamic code generation
There might well be some push-back
from gcc developers, with the claim that GForth "isn't
representative". But I'm not convinced of that, as I suspect that some
of the problems you've seen might have an impact on other code bases.
1) Register allocation affects everyone. Most interpreters will have
register liveness characteristics similar to Gforth.
2,3) The code arrangement issue and PR25285 affect systems that use
similar techniques to Gforth. Apart from Gforth, SableVM, and the
Cacao interpreter, two other projects I know that use similar
techniqes are Qemu and the Tempo partial evaluator. I have read
somewhere that Qemu has big problems with recent gccs thanks to a code
generation issue that is somewhat similar to PR25285, except that it
involves returns instead of general indirect jumps.
I used to write bug reports for gcc. But the reaction to PR25285
convinced me that this is a waste of time.
Well, I agree that Andrew Pinski's Comment #3 seems to be very
unhelpful. However, it's not just a matter of writing bug reports,
but of finding a gcc maintainer who understands the problem and is
motivated to work on it.
Yes. So how do we find one?
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2007: http://www.complang.tuwien.ac.at/anton/euroforth2007/
.
- Follow-Ups:
- Re: Gforth and gcc "progress"
- From: Andrew Haley
- Re: Gforth and gcc "progress"
- References:
- Gforth and gcc "progress"
- From: Anton Ertl
- Re: Gforth and gcc "progress"
- From: slava@xxxxxxxxx
- Re: Gforth and gcc "progress"
- From: Anton Ertl
- Re: Gforth and gcc "progress"
- From: Andrew Haley
- Re: Gforth and gcc "progress"
- From: Anton Ertl
- Re: Gforth and gcc "progress"
- From: Andrew Haley
- Re: Gforth and gcc "progress"
- From: Anton Ertl
- Re: Gforth and gcc "progress"
- From: Andrew Haley
- Re: Gforth and gcc "progress"
- From: Anton Ertl
- Re: Gforth and gcc "progress"
- From: Andrew Haley
- Gforth and gcc "progress"
- Prev by Date: Re: Build your own Forth for Microchip PIC: Design thoughts
- Next by Date: Re: Gforth and gcc "progress"
- Previous by thread: Re: Gforth and gcc "progress"
- Next by thread: Re: Gforth and gcc "progress"
- Index(es):
Relevant Pages
|