Re: Forth Research Topics



rickman <gnuarm@xxxxxxxxx> wrote:
On Apr 2, 4:54 am, Andrew Haley <andre...@xxxxxxxxxxxxxxxxxxxxxxx>
wrote:
rickman <gnu...@xxxxxxxxx> wrote:
If I understood more about what is required for a CPU to execute C, I
could potentially turn my design in that direction and get something
that does a better job of C than my CPU does and executes code more
efficiently than the ZPU.

I can tell you what's required for a gcc target: the critcial things
are plenty of registers and frame pointer relative indexed addressing
for spills. There are other issues, but if you're missing both of
these the performance will be dreadful.

Ok, that is what I was looking for. I was aware of the frame pointer
relative addressing; that is what seems to make the ZPU a bit clunky
in some respects. At least, it makes the ZPU harder to implement
since instructions have to be multi-cycle.

The internal data stack would be used as register space. When I
tried to discuss this with the originator of the ZPU he seemed to
think it was not practical to attempt to produce instructions from
the C compiler to control a CPU stack instead of registers. I don't
know if this really would be a difficult task or if it is just
showing limitations in thought.

I think it is possible, but it'll require some work on combine. The
combine pass of gcc performs substitutions of instruction operands.
For example, a pair of instructions like

(set (reg:SI 63)
(mult:SI (reg:SI 61)
(reg:SI 60)))

(set (reg:SI 64)
(plus:SI (reg:SI 65)
(reg:SI 63)))

can be combined into a single instruction like

(set (reg:SI 64)
(plus:SI (reg:SI 65)
(mult:SI (reg:SI 61)
(reg:SI 60))))

i.e. r61 r60 + to r63 r65 r63 + to r64 -> r61 r60 * r65 + to r64

At the moment with conventional architectures these substitution
possibilities are, of course, very limited, but in principle a
compiler for a stack computer could generate "superinstructions" like
this with large numbers of operands. This would also require changes
to the instruction recognizer, and maybe other places as well.

But you have to be aware that even though the stack is being used for
evaluations, the compiler will still make heavy use of locals. Also,
there may be some reason I'm not aware of that would stop this from
working. As far as I'm aware no-one has ever tried.

The big problem you'll always have with automatic conversion of C to
Forth is that C programs are written in a very different style, with a
lot of long-lived variables and temporaries. Even the best possible
stack machine compiler is going to generate a lot of accesses to
locals. If you want decent C performance you're just going to have to
deal with this somehow.

I see the ZPU mailing list is discussing using an internal stack,
but they don't seem to get the idea that you don't need to duplicate
the entire stack frame. Instead the calculations need to be
reordered to optimize the stack usage. In essence, that is what is
done by the user when using Forth. A common programming goal is to
minimize stack manipulations and ultimately data stack size. I
don't know how well this can be done by the compiler. Is it
reasonable to ask the GCC backend to generate code to use an
internal stack as register space rather than using the stack frame
for all local variable storage?

I don't think so. To do this would require a special pass, and even
then it may not work at all well.

Andrew.
.



Relevant Pages

  • Re: Bison/Flex To ByteCode
    ... I would assume there are 2 stacks my compiler ... How would this instruction get handled and represented on a stack ... integers get pushed to the stack with a special opcode that simply ... -> jump instructions, call instructions, comparisons, conditonal jumps, ...
    (comp.lang.cpp)
  • Re: .Net applications, not optimized for CPU at all?
    ... Both Java VM and MSIL are logically stack machines. ... while the MSIL instructions aren't. ... This implies that all the usual compiler tricks can be applied. ...
    (borland.public.delphi.non-technical)
  • Re: Higher Order Instructions
    ... push or pop instructions on the evaluation stack. ... higher order instructions to a stack machine these can be trivially ... this scheme amounts to macro expansion since e.g. your compose operation ...
    (microsoft.public.dotnet.framework.clr)
  • Re: I want to learn forth but...
    ... words in the loop are executed. ... executing a word copies new instructions on the top of the stack. ... where to fetch the next instruction, or it will execute the loop over ...
    (comp.lang.forth)
  • Re: Seymour remembered
    ... > manage to get a loop into this buffer, ... I wasn't certain from the article which Fortran compiler they were ... Instructions were 15, 30, or 60 bits long so theoretically the stack ...
    (comp.lang.fortran)