Re: Code density and performance?





Nick Maclaren wrote:

In article <1121333201.729594.240150@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>,
"jon@xxxxxxxxxxxx" <jon@xxxxxxxxxxxx> writes:
|> > Heidi Pan's "High Performance, Variable-Length Instruction
|> > Encodings" implies that a 25% reduction in code size relative to
|> > MIPS should be possible without making a superscalar
|> > implementation excessively difficult.
|> |> You can get a 40% reduction with just a mix of 16 and 32-bit
|> instructions, without harming performance or making superscalar
|> execution more difficult than it otherwise would be. And yes, for
|> embedded apps, 40% reduction in code size is well worth it, that's why
|> we have MIPS-16, Thumb, ARCcompact etc.


In the past, I have estimated that you could get a 50-75% reduction
by a two-level ISA, where the top level was designed for the high
level language. Designing ISAs for ease of code generation was
first proposed in the late 1960s, as far as I know, but was never
done in a mainstream system, and hasn't been attempted at all in
recent decades.

Debaere & Campenhout's Interpretation and Instruction Path Coprocessing
http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=5164
is a fine variation on this theme and was done in the early 90's. Their idea was to speed-up threaded code by tacking a "decode ROM" onto the instruction-fetch logic of a 680x0 processor. The decode ROM contained the native instruction sequences for each threaded code. Threaded code was fetched by external logic, looked up in the decode rom, and the corresponding native instructions fed to the processor.


Using a predesigned processor and tacking on some support logic for a higher-level language makes good economic sense. One saves much of the design effort of a ground-up design, and can hitch a ride on a commodity processor whose implementations follow Moore's law instead of being left behind by a small market niche as were Lisp machines.

I suspect this would have been a better way to implement e.g. picojava.

Its also reminiscent of what one could do with configurable processors like Tensilica, where you have a predefined RISC core (which incidentally has two instruction sets, either 16-bit or 24-bit) and the ability to define additional logic to interface to the core using tools that compile C definitions of the logic.

[I don't know how Alpha PALcode is implemented, but it is perhaps a similar idea]

In order to do this, you need a microcode approach, possibly with
a programmable microcode.  In the heyday of the RISC dogma, this
was stated to be incompatible with performance, but the Pentium 4
and Opteron have shown that that is now false, if it ever were
true (which is very doubtful).

The above examples show you don't need to use microcode at all. You simply need to provide ways of interfacing additional logic to an existing core. Using FPGA technology one can imagine e.g. adding associative memories used to speed-up specific language features such as oo method lookup.


[and of course whether these kinds of approaches can beat compilation technology is a separate question. Because compilation technology looks at much more of the problem than a single bytecode it typically nets much greater performance benefit.]


-- _______________,,,^..^,,,____________________________ Eliot Miranda Smalltalk - Scene not herd

.



Relevant Pages

  • Re: Code density and performance?
    ... without harming performance or making superscalar ... |> embedded apps, 40% reduction in code size is well worth it, that's why ... which implies the need for larger cache and more bandwidth on the memory cache channel. ... This is quite different from microcode that would decode instructions in software ...
    (comp.arch)
  • Re: Why is zSeries so CPU poor? (was:RE: Linux - Our Saving Grace?)
    ... lower clock rate than some other machines. ... can clock it the faster it will execute instructions - all else being ... Design complexity lowers the rate that can be ...
    (bit.listserv.ibm-main)
  • Re: The coming death of all RISC chips.
    ... You are obviously feeling left out, I did look at your NIBZ design. ... Also with variable width instructions this enables you ... AISE - Advanced Instruction Set Extensions. ...
    (comp.arch)
  • Re: Is microprocessor an integrated circuit???
    ... >>instructions in that machine were the MMRB (Move Memory to Register ... In 1981 Modcomp's president went to a California design firm for the ... Memory Management Control used one gate array. ...
    (sci.electronics.design)
  • Re: The coming death of all RISC chips.
    ... You are obviously feeling left out, I did look at your NIBZ design. ... Today you want to compress your instructions down, ... AISE - Advanced Instruction Set Extensions. ...
    (comp.arch)