Re: Disadvantages and advantages of condition code register and general-purpose register



In article <7zacctkln9.fsf@xxxxxxxxxxxxx> torbenm@xxxxxxxxxxxxx
(=?iso-8859-1?q?Torben_=C6gidius_Mogensen?=) writes:
Nudge <honeypot@xxxxxxxxxx> writes:

Torben Mogensen wrote:
I have on a couple of occasions thought that it might be
worthwhile for a microprocessor to specify when registers
(including condition codes) become dead. This will require one
extra instruction bit per register use, so it is by no means
free, but it would have a number of benefits, for example
allowing the register renamer to reuse the physical register even
if it can't see a write to the logical register in the lookahead
buffer.
Does anyone know if this has been proposed before?

IA-64 GPRs are 65-bits wide.
Could the additional bit be used as you propose?

My idea was to use an extra bit in the instruction word when you read
from a register (so if you have 32 registers, you would use 6 bits to
specify the register -- 5 for the register number and one to mark if
it is dead after the read). The registers themselves would need no
extra bits.

An alternative to using an extra instruction bit per register read is
to make a subset of the registers automatically become dead after the
first read. You could even let _all_ registers become dead after the
first read, so you would explicitly have to copy them if you need the
value twice (like in data-flow architectures). The copy instruction
would not physically copy the value, but just map two new logical
registers to the same physical register (and unmap the logical
register that held the value, unless it is used as one of the two new
registers).

Torben

Sometimes the compiler won't know that a register is dead
at the time of last use. Consider the vector normalize code

sum = 0.0
for j from 0 to 2 do
sum = sum + vec[j]^2
end for

mulby = (sum == 0 ? 0.0 : sqrt(1/sum))
for j from 0 to 2 do
vec[j] = mulby*vec[j]
end for

The compiler can see that the register holding vec[j]
is dead after the squaring, and
the register holding vec[j]^2 is dead after the add.

The generated code might also have registers holding the
addresses &vec[j] and &vec[3] (the latter to compare against
&vec[j] for the end-of-loop test). The &vec[j]
register will be dead only after the final iteration of the first
loop. Whereas the &vec[3] register can remain alive for the
second loop. A single-bit in the compare instruction cannot describe this
behavior.

Likewise, in the definition of mulby, the register holding
sum remains alive if sum is nonzero, but dead if sum is zero
(I'll ignore the case where sum is NaN).

After the second loop, the compiler needs to flag
mulby and &vec[3] registers as dead.

Assume the square root is implemented as a function call
(not inline). The square root code might need register R4 for scratch,
where the callee is responsible for saving R4. It generates

save R4 on stack
use R4 for other purposes
restore old R4 from stack

If the caller had already flagged R4 as dead,
how does the "save R4 on stack" behave?
Can we save the dead bit along eith the register?

--
The USA must invest in interstate passenger rail. Amtrak needs high-speed
public railways, not private lines where it waits for freight trains to pass.

pmontgom@xxxxxx Microsoft Research and CWI Home: Bellevue, WA
.



Relevant Pages


Loading