CMov implementation (Was Re: What will Microsoft use its ARM license for?)
- From: "nedbrek" <nedbrek@xxxxxxxxx>
- Date: Thu, 12 Aug 2010 06:55:03 -0500
Hello all,
"Brett Davis" <ggtgp@xxxxxxxxx> wrote in message
news:ggtgp-E3DCAF.00140812082010@xxxxxxxxxxxxxxxxxxxxxxxx
In article <i3u0jk$mi1$1@xxxxxxxxxxxxxxxxxxxxxxxxxx>,
"nedbrek" <nedbrek@xxxxxxxxx> wrote:
"Brett Davis" <ggtgp@xxxxxxxxx> wrote in message
news:ggtgp-00E59C.22370510082010@xxxxxxxxxxxxxxxxxxxxxxxx
In article <i3rcrv$44e$1@xxxxxxxxxxxxxxxxxxxxxxxxxx>,
"nedbrek" <nedbrek@xxxxxxxxx> wrote:
The biggest problem with CMOV is the renamer (so, it is easy to handleSo CMOV does not issue to the integer unit, it just acts as a rename.
for an in-order machine).
You end up with a nop, or two pointers to r4, r1 and r4 both the same.
This is only a win if all three integer pipes are full, because it
will still take a cycle to turn the flag into a rename.
(Will save power, no register reads, math or register write.)
Correct?
You're talking about executing the op in the renamer? It's possible, but
more complicated than executing a move - you need to bring the flags
value
from the execution engine back to the renamer. Most execute-in-rename
proposals only execute movs and alu ops with immediates (e.g. add r1 +=
imm). That way, you can just store an offset value in the map (no need
to
bring values in from execute).
Certainly possible, but a lot more effort.
Of course, I don't know of any shipping mainstream processors with
execute-in-rename... (not counting special handling for SP)
Ok, you confused me again. ;)
Most RISC chips implement a register move as a ALU binary OR with
immediate value zero. Even the NOP is actually "OR r0 = r0, #0"
which I remember from my Moto 68k days.
In x86, it is encoded as "mov dst = src". Internally, this can be converted
to an "(x)or/add/sub imm0" uop, or there might be a mov uop. Not sure what
the tradeoffs are...
An ALU is always going to read two values and write one, even x86.
Consider the "clear" operation (XOR AX ^= AX). There is one write, the
value 0. Or a "mov r = imm". These are usually executed at the ALU port.
The value you want was written by a previous opcode, that value
went to the re-namer, you need to decide if r4 or r1 holds the value
you need, for the micro-opcode after the first CMOV micro-opcode part.
(A CMOV to memory would be cracked into two micro-opcodes.)
Usually "values" (at least, register values) are not going anywhere near the
renamer. The renamer is a map from architecture register numbers (r0..15)
to physical register names (on the order of the ROB entries, so p0..128).
The bypass logic is matched using physical register numbers (p27 is on these
wires - there are huge banks of comparators which compare the incoming
register reads to the physical register numbers from the producers that have
scheduled recently). That drives the muxes which choose between the values
coming from the physical register file and the bypass logic.
Of course 99% of the time you are actually grabbing from the bypass
as these would be sequential opcodes. So the mystery becomes which
bypass do you pick from, ALU2 which is writing r1 or ALU3 writing r4.
(And which stage of bypass, and do you need to do a actual register
read instead for one, or both values.)
Right, the mystery is resolved using physical register numbers. The renamer
provides the number for each source. You would like there to be one source
at this point, although you could make the bypass logic execute the cmov -
this would require the renamer to produce 3 numbers (remember the flags have
a producer!).
Then after all this thinking you need to go back and fix the
re-namer so that it does not get confused and epic fail on an
interrupt.
Now am i right, or at least close. ;)
You've described how Itanium executes predicates. I'm not aware of any OOO
machine that goes to this much trouble. Elevating a CMOV from one or two
clocks to 0 or 1 clocks isn't a huge performance win...
Ned
.
- Follow-Ups:
- References:
- Re: What will Microsoft use its ARM license for?
- From: Torben Ęgidius Mogensen
- Re: What will Microsoft use its ARM license for?
- From: Brett Davis
- Re: What will Microsoft use its ARM license for?
- From: Brett Davis
- Re: What will Microsoft use its ARM license for?
- From: Owen Shepherd
- Re: What will Microsoft use its ARM license for?
- From: Brett Davis
- Re: What will Microsoft use its ARM license for?
- From: Owen Shepherd
- Re: What will Microsoft use its ARM license for?
- From: Brett Davis
- Re: What will Microsoft use its ARM license for?
- From: nedbrek
- Re: What will Microsoft use its ARM license for?
- From: Brett Davis
- Re: What will Microsoft use its ARM license for?
- From: nedbrek
- Re: What will Microsoft use its ARM license for?
- From: Brett Davis
- Re: What will Microsoft use its ARM license for?
- Prev by Date: Re: What will Microsoft use its ARM license for?
- Next by Date: Re: What will Microsoft use its ARM license for?
- Previous by thread: Re: What will Microsoft use its ARM license for?
- Next by thread: Re: CMov implementation (Was Re: What will Microsoft use its ARM license for?)
- Index(es):
Relevant Pages
|