Re: (64bit) push-pop puzzles



From the beginning of the 8086 until nowaday's still is >>using
as a NOP the following machine code:

xchg (r)(e)ax,(r)(e)ax

the problem with this kind of nop's is that the >>processor
can't optimize the NOP used (is used as a advanged >too).

I am not really sure what you mean by this, but AFAIK >all the modern
implementations of the IA-32 and AMD64 architecture >recognize this as
NOP and optimize this in some way; e.g., they do not >treat this
instruction as reading from and writing to EAX. This is >not
necessarily the case for the other instructions that are >typically
used as NOPs. Read the optimization manuals of the >various
manufacturers for the recommended NOP sequences >for various numbers of
bytes.

This is not true, the NOP is still treated the same old way,
the optimization is taken out later since a lot of programmer
still uses them to delay the processor in to insert serelization.
So there is a time penalty.
If you want to use a nop for other testing reacence use
one of the suggested nop's in the optimization section.

So when the processor load one NOP in the first >pipeline
a next NOP in the second pipeline and so on all
the pipelines have to wait for the result of that xchg.

No, they don't.

See above

If you would put afther a nop something usefull the delay
would be less since the pipeline dos not have to wait for
the prefius one.

If you have something useful to put there, why use a >NOP at all?

Try testing for the actial speed of your sofware.

------=_NextPart_000_008C_01C7AD22.642FDC6
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Please post plain text only.

I did, so I don't know where this came from.

- Tessa
--


Relevant Pages