Re: AMD vs Intel - Ghz & performance question



On 23 Jan 2006 02:40:50 -0800, "David Kanter" <dkanter@xxxxxxxxx> wrote:

>> It's been said that Barrett was to blame for P4 with his "They buy the
>> Megahertz" remark - not sure if that is an exact quote or not.
>
>Ultimately he is, however folks like Louis Burns were also in the
>management chain that signed off on Prescott and Tejas. The P4P was a
>fine core, Prescott ran into problems (and is a very different core) as
>did Tejas. A bunch of folks said that Prescott and Tejas were
>basically nuts...and the management chose to ignore them.

Depends what you mean by "fine core" and P4P [I've seen Prescott called a
P4P xxx] - the original P4 was conceived with DRDRAM in mind and it showed.
It had several interesting innovations which didn't seem to pay off all as
well as might have been expected. I was never strongly tempted to go
there.<shrug>

>> >> The only CPUs that really focus on doing the most work per cycle are
>> >> Itaniums. One might also consider the POWER5 in that category, but it
>> >> does have a rather long pipeline.
>> >
>> >Wonderful. Anyways, "brainiac" was your interpretation. All I ever said
>> >was "higher IPC".
>>
>> Looks like David hit the wrong target - I knew brainiac had appeared in
>> this thread... but it was Tony's characterisation.:-)
>
>Braniac is a well established term for MPUs that attain high
>performance by a low clock rate and high IPC. The two most prominent
>examples are PA-RISC and IPF. The problem is that over time, speed
>demon and braniac shift around.
>
>Once upon a time, the Alpha was considered a pure speed demon, with a 6
>(or 7) stage pipeline. Ironically, almost every Alpha outclocked the
>K7 except at the end of it's life. However, with MPUs like the
>POWER4/5/6 and the P4, the Alpha seems more balanced.

Yes I know what brainiac [braniac ?] means but your use seemed to hint
rather strongly at someone else's (Yousuf's) claim of such in relation to
K8... and Tony had used the word.

Classification of IPF as brainiac seems dubious to me in that it abdicates
scheduling and parallelism to the compiler... a strategy I'm still baffled
by on several counts, the main one being: how the hell did they get
hardware folks to swallow delegating this power to software "jockeys"?:-)

>I just don't see the K7/8 as being branaics, they are very much middle
>of the road designs as I said.

The hoss is dead.

>> >> Hypertransport really is only worth 10% performance gains for single
>> >> socket systems...nobody has really offered proof otherwise. Especially
>> >> since for single socket systems the FSB is only used for memory and
>> >> I/O...just like HT.
>> >
>> >Hypertransport is not used for memory, just i/o; the integrated memory
>> >controller is not part of the HT system. Well, in multi-socket systems,
>> >the HT is kind of used for memory when two processors share cache
>> >contents with each other, but that's really just part of interprocessor
>> >communications, not memory per se.
>> >
>> >However, there are some well-known areas where HT has helped even in
>> >single processor systems. That would be the situation when you're using
>> >Nvidia's SLI dual-graphics. It's been shown that you gain more
>> >performance when going to SLI with AMD systems. I believe the percentage
>> >increases are between 20-40% in Intel systems, whereas it's between
>> >60-70% in AMD systems, comparing Nvidia's own Nforce chipsets against
>> >each other.
>>
>> Hmm, that's an odd one. Has nVidia lost the ability to do a memory
>> controller/graphics interface?... or did they just adopt the same one they
>> had in nForce2?... which was not that bad, that I'd noticed but things do
>> move along.
>
>It was their first time workign with Intel's vintage FSB. HT is a lot
>prettier (I suspect), so they probably were in rather uncharted waters.

I don't see how the FSB figures here at all. Any interface between a video
card and a system depends for its performance on the ability to transfer
large amounts of data directly from main memory to local video memory by
DMA; with video, since AGP, there's not even any snooping of cache.
Another *vague* possibility might be a poor GART implementation... maybe
insufficient or slow lookaside cache for page tables. Another might be to
do with buffering of memory bursts destined for the video memory.

>>Or could it be that the AMD64 CPU/memory/HT cross-bar is so
>> much better?
>
>Doubt it. For single socket systems, it really matters very little.

Of course it matters, for the same DMA mentioned above: the crossbar is
used to switch between CPU<->memory and HT<->memory transfers. I'm not
sure how the clocks are distributed in the "northbridge" section of the
AMD64 CPUs but it's possible that things happen much faster than in the
memory transaction arbitration of an Intel compatible MCH... or as
previously suggested nVidia didn't do enough new work there?

--
Rgds, George Macdonald
.



Relevant Pages

  • Re: Interpreting CM memory resource list physical addresses
    ... cache attributes of any existing mapping for adapter RAM? ... member of the MEM_DES descriptor for the video memory on this system is 0x3, ...
    (microsoft.public.development.device.drivers)
  • Re: OT: Xbox 360, any equivalent PC video cards yet?
    ... I wonder if the 512MB of shared system/video memory isn't ... The console has 512MB of system ... > being made that the entire system memory can be used for video memory. ... the ATI X1800XL is clocked faster than the console ATI Xenos ...
    (comp.sys.ibm.pc.games.action)
  • Re: Request a discrete refresh circuit schematic
    ... The addressing of the video memory isn't that bad, ... The color generation and its effect of the video memory organization ...
    (comp.sys.apple2)
  • Re: Video memory organization
    ... Assuming that video memory reserved area is limited to 128K ... That 128K reserved area was used on the oldest PC designs, ... The video memory can be ... In banked mode, usually the 128K reserved ...
    (comp.lang.asm.x86)
  • Re: radeon-pre-2
    ... And once you have reasonably intelligent memory management, ... enough to just "idle" the engine). ... really save/restore any video memory contents at all. ... the cache obviously can't be thrown away). ...
    (Linux-Kernel)