Re: Software Optimization Guide for AMD Family 10h Processors
- From: nmm1@xxxxxxxxxxxxx (Nick Maclaren)
- Date: 17 Jun 2007 10:16:39 GMT
In article <1182037409.884017.141920@xxxxxxxxxxxxxxxxxxxxxxxxxxx>,
already5chosen@xxxxxxxxx writes:
|>
|> [O.T.] Last time I looked in SPARC v9 architecture manual there were
|> only three models. And I am 100% sure that all SPARC SMP systems ever
|> shipped by Sun or Fujitsu (don't know about more exotic vendors)
|> adhere to only one model, specifically TSO.
I may have miscounted, but I vaguely recollect that it looks as if there
are three, but one has two variants. Whatever. Call it three.
|> > PowerPC uses a very different
|> > one from the Intel x86, and most other CPUs are different yet again.
|>
|> [O.T.]Yes, x86 memory ordering rules in SMP systems are not very well
|> defined. Nevertheless they are well understood.
I wish :-( Yes, the simple cases are well-understood, but the subtleties
aren't. And, when writing robust or portable, the subtleties matter.
|> You try to make a simple matter complex.
|> All IBM and Unisys should assure in their 32-way boxen is as much
|> cache consistency as in two-way x86 box with shared bus. So they do.
Ah. The difference between theory and practice is less in theory than
it is in practice. I should put a lot more trust in a clear statement
by the relevant vendor than your claim that, because your opinion is
that they should do that, they do.
|> Otherwise how this boxes would off-the-shelf Windows/Linux?
All of the information that I have received from people who have tried
to get high communication, medium-sized SMP applications to work on
those systems is that they don't. So far, I haven't found anyone who
has tracked down why, but the problem is that (a) such machines are
rare, (b) they are often used only for applications that don't stress
the cache coherence and most of all (c) there are DAMN few people with
the skill, obstinacy and time to track down the causes.
|> And BTW I don't understand how interrupts belong here.
Precisely. Drink deep or taste not the Pierian spring.
A large number of operations are completed by interrupt; this includes
many floating-point ones (except perhaps on POWER), but also includes
many memory access ones. Nowadays, TLB misses are almost always handled
'early' (because of physical caches), but ECC obviously can't be. Now,
most architectures rely on their pipeline constraints (e.g. write buffers)
to maintain their SMP invariants, but obviously this doesn't apply if the
pipeline is interrupted.
At a naive level, there is a single pipeline that is stopped, dead,
while the interrupt is handled. Well, that hasn't been true for years
(and wasn't true on all systems even in the 1960s). In theory, the FLIH
is supposed to ensure that invariants are preserved but, in practice, it
often doesn't or even can't do so. So you get a failure of the SMP model
if you get an inconvenient interrupt.
This can also happen for certain high-priority asynchronous interrupts,
too, and can lead to occasional ordering problems "that can't occur".
In particular, on a large SMP, one CPU often needs another to do something
URGENTLY, and so interrupts it at very high priority. I have seen an
impossible memory model failure on an SGI Origin due to that, with
positive identification.
|> You seem to agree that Intel does compete "in the medium to large SMP
|> arena" with Itanium. To your information, Intel-made Itanium chipsets
|> don't extend themselves beyond 8-ways in theory, and 4-ways in
|> practice. Even these chipsets are now mostly obsolete and soon EOLed.
|> All larger Itanium systems are based on HP, Fujitsu, SGI, NEC, Hitachi
|> and Unisys chipsets to none of which Intel has any rights.
|> I don't see how situation with IBM and Unisys chipsets for XeonMP is
|> any different.
You clearly don't know anything about those Itanium agreements. I know
very little, but I can tell you from direct, CONTRACTUAL statements from
several Tier 1 vendors that they are VERY different from the x86 range
ones. In particular, Intel DOES have SOME rights and powers over the
chipsets developed for use with it - I know a little of what they are,
but that was under NDA, so I obviously can't say more.
If you take a look at the online comics' articles of the late 1990s,
you will see references to just such rights. I doubt that they knew any
more than me, but it is confirmation of my statement.
Regards,
Nick Maclaren.
.
- Follow-Ups:
- Re: Software Optimization Guide for AMD Family 10h Processors
- From: Nick Maclaren
- Re: Software Optimization Guide for AMD Family 10h Processors
- From: Andrew Reilly
- Re: Software Optimization Guide for AMD Family 10h Processors
- References:
- Re: Software Optimization Guide for AMD Family 10h Processors
- From: Klaus Fehrle
- Re: Software Optimization Guide for AMD Family 10h Processors
- From: Nick Maclaren
- Re: Software Optimization Guide for AMD Family 10h Processors
- From: Del Cecchi
- Re: Software Optimization Guide for AMD Family 10h Processors
- From: Nick Maclaren
- Re: Software Optimization Guide for AMD Family 10h Processors
- From: Del Cecchi
- Re: Software Optimization Guide for AMD Family 10h Processors
- From: Nick Maclaren
- Re: Software Optimization Guide for AMD Family 10h Processors
- From: already5chosen
- Re: Software Optimization Guide for AMD Family 10h Processors
- Prev by Date: Fast benchmarks on SPEC CPU2000
- Next by Date: Re: Which virtualization mechanism works on most of commonly used HW?
- Previous by thread: Re: Software Optimization Guide for AMD Family 10h Processors
- Next by thread: Re: Software Optimization Guide for AMD Family 10h Processors
- Index(es):
Relevant Pages
|