Re: Question on scalability of multi-core Processors



Stephen Fuld <S.Fuld@xxxxxxxxxxxxxxxxxxxx> writes:
I take from this and your previous post that not only is the single core
performance not going to improve, it is actually going to decline as
time goes on. That is, in order to make room in the die and power
budget for more cores the performance of each core will be less than
that of the cores in the previous generation.

If we have CPUs with a uniform set of cores, and go for manycore.
OTOH, Intel could put one or two high-performance cores in the
package, to make legacy code run satisfactorily, and also put, say 16
Atom-style cores in the package for applications that use many threads
well. To see if this flies in the market, they could even do this as
an MCM: One Nehalem-style chip and one multi-Atom chip, connected via
Hypertransport (sorry, don't remember Intel's name for that), each
talking to some memory.

Of course with CPUs that integrate GPUs, we will see something similar
to this from Intel anyway, but the graphics cores are not quite
equivalent to CPU cores as seen by software, so that's an issue.

The problem with such a proposal is that the OS schedulers are not
very good in my experience even now, they will take their time to
adapt to heterogeneous-performance cores, and after that time they
will still get it wrong.

Also, I very much take your point about pin/memory limitations. AFAICT,
the big advantage of a separate graphics chip is that it it gives you a
lot more pins to use for memory interface, or to put it another way, it
takes a lot of the video output memory traffic off the main memory.

Video output memory traffic is a relatively small part of memory
bandwidth for modern graphics chips: Even 2560x1600x32@60Hz only needs
1GB/s bandwidth. You don't need memory interfaces with 50GB/s and
more for that. AFAIK reading textures and geometry data, and reading
and writing to the Z-Buffer consume a lot of traffic.

- anton
--
M. Anton Ertl Some things have to be seen to be believed
anton@xxxxxxxxxxxxxxxxxxxxxxxxxx Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html
.



Relevant Pages

  • Re: Intel details future Larrabee graphics chip
    ... for dinky little SMP systems of 4-8 cores. ... Why multi-thread *anything* when hundreds or thousands of CPUs are ... video CPUs using fancy memory and generics doing the grunt work. ... Duo, and never get trojans, memory leaks, any of that. ...
    (sci.electronics.design)
  • Re: HARSH article on TheRegister - Jobs lied about Intel speed, doesnt care about computers now
    ... They have 2 cores, not 2 CPUs. ... L2 cache or the shared access to main memory. ... This is one place that Opteron wins over Xeon; ...
    (uk.comp.sys.mac)
  • Re: Intel details future Larrabee graphics chip
    ... That is precisely how the early SMP systems worked, ... for dinky little SMP systems of 4-8 cores. ... Why multi-thread *anything* when hundreds or thousands of CPUs are ... It could be useful for 3D gaming, but even there it still makes sense to split the load across specialised dedicated video CPUs using fancy memory and generics doing the grunt work. ...
    (sci.electronics.design)
  • Re: Verbose functional languages?
    ... Speaking of multiple cores: when I look at what's Intel talking about, ... Even memory allocation would create all kinds of mutual wait ... I'm less convinced about general use of semi-space moving collectors ...
    (comp.lang.functional)
  • Re: Target market for Intellasys.
    ... I was wrong about that Ambarella chip, it's average power requirements are more than I thought. ... With the 1 transistor dram, the substrate acts as a capacitor, so theoretically you get many times more memory density, good speed etc. ... I for one would be dropping in 10+DACS, extra processors, extra memory, and if available 36bit processor cores and full external SRAM memory buss mapped to one core. ... But such a scheme would allow customers to easily order a module populated with a desired amount of memory cores, and it would cost intellasys a lot less than putting memory on the processor. ...
    (comp.lang.forth)