Re: Tilera: how exciting?
- From: Quadibloc <jsavard@xxxxxxxxx>
- Date: Wed, 29 Aug 2007 06:06:19 -0700
I wrote:
A recent news item about how a company has developed a 64-CPU chip
that runs at 900 MHz, and is fabricated using a 90nm process, was
interesting.
To me, the threshold for deciding that the days of the conventional
microprocessor are numbered would be if the component processors:
- do 64-bit floating-point arithmetic directly in hardware,
- using a Wallace Tree multiplier.
I should perhaps have explained more fully why I set such a stringent
condition on whether or not a multicore chip is really interesting.
For one thing, the economics of chip making are strongly against
larger die sizes. The larger the area of a chip, the greater the
chance of a defect; this goes up exponentially as the die size
increases.
For another, while an 8-bit processor can certainly do floating-point
arithmetic in software, this is very slow. Thus, as a single processor
becomes larger, until it reaches the point where it handles the data
types that will be used by a program directly, and does so using
faster hardware techniques, the benefits of making that processor
larger are greater than proportional to the size of the processor. In
the mainframe era, Grosch's law was a rule of thumb that said the
power of a computer was proportional to the square of its cost. This
law, even then, only worked up to a point; today, Grosch's law is
usually invisible, because that point now is reached with single chip
CPUs.
And, of course, breaking down programs into multiple threads or
exploiting parallelism in other ways is difficult. So having twice as
many processors can give you twice the throughput, if you have several
people doing different things using the same computer, but it's hard
to use more processors to make one problem get finished twice as fast.
So, on the one hand, the smallest chip that is useful is best; on the
other, making one processor bigger - up to a point - brings benefits
faster than the increase in the number of transistors used, while
having more processors brings benefits slower than the increase in the
number of transistors used.
So it only makes sense to put many processors on one chip if:
the die size is small enough not to pose significant manufacturing
problems, and
the individual processors are as large as is reasonable; they make use
of the best techniques to improve performance, and making them any
larger leaves the realm where benefits accrue faster than the
transistor count, and instead goes into the realm of diminishing
returns, if not wretched excess.
The original Pentium had 3.1 million transistors, and an early Pentium
II chip had 7.5 million transistors. The original Pentium only had L1
cache on-chip, not L2 cache, but as it already had architectural
features comparable with a System/360 model 195, both cache and
pipelining, we can use that as a benchmark for a processor that is "as
large as reasonable"; perhaps adding a few extra transistors for more
cache.
Although the Pentium II, the Pentium III, and the Core Duo are noted
as belonging to the same design family, a late Pentium III had 28
million transistors, so growth still took place within the same basic
design.
The Pentium IV started out with 42 million transistors.
The Itanium 2 started out with 221 million transistors, and now dual-
core models are up to 1.7 billion transistors, but unlike the Pentium
IV, it is a chip with a sufficiently large die size as not to be a
mass-market product, but instead one sold at a premium price.
Anyhow, as one can fit about 5 early Pentium II chips into the
transistor count of the earliest Pentium IV chips, it is clear that it
_is_ reasonable to consider a multi-core design nowadays, and even as
many as 64 cores on one chip, since the first Pentium IV computers
came out several years ago, may well be a *practical* strategy today.
Also, given that some types of problem don't involve double-precision
floating-point, but instead just involve performing simple operations
on eight-bit values, while when a big computer is needed, many small
computers are not a practical substitute, if you have only big
computers, you have to waste a whole big computer on the parts of the
program that are just handling single bytes.
So, while I don't think a chip with *lots* of 8-bit or 16-bit CPUs on
it is a good general-purpose chip, it could well be a valuable part of
a *heterogenous* computing strategy.
Imagine a chip with 16 "big" cores that do 64-bit IEEE 754 floats in
hardware with Wallace Tree multipliers and all that... and 4,096 cores
that are very basic 16-bit CPUs without even a hardware multiply
instruction, to handle things like text searching.
But memory bandwidth is one problem, and using multiple CPUs all at
once is difficult for many problems. Architectures that put smaller
processors in the memory help, but they conflict with having wider
buses, since that leads to the smaller processors not seeing
contiguous regions of main memory.
John Savard
.
- Follow-Ups:
- Re: Tilera: how exciting?
- From: Gavin Scott
- Re: Tilera: how exciting?
- References:
- Tilera: how exciting?
- From: Quadibloc
- Tilera: how exciting?
- Prev by Date: Re: Can squaring be done in hardware at least three (+ epsilon) times faster than general multiplication?
- Next by Date: Re: Can squaring be done in hardware at least three (+ epsilon) times faster than general multiplication?
- Previous by thread: Re: Tilera: how exciting?
- Next by thread: Re: Tilera: how exciting?
- Index(es):
Relevant Pages
|