Re: New Xeon CPU's (Paxil) = Underpowered



Looks like someone set the Wayback machine for June... note the similar
PS3 problems... (anandtech has since removed the article, but it
remains archived on Google for all time.)

The problem, as usual, is comparing game system performance to PC
perfomance. Halo runs perfectly fine on a 733 mhz Xbox processor with,
what? 64 MB of ram? Try running the PC version on hardware with equal
spec.

Newsgroups: alt.games.video.sony-playstation2, alt.games.video.xbox,
japan.videogames.playstation, microsoft.public.xbox,
microsoft.public.xbox.games, rec.games.video.advocacy,
rec.games.video.sony, uk.games.video.playstation, uk.games.video.xbox
From: <xenos> - Find messages by this author
Date: Wed, 29 Jun 2005 15:11:13 -0500
Local: Wed, Jun 29 2005 1:11 pm
Subject: Anandtech: both X360 and PS3 CPUs suck incredibly bad
Reply to Author | Forward | Print | View Thread | Show original |
Report Abuse

http://www.anandtech.com/video/showdoc.aspx?i=2461

Microsoft's Xbox 360 & Sony's PlayStation 3 - Examples of Poor CPU
Performance

Date: June 29th, 2005
Author: Anand Lal Shimpi

"In our last article we had a fairly open-ended discussion about many
of the
challenges facing both of the recently announced next-generation game
consoles. We discussed misconceptions about the Cell processor and its
ability to accelerate physics calculations, as well as touched on the
GPUs
of both platforms. In the end, both the Xbox 360 and the PlayStation 3
are
much closer competitors than you would think based on first
impressions.

The Xbox 360's Xenon CPU features more general purpose cores than the
PlayStation 3 (3 vs. 1), however game developers will most likely only
be
using one of those cores for the majority of their calculations,
leveling
the playing field considerably.

The Cell processor derives much of its power from its array of 7 SPEs
(Synergistic Processing Elements), however as we discovered in our last
article, their purpose is far more specialized than we had thought.
Speaking with Epic Games' head developer, Tim Sweeney, he provided a
much
more balanced view of what sorts of tasks could take advantage of the
Cell's
SPE array.

The GPUs of the next-generation platforms also proved to be quite
interesting. In Part I we speculated as to the true nature of NVIDIA's
RSX
in the PS3, concluding that it's quite likely little more than a higher
clocked G70 GPU. We will expand on that discussion a bit more in this
article. We also looked at Xenos, the Xbox 360's GPU and characterized
it
as equivalent to a very flexible 24-pipe R420. Despite the inclusion
of the
10MB of embedded DRAM, Xenos and RSX ended up being quite similar in
our
expectations for performance; and that pretty much summarized all of
our
findings - the two consoles, although implementing very different
architectures, ended up being so very similar.

So we've concluded that the two platforms will probably end up
performing
very similarly, but there was one very important element excluded from
the
first article: a comparison to present-day PC architectures. The
reason a
comparison to PC architectures is important is because it provides an
evaluation point to gauge the expected performance of these
next-generation
consoles. We've heard countless times that these new consoles would
offer
better gaming performance than anything we've had on the PC, or
anything we
would have for a matter of years. Now it's time to actually put those
claims to the test, and that's exactly what we did.

Speaking under conditions of anonymity with real world game developers
who
have had first hand experience writing code for both the Xbox 360 and
PlayStation 3 hardware (and dev kits where applicable), we asked them
for
nothing more than their brutal honesty. What did they think of these
new
consoles? Are they really outfitted with the PC-eclipsing performance
we've
been lead to believe they have? The answer is actually quite
frequently
found in history; as with anything, you get what you pay for.

Learning from Generation X
The original Xbox console marked a very important step in the evolution
of
gaming consoles - it was the first console that was little more than a
Windows PC.

It featured a 733MHz Pentium III processor with a 128KB L2 cache,
paired up
with a modified version of NVIDIA's nForce chipset (modified to support
Intel's Pentium III bus instead of the Athlon XP it was designed for).
The
nForce chipset featured an integrated GPU, codenamed the NV2A, offering
performance very similar to that of a GeForce3. The system had a 5X PC
DVD
drive and an 8GB IDE hard drive, and all of the controllers interfaced
to
the console using USB cables with a proprietary connector.

For the most part, game developers were quite pleased with the original
Xbox. It offered them a much more powerful CPU, GPU and overall
platform
than anything had before. But as time went on, there were definitely
limitations that developers ran into with the first Xbox.

One of the biggest limitations ended up being the meager 64MB of memory
that
the system shipped with. Developers had asked for 128MB and the
motherboard
even had positions silk screened for an additional 64MB, but in an
attempt
to control costs the final console only shipped with 64MB of memory.

The next problem is that the NV2A GPU ended up not having the fill rate
and
memory bandwidth necessary to drive high resolutions, which kept the
Xbox
from being used as a HD console.

Although Intel outfitted the original Xbox with a Pentium III/Celeron
hybrid
in order to improve performance yet maintain its low cost, at 733MHz
that
quickly became a performance bottleneck for more complex games after
the
console's introduction.

The combination of GPU and CPU limitations made 30 fps a frame rate
target
for many games, while simpler titles were able to run at 60 fps. Split
screen play on Halo would even stutter below 30 fps depending on what
was
happening on screen, and that was just a first-generation title. More
experience with the Xbox brought creative solutions to the limitations
of
the console, but clearly most game developers had a wish list of things
they
would have liked to have seen in the Xbox successor. Similar
complaints
were levied against the PlayStation 2, but in some cases they were more
extreme (e.g. its 4MB frame buffer).

Given that consoles are generally evolutionary, taking lessons learned
in
previous generations and delivering what the game developers want in
order
to create the next-generation of titles, it isn't a surprise to see
that a
number of these problems are fixed in the Xbox 360 and PlayStation 3.

One of the most important changes with the new consoles is that system
memory has been bumped from 64MB on the original Xbox to a whopping
512MB on
both the Xbox 360 and the PlayStation 3. For the Xbox, that's a factor
of 8
increase, and over 12x the total memory present on the PlayStation 2.

The other important improvement with the next-generation of consoles is
that
the GPUs have been improved tremendously. With 6 - 12 month product
cycles,
it's no surprise that in the past 4 years GPUs have become much more
powerful. By far the biggest upgrade these new consoles will offer,
from a
graphics standpoint, is the ability to support HD resolutions.

There are obviously other, less-performance oriented improvements such
as
wireless controllers and more ubiquitous multi-channel sound support.
And
with Sony's PlayStation 3, disc capacity goes up thanks to their
embracing
the Blu-ray standard.

But then we come to the issue of the CPUs in these next-generation
consoles, and the level of improvement they offer. Both the Xbox 360
and
the PlayStation 3 offer multi-core CPUs to supposedly usher in a new
era of
improved game physics and reality. Unfortunately, as we have found
out, the
desire to bring multi-core CPUs to these consoles was made a reality at
the
expense of performance in a very big way.

Problems with the Architecture
At the heart of both the Xenon and Cell processors is IBM's custom
PowerPC
based core. We've discussed this core in our previous articles, but it
is
best characterized as being quite simple. The core itself is a very
narrow
2-issue in-order execution core, featuring a 64KB L1 cache (32K
instruction/32K data) and either a 1MB or 512KB L2 cache (for Xenon or
Cell,
respectively). Supporting SMT, the core can execute two threads
simultaneously similar to a Hyper Threading enabled Pentium 4. The
Xenon
CPU is made up of three of these cores, while Cell features just one.

Each individual core is extremely small, making the 3-core Xenon CPU in
the
Xbox 360 smaller than a single core 90nm Pentium 4. While we don't
have
exact die sizes, we've heard that the number is around 1/2 the size of
the
90nm Prescott die.

IBM's pitch to Microsoft was based on the peak theoretical floating
point
performance-per-dollar that the Xenon CPU would offer, and given
Microsoft's
focus on cost savings with the Xbox 360, they took the bait.

While Microsoft and Sony have been childishly playing this flops-war,
comparing the 1 TFLOPs processing power of the Xenon CPU to the 2
TFLOPs
processing power of the Cell, the real-world performance war has
already
been lost.

Right now, from what we've heard, the real-world performance of the
Xenon
CPU is about twice that of the 733MHz processor in the first Xbox.
Considering that this CPU is supposed to power the Xbox 360 for the
next 4 -
5 years, it's nothing short of disappointing. To put it in
perspective,
floating point multiplies are apparently 1/3 as fast on Xenon as on a
Pentium 4.

The reason for the poor performance? The very narrow 2-issue in-order
core
also happens to be very deeply pipelined, apparently with a branch
predictor
that's not the best in the business. In the end, you get what you pay
for,
and with such a small core, it's no surprise that performance isn't
anywhere
near the Athlon 64 or Pentium 4 class.

The Cell processor doesn't get off the hook just because it only uses a
single one of these horribly slow cores; the SPE array ends up being
fairly
useless in the majority of situations, making it little more than a
waste of
die space.

We mentioned before that collision detection is able to be accelerated
on
the SPEs of Cell, despite being fairly branch heavy. The lack of a
branch
predictor in the SPEs apparently isn't that big of a deal, since most
collision detection branches are basically random and can't be
predicted
even with the best branch predictor. So not having a branch predictor
doesn't
hurt, what does hurt however is the very small amount of local memory
available to each SPE. In order to access main memory, the SPE places
a DMA
request on the bus (or the PPE can initiate the DMA request) and waits
for
it to be fulfilled. From those that have had experience with the PS3
development kits, this access takes far too long to be used in many
real
world scenarios. It is the small amount of local memory that each SPE
has
access to that limits the SPEs from being able to work on more than a
handful of tasks. While physics acceleration is an important one,
there are
many more tasks that can't be accelerated by the SPEs because of the
memory
limitation.

The other point that has been made is that even if you can offload some
of
the physics calculations to the SPE array, the Cell's PPE ends up being
a
pretty big bottleneck thanks to its overall lackluster performance.
It's
akin to having an extremely fast GPU but without a fast CPU to pair it
up
with.

What About Multithreading?
We of course asked the obvious question: would game developers rather
have 3
slow general purpose cores, or one of those cores paired with an array
of
specialized SPEs? The response was unanimous, everyone we have spoken
to
would rather take the general purpose core approach.

Citing everything from ease of programming to the limitations of the
SPEs we
mentioned previously, the Xbox 360 appears to be the more
developer-friendly
of the two platforms according to the cross-platform developers we've
spoken
to. Despite being more developer-friendly, the Xenon CPU is still not
what
developers wanted.

The most ironic bit of it all is that according to developers, if
either
manufacturer had decided to use an Athlon 64 or a Pentium D in their
next-gen console, they would be significantly ahead of the competition
in
terms of CPU performance.

While the developers we've spoken to agree that heavily multithreaded
game
engines are the future, that future won't really take form for another
3 - 5
years. Even Microsoft admitted to us that all developers are focusing
on
having, at most, one or two threads of execution for the game engine
itself - not the four or six threads that the Xbox 360 was designed
for.

Even when games become more aggressive with their multithreading,
targeting
2 - 4 threads, most of the work will still be done in a single thread.
It
won't be until the next step in multithreaded architectures where that
single thread gets broken down even further, and by that time we'll be
talking about Xbox 720 and PlayStation 4. In the end, the more
multithreaded nature of these new console CPUs doesn't help paint much
of a
brighter performance picture - multithreaded or not, game developers
are not
pleased with the performance of these CPUs.

What about all those Flops?
The one statement that we heard over and over again was that Microsoft
was
sold on the peak theoretical performance of the Xenon CPU. Ever since
the
announcement of the Xbox 360 and PS3 hardware, people have been set on
comparing Microsoft's figure of 1 trillion floating point operations
per
second to Sony's figure of 2 trillion floating point operations per
second
(TFLOPs). Any AnandTech reader should know for a fact that these
numbers
are meaningless, but just in case you need some reasoning for why,
let's
look at the facts.

First and foremost, a floating point operation can be anything; it can
be
adding two floating point numbers together, or it can be performing a
dot
product on two floating point numbers, it can even be just calculating
the
complement of a fp number. Anything that is executed on a FPU is fair
game
to be called a floating point operation.

Secondly, both floating point power numbers refer to the whole system,
CPU
and GPU. Obviously a GPU's floating point processing power doesn't mean
anything if you're trying to run general purpose code on it and vice
versa.
As we've seen from the graphics market, characterizing GPU performance
in
terms of generic floating point operations per second is far from the
full
performance story.

Third, when a manufacturer is talking about peak floating point
performance
there are a few things that they aren't taking into account. Being
able to
process billions of operations per second depends on actually being
able to
have that many floating point operations to work on. That means that
you
have to have enough bandwidth to keep the FPUs fed, no mispredicted
branches, no cache misses and the right structure of code to make sure
that
all of the FPUs can be fed at all times so they can execute at their
peak
rates. We already know that's not the case as game developers have
already
told us that the Xenon CPU isn't even in the same realm of performance
as
the Pentium 4 or Athlon 64. Not to mention that the requirements for
hitting peak theoretical performance are always ridiculous; caches are
only
so big and thus there will come a time where a request to main memory
is
needed, and you can expect that request to be fulfilled in a few
hundred
clock cycles, where no floating point operations will be happening at
all.

So while there may be some extreme cases where the Xenon CPU can hit
its
peak performance, it sure isn't happening in any real world code.

The Cell processor is no different; given that its PPE is identical to
one
of the PowerPC cores in Xenon, it must derive its floating point
performance
superiority from its array of SPEs. So what's the issue with 218
GFLOPs
number (2 TFLOPs for the whole system)? Well, from what we've heard,
game
developers are finding that they can't use the SPEs for a lot of tasks.
So
in the end, it doesn't matter what peak theoretical performance of
Cell's
SPE array is, if those SPEs aren't being used all the time.

Another way to look at this comparison of flops is to look at integer
add
latencies on the Pentium 4 vs. the Athlon 64. The Pentium 4 has two
double
pumped ALUs, each capable of performing two add operations per clock,
that's
a total of 4 add operations per clock; so we could say that a 3.8GHz
Pentium
4 can perform 15.2 billion operations per second. The Athlon 64 has
three
ALUs each capable of executing an add every clock; so a 2.8GHz Athlon
64
can perform 8.4 billion operations per second. By this silly console
marketing logic, the Pentium 4 would be almost twice as fast as the
Athlon
64, and a multi-core Pentium 4 would be faster than a multi-core Athlon
64.
Any AnandTech reader should know that's hardly the case. No code is
composed entirely of add instructions, and even if it were, eventually
the
Pentium 4 and Athlon 64 will have to go out to main memory for data,
and
when they do, the Athlon 64 has a much lower latency access to memory
than
the P4. In the end, despite what these horribly concocted numbers may
lead
you to believe, they say absolutely nothing about performance. The
exact
same situation exists with the CPUs of the next-generation consoles;
don't
fall for it.

Why did Sony/MS do it?
For Sony, it doesn't take much to see that the Cell processor is eerily
similar to the Emotion Engine in the PlayStation 2, at least
conceptually.
Sony clearly has an idea of what direction they would like to go in,
and it
doesn't happen to be one that's aligned with much of the rest of the
industry. Sony's past successes have really come, not because of the
hardware, but because of the developers and their PSX/PS2 exclusive
titles.
A single hot title can ship hundreds of millions of consoles, and by
our
count, Sony has had many more of those than Microsoft had with the
first
Xbox.

Sony shipped around 4 times as many PlayStation 2 consoles as Microsoft
did
Xboxes, regardless of the hardware platform, a game developer won't
turn
down working with the PS2 - the install base is just that attractive.
So
for Sony, the Cell processor may be strange and even undesirable for
game
developers, but the developers will come regardless.

The real surprise was Microsoft; with the first Xbox, Microsoft
listened
very closely to the wants and desires of game developers. This time
around,
despite what has been said publicly, the Xbox 360's CPU architecture
wasn't
what game developers had asked for.

They wanted a multi-core CPU, but not such a significant step back in
single
threaded performance. When AMD and Intel moved to multi-core designs,
they
did so at the expense of a few hundred MHz in clock speed, not by
taking a
step back in architecture.

We suspect that a big part of Microsoft's decision to go with the Xenon
core
was because of its extremely small size. A smaller die means lower
system
costs, and if Microsoft indeed launches the Xbox 360 at $299 the Xenon
CPU
will be a big reason why that was made possible.

Another contributing factor may be the fact that Microsoft wanted to
own the
IP of the silicon that went into the Xbox 360. We seriously doubt that
either AMD or Intel would be willing to grant them the right to make
Pentium
4 or Athlon 64 CPUs, so it may have been that IBM was the only partner
willing to work with Microsoft's terms and only with this one specific
core.

Regardless of the reasoning, not a single developer we've spoken to
thinks
that it was the right decision.

The Saving Grace: The GPUs
Although both manufacturers royally screwed up their CPUs, all
developers
have agreed that they are quite pleased with the GPU power of the
next-generation consoles.

First, let's talk about NVIDIA's RSX in the PlayStation 3. We
discussed the
possibility of RSX offloading vertex processing onto the Cell
processor, but
more and more it seems that isn't the case. It looks like the RSX will
basically be a 90nm G70 with Turbo Cache running at 550MHz, and the
performance will be quite good.

One option we didn't discuss in the last article, was that the G70 GPU
may
feature a number of disabled shader pipes already to improve yield.
The
move to 90nm may allow for those pipes to be enabled and thus allowing
for
another scenario where the RSX offers higher performance at the same
transistor count as the present-day G70. Sony may be hesitant to
reveal the
actual number of pixel and vertex pipes in the RSX because honestly
they
won't know until a few months before mass production what their final
yields
will be.

Despite strong performance and support for 1080p, a large number of
developers are targeting 720p for their PS3 titles and won't support
1080p.
Those that are simply porting current-generation games over will have
no
problems running at 1080p, but anyone working on a truly
next-generation
title won't have the fill rate necessary to render at 1080p.

Another interesting point is that despite its lack of "free 4X AA" like
the
Xbox 360, in some cases it won't matter. Titles that use longer pixel
shader programs end up being bound by pixel shader performance rather
than
memory bandwidth, so the performance difference between no AA and 2X/4X
AA
may end up being quite small. Not all titles will push the RSX to the
limits however, and those titles will definitely see a performance drop
with
AA enabled. In the end, whether the RSX's lack of embedded DRAM
matters
will be entirely dependent on the game engine being developed for the
platform. Games that make more extensive use of long pixel shaders
will see
less of an impact with AA enabled than those that are more texture
bound.
Game developers are all over the map on this one, so it wouldn't be
fair to
characterize all of the games as falling into one category or another.

ATI's Xenos GPU is also looking pretty good and most are expecting
performance to be very similar to the RSX, but real world support for
this
won't be ready for another couple of months. Developers have just
recently
received more final Xbox 360 hardware, and gauging performance of the
actual
Xenos GPU compared to the R420 based solutions in the G5 development
kits
will take some time. Since the original dev kits offered significantly
lower performance, developers will need a bit of time to figure out
what
realistic limits the Xenos GPU will have.

Final Words
Just because these CPUs and GPUs are in a console doesn't mean that we
should throw away years of knowledge from the PC industry - performance
doesn't come out of thin air, and peak performance is almost never
achieved.
Clever marketing however, will always try to fool the consumer.

And that's what we have here today, with the Xbox 360 and PlayStation
3.
Both consoles are marketed to be much more powerful than they actually
are,
and from talking to numerous game developers it seems that the real
world
performance of these platforms isn't anywhere near what it was supposed
to
be.

It looks like significant advancements in game physics won't happen on
consoles for another 4 or 5 years, although it may happen with PC games
much
before that.

It's not all bad news however; the good news is that both GPUs are
quite
possibly the most promising part of the new consoles. With the
performance
that we have seen from NVIDIA's G70, we have very high expectations for
the
360 and PS3. The ability to finally run at HD resolutions in all games
will
bring a much needed element to console gaming.

And let's not forget all of the other improvements to these
next-generation
game consoles. The CPUs, despite being relatively lackluster, will
still be
faster than their predecessors and increased system memory will give
developers more breathing room. Then there are other improvements
such as
wireless controllers, better online play and updated game engines that
will
contribute to an overall better gaming experience.

In the end, performance could be better, the consoles aren't what they
could
have been had the powers at be made some different decisions. While
they
will bring better quality games to market and will be better than their
predecessors, it doesn't look like they will be the end of PC gaming
any
more than the Xbox and PS2 were when they were launched. The two
markets
will continue to coexist, with consoles being much easier to deal with,
and
PCs offering some performance-derived advantages.

With much more powerful CPUs and, in the near future, more powerful
GPUs,
the PC paired with the right developers should be able to bring about
that
revolution in game physics and graphics we've been hoping for.
Consoles
will help accelerate the transition to multithreaded gaming, but it
looks
like it will take PC developers to bring about real change in things
like
game physics, AI and other non-visual elements of gaming. "

.



Relevant Pages

  • Re: New Xeon CPUs (Paxil) = Underpowered
    ... > The problem, as usual, is comparing game system performance to PC ... both the Xbox 360 and the PlayStation ... > The Xbox 360's Xenon CPU features more general purpose cores than the ... > findings - the two consoles, ...
    (alt.games.video.xbox)
  • Re: The top reason why PC game sales have flatlined.
    ... when the consoles at the time were pretty reliable. ... game won't work, for whatever reason, there's not a lot you can do about ... and xbox 360s can brick and become useless to the owner. ... either an individual game won't work properly because the laser doesn't ...
    (comp.sys.ibm.pc.games.strategic)
  • Re: Shadowrun Video Game (X-Box/Vista)
    ... This is the main reason why I have not bought one yet. ... Xbox that's out now because of v2's more efficient 65nm CPU. ... I'm definitely marking down Shadowrun as a game I want ...
    (rec.sport.pro-wrestling)
  • Re: xbox settings
    ... > When playing any game that's 480p or higher, is there a setting I need to ... You need to be using component cables and have your Xbox dashboard set ... which games support what for all consoles at www.hdtvarcade.com ...
    (alt.games.video.xbox)
  • Re: 10 Reasons why PS3 is Superior to Xbox
    ... Counter-strike is the most played online FPS game, ... Show me where you can find a statistic that says that Counter-strike ... statistics for Halo on both the Xbox and PC and Halo 2 on both the ...
    (alt.games.video.xbox)