Re: New Xeon CPU's (Paxil) = Underpowered
- From: "KillzoneBigNunts" <Killzone@xxxxxxxx>
- Date: Sat, 22 Oct 2005 23:23:33 -0400
Jordan wrote:
> Looks like someone set the Wayback machine for June... note the
> similar PS3 problems... (anandtech has since removed the article, but
> it remains archived on Google for all time.)
>
> The problem, as usual, is comparing game system performance to PC
> perfomance. Halo runs perfectly fine on a 733 mhz Xbox processor with,
> what? 64 MB of ram? Try running the PC version on hardware with equal
> spec.
>
> Newsgroups: alt.games.video.sony-playstation2, alt.games.video.xbox,
> japan.videogames.playstation, microsoft.public.xbox,
> microsoft.public.xbox.games, rec.games.video.advocacy,
> rec.games.video.sony, uk.games.video.playstation, uk.games.video.xbox
> From: <xenos> - Find messages by this author
> Date: Wed, 29 Jun 2005 15:11:13 -0500
> Local: Wed, Jun 29 2005 1:11 pm
> Subject: Anandtech: both X360 and PS3 CPUs suck incredibly bad
> Reply to Author | Forward | Print | View Thread | Show original |
> Report Abuse
>
> http://www.anandtech.com/video/showdoc.aspx?i=2461
>
> Microsoft's Xbox 360 & Sony's PlayStation 3 - Examples of Poor CPU
> Performance
>
> Date: June 29th, 2005
> Author: Anand Lal Shimpi
>
> "In our last article we had a fairly open-ended discussion about many
> of the
> challenges facing both of the recently announced next-generation game
> consoles. We discussed misconceptions about the Cell processor and
> its ability to accelerate physics calculations, as well as touched on
> the GPUs
> of both platforms. In the end, both the Xbox 360 and the PlayStation
> 3 are
> much closer competitors than you would think based on first
> impressions.
>
> The Xbox 360's Xenon CPU features more general purpose cores than the
> PlayStation 3 (3 vs. 1), however game developers will most likely only
> be
> using one of those cores for the majority of their calculations,
> leveling
> the playing field considerably.
>
> The Cell processor derives much of its power from its array of 7 SPEs
> (Synergistic Processing Elements), however as we discovered in our
> last article, their purpose is far more specialized than we had
> thought. Speaking with Epic Games' head developer, Tim Sweeney, he
> provided a much
> more balanced view of what sorts of tasks could take advantage of the
> Cell's
> SPE array.
>
> The GPUs of the next-generation platforms also proved to be quite
> interesting. In Part I we speculated as to the true nature of
> NVIDIA's RSX
> in the PS3, concluding that it's quite likely little more than a
> higher clocked G70 GPU. We will expand on that discussion a bit more
> in this article. We also looked at Xenos, the Xbox 360's GPU and
> characterized it
> as equivalent to a very flexible 24-pipe R420. Despite the inclusion
> of the
> 10MB of embedded DRAM, Xenos and RSX ended up being quite similar in
> our
> expectations for performance; and that pretty much summarized all of
> our
> findings - the two consoles, although implementing very different
> architectures, ended up being so very similar.
>
> So we've concluded that the two platforms will probably end up
> performing
> very similarly, but there was one very important element excluded from
> the
> first article: a comparison to present-day PC architectures. The
> reason a
> comparison to PC architectures is important is because it provides an
> evaluation point to gauge the expected performance of these
> next-generation
> consoles. We've heard countless times that these new consoles would
> offer
> better gaming performance than anything we've had on the PC, or
> anything we
> would have for a matter of years. Now it's time to actually put those
> claims to the test, and that's exactly what we did.
>
> Speaking under conditions of anonymity with real world game developers
> who
> have had first hand experience writing code for both the Xbox 360 and
> PlayStation 3 hardware (and dev kits where applicable), we asked them
> for
> nothing more than their brutal honesty. What did they think of these
> new
> consoles? Are they really outfitted with the PC-eclipsing performance
> we've
> been lead to believe they have? The answer is actually quite
> frequently
> found in history; as with anything, you get what you pay for.
>
> Learning from Generation X
> The original Xbox console marked a very important step in the
> evolution of
> gaming consoles - it was the first console that was little more than a
> Windows PC.
>
> It featured a 733MHz Pentium III processor with a 128KB L2 cache,
> paired up
> with a modified version of NVIDIA's nForce chipset (modified to
> support Intel's Pentium III bus instead of the Athlon XP it was
> designed for). The
> nForce chipset featured an integrated GPU, codenamed the NV2A,
> offering performance very similar to that of a GeForce3. The system
> had a 5X PC DVD
> drive and an 8GB IDE hard drive, and all of the controllers interfaced
> to
> the console using USB cables with a proprietary connector.
>
> For the most part, game developers were quite pleased with the
> original Xbox. It offered them a much more powerful CPU, GPU and
> overall platform
> than anything had before. But as time went on, there were definitely
> limitations that developers ran into with the first Xbox.
>
> One of the biggest limitations ended up being the meager 64MB of
> memory that
> the system shipped with. Developers had asked for 128MB and the
> motherboard
> even had positions silk screened for an additional 64MB, but in an
> attempt
> to control costs the final console only shipped with 64MB of memory.
>
> The next problem is that the NV2A GPU ended up not having the fill
> rate and
> memory bandwidth necessary to drive high resolutions, which kept the
> Xbox
> from being used as a HD console.
>
> Although Intel outfitted the original Xbox with a Pentium III/Celeron
> hybrid
> in order to improve performance yet maintain its low cost, at 733MHz
> that
> quickly became a performance bottleneck for more complex games after
> the
> console's introduction.
>
> The combination of GPU and CPU limitations made 30 fps a frame rate
> target
> for many games, while simpler titles were able to run at 60 fps.
> Split screen play on Halo would even stutter below 30 fps depending
> on what was
> happening on screen, and that was just a first-generation title. More
> experience with the Xbox brought creative solutions to the limitations
> of
> the console, but clearly most game developers had a wish list of
> things they
> would have liked to have seen in the Xbox successor. Similar
> complaints
> were levied against the PlayStation 2, but in some cases they were
> more extreme (e.g. its 4MB frame buffer).
>
> Given that consoles are generally evolutionary, taking lessons learned
> in
> previous generations and delivering what the game developers want in
> order
> to create the next-generation of titles, it isn't a surprise to see
> that a
> number of these problems are fixed in the Xbox 360 and PlayStation 3.
>
> One of the most important changes with the new consoles is that system
> memory has been bumped from 64MB on the original Xbox to a whopping
> 512MB on
> both the Xbox 360 and the PlayStation 3. For the Xbox, that's a
> factor of 8
> increase, and over 12x the total memory present on the PlayStation 2.
>
> The other important improvement with the next-generation of consoles
> is that
> the GPUs have been improved tremendously. With 6 - 12 month product
> cycles,
> it's no surprise that in the past 4 years GPUs have become much more
> powerful. By far the biggest upgrade these new consoles will offer,
> from a
> graphics standpoint, is the ability to support HD resolutions.
>
> There are obviously other, less-performance oriented improvements such
> as
> wireless controllers and more ubiquitous multi-channel sound support.
> And
> with Sony's PlayStation 3, disc capacity goes up thanks to their
> embracing
> the Blu-ray standard.
>
> But then we come to the issue of the CPUs in these next-generation
> consoles, and the level of improvement they offer. Both the Xbox 360
> and
> the PlayStation 3 offer multi-core CPUs to supposedly usher in a new
> era of
> improved game physics and reality. Unfortunately, as we have found
> out, the
> desire to bring multi-core CPUs to these consoles was made a reality
> at the
> expense of performance in a very big way.
>
> Problems with the Architecture
> At the heart of both the Xenon and Cell processors is IBM's custom
> PowerPC
> based core. We've discussed this core in our previous articles, but
> it is
> best characterized as being quite simple. The core itself is a very
> narrow
> 2-issue in-order execution core, featuring a 64KB L1 cache (32K
> instruction/32K data) and either a 1MB or 512KB L2 cache (for Xenon or
> Cell,
> respectively). Supporting SMT, the core can execute two threads
> simultaneously similar to a Hyper Threading enabled Pentium 4. The
> Xenon
> CPU is made up of three of these cores, while Cell features just one.
>
> Each individual core is extremely small, making the 3-core Xenon CPU
> in the
> Xbox 360 smaller than a single core 90nm Pentium 4. While we don't
> have
> exact die sizes, we've heard that the number is around 1/2 the size of
> the
> 90nm Prescott die.
>
> IBM's pitch to Microsoft was based on the peak theoretical floating
> point
> performance-per-dollar that the Xenon CPU would offer, and given
> Microsoft's
> focus on cost savings with the Xbox 360, they took the bait.
>
> While Microsoft and Sony have been childishly playing this flops-war,
> comparing the 1 TFLOPs processing power of the Xenon CPU to the 2
> TFLOPs
> processing power of the Cell, the real-world performance war has
> already
> been lost.
>
> Right now, from what we've heard, the real-world performance of the
> Xenon
> CPU is about twice that of the 733MHz processor in the first Xbox.
> Considering that this CPU is supposed to power the Xbox 360 for the
> next 4 -
> 5 years, it's nothing short of disappointing. To put it in
> perspective,
> floating point multiplies are apparently 1/3 as fast on Xenon as on a
> Pentium 4.
>
> The reason for the poor performance? The very narrow 2-issue in-order
> core
> also happens to be very deeply pipelined, apparently with a branch
> predictor
> that's not the best in the business. In the end, you get what you pay
> for,
> and with such a small core, it's no surprise that performance isn't
> anywhere
> near the Athlon 64 or Pentium 4 class.
>
> The Cell processor doesn't get off the hook just because it only uses
> a single one of these horribly slow cores; the SPE array ends up being
> fairly
> useless in the majority of situations, making it little more than a
> waste of
> die space.
>
> We mentioned before that collision detection is able to be accelerated
> on
> the SPEs of Cell, despite being fairly branch heavy. The lack of a
> branch
> predictor in the SPEs apparently isn't that big of a deal, since most
> collision detection branches are basically random and can't be
> predicted
> even with the best branch predictor. So not having a branch predictor
> doesn't
> hurt, what does hurt however is the very small amount of local memory
> available to each SPE. In order to access main memory, the SPE places
> a DMA
> request on the bus (or the PPE can initiate the DMA request) and waits
> for
> it to be fulfilled. From those that have had experience with the PS3
> development kits, this access takes far too long to be used in many
> real
> world scenarios. It is the small amount of local memory that each SPE
> has
> access to that limits the SPEs from being able to work on more than a
> handful of tasks. While physics acceleration is an important one,
> there are
> many more tasks that can't be accelerated by the SPEs because of the
> memory
> limitation.
>
> The other point that has been made is that even if you can offload
> some of
> the physics calculations to the SPE array, the Cell's PPE ends up
> being a
> pretty big bottleneck thanks to its overall lackluster performance.
> It's
> akin to having an extremely fast GPU but without a fast CPU to pair it
> up
> with.
>
> What About Multithreading?
> We of course asked the obvious question: would game developers rather
> have 3
> slow general purpose cores, or one of those cores paired with an array
> of
> specialized SPEs? The response was unanimous, everyone we have spoken
> to
> would rather take the general purpose core approach.
>
> Citing everything from ease of programming to the limitations of the
> SPEs we
> mentioned previously, the Xbox 360 appears to be the more
> developer-friendly
> of the two platforms according to the cross-platform developers we've
> spoken
> to. Despite being more developer-friendly, the Xenon CPU is still not
> what
> developers wanted.
>
> The most ironic bit of it all is that according to developers, if
> either
> manufacturer had decided to use an Athlon 64 or a Pentium D in their
> next-gen console, they would be significantly ahead of the competition
> in
> terms of CPU performance.
>
> While the developers we've spoken to agree that heavily multithreaded
> game
> engines are the future, that future won't really take form for another
> 3 - 5
> years. Even Microsoft admitted to us that all developers are focusing
> on
> having, at most, one or two threads of execution for the game engine
> itself - not the four or six threads that the Xbox 360 was designed
> for.
>
> Even when games become more aggressive with their multithreading,
> targeting
> 2 - 4 threads, most of the work will still be done in a single thread.
> It
> won't be until the next step in multithreaded architectures where that
> single thread gets broken down even further, and by that time we'll be
> talking about Xbox 720 and PlayStation 4. In the end, the more
> multithreaded nature of these new console CPUs doesn't help paint much
> of a
> brighter performance picture - multithreaded or not, game developers
> are not
> pleased with the performance of these CPUs.
>
> What about all those Flops?
> The one statement that we heard over and over again was that Microsoft
> was
> sold on the peak theoretical performance of the Xenon CPU. Ever since
> the
> announcement of the Xbox 360 and PS3 hardware, people have been set on
> comparing Microsoft's figure of 1 trillion floating point operations
> per
> second to Sony's figure of 2 trillion floating point operations per
> second
> (TFLOPs). Any AnandTech reader should know for a fact that these
> numbers
> are meaningless, but just in case you need some reasoning for why,
> let's
> look at the facts.
>
> First and foremost, a floating point operation can be anything; it can
> be
> adding two floating point numbers together, or it can be performing a
> dot
> product on two floating point numbers, it can even be just calculating
> the
> complement of a fp number. Anything that is executed on a FPU is fair
> game
> to be called a floating point operation.
>
> Secondly, both floating point power numbers refer to the whole system,
> CPU
> and GPU. Obviously a GPU's floating point processing power doesn't
> mean anything if you're trying to run general purpose code on it and
> vice versa.
> As we've seen from the graphics market, characterizing GPU performance
> in
> terms of generic floating point operations per second is far from the
> full
> performance story.
>
> Third, when a manufacturer is talking about peak floating point
> performance
> there are a few things that they aren't taking into account. Being
> able to
> process billions of operations per second depends on actually being
> able to
> have that many floating point operations to work on. That means that
> you
> have to have enough bandwidth to keep the FPUs fed, no mispredicted
> branches, no cache misses and the right structure of code to make sure
> that
> all of the FPUs can be fed at all times so they can execute at their
> peak
> rates. We already know that's not the case as game developers have
> already
> told us that the Xenon CPU isn't even in the same realm of performance
> as
> the Pentium 4 or Athlon 64. Not to mention that the requirements for
> hitting peak theoretical performance are always ridiculous; caches are
> only
> so big and thus there will come a time where a request to main memory
> is
> needed, and you can expect that request to be fulfilled in a few
> hundred
> clock cycles, where no floating point operations will be happening at
> all.
>
> So while there may be some extreme cases where the Xenon CPU can hit
> its
> peak performance, it sure isn't happening in any real world code.
>
> The Cell processor is no different; given that its PPE is identical to
> one
> of the PowerPC cores in Xenon, it must derive its floating point
> performance
> superiority from its array of SPEs. So what's the issue with 218
> GFLOPs
> number (2 TFLOPs for the whole system)? Well, from what we've heard,
> game
> developers are finding that they can't use the SPEs for a lot of
> tasks. So
> in the end, it doesn't matter what peak theoretical performance of
> Cell's
> SPE array is, if those SPEs aren't being used all the time.
>
> Another way to look at this comparison of flops is to look at integer
> add
> latencies on the Pentium 4 vs. the Athlon 64. The Pentium 4 has two
> double
> pumped ALUs, each capable of performing two add operations per clock,
> that's
> a total of 4 add operations per clock; so we could say that a 3.8GHz
> Pentium
> 4 can perform 15.2 billion operations per second. The Athlon 64 has
> three
> ALUs each capable of executing an add every clock; so a 2.8GHz Athlon
> 64
> can perform 8.4 billion operations per second. By this silly console
> marketing logic, the Pentium 4 would be almost twice as fast as the
> Athlon
> 64, and a multi-core Pentium 4 would be faster than a multi-core
> Athlon 64.
> Any AnandTech reader should know that's hardly the case. No code is
> composed entirely of add instructions, and even if it were, eventually
> the
> Pentium 4 and Athlon 64 will have to go out to main memory for data,
> and
> when they do, the Athlon 64 has a much lower latency access to memory
> than
> the P4. In the end, despite what these horribly concocted numbers may
> lead
> you to believe, they say absolutely nothing about performance. The
> exact
> same situation exists with the CPUs of the next-generation consoles;
> don't
> fall for it.
>
> Why did Sony/MS do it?
> For Sony, it doesn't take much to see that the Cell processor is
> eerily similar to the Emotion Engine in the PlayStation 2, at least
> conceptually.
> Sony clearly has an idea of what direction they would like to go in,
> and it
> doesn't happen to be one that's aligned with much of the rest of the
> industry. Sony's past successes have really come, not because of the
> hardware, but because of the developers and their PSX/PS2 exclusive
> titles.
> A single hot title can ship hundreds of millions of consoles, and by
> our
> count, Sony has had many more of those than Microsoft had with the
> first
> Xbox.
>
> Sony shipped around 4 times as many PlayStation 2 consoles as
> Microsoft did
> Xboxes, regardless of the hardware platform, a game developer won't
> turn
> down working with the PS2 - the install base is just that attractive.
> So
> for Sony, the Cell processor may be strange and even undesirable for
> game
> developers, but the developers will come regardless.
>
> The real surprise was Microsoft; with the first Xbox, Microsoft
> listened
> very closely to the wants and desires of game developers. This time
> around,
> despite what has been said publicly, the Xbox 360's CPU architecture
> wasn't
> what game developers had asked for.
>
> They wanted a multi-core CPU, but not such a significant step back in
> single
> threaded performance. When AMD and Intel moved to multi-core designs,
> they
> did so at the expense of a few hundred MHz in clock speed, not by
> taking a
> step back in architecture.
>
> We suspect that a big part of Microsoft's decision to go with the
> Xenon core
> was because of its extremely small size. A smaller die means lower
> system
> costs, and if Microsoft indeed launches the Xbox 360 at $299 the Xenon
> CPU
> will be a big reason why that was made possible.
>
> Another contributing factor may be the fact that Microsoft wanted to
> own the
> IP of the silicon that went into the Xbox 360. We seriously doubt
> that either AMD or Intel would be willing to grant them the right to
> make Pentium
> 4 or Athlon 64 CPUs, so it may have been that IBM was the only partner
> willing to work with Microsoft's terms and only with this one specific
> core.
>
> Regardless of the reasoning, not a single developer we've spoken to
> thinks
> that it was the right decision.
>
> The Saving Grace: The GPUs
> Although both manufacturers royally screwed up their CPUs, all
> developers
> have agreed that they are quite pleased with the GPU power of the
> next-generation consoles.
>
> First, let's talk about NVIDIA's RSX in the PlayStation 3. We
> discussed the
> possibility of RSX offloading vertex processing onto the Cell
> processor, but
> more and more it seems that isn't the case. It looks like the RSX
> will basically be a 90nm G70 with Turbo Cache running at 550MHz, and
> the performance will be quite good.
>
> One option we didn't discuss in the last article, was that the G70 GPU
> may
> feature a number of disabled shader pipes already to improve yield.
> The
> move to 90nm may allow for those pipes to be enabled and thus allowing
> for
> another scenario where the RSX offers higher performance at the same
> transistor count as the present-day G70. Sony may be hesitant to
> reveal the
> actual number of pixel and vertex pipes in the RSX because honestly
> they
> won't know until a few months before mass production what their final
> yields
> will be.
>
> Despite strong performance and support for 1080p, a large number of
> developers are targeting 720p for their PS3 titles and won't support
> 1080p.
> Those that are simply porting current-generation games over will have
> no
> problems running at 1080p, but anyone working on a truly
> next-generation
> title won't have the fill rate necessary to render at 1080p.
>
> Another interesting point is that despite its lack of "free 4X AA"
> like the
> Xbox 360, in some cases it won't matter. Titles that use longer pixel
> shader programs end up being bound by pixel shader performance rather
> than
> memory bandwidth, so the performance difference between no AA and
> 2X/4X AA
> may end up being quite small. Not all titles will push the RSX to the
> limits however, and those titles will definitely see a performance
> drop with
> AA enabled. In the end, whether the RSX's lack of embedded DRAM
> matters
> will be entirely dependent on the game engine being developed for the
> platform. Games that make more extensive use of long pixel shaders
> will see
> less of an impact with AA enabled than those that are more texture
> bound.
> Game developers are all over the map on this one, so it wouldn't be
> fair to
> characterize all of the games as falling into one category or another.
>
> ATI's Xenos GPU is also looking pretty good and most are expecting
> performance to be very similar to the RSX, but real world support for
> this
> won't be ready for another couple of months. Developers have just
> recently
> received more final Xbox 360 hardware, and gauging performance of the
> actual
> Xenos GPU compared to the R420 based solutions in the G5 development
> kits
> will take some time. Since the original dev kits offered
> significantly lower performance, developers will need a bit of time
> to figure out what
> realistic limits the Xenos GPU will have.
>
> Final Words
> Just because these CPUs and GPUs are in a console doesn't mean that we
> should throw away years of knowledge from the PC industry -
> performance doesn't come out of thin air, and peak performance is
> almost never achieved.
> Clever marketing however, will always try to fool the consumer.
>
> And that's what we have here today, with the Xbox 360 and PlayStation
> 3.
> Both consoles are marketed to be much more powerful than they actually
> are,
> and from talking to numerous game developers it seems that the real
> world
> performance of these platforms isn't anywhere near what it was
> supposed to
> be.
>
> It looks like significant advancements in game physics won't happen on
> consoles for another 4 or 5 years, although it may happen with PC
> games much
> before that.
>
> It's not all bad news however; the good news is that both GPUs are
> quite
> possibly the most promising part of the new consoles. With the
> performance
> that we have seen from NVIDIA's G70, we have very high expectations
> for the
> 360 and PS3. The ability to finally run at HD resolutions in all
> games will
> bring a much needed element to console gaming.
>
> And let's not forget all of the other improvements to these
> next-generation
> game consoles. The CPUs, despite being relatively lackluster, will
> still be
> faster than their predecessors and increased system memory will give
> developers more breathing room. Then there are other improvements
> such as
> wireless controllers, better online play and updated game engines that
> will
> contribute to an overall better gaming experience.
>
> In the end, performance could be better, the consoles aren't what they
> could
> have been had the powers at be made some different decisions. While
> they
> will bring better quality games to market and will be better than
> their predecessors, it doesn't look like they will be the end of PC
> gaming any
> more than the Xbox and PS2 were when they were launched. The two
> markets
> will continue to coexist, with consoles being much easier to deal
> with, and
> PCs offering some performance-derived advantages.
>
> With much more powerful CPUs and, in the near future, more powerful
> GPUs,
> the PC paired with the right developers should be able to bring about
> that
> revolution in game physics and graphics we've been hoping for.
> Consoles
> will help accelerate the transition to multithreaded gaming, but it
> looks
> like it will take PC developers to bring about real change in things
> like
> game physics, AI and other non-visual elements of gaming. "
This is way different dude.
Back then there was nothing to test or benchmark.
Now, at least for Paxil, there is.
And it's ***.
--
.
- References:
- Re: New Xeon CPU's (Paxil) = Underpowered
- From: Jordan
- Re: New Xeon CPU's (Paxil) = Underpowered
- Prev by Date: Re: Castlevania Curse of Darkness Demo
- Next by Date: Re: New Xeon CPU's (Paxil) = Underpowered
- Previous by thread: Re: New Xeon CPU's (Paxil) = Underpowered
- Next by thread: Re: New Xeon CPU's (Paxil) = Underpowered
- Index(es):