Re: Fermi
- From: Bernd Paysan <bernd.paysan@xxxxxx>
- Date: Wed, 14 Oct 2009 11:31:13 +0200
Terje Mathisen wrote:
On key point:
There's only room for half as many DP values in the same vector
register, so you already have a pair of SP multipliers available for
each DP result.
It's still that NVidia claims that it can do half the DP operations as SP
operations in the same time. If they said "we can do half the DP operations
as SP operations per unit, and it takes twice the time", I would agree -
like Andy explains, having a primitive in the form of AX+BY is useful to
have in a GPU, and then you can really use the two such units to perform one
DP multiplication - but you also can use them to perform four SP
multiplications.
There's a trick to reduce the number of multiplications to three, but it
costs you adder resources and latency. You can also drop the A2*B2 partial
result if you aren't interested in the last few bits, and if your multiplier
array is made of components for smaller parts (e.g. if the building block is
an 8 by 8 multiplier, and you have 9 of those), you can distribute the more
interesting portions of the A1*B2+A2*B1 into a single SP multiplier and just
lose a few more bits. Note that two SP units give you only a 48 bit result,
while for exact DP you really need 53 bits.
I can imagine that such a structure for the multiplier cells is useful for a
GPU, where you can use the 8x8 multiplier components also for alpha
blending. Maybe there are enough resources of that kind to perform a
sufficiently accurate DP multiplication with just two SP multiplier cells.
But note that probably the SP multipliers already use this performance
trick, so when they do (A1,A2,A3)*(B1,B2,B3), they already drop at least
three results, and just use the 6 8x8 blocks that are necessary for alpha
blending. This won't be enough for DP.
--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/
.
- Follow-Ups:
- Re: Fermi
- From: Brett Davis
- Re: Fermi
- References:
- Prev by Date: #●#●#*** cheap fashionable Jeans etc at www.ecyaya.com
- Next by Date: ⊙┳⊙Paypal *** Nike Air Max Shoes: Air Max 87,Max 89,Max 90,Air Max 2009,Air Max LTD, Max TN <www.dotradenow.com>
- Previous by thread: Re: Fermi
- Next by thread: Re: Fermi
- Index(es):