Re: Software vs hardware floating-point [was Re: What happened ...]
- From: nmm1@xxxxxxxxx
- Date: Sun, 20 Sep 2009 09:16:18 +0100 (BST)
In article <4AB580FE.404@xxxxxxxxxxxxxxx>,
Andy \"Krazy\" Glew <ag-news@xxxxxxxxxxxxxxx> wrote:
I believe you could indeed make a
'multiply_setup/mul_core1/mul_core2/mul_normalize' perform close to
dedicated hw, but you would have to make sure that _nobody_ except the
compiler writers ever needed to be exposed to it.
Trouble is, you need about 3x the instruction fetch/decode/scheduling
bandwidth. Since that is comparable to the actual instruction execution
in terms of power, depending on your machine, it is by no means a clear win.
Nobody claims that it is a clear win - certainly neither I nor Terje
would. My assertion is that it would be better, overall, NOT solely
for performance reasons - but no more than that.
And you wouldn't need three times the instruction throughput, except
for highly tuned HPC and benchmarketing. Few 'floating-point' codes
have more than about 10% of their instructions actually executing
floating-point operations. Remember that load and store don't count,
and I said that I would also have a 'direct' comparison operation,
too. When I last measured this (decades ago), it would have needed
very little more instruction throughput, and RISC codes have more
integer operations than the ones I looked at.
You would need to be working on a code that allowed nearly all of the FP
"primitive operations" to be optimized away for it to be a win on scalar
code.
Not so. That would be true for a very few codes, but others would
gain with little or no optimisation. For example, some codes spend
half their time switching between the pipelines (yes, really), and
others are dominated by calls to mathematical functions. By merging
the pipelines, the overheads for the latter could be reduced very
considerably.
Now, working out the winners and losers, and by how much, would be
part of the research project that this proposal would involve.
Nobody is saying that it could be done by waving a magic wand.
Anyway, this is nothing new. I investigated this with a mind to
exposing the primitives to the compiler in the P6 era. Trouble is, the
compiler had bigger fish to fry.
Yup. I never said that it was new - it predates my involvement in
computing, and the reason you say is the reason it has never been
restarted.
Regards,
Nick Maclaren.
.
- References:
- What happened to computer architecture (and comp.arch?)
- From: Mayan Moudgill
- Re: Software vs hardware floating-point [was Re: What happened ...]
- From: Terje Mathisen
- Re: Software vs hardware floating-point [was Re: What happened ...]
- From: nmm1
- Re: Software vs hardware floating-point [was Re: What happened ...]
- From: Andy \"Krazy\" Glew
- What happened to computer architecture (and comp.arch?)
- Prev by Date: Re: What happened to computer architecture (and comp.arch?)
- Next by Date: Re: Embedded DRAM
- Previous by thread: Re: Software vs hardware floating-point [was Re: What happened ...]
- Next by thread: Re: Software vs hardware floating-point [was Re: What happened ...]
- Index(es):
Relevant Pages
|