Re: Superstitious learning in Computer Architecture



John,

I don't know of any highly
computationally intensive applications requiring long precision.

This may be a slight exaggeration. Certainly I know of nothing at the very high end that would greatly benefit from teraflops, or better yet petaflops. The usual gigaflop thing that suffers from short precision is numerical differentiation, which can usually be eliminated by applying better mathematical methods prior to programming. However, most people would prefer to do a LOT of programming than to clean up their math.

If that is indeed the case, then the design of supercomputers - the
Stretch, the Control Data 6600, the Cray-1 - has indeed been badly
misguided.

A point that I have been making for ~20 years. I had a long discussion about this with a Cray salesman. His primary arguments for their long-precision-only approach were:
1. There ARE some computationally-intensive programs that need long precision for everything, that run on existing computers. They didn't want to sell something that was worse than the status quo for ANYTHING.
2. Some things must be done somewhat differently in short precision, which makes a compatibility issue that they didn't want to deal with.
3. Back then, people viewed computers as "FORTRAN engines" which either worked and produced a benchmark result, or didn't. To run in short precision, it is likely that the benchmarks themselves may have to be altered, and that would invalidate any claimed performance gains.
4. To get around the above problems they could provide both short and long preceision, as many other machines have done. However, since the benchmarks would have to be run in long precision, this would just up the manufacturing cost without improving the benchmarks.

Against this, my argument that most (though not all) programs could run orders of magnitude faster with some architectural improvements that would NOT affect current benchmark programs just didn't prevail.

Doing a Google search on "Computational Fluid Dynamics" and precision
brought up a few mentions of how double-precision (64-bit
floating-point) was associated with that type of problem.

Fluid dynamics utilizes numerical differentiation. As mentioned above, twisting the math around to represent the differentiated quantity and then integrating that could overcome this problem. For example, by representing the fluid acceleration and integrating for velocity, and integrating again for position, you can circumvent the short precision problems. Of course, this would involve recreating (rather than just rewriting) the entire program that has been refined over decades of use.

It may be that since the raw input measurements are only to slide-rule
precision, the problem is a failure to use proper mathematical
techniques to handle things like ill-conditioned matrices - but it's
hard to believe that an entire community of scientists and engineers
would have a sudden attack of incompetency in numerical analysis
techniques.

Two comments:

If you already have double precision on your computer, then why go to a lot of work to avoid using it?

Having been the numerical analysis consultant for the University of Washington Physics department for 2 years, I have absolutely no problem believing how much incompetency there is! In my time there, I don't think I ever ran onto anyone who was smart enough to put the "redundant" parentheses into expressions to evaluate them in the numerically best order, other than the people that I showed how to do this. Now days, most compilers are "smart enough" to ignore those parentheses!

I *do* remember coming across plaintive mutterings that the
single-precision on the 7094 was useful for many purposes, while
single-precision on the 360, with its poor treatment of the least
significant bits and its shorter precision, just wasn't good enough.

Yes, I ran into that at Boeing. Going to double precision didn't work either because it doubled the memory needed and nearly quadrupled the execution time. Finally, they gave up on using 360s for CAD work.

That, and the design of most early pocket calculators (10 digits of
precision in the display, one to three extra digits kept internally)
has made me think that it's a pity that a 48-bit floating-point data
type isn't more commonly available, since it seems that such a type
would be "just right" for many problems.

Yes. Burroughs used 48-bit, as did Remote Time Sharing that I wrote about.

Current microprocessors seem to be optimized around C's floating-point
blunder, and perform 32-bit floating-point arithmetic at the same speed
as 64-bit floating-point arithmetic. But if you're using Wallace Tree
multiplication, instead of a serial microprogrammed method in a 16-bit
ALU (remember that nice little knob on the front of the 360/44, that
could cause double-precision floats to be handled with reduced
precision, ignoring the last 3, 2, 1, or 0 bytes of the number?) then
you have to build a second Wallace Tree to process 32-bit floats any
faster.

I *will* agree, heartily, that if you only need 32-bit or 48-bit
floats, you shouldn't have to pay for the speed of 64-bit floats - or
80-bit floats. And, of course, IEEE-754 requirements do put a big
monkey wrench into using Goldschmidt division - because if you have to
do an extra adjustment step for even a tiny fraction of cases, that
lengthens every division if one has to maintain a nice, orderly
pipeline.

You appear to be suggesting some things that my FP proposal (usable on current PC hardware and using current compilers) at <http://www.smart-life.net/FP/> is missing - even after the ~3K postings about it!!! You sound like you are deeper into the subtleties of tradeoffs between speed and accuracy than I am, so I would welcome your suggestions about incorporating this into the proposal.

perhaps a "Q&D" configuration bit that grants permission to trash the last few bits of the result?
Any thoughts?

In my hypothetical architecture I use as an example of how an
excrutiatingly baroque computer, including almost every possible
feature from the history of computing, in an attempt to maximize single
processor performance before breaking down and allowing more cores

I am partial to tiny main cores with a multitude of fancy ALUs attached, that just runs a little slower if some of the fancy ALUs are dead.

- due to a surfeit of transistors - I actually illustrated an operating
mode in which, thanks to incompletely-filled cache lines, the machine
could be put into a mode in which it could simulate 36-bit word memory
for its single-precision floating-point numbers, and 48-bit word memory
for its medium-precision floating-point numbers, and kept conventional
byte-oriented memory for double-precision at 64 bits.

http://www.quadibloc.com/arch/ar0502.htm

mixed alignment mode.

If you do that, though, different floating-point precisions can't be
combined in the same data structure or EQUIVALENCE statement.

Memory is essentially free these days. If you want 48 bits, just use 48 out of 64 bits.

Steve Richfie1d
.



Relevant Pages

  • Re: Superstitious learning in Computer Architecture
    ... Doing a Google search on "Computational Fluid Dynamics" and precision ... floating-point) was associated with that type of problem. ... single-precision on the 360, with its poor treatment of the least ... you have to build a second Wallace Tree to process 32-bit floats any ...
    (comp.arch.arithmetic)
  • Re: Strange Calculation Error in Excel (2)
    ... > Calling it a "mistake" suggests that you still do not understand. ... > an inevitable consequence of finite precision mathematics. ... BCD is rarely done in computers, ... precision or force me to display 10-decimal-point for every figure (clumsy ...
    (microsoft.public.excel.misc)
  • Re: IS high energy physics real ?
    ... >results according to Feynman diagrams. ... >they be curious to know if a higher precision ... theoreticians can interpret the data. ... >3) particle accelarators, theories, computers ...
    (sci.physics)
  • Re: Calculating Wishes (was hpcatalog.com)
    ... But compared to modern computers, ... Sun-Earth distance useful even if it's know to a precision clearly ... digits of precision. ... > If a calculator offerred say 19 or more digits of precision, ...
    (comp.sys.hp48)
  • IS high energy physics real ?
    ... I read that particle physics uses ... results according to Feynman diagrams. ... According to the precision these diagrams ... particle accelarators, theories, computers ...
    (sci.physics)