Re: sin(10^300), ignorance, collusion. was Re: TILE64 embedded multicore processors - Guy Macon



And Dr. Fateman writes:
And, lastly, exactly WHAT are you claiming that the success
is with LAPACK? It runs perfectly well on other
architectures, and I know of no major extra function that it
delivers with IEEE 754.

Sure, it runs on other architectures, but one would like to say
what it computes. The extra functionality that you get is
mostly that you get a predictably reliable answer, [...]

Bit-level predictability often is broken by optimized BLAS...
And LAPACK gives reliable answers so long as the arithmetic isn't
completely absurd. But there are parts of LAPACK that benefit
from 754 arithmetic.

The currently released LAPACK uses 754 Inf and NaN propagation in
optimal (accuracy and performance, in practice *and* in theory)
algorithms for tridiagonal eigenvalue problems, optimized Sturm
count routines for bisection, and some improved error handling
(contributed by a conscientious Mathworks programmer). There are
slower fall-backs for crap arithmetics. And I do have to note
that saturation arithmetic would be even better for the Sturm
count routines. (I'm not an opponent of having other
arithmetics, just having to support them in every piece of code I
write.)

The next major release will include extra-precise refinement for
square, general Ax=b and possibly other forms and overdetermined
least squares. These routines rely on faithful rounding. That's
not 754 specific, but I guarantee that we never would have tested
the routines sufficiently well if we had to cope with many
different arithmetics. And there may be some optimizations
possible for common/smallish problems with
round-to-nearest-even... But that would make someone's head
explode, so I'll refrain.

If we had ready access to flags and rounding modes (which can be
scoped appropriately for optimization, but languages haven't
bothered, even those with dynamic scope already *cough*), we
could use them. Round-to-inf would be perfect for computing
error estimates as a kind of half interval. Flags (esp
underflow) would let us run fast&sloppy code more often.

And in debugging these routines, some of us certainly rely on 754
features. But hey, no one cares about debugging, just that the
result doesn't have bugs that they hit.

Jason


.



Relevant Pages

  • Re: sin(10^300), ignorance, collusion. was Re: TILE64 embedded multicore processors - Guy Macon
    ... in practice *and* in theory) ... |> count routines for bisection, ... |> slower fall-backs for crap arithmetics. ... And I am saying that IEEE 754 is a major CAUSE of that attitude. ...
    (comp.arch)
  • Re: Matrix Multiplication in BLAS or Lapack
    ... > so you need not to link the lapack library. ... > LAPACK library includes the blas routines already due to ... > BLAS offers the generic and simple routines, e.g. daxpy, dgemv, dgemm, ... > playing around with the optimization options of the compiler - ...
    (sci.math.num-analysis)
  • Re: Matrix Multiplication in BLAS or Lapack
    ... BLAS-library and not of LAPACK; ... LAPACK library includes the blas routines already due to ... BLAS offers the generic and simple routines, e.g. daxpy, dgemv, dgemm, ... playing around with the optimization options of the compiler - ...
    (sci.math.num-analysis)
  • Re: Linear Algebra
    ... I was hoping that LAPACK would be the answer. ... There are separate LAPACK routines to compute s=Q^T.r. ... to solve the equations this way than the original normal equation ... the QR factorization approach requires fewer floating point ...
    (comp.lang.fortran)
  • Re: LAPACK, et al., vs. Numerical Recipes
    ... Recipes and LAPACK, preferably including ATLAS and/or Goto BLAS. ... the solver routines in Lapack are probably better ... Victor Eijkhout -- eijkhout at tacc utexas edu ...
    (sci.math.num-analysis)