Re: Top 500 is out



Jonathan Thornburg [remove -animal to reply] wrote:
In article <5ptcigFsqv7iU1@xxxxxxxxxxxxxxxxxx>, I wrote
* There are clever algorithms to compute each (hexadecimal) digit of
pi independently. See, for example,
http://oldweb.cecm.sfu.ca/projects/pihex/pihex.html

In article <otb*jPNZr@xxxxxxxxxxxxxxxxxxxxxxxxxxx>, Thomas Womack replied
Yes, this is true, but those have performance O(N^2) to compute the
first N digits, which is obviously crazy.

Computing pi by a more sensible algorithm will use a couple of dozen
FFTs on arrays the size of the supercomputer memory, which sounds to
me like a very reasonable interconnect test. OK, you probably use a
radix-sqrt(N) algorithm which requires about four enormous transposes
per FFT plus local work, but I thought machine-scale matrix transpose
was the standard benchmark for an interconnect.

And huge-scale FFTs look quite like spectral methods for solving
differential equations, which are absolutely classic supercomputer
problems.

There are alternative algorithms which are a lot faster, and which
don't need N-precision arithmetic. http://oldweb.cecm.sfu.ca/pi/pi.html
gives details, describes them as "almost linear time", and specifically
points out that they use only double-precision or at most quad-precision
floating-point, and do *not* need N-point FFTs. In particular, the
abstract of
http://oldweb.cecm.sfu.ca/~pborwein/PAPERS/P123.ps
is pretty clear on this:
We give algorithms for the computation of the d-th digit of certain
transcendental numbers in various bases. These algorithms can be
easily implemented (multiple precision arithmetic is not needed),
require virtually no memory, and feature run times that scale nearly
linearly with the order of the digit desired. They make it feasible
to compute, for example, the billionth binary digit of log(2) or
on a modest work station in a few hours run time.

That's the problem right there: "feature run times that scale nearly linearly with the order of the digit desired".

This means each digit requires O(n(+)) work and to get N digits, you have at least O(N*n) = O(n^2) work.

For all interesting values of N, O(N^2) is not an interesting algorithm.

Terje

--
- <Terje.Mathisen@xxxxxxxxxxxxx>
"almost all programming can be viewed as an exercise in caching"
.



Relevant Pages