Re: ML matrices multiplication accuracy
- From: "Tim Davis" <davis@xxxxxxxxxxxx>
- Date: Sun, 24 Jun 2007 03:07:35 -0400
For dense matrix multiplication, MATLAB uses the BLAS. Specifically,
it uses optimized BLAS depending on the architecture (the Intel MKL
for example). They are fairly accurate, but do not use the
algorithms stated above. They use block matrix algorithms that
decompose the matrix multiplication into a set of block submatrices.
The inner kernel might be something like A = A + B*C for example
where B and C are 2-by-2 matrices (I'm making this up, but that might
be a kernel they use; they can use many kernels). A kernel like that
would exploit the SSE2 instruction set. Larger blocks are used as
well for other levels of cache.
If you want to see source code for this, see the ATLAS BLAS, and the
DGEMM function in particular. The code is fairly lengthy, as any
optimized BLAS package would be.
The differences have nothing to do with the compiler. It's how the
BLAS are written ... and the innermost kernels of those are typically
written in assembly language.
If you want to see how sparse-times-sparse is done, see the SSMULT
package in the File Exchange. That's not the code that is in MATLAB,
but it gets the same results as MATLAB does (down to the last bit).
If you want to see how sparse-times-full is done, see the cs_gaxpy
function in CSparse, which is also *not* what's in MATLAB, but that
code gets the same bit results as MATLAB.
To get a better handle on the error, you should divide your error by
the norm of the result. Also ... why do you say that MATLAB's error
is 1e-12? Are you assuming that yours is the correct answer (it
isn't) and comparing your result with MATLAB's? The norm of the
result will be around n, so the true relative error is about 1e-15.
When I do C=A*B and D=mult(A,B), then norm(C-D)/norm(C) is 5e-17.
That is, I get a relative "error" of around machine epsilon.
If I do E=sparse(A)*sparse(B), I get norm(C-full(E))/norm(C) of zero.
This is because sparse-times-sparse can't use the same 2-by-2,
4-by-2, or whatever small-by-small blocking that the BLAS can use.
It doesn't use the mult.m method, though, which would be far too slow
in the sparse case.
Note that I put "error" in quotes, since *neither* C *nor* D are the
"true" result. However, if they agree to within a relative error of
machine epsilon, you should be content that MATLAB is giving you an
So no, MATLAB is not sacrificing quality for speed. The MathWorks
would never do that.
- Prev by Date: Re: prob in random number generation
- Next by Date: Re: CORRCOEF and NAN
- Previous by thread: Re: ML matrices multiplication accuracy
- Next by thread: Re: ML matrices multiplication accuracy