Re: Can't perform PCA



Dave Krebs wrote:
My data set is too large for me to use the built-in Matlab functions such as "processpca" to compute principal components due to memory limitations.

Does anyone have an m-file that performs PCA interatively, returning one component at a time, or any algorithm that allows PCA to be performed in some way without computing the entire covariance matrix?

Dave, do you have access to the Statistics Toolbox? There is no good reason to compute the cov matrix, and at least two reasons not to. The PRINCOMP function computes a PCA directly from the data. If your data are wide, you'll want to use the 'econ' flag:

>> help princomp
PRINCOMP Principal Components Analysis.
[snip]
[...] = PRINCOMP(X,'econ') returns only the elements of LATENT that are
not necessarily zero, i.e., when N <= P, only the first N-1, and the
corresponding columns of COEFF and SCORE. This can be significantly
faster when P >> N.

Since you mention computing the cov matrix as the problem, I'm guessing your data _are_ wide.

That being said, there are any number of implementations of the NIPALS algorithm in MATLAB out there; that may be what you're looking for, but I'd look at PRINCOMP first.

Hope this helps.

- Peter Perkins
The MathWorks, Inc.
.



Relevant Pages

  • Re: basic description of PCA terms
    ... > description which is PCA used usually for data reduction and PCA as ... Principal Components is ... > analysis is because the original poster mentioned "latent variables". ... >>> Sorry, Data Matter. ...
    (sci.stat.math)
  • Re: Difference between Principal Components Analysis and Factor Analysis?
    ... What is the difference between PCA and Factor Analysis? ... original axes of X into orthogonal axes of the PCs. ... That is Principal Components Factor Analysis. ... We have dozens and dozens of "analytic rotation methods" ...
    (sci.stat.math)
  • Re: Principle Component Analysis
    ... I read a book about clearing multicollinearity of the independent variables by PCA. ... The result could be in the form of latent roots or latent vector but the problem is how do i use this PCA in regression? ... HOWEVER -- there is a serious problem here -- some of the principal components may not be predictive of the Y variables. ...
    (sci.stat.math)
  • Re: basic description of PCA terms
    ... description which is PCA used usually for data reduction and PCA as ... I have not heard principal components referred to as "latent variables" ...
    (sci.stat.math)
  • Re: eigenvalues of the covarience matrix (princomp)
    ... centering is: X-mean(X) ... this: princomp(X/var(X)) is: correlation PCA ...
    (comp.soft-sys.matlab)