Re: race of truth ends tour.



LastToKnow wrote:
bjw@xxxxxxxxxxxxxxxxx wrote:

If you don't know the inputs but have multiple outputs, you can
do something like principal component analysis, or the aptitude-test
factor analysis outlined in the wikipedia page on factor analysis
that you helpfully pointed out. Here you have several measurements
for each subject and the goal (at least in PCA) is to reduce the
dimensionality of the data set. But you need more measurements
per subject than factors (hidden variables) otherwise the linear
model is underconstrained.

What you are saying about PCA shows a major misunderstanding of the
topic. PCA in the context of regression modeling nothing more than
the following: 1) do eigenanalysis of matrix of explanatory variables,
2) rotate the equations to the new eigenspace and pick the n
eigenvectors with the largest eigenvalues as your new explanatory
variables (where n is lower than the original number parameters to be
estimated), 3) do statistical fitting as before in the new reduced
subspace, the point being that using eigenvectors insures that all
parameter estimates are now statistically orthogonal and using the ones
with largest eigenvalues insures the least addition of model bias in
the least squares sense.

Maybe you thought I was saying something more profound
about PCA than I was? Here's a couple of examples.
Suppose each "subject" is a collection of a few hundred
or thousand measurements, such as rainfall in a given
month at several hundred positions on the globe, or
light intensity at several thousand wavelengths in the
spectrum of a galaxy:
http://trmm.jpl.nasa.gov/global/
http://arxiv.org/abs/astro-ph/0305587

You have a bunch of these collections, from different
months, or different galaxies. Each collection is a
vector in a few hundred or thousand dimensional space.
If you do a PCA on the ensemble of vectors, you can (in
either of these problems) find a small number of
eigenvectors that account for most of the variance, so
each vector can be approximated by a linear
combination of some small number n eigenvectors
(similar to what you described), rather than hundreds
or thousands of numbers. This is what I meant by
"reduce the dimensionality." One of the utilities of
this is that the small number of important eigenvectors
often have a physical meaning which helps you
understand the problem, if you set it up right.
Not that I'm claiming to know how to set this up
for analyzing time trials.

But as you said, we are digressing, so I'll
shut up now.

.



Relevant Pages

  • Re: why center data before apply PCA?
    ... data points in the array. ... can apply a PCA tool to the centered data. ... the mean of the data built into the eigenvectors. ...
    (comp.soft-sys.matlab)
  • Re: why center data before apply PCA?
    ... data points in the array. ... can apply a PCA tool to the centered data. ... the mean of the data built into the eigenvectors. ...
    (comp.soft-sys.matlab)
  • Re: Adaptive PCA problem
    ... I has been contront of a new problem on adaptivePCA. ... their co-variance-like matrix S is formulated as follows. ... The question is "is there any method to obtain eigenvectors of S' are ... or the singular value decomposition: ...
    (sci.math.num-analysis)
  • Adaptive PCA problem
    ... I has been contront of a new problem on adaptivePCA. ... their co-variance-like matrix S is formulated as follows. ... The question is "is there any method to obtain eigenvectors of S' are ... S_delta can be very sparse matrix where most entries are zeros. ...
    (sci.math.num-analysis)
  • Re: God=G_uv proves 40k B.C. Creation
    ... on the 13 symmetry axes of the CUBE ... defined in ENP space by the eigenvectors ENP which are the edges of the ... BOX PROBLEM where only 20 items are used (measurements) 3 of them ... > to visualize the correlations. ...
    (sci.physics.relativity)