Re: race of truth ends tour.
- From: "bjw@xxxxxxxxxxxxxxxxx" <bjw@xxxxxxxxxxxxxxxxx>
- Date: 10 Jul 2006 22:31:06 -0700
LastToKnow wrote:
bjw@xxxxxxxxxxxxxxxxx wrote:
If you don't know the inputs but have multiple outputs, you can
do something like principal component analysis, or the aptitude-test
factor analysis outlined in the wikipedia page on factor analysis
that you helpfully pointed out. Here you have several measurements
for each subject and the goal (at least in PCA) is to reduce the
dimensionality of the data set. But you need more measurements
per subject than factors (hidden variables) otherwise the linear
model is underconstrained.
What you are saying about PCA shows a major misunderstanding of the
topic. PCA in the context of regression modeling nothing more than
the following: 1) do eigenanalysis of matrix of explanatory variables,
2) rotate the equations to the new eigenspace and pick the n
eigenvectors with the largest eigenvalues as your new explanatory
variables (where n is lower than the original number parameters to be
estimated), 3) do statistical fitting as before in the new reduced
subspace, the point being that using eigenvectors insures that all
parameter estimates are now statistically orthogonal and using the ones
with largest eigenvalues insures the least addition of model bias in
the least squares sense.
Maybe you thought I was saying something more profound
about PCA than I was? Here's a couple of examples.
Suppose each "subject" is a collection of a few hundred
or thousand measurements, such as rainfall in a given
month at several hundred positions on the globe, or
light intensity at several thousand wavelengths in the
spectrum of a galaxy:
http://trmm.jpl.nasa.gov/global/
http://arxiv.org/abs/astro-ph/0305587
You have a bunch of these collections, from different
months, or different galaxies. Each collection is a
vector in a few hundred or thousand dimensional space.
If you do a PCA on the ensemble of vectors, you can (in
either of these problems) find a small number of
eigenvectors that account for most of the variance, so
each vector can be approximated by a linear
combination of some small number n eigenvectors
(similar to what you described), rather than hundreds
or thousands of numbers. This is what I meant by
"reduce the dimensionality." One of the utilities of
this is that the small number of important eigenvectors
often have a physical meaning which helps you
understand the problem, if you set it up right.
Not that I'm claiming to know how to set this up
for analyzing time trials.
But as you said, we are digressing, so I'll
shut up now.
.
- References:
- race of truth ends tour.
- From: Callistus Valerius
- Re: race of truth ends tour.
- From: LastToKnow
- Re: race of truth ends tour.
- From: ronaldo_jeremiah
- Re: race of truth ends tour.
- From: LastToKnow
- Re: race of truth ends tour.
- From: Michael Press
- Re: race of truth ends tour.
- From: LastToKnow
- Re: race of truth ends tour.
- From: Michael Press
- Re: race of truth ends tour.
- From: LastToKnow
- Re: race of truth ends tour.
- From: bjw@xxxxxxxxxxxxxxxxx
- Re: race of truth ends tour.
- From: LastToKnow
- Re: race of truth ends tour.
- From: bjw@xxxxxxxxxxxxxxxxx
- Re: race of truth ends tour.
- From: LastToKnow
- race of truth ends tour.
- Prev by Date: Re: Proof Hampsten was Clean
- Next by Date: Re: Proof Hampsten was Clean
- Previous by thread: Re: race of truth ends tour.
- Next by thread: Re: race of truth ends tour.
- Index(es):
Relevant Pages
|