Re: PCA for looking at the impact from 1 variable



I think that there is a problem since there is strong
interdependence between one variable and two other
variables. It is this interdependence that I would like to
remove. The interdependence requires some explaination: I
have a regular grid(Manhattan style) where the size of the
meshes are changed(thereby two variables: changes of the
pitch in the horizontal and vertical direction).
Furthermore, I have a third variable which represents ground
points in a electrical net. These ground points (spacing and
placement) are very dependent on the size of the meshes.
Since the variable representing the ground points is the
dominating variable, this will affect the relation between
mesh sizes and the measured output value. Is there anything
I can do to remove this interdependence?

Thanks
Daniel



"Scott " <millers@xxxxxxxxxxxx> wrote in message
<feri05$7cg$1@xxxxxxxxxxxxxxxxxx>...
"Daniel Andersson" <daan@xxxxxxxxxxxxxx> wrote in message
<feqqn3$1ub$1@xxxxxxxxxxxxxxxxxx>...
Hi

I have a data set(represented by 6 variables x 180
measured
value) where it is hard to sort out what impact a certain
variable have on the measured value. From what I
understand,
PCA is giving me (through the principal component) a
value
of the impact that each variable will have on the
measured
values.

This is not what PCA does. PCA computes the directions of
linear combinations of the variables that have the
greatest variability, and how large the variability is in
those directions. These linear combinations of the
variables are not the original variables, but could be
called "latent" variables. It is possible for a variable
to have an 'impact' on a measured value, but not to have
any effect on the PCA, if the relationship between the
variable and the measured value does not contain a linear
component.

But how do I go from this value and then recreate
the data points without the influence of the other
variables. In other words I am looking for a way to
filter
the information from the other variables and only keep
the
relation between the variable of interest and the
measured
values. Is this possible?

If what you are interested in is the linear relationship
of the original variables with the measured value, write a
linear regression equation relating the original variables
to the measured value. See "Programmatic Fitting" in the
MATLAB help, but use the method in the section "Linear
Model with Nonpolynomial Terms." Make sure you add in a
vector of ones to handle the constant offsets as described
in the help. The coefficients extracted are then the
individual influences, as you put it.

It is possible to extract the same data using PCA. Make
sure you remove the means of the variables and the
measured value before taking the PCA. If the variable you
care about is variable k, such that it is the kth element,
then the kth element of each of the columns of the PCA
will be the projection of the kth variable onto that
latent variable. Adding these as Euclidean distances
(square root of the sum of the squares) over all the axes
can be related back to the linear relationship between the
kth variable and the measured output, but this is really
the long way around - you took the data apart with PCA,
then put it back together with Euclidean distances. It's
easier in my mind just to use regression.

Another approach is to use xcorr (using the 'corrcoef'
option) in the signal processing toolbox to compute the
cross correlation coefficient of the measured value with
the variable of interest, if by 'influence' you mean the
amount of linear variance of the measured value with the
kth variable. Again, remove the means of the variables
and measured value before taking the xcorr. Recall this
is signed.

Finally, using cov(kth variable,measured value) standard
built-in will do this for you, while taking out the means
automatically. The (1,2) component of the output matrix
(or equivalently the (2,1) component, since the matrix is
symmetric) will give the amount of linear variance of the
measured value with the kth variable, once normalized by
sqrt((1,1)*(2,2)).

Scott




.



Relevant Pages