Re: Collinearity, confidence intervals and sampling




"Richard Ulrich" <Rich.Ulrich@xxxxxxxxxxx> wrote in message news:qghn24hqmfs20su188b7iaku2od617323d@xxxxxxxxxx
On Wed, 14 May 2008 21:43:06 +0100, "reflex" <sdfs@xxxxxxxxx> wrote:

Ok I've been doing some more reseach on this whole collinearity thing and
read that if you have collinear variables, the best fitting plane of the
data points in a regression will be narrower and less achored (because the
predictors are highly correlated so the predictor values fall in a straight
line). Consequently, if response varied from sample to sample, the
coefficients could change substantially. Therefore the standard errors of
the coefficients are necessarily larger.

Does this mean that this is not a problem if you have population level data
(ie sample size doesn't matter because you have 'sampled' the entire
population you are interested)?

That's a slightly-true observation, with no real application.

With true Population level data, you might have "measurement
error" but you have no "statistical error." This is like the
results of taking a vote, as compared to taking an opinion poll.
("Recounts" are used to reduce "measurement error" in votes.)

With true population data, or data treated as such, you have no
role for inference or generalization or the direct application
of science; you have an administrative tool.

Basically - If you are hoping to say anything interesting to
almost anybody else, you are treating some "population" as
a sample. So, unless there is special reason, you never will
treat a population as a "population."

If you want more discussion, you might Google-groups and
look at threads found by
< groups:sci.stat.* "finite population" author:ulrich >


Are there are other effects of collinearity that do not matter if you have
population level data? What about other assumptions of regression e.g.
normal distribution of variables, homoskedasticity.

The website I've been looking at is
http://www.stat.psu.edu/~jglenn/stat501/12multicollinearity/04multico_corr.html
which is an excellent source on collinearity.

As always, any replies well appreciated.

--
Rich Ulrich

http://www.pitt.edu/~wpilib/index.html

Say if you had a population sample of all hospitals in England, and you wanted to say something interesting about all hospitals in England, then you wouldn't need to generalise to a wider population because you know the whole population. Surely that's a real application?

Cheers


.



Relevant Pages

  • Re: Collinearity, confidence intervals and sampling
    ... predictors are highly correlated so the predictor values fall in a straight ... Does this mean that this is not a problem if you have population level data ... Are there are other effects of collinearity that do not matter if you have ... What about other assumptions of regression e.g. ...
    (sci.stat.consult)
  • Re: stepwise regression by GENSTAT
    ... My handbook considers only stepwise regression as a method to select ... leaving behind only "random variation" in the residuals (residuals = ... to which subset of these to use as predictors. ...
    (sci.stat.math)
  • Re: Questions about square errors
    ... Take a look at the 10X10 correlation coefficient matrix and the ... multicollinearities. ... least squares and/or multiple regression. ... Your model may have several unnecessary predictors. ...
    (sci.stat.math)
  • Re: Enter versus forward method for linear regression
    ... Regression, ... present the coefficents and p values of all predictors so that readers ... try Robert Abelson's book "Statistics as Principled Argument." ... and examine the effects on the coefficients. ...
    (sci.stat.edu)
  • Re: Using Ridge Regression to disentangle highly correlated explanatory variables
    ... the regression model, which did it's job at reducing the VIF greatly. ... the relative impact of each of the three explanatory variables. ... impacts of your three correlated predictors, ... and the VIFs should be 1 for each score variable. ...
    (sci.stat.math)