Re: multicollinearity in regression
- From: "Anon." <bob.ohara@xxxxxxxxxxxxxxxxx>
- Date: Sun, 26 Mar 2006 11:23:19 +0300
Paul wrote:
Hi,As Reef Fish is being his normal helpful self, I'll try and make some sensible suggestions.
After reading numerous articles on the web, I have a couple of
questions.
1. How do I enter 5 continuous control variables such as LogSize inI don't know what package you're using to do the regression, but you can simply fit the model as a multiple regression with all of the variables in.
regression?
Note I also have 2 continuous independent variables which are not
control variables but on which hypotheses are based. I also have 3
categorical independents which have 2 categories on which hypotheses
are based.
I could use Analysis of Covariance but 2 of the independent variables
are also continuous.
Do I just enter all the variables as Independents into the regression
using Enter method (which is what I have done)?
2. The multicollinearity diagnostics indicate that the highest VIF isI must admit that I don't totally understand this, but I assume that this is suggesting that LOGSIZE is co-linear with another variable (or a combination of variables). It may be that you can find out what's going on by making pairwise plots of the covariates, and this will guide you to seeing what to do.
2.3 which is less that the rule of thumb value of 4 or 2.5 that I have
seen mentioned. However, the Condition Index is 67.899 and seems to be
related to LOGSIZE variable which has a Variance Proportion of .99. The
other variable with a high Variance Proportion (.99) is the Constant.
If I remove the variable LOGSIZE, then the coefficient of the ConstantThe change in the coefficient of the Constant isn't surprising, especially if some of the covariates are distrbuted a long way from zero.
is reduced from -33 to
-.44. None of the signs of the other coefficients are changed although
one Independent variable is now significant which wasn't previously.
I'm guessing that in the model with LOGSIZE, the LOGSIZE coefficient is pretty small, true? Oh, and the independent variable that has become significant: how much did the coefficient change? It's possible that it only moved a bit, from being just non-significant to being jusy significant.
So should I just report the 2 regression models, one with LOGSIZEI think you should try and understand why there seems to be multicollinearity: it may be that you can then see a sensible approach (i.e. one based on the substansive problem, not just a set of numbers). When I get problems like these, I try to report one analysis, and make a comment along the lines of "...if we include factor X, we get similar results"-.
included and one with it excluded?
There are people on this list who have a much better understanding of multicollinearity than I do, so hopefully they'll chime in with some sensible advice as well.
Bob
--
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland
Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax: +358-9-191 51400
WWW: http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org
.
- Follow-Ups:
- Re: multicollinearity in regression
- From: Greg Heath
- Re: multicollinearity in regression
- From: Paul
- Re: multicollinearity in regression
- References:
- multicollinearity in regression
- From: Paul
- multicollinearity in regression
- Prev by Date: Re: multicollinearity in regression
- Next by Date: Re: moderated mediation using AMOS
- Previous by thread: Re: multicollinearity in regression
- Next by thread: Re: multicollinearity in regression
- Index(es):
Relevant Pages
|