Re: multiple linear regression



On May 13, 3:34 pm, Ray Koopman <koop...@xxxxxx> wrote:
On May 13, 2:32 pm, JW <wallis...@xxxxxxxxxxx> wrote:

[...]
Isn't "centering" the first step in standardizing anyway? My
understanding of standardizing is that we subtract the mean and then
divide by the standard deviation. So subtracting the mean gives me -2
-1 0 1 2 for IV1 and -0.5 0.5 for IV2, and then I divide both by the
standard deviation.

If I then multiply these scores as the product term and put this into
my regression equation, I get something sensible from the regression
coefficients, but if I now standardize the product term (by
subtracting its mean and dividing by its standard deviation) the
values I get regarding the significance of the regression coefficients
look off. So I think I'm still doing something wrong that I'm not
seeing.

If your original model is y = b0 + b1*x1 + b2*x2 + b3*x1*x2 + e,
and you change it to y = a0 + a1*x1 + a2*x2 + a3*((x1*x2 - m)/s) + e,
where m and s are any arbitrary constants,
your new results should be
a0 = b0 + b3*m,
a1 = b1,
a2 = b2,
a3 = b3*s.
Are you getting something else?

OK, so I rechecked all my code. I think I've been changing too many
things at once, and had screwed up exactly which variables I was or
was not standardizing. So yes, I can now confirm that is exactly what
I get.

I have another question though :(

I've looked round on the web and from what I can find it seems
everybody uses dummy variables of 0 and 1. Why is this? If centering
is important to interpreting the interaction term, why don't people
use dummy variables of -1 and 1?
.