Re: ANOVA QUESTION - Terminology




Bruce Weaver wrote:
Reef Fish wrote:
Bruce Weaver wrote:
jp wrote:
I have looked at an intro to biostatistics book.

I know two groups is t-test and more than two is ANOVA. I'm sorry this
ANOVA can also be done with two groups. The F-ratio from the ANOVA
equals the square of the t-ratio for the same data.

I actually saw this reply while I was finishing up what I said jp, and
I
stand by what I said there about Richard Ulrich's deficiency in the
material to advise anyone on the subject of ANOVA along the line of
Linear Models because he doesn't KNOW IT.

So, here are just a couple of short comment to your suggestions.

The fact that ONE of the T-tests is a special case of ANOVA is
standard material in the Linear Models treatment of ANOVA.
What you mentioned is NOT a general result in ANOVA. The
t-sq equals F is true ONLY in a SIMPLE regression problem
(which is what a T-test can be set up as), and no other ANOVA
problem in comparing means.


I thought it was abundantly clear from the context of jp's immediately
preceding statement that I was talking about the F-ratio from a one-way
(independent groups) ANOVA and the t-ratio from an independent groups
t-test (with both tests performed on the same data). If that was not
clear, I apologize.

It was quite clear to ME, but for someone like jp who is NOT familiar
with either a T test or ANOVA and especially a Linear Model approach,
what you told him is unlikely to be illuminating. To those who
already
knew what you're trying to say, it was an oversimplification.


Of course, if you square the t-ratio from a *paired* t-test, you get the
F-ratio from a (single-factor) repeated measures ANOVA on the same data.
So there are by my count two ANOVA problems for which t-squared =
F. ;-)

But that's the SAME problem, only parametrized differently.


In that particular case, the equivalence is to combine the data
in BOTH groups into Y and regress it in a SIMPLE regression
with the Indicator variable (X = 1 if the data is in group 1;
X = 0 otherwise).

sounds so basic, but I didn't find any example of this in the book.
Most examples of ANOVA are 3 groups of race (Chinese, White, Black)
with one dependent variable (IQ). I can see how this is a one-way as
there is one independent and one dependent. When repeated measures is
described in the book, the ANOVA now changes to something like
monitoring heart rate at several times during biking. A two-way is
described as having two independent variables, say gender and race in
which IQ is again, the dependent variable.

The question I have is what would the analysis be for for what I was
suggesting above. To me you could treat it as a two-way in that Group
(A,B,C) and Test# (Pre Post) could serve as two independent variables,
but to me I thought it sounded like a repeated measures because the
Test Score was the dependent variable measured at two time points.
There is no example like this in my book, and perhaps I not understand
it. I would imagine this newsgroup is for individuals of all learning
levels?

To me, this sounds like a two-way design, but I am confused because it
is more like a one-way repeated measures since the dependent variable
is measured twice and there is the independent variable Group. I would
really like to know if this distinction.
It is a two-way design. Nowadays, some people do call it a repeated
measures design, probably because the procedure they use in their stats
package is found under "GLM->Repeated Measures", or something similar.

I consider that not the "optimal" advice, a term I mimicked from Kevin
Thorpe, a very competitent BIOstatistician in the sci.stat.math group.

It is NEVER a good idea to learn statistics from a computer manual,
ANY computer manual. The place to learn it is from a good textbook,
and THEN one would automatically understand what's RIGHT and
what's WRONG in the computer software manuals.

It was not meant as advice, Bob, nor was I condoning that reason for
reason for calling split-plot designs "repeated measures" designs. It
was a guess as to why so many people *do* call them repeated measures
designs.



In textbooks, however, it would more likely be described as a split-plot
design, a between-within design, or a mixed design. Group is a
between-subjects factor, and Time (pre vs post) is a within-subjects (or
repeated measures) factor. (How you label it no doubt varies by area of
research.)

Now you're telling a baby how to FLY before he learns how to walk. :-)

I was just responding to jp's request for some terminology to describe
the design.

That's exactly what I meant when I said talking aerodynamics to a baby
who is still crawling on the floor. ;'^)

An analogy I've used on Jack Tomsky referring Afonso to Lehmann's
book when Afonso's statistical knowledge is BELOW that of an
average freshman in college, and Lehmann's book is at the graduate
level, completely beyond the comprehension of Afonso and most
people in that group. I would say YOUR mention of all these designs
are PROBABLY beyond the comprehension of almost everybody in
THIS group. However, those topics are systematically and
correctly treated in the Applied Linear Models book I recommended
to jp to learn them. They are elementary, and at the undergrad
level -- but NOT having been learned by most of the statisticians in
these groups, I am sorry to say.
The design has 3 F-tests: Main effect of Group, main effect of Time, and
the Group x Time interaction. The null hypothesis for the interaction
is that change is equivalent in all of the groups. The interaction
F-test is equivalent to an independent groups t-test on the change scores.

Really?


Yes, when there are two groups. (I was thinking of jp's original
example where the group variable was male vs female.) It is clearly not
so when there are 3 or more groups.

Again, I KNEW what you were thinking. But jp did talk about Asians,
Blacks, and whatnot in THREE Groups, and you were delving into
even much more complicated terminology and methodogy than
having only three groups.


Even at this ambiguous global level, your statement is as
misleading to one (me) who knows the material as to anyone who
DOESN'T know the material YET.

What is ambiguous or misleading about it? Perhaps it was not clear that
I was referring to the two-group case given in the original example.

About the notion of GROUP (a set of indicators instead of 1 indicator),
and the notion of interaction -- I did elaborate on that in the
paragraph
below yours below.

For the two-group case, if you compute a change score for each person
(post-pre), perform an independents group t-test on those change scores,
and then square the t-ratio, you get the F-ratio for the Group x Time
interaction term in a split-plot ANOVA on the same data. I thought this
was well-known.


What is GROUP? It is NOT a single variable. It is a collection of
k indicator variables for (k+1) groups. There are k interactions to
the MAIN effects, and there are second order interactions to
more than two effects.


In the ANOVA tradition, Group is considered a single variable with k
*levels*, and df = k-1. But of course, if you perform the analysis with
a regression program, then you do indeed need k-1 indicator variables.

Ah, but that's exactly what I meant! The two formulations are
EQUIVALENT
and IDENTICAL, but when you are using the Linear Models approach,
(which is by now the "standard" isn't it?, as in SAS, GLM, etc.) then

the notion of a Group is no longer a Group with levels in the
pre-linear
models formulation.


In short, what you said about "change is equivalent in all of the
groups"
is wrong 97 % of the time for those models that are covered in the
multi-group, multi-factor cases. Perhaps correct 3% of the time in
those special cases.

I don't understand what you're saying here, Bob.

In terms of Linear Models formulation of ANOVA, MANOVA, and
ANOCVA, and MANOCOVA and many many other univariate and
multivariate designs that are all Linear Model based.


Just as your mention of t-test being a special
case of ANOVA is correct in exactly ONE instance of ANOVA.

Only if you don't count repeated measures ANOVA as an instance of ANOVA.
;-)


But, as Rich Ulrich noted, another very common method of analysis for
data such as this is one-way ANCOVA. It is a linear regression model of
this form:

Post = b0 + b1*Pre + b2*Group

What is Group? Richard Ulrich had argue the quackery concept of
an categorical-ordinal variable and CODE the NOMINAL groups
(such as Asian, Black, and American) as 1, 2, and 3 in the variable
called "group" and do a regression on it. That is called a BLUNDER
of the worst kind by undergraduates.


Sorry, I was thinking about jp's original problem, which had two groups
(male/female). For the 3-group problem given later, yes, there would be
two indicators for group. E.g.,

Post = b0 + b1*Pre + b2*G1 + b3*G2

Yes. I know you were. But I was addressing precisely the ambiguity.
Rich Ulrich was USING 1,2,..., 6 for the id of 5 groups and then
treated
them as if those indicator NAMES (codes) were real data.



You cannot have a Group variable for more than 1 group. because

You mean for more than 2 groups, of course. ;-)

Well, a GROUP variable is for only ONE group. Such as I1 as the
Indicator variable for Group 1. For 2 groups, you NEED only ONE
indicator variable, because the other is idenified by the constant
in the linear model. :-) So what I said is technically correct even
thought it might sound like a slip.


you would need 2 indicators for jp's three groups, and you cannot
have only one b2 for the three groups. Your interactions would be
ALL the crossproducts between the indicator variables AND the
covariate, as well as interactions between the groups (the product
of the indicators).

No argument here.



Here is a comment you might find useful.

www.angelfire.com/wv/bwhomedir/notes/krantz_ancova.txt

I don't think so. Too many half-baked ideas all jumbled into one
obscure short document for exposition. There ain't such a
thing as a "free lunch". You have to pay for the lunches in at
least several chapters of the Neter et al book.

-- Reef Fish Bob.




--
Bruce Weaver
bweaver@xxxxxxxxxxxx
www.angelfire.com/wv/bwhomedir

No problem with your discussion at all. Just throw in some
clarification
comments to make some concepts more precise. I think in the two
exchanges, we should have clarified those.

-- Reef Fish Bob.

.



Relevant Pages

  • Re: ANOVA QUESTION - Terminology
    ... Most examples of ANOVA are 3 groups of race ... and Test# could serve as two independent variables, ... measures design, probably because the procedure they use in their stats package is found under "GLM->Repeated Measures", or something similar. ... The interaction F-test is equivalent to an independent groups t-test on the change scores. ...
    (sci.stat.consult)
  • Re: Multiple single-sample t-tests
    ... We can carry out a two-way ANOVA, with A and B as repeated factors. ... we also want to compare the values of X to a given value: ... take into account the dependancy due to the repeated measure design. ... multiple single-sample t-tests. ...
    (sci.stat.consult)
  • Re: ANOVA QUESTION - Terminology
    ... I know two groups is t-test and more than two is ANOVA. ... It is a two-way design. ... and there are second order interactions to ...
    (sci.stat.consult)
  • Re: ANOVA QUESTION - Terminology
    ... I know two groups is t-test and more than two is ANOVA. ... I thought it was abundantly clear from the context of jp's immediately preceding statement that I was talking about the F-ratio from a one-way ANOVA and the t-ratio from an independent groups t-test. ... To me, this sounds like a two-way design, but I am confused because it ...
    (sci.stat.consult)
  • Re: ANOVA QUESTION - Terminology
    ... I know two groups is t-test and more than two is ANOVA. ... But that is usually done in LINEAR MODELS after the introduction ... and Test# could serve as two independent variables, ... a repeated-measures analysis with several ...
    (sci.stat.consult)