Re: ANOVA QUESTION - Terminology



Reef Fish wrote:
Bruce Weaver wrote:
jp wrote:
I have looked at an intro to biostatistics book.

I know two groups is t-test and more than two is ANOVA. I'm sorry this
ANOVA can also be done with two groups. The F-ratio from the ANOVA
equals the square of the t-ratio for the same data.

I actually saw this reply while I was finishing up what I said jp, and
I
stand by what I said there about Richard Ulrich's deficiency in the
material to advise anyone on the subject of ANOVA along the line of
Linear Models because he doesn't KNOW IT.

So, here are just a couple of short comment to your suggestions.

The fact that ONE of the T-tests is a special case of ANOVA is
standard material in the Linear Models treatment of ANOVA.
What you mentioned is NOT a general result in ANOVA. The
t-sq equals F is true ONLY in a SIMPLE regression problem
(which is what a T-test can be set up as), and no other ANOVA
problem in comparing means.


I thought it was abundantly clear from the context of jp's immediately preceding statement that I was talking about the F-ratio from a one-way (independent groups) ANOVA and the t-ratio from an independent groups t-test (with both tests performed on the same data). If that was not clear, I apologize.

Of course, if you square the t-ratio from a *paired* t-test, you get the F-ratio from a (single-factor) repeated measures ANOVA on the same data. So there are by my count two ANOVA problems for which t-squared = F. ;-)


In that particular case, the equivalence is to combine the data
in BOTH groups into Y and regress it in a SIMPLE regression
with the Indicator variable (X = 1 if the data is in group 1;
X = 0 otherwise).

sounds so basic, but I didn't find any example of this in the book.
Most examples of ANOVA are 3 groups of race (Chinese, White, Black)
with one dependent variable (IQ). I can see how this is a one-way as
there is one independent and one dependent. When repeated measures is
described in the book, the ANOVA now changes to something like
monitoring heart rate at several times during biking. A two-way is
described as having two independent variables, say gender and race in
which IQ is again, the dependent variable.

The question I have is what would the analysis be for for what I was
suggesting above. To me you could treat it as a two-way in that Group
(A,B,C) and Test# (Pre Post) could serve as two independent variables,
but to me I thought it sounded like a repeated measures because the
Test Score was the dependent variable measured at two time points.
There is no example like this in my book, and perhaps I not understand
it. I would imagine this newsgroup is for individuals of all learning
levels?

To me, this sounds like a two-way design, but I am confused because it
is more like a one-way repeated measures since the dependent variable
is measured twice and there is the independent variable Group. I would
really like to know if this distinction.
It is a two-way design. Nowadays, some people do call it a repeated
measures design, probably because the procedure they use in their stats
package is found under "GLM->Repeated Measures", or something similar.

I consider that not the "optimal" advice, a term I mimicked from Kevin
Thorpe, a very competitent BIOstatistician in the sci.stat.math group.

It is NEVER a good idea to learn statistics from a computer manual,
ANY computer manual. The place to learn it is from a good textbook,
and THEN one would automatically understand what's RIGHT and
what's WRONG in the computer software manuals.

It was not meant as advice, Bob, nor was I condoning that reason for reason for calling split-plot designs "repeated measures" designs. It was a guess as to why so many people *do* call them repeated measures designs.



In textbooks, however, it would more likely be described as a split-plot
design, a between-within design, or a mixed design. Group is a
between-subjects factor, and Time (pre vs post) is a within-subjects (or
repeated measures) factor. (How you label it no doubt varies by area of
research.)

Now you're telling a baby how to FLY before he learns how to walk. :-)

I was just responding to jp's request for some terminology to describe the design.

An analogy I've used on Jack Tomsky referring Afonso to Lehmann's
book when Afonso's statistical knowledge is BELOW that of an
average freshman in college, and Lehmann's book is at the graduate
level, completely beyond the comprehension of Afonso and most
people in that group. I would say YOUR mention of all these designs
are PROBABLY beyond the comprehension of almost everybody in
THIS group. However, those topics are systematically and
correctly treated in the Applied Linear Models book I recommended
to jp to learn them. They are elementary, and at the undergrad
level -- but NOT having been learned by most of the statisticians in
these groups, I am sorry to say.
The design has 3 F-tests: Main effect of Group, main effect of Time, and
the Group x Time interaction. The null hypothesis for the interaction
is that change is equivalent in all of the groups. The interaction
F-test is equivalent to an independent groups t-test on the change scores.

Really?


Yes, when there are two groups. (I was thinking of jp's original example where the group variable was male vs female.) It is clearly not so when there are 3 or more groups.


Even at this ambiguous global level, your statement is as
misleading to one (me) who knows the material as to anyone who
DOESN'T know the material YET.

What is ambiguous or misleading about it? Perhaps it was not clear that I was referring to the two-group case given in the original example.

For the two-group case, if you compute a change score for each person (post-pre), perform an independents group t-test on those change scores, and then square the t-ratio, you get the F-ratio for the Group x Time interaction term in a split-plot ANOVA on the same data. I thought this was well-known.


What is GROUP? It is NOT a single variable. It is a collection of
k indicator variables for (k+1) groups. There are k interactions to
the MAIN effects, and there are second order interactions to
more than two effects.


In the ANOVA tradition, Group is considered a single variable with k *levels*, and df = k-1. But of course, if you perform the analysis with a regression program, then you do indeed need k-1 indicator variables.


In short, what you said about "change is equivalent in all of the
groups"
is wrong 97 % of the time for those models that are covered in the
multi-group, multi-factor cases. Perhaps correct 3% of the time in
those special cases.

I don't understand what you're saying here, Bob.

Just as your mention of t-test being a special
case of ANOVA is correct in exactly ONE instance of ANOVA.

Only if you don't count repeated measures ANOVA as an instance of ANOVA. ;-)


But, as Rich Ulrich noted, another very common method of analysis for
data such as this is one-way ANCOVA. It is a linear regression model of
this form:

Post = b0 + b1*Pre + b2*Group

What is Group? Richard Ulrich had argue the quackery concept of
an categorical-ordinal variable and CODE the NOMINAL groups
(such as Asian, Black, and American) as 1, 2, and 3 in the variable
called "group" and do a regression on it. That is called a BLUNDER
of the worst kind by undergraduates.


Sorry, I was thinking about jp's original problem, which had two groups (male/female). For the 3-group problem given later, yes, there would be two indicators for group. E.g.,

Post = b0 + b1*Pre + b2*G1 + b3*G2



You cannot have a Group variable for more than 1 group. because

You mean for more than 2 groups, of course. ;-)

you would need 2 indicators for jp's three groups, and you cannot
have only one b2 for the three groups. Your interactions would be
ALL the crossproducts between the indicator variables AND the
covariate, as well as interactions between the groups (the product
of the indicators).

No argument here.



Here is a comment you might find useful.

www.angelfire.com/wv/bwhomedir/notes/krantz_ancova.txt

I don't think so. Too many half-baked ideas all jumbled into one
obscure short document for exposition. There ain't such a
thing as a "free lunch". You have to pay for the lunches in at
least several chapters of the Neter et al book.

-- Reef Fish Bob.




--
Bruce Weaver
bweaver@xxxxxxxxxxxx
www.angelfire.com/wv/bwhomedir
.



Relevant Pages

  • Re: ANOVA QUESTION - Terminology
    ... Most examples of ANOVA are 3 groups of race ... and Test# could serve as two independent variables, ... measures design, probably because the procedure they use in their stats package is found under "GLM->Repeated Measures", or something similar. ... The interaction F-test is equivalent to an independent groups t-test on the change scores. ...
    (sci.stat.consult)
  • Re: ANOVA on ordinal data
    ... The short answer to your question is yes you can do a t-test and ANOVA ... The more purist camp will tell you that you CANNOT use ... it’s only permissible to use Interval or Ratio ...
    (sci.stat.math)
  • Re: when is data considered "continuous" for parametric testing?
    ... a t-test comparing these two groups, or is that not continous data. ... not directly related to ANOVA. ... One assumption for *non-parametric* testing based on ranks ... When there are a lot of ties, the ANOVA ...
    (sci.stat.edu)
  • Re: ANOVA QUESTION - Terminology
    ... I know two groups is t-test and more than two is ANOVA. ... It is a two-way design. ... and there are second order interactions to ...
    (sci.stat.consult)
  • Re: ANOVA QUESTION - Terminology
    ... I know two groups is t-test and more than two is ANOVA. ... Most examples of ANOVA are 3 groups of race ... and Test# could serve as two independent variables, ... a repeated-measures analysis with several ...
    (sci.stat.consult)