Re: What statistical test to use?
- From: valter.sundh@xxxxxxxxxxxxxxx
- Date: 20 Jan 2006 08:06:25 -0800
When the question is asked the following way
"I am analyzing some data of a recent experiment that i did."
it is not possible to select the most suitable test without making
assumptions.
For example, which formulation of the underlying question to answer is
relevant:
1. Do the two groups in my data differ in distribution in the variable
I have measured?
2. Do the two groups in my data differ in level in the variable I have
measured?
Question 2 is equivalent to:
Do the cases in one group tend to have higher value than the cases in
the other group?
Those two question are of course related, but will lead to different
tests.
If you want a general test of difference, you should use a test with
high power
for many different types of deviation from equal distribution.
Two groups can for example differ in the shape of the distribution, but
have the same mean and/or median, and a certain test may have high
power to detect difference in level, but not in shape, and vice versa
for an other test. And a third test may have good power in both
situations.
Do you think that difference in shape it is an important difference to
discover even when the
means/medians are the same?
If you want to test general difference, you may use the
Kolmogorov-Smirnov two-sample test,
or the Cramer-von Mises test (that is similar to the K-S two sample
test, but have higher power).
An other test of general group difference is given by a binary logistic
regression model
with explanatory variables X + X*X, where X is the numerical variable
you want to test.
If that model is significant, it indicates that the two groups have
different distribution,
either in level, shape or both.
(This use of logistic regression was described by Peter O'Brien in an
article in JASA 1988,
where he call it an extension of the t-test).
If the question primarily concern the average or median level of each
group, it is sensible to choose a test having high power only for that
type of deviation from equal distribution.
A t-test is in this situation suitable. If you have at least 20-30
observations in each group
(and there are no outliers in the sample), the non-normality of the
observations
does not matter. (This conclusion rests on the Central Limit Theorem).
A rank test, like the Wilcoxon rank sum test, is not suitable for
discrete (categorical) data, like Ronald's count variable, as it will
introduce an artificial weighting of the data, and therefore also of
the test statistic.
>>From the point of statistical significance, a t-test, a test of linear
trend in a continguency
table (the Cochran-Armitage test) and a logistic regression model with
X as the only explanatory variable in the model are equivalent except
for very small samples.
Those three test will test the same null hypothesis, and have the same
alternative hypothesis: that there is a linear relation between group
membership and the X-values.
A rank test will test the same null hypothesis, but have a different
alternative hypothesis:
that there is a linear relation between group membership and the ranked
X-values.
If you have very few cases in any group, you can use the permutation
t-test.
That test does not rest on the Central Limit Theorm so it gives
accurate result even for very small samples.
Valter Sundh
Ronald van den Berg wrote:
> I am analyzing some data of a recent experiment that i did. Apart from
> ANOVA and t-tests i am not really familiar with applied statistics and
> i'm having some difficulties now in finding out what test to use for
> statistics on my data. Any help would be appreciated.
>
> To give you an impression of the kind of data that it concerns, two
> example plots can be found at:
> http://www.ronaldvdberg.nl/temp/stat.htm
>
> Roughly stated, the raw data consists of counts of (fixation) errors.
> Errors were discrete values, ranging from 0 to 9. The plots show the
> proportion of fixations associated to each error (I normalized all raw
> counts by dividing every count by the total count). So, the sum is 1
> and the mean is 0.1 in both plots/datasets. The distributions in these
> two data sets are clearly different and my question is how i can
> properly test/show this ... ?
>
> Thanks.
>
> Ronald
.
- References:
- What statistical test to use?
- From: Ronald van den Berg
- What statistical test to use?
- Prev by Date: power calculation
- Next by Date: Re: a tricky question on probabilities
- Previous by thread: Re: What statistical test to use?
- Next by thread: Anova
- Index(es):
Relevant Pages
|