Re: Fisher's Exact Test and Chi-square
- From: harriscsuiucedu@xxxxxxxxx
- Date: 18 Jul 2006 18:12:19 -0700
Thom wrote:
harriscsuiucedu@xxxxxxxxx wrote:
I have a number of questions about these two...
(I'll presume for argument's sake that FET is always considered the
'best' value (the 'true' probability) to calculate but chi-square is
easier (faster), and so there is a trade-off)
They're different tests - Fisher's is conditional on the marginals and
the Pearson Chi-Square isn't.
I'm not sure what 'conditional' means (I am obviously not a
statistician). But I do know that in the computation of both, the
marginals are computed (in FET to help determine the other contingency
tables with the same marginals, for Chi^2 to help determine the
expected cell entries). So how is one 'conditional' while the other
not?
- the rule of thumb is to use FET when any cell is < 5 and the total
<, oh, some larger number, chi-square otherwise. Why this particular
cut off? Was there an empirical comparison of the two methods on random
contingency tables, and then 5 was taken as a reasonable cutoff point?
While this is commonly stated I don't think that this is correct.
What's not correct about it (assuming you're talking about the '<5'
statement)?
The
Chi-square approximation isn't great because the chi-square
distribution is continuous but the cell counts are discrete. This means
that for small cell counts the probabilities of the continuous
distribution will tend to be somewhat innacurate.
Hm... I see the continuous vs discrete problem, but one could just see
contingency tables as limited to integers rather than reals (a
restriction of the data not the test), so I don't see how innaccuracy
is involved.
Also note that it is the expected counts not the observed counts that
are important.
- for the chi-square statistic (sum (o-e)^2/e^2), its distribution is
'approximated very closely by the chi-square distribution' (a random
but representative quote from a statistics text). OK, so it is
-approximated- by the chi-square distribution. Well, then, is there a
(named? known?) function for that -exact- distribution? (I'm guessing
There can be no single exact distribution because the probabilities
depend on the margianls. This is why conditional tests can be exact.
How is that? Can you elaborate? This confuses me. Here it sounds like
you're saying that the probabilities depend on the marginals for
-chi^2-.but that seems to contradict your distinction at the beginning.
I'm really not following here.
that this is a similar situation to where the normal distribution is an
approximation of the t-distribution)
No. Neither is an approximation of the other - they are different
distributions. However, with infinite d.f. t converges on the Normal.
Doesn't that latter sentence seem to say that for large degrees of
freedom, the t distribution is approximated well by the normal?
- FET takes a long time to compute if n is large (in comparison to
chi-square). Are there approximation algorithms of FET directly (that
is attempts to approximate the sum of hypergeometric values)?
Well, Yates correction to the Pearson chi-square approximates
conditional tests such as Fisher's exact test, though it probably isn't
sensible to switch merely because expected cell counts are low.
I've seen at least one argument that one should nearly always use a
conditional test - but in practice, for most purposes, the results are
very similar.
Can you give me an idea of what such an argument would look like?
reference?
Thanks,
Mitch
.
- Follow-Ups:
- Re: Fisher's Exact Test and Chi-square
- From: Thom
- Re: Fisher's Exact Test and Chi-square
- From: John Uebersax
- Re: Fisher's Exact Test and Chi-square
- References:
- Re: Fisher's Exact Test and Chi-square
- From: Thom
- Re: Fisher's Exact Test and Chi-square
- Prev by Date: Re: Fisher's Exact Test and Chi-square
- Next by Date: Re: probability of "the worst" among a set of 10 items
- Previous by thread: Re: Fisher's Exact Test and Chi-square
- Next by thread: Re: Fisher's Exact Test and Chi-square
- Index(es):
Relevant Pages
|