Re: Fisher's Exact Test and Chi-square




Thom wrote:
harriscsuiucedu@xxxxxxxxx wrote:
I have a number of questions about these two...
(I'll presume for argument's sake that FET is always considered the
'best' value (the 'true' probability) to calculate but chi-square is
easier (faster), and so there is a trade-off)

They're different tests - Fisher's is conditional on the marginals and
the Pearson Chi-Square isn't.

I'm not sure what 'conditional' means (I am obviously not a
statistician). But I do know that in the computation of both, the
marginals are computed (in FET to help determine the other contingency
tables with the same marginals, for Chi^2 to help determine the
expected cell entries). So how is one 'conditional' while the other
not?

- the rule of thumb is to use FET when any cell is < 5 and the total
<, oh, some larger number, chi-square otherwise. Why this particular
cut off? Was there an empirical comparison of the two methods on random
contingency tables, and then 5 was taken as a reasonable cutoff point?

While this is commonly stated I don't think that this is correct.

What's not correct about it (assuming you're talking about the '<5'
statement)?

The
Chi-square approximation isn't great because the chi-square
distribution is continuous but the cell counts are discrete. This means
that for small cell counts the probabilities of the continuous
distribution will tend to be somewhat innacurate.

Hm... I see the continuous vs discrete problem, but one could just see
contingency tables as limited to integers rather than reals (a
restriction of the data not the test), so I don't see how innaccuracy
is involved.

Also note that it is the expected counts not the observed counts that
are important.

- for the chi-square statistic (sum (o-e)^2/e^2), its distribution is
'approximated very closely by the chi-square distribution' (a random
but representative quote from a statistics text). OK, so it is
-approximated- by the chi-square distribution. Well, then, is there a
(named? known?) function for that -exact- distribution? (I'm guessing

There can be no single exact distribution because the probabilities
depend on the margianls. This is why conditional tests can be exact.

How is that? Can you elaborate? This confuses me. Here it sounds like
you're saying that the probabilities depend on the marginals for
-chi^2-.but that seems to contradict your distinction at the beginning.
I'm really not following here.

that this is a similar situation to where the normal distribution is an
approximation of the t-distribution)

No. Neither is an approximation of the other - they are different
distributions. However, with infinite d.f. t converges on the Normal.

Doesn't that latter sentence seem to say that for large degrees of
freedom, the t distribution is approximated well by the normal?

- FET takes a long time to compute if n is large (in comparison to
chi-square). Are there approximation algorithms of FET directly (that
is attempts to approximate the sum of hypergeometric values)?

Well, Yates correction to the Pearson chi-square approximates
conditional tests such as Fisher's exact test, though it probably isn't
sensible to switch merely because expected cell counts are low.

I've seen at least one argument that one should nearly always use a
conditional test - but in practice, for most purposes, the results are
very similar.

Can you give me an idea of what such an argument would look like?
reference?

Thanks,
Mitch

.



Relevant Pages

  • Re: chi-square test goodness of fit
    ... Okay, not necessarily distribution tables. ... accumulated chi-square sums into probability amounts. ... what Excel does, and what you suggest is proper practice. ...
    (sci.stat.math)
  • Re: Fishers Exact Test and Chi-square
    ... (I'll presume for argument's sake that FET is always considered the ... <, oh, some larger number, chi-square otherwise. ... -approximated- by the chi-square distribution. ... Are there approximation algorithms of FET directly (that ...
    (sci.stat.consult)
  • Re: Fishers Exact Test and Chi-square
    ... The probability distribution implicitly assumes continuous rather than ... chi-square or normal distribution would only give an approximation ... merely re-iterating that the chi-square statistic for a 2x2 contingency ... there at the least dozes and dozens of chi-square statistics. ...
    (sci.stat.consult)
  • Re: Fishers Exact Test and Chi-square
    ... 'best' value to calculate but chi-square is ... Chi-square approximation isn't great because the chi-square ... distribution is continuous but the cell counts are discrete. ... This is why conditional tests can be exact. ...
    (sci.stat.consult)
  • Re: diehard and ent results quesion
    ... Sampling error and quantization error are really two ... > it is that sampling a given expectation distribution ... Pearson's chi-square statistic ... That gives you the mdi and df for the aggregation. ...
    (sci.crypt)