Re: What does the p-value mean
- From: Peter Perkins <Peter.PerkinsRemoveThis@xxxxxxxxxxxxx>
- Date: Thu, 01 Dec 2005 15:55:42 -0500
Murphy O'Brien wrote:
If I'm using a test like say lillietest or kstest I can get a p value. When I google it I get quotes like this.
"If the p-value is less than some significance level, alpha, (typically practitioners use an alpha of 0.05) then we say that the result is statistically significant (at the 5% level) - i.e. the probability of incorrectly rejecting the null hypothesis is less than 5%."
This quote is either imprecisely worded, or wrong, depending on how harsh you want to be. The p-value is computed for particular data. The probability of incorrectly rejecting is a long-run probability, over many repetitions of the test procedure on different data. A better statement for the Neyman-Pearson interpretation of hypothesis testing would be something like (leaving out the preliminaries and ignoring things like discreteness),
"Given a fixed significance level alpha, chosen in advance (typically practitioners use an alpha of 0.05), you can define a test procedure which has a probability of incorrectly rejecting a true null hypothesis only alpha%, by simply computing the p-value (tail area under the null) and rejecting the null if p<alpha. In that case, we say that a result is statistically significant (at the 5% level). You have no idea (and are even not allowed to care) whether you're correct in any particular case, but you know that the probability of rejecting a true null hypothesis is only 5%, so you're doing well in that sense in the long run."
A Fisherian might say,
"If the p-value is small, then the observed test statistic falls among a group of potential outcomes that are extreme and pretty unlikely (you'd only see such outcomes 5% of the time, say), and so a more plausible explanation of the data is that the null hypothesis is not true. We then say that the result is statistically significant (at the 5% level), because you have some strong evidence against the null (if it were true, then a very unlikely event would have had to have occurred). If you use this procedure with a fixed cutoff of 5%, then the probability of incorrectly rejecting a null hypothesis over the long run is 5%, but that's just gravy -- really you care about the evidence (or lack of) that each particular set of data provides against the null."
These are two different ways of looking at it.
So this
But I always thought that if P <0.05 then its a very good bet that your data is normal.
is backwards. You haven't said what the hypothesis(es) are, but you probably mean, "the p-value for a test of the null hypothesis of normality is less than ..05", and the appropriate conclusion in that case should be to reject the null.
Hope this helps. MATLAB, by the way, is a perfect platform for understanding all this stuff experimentally, by simulating data and computing rejection rates, and so forth. The book Computational Statistics Handbook with MATLAB by Martinez and Martinez is an excellent place to start.
- Peter Perkins The MathWorks, Inc. .
- Follow-Ups:
- Re: What does the p-value mean
- From: Murphy O'Brien
- Re: What does the p-value mean
- References:
- What does the p-value mean
- From: Murphy O'Brien
- What does the p-value mean
- Prev by Date: Re: MATLAB CODE FOR IMAGE PROCESSING
- Next by Date: Re: finding indices
- Previous by thread: What does the p-value mean
- Next by thread: Re: What does the p-value mean
- Index(es):
Relevant Pages
|