Re: What does the p-value mean



Murphy O'Brien wrote:
If I'm using a test like say lillietest or kstest I can get a p
value. When I google it I get quotes like this.

"If the p-value is less than some significance level, alpha,
(typically practitioners use an alpha of 0.05) then we say that the
result is statistically significant (at the 5% level) - i.e. the
probability of incorrectly rejecting the null hypothesis is less than
5%."

This quote is either imprecisely worded, or wrong, depending on how harsh you want to be. The p-value is computed for particular data. The probability of incorrectly rejecting is a long-run probability, over many repetitions of the test procedure on different data. A better statement for the Neyman-Pearson interpretation of hypothesis testing would be something like (leaving out the preliminaries and ignoring things like discreteness),


"Given a fixed significance level alpha, chosen in advance (typically practitioners use an alpha of 0.05), you can define a test procedure which has a probability of incorrectly rejecting a true null hypothesis only alpha%, by simply computing the p-value (tail area under the null) and rejecting the null if p<alpha. In that case, we say that a result is statistically significant (at the 5% level). You have no idea (and are even not allowed to care) whether you're correct in any particular case, but you know that the probability of rejecting a true null hypothesis is only 5%, so you're doing well in that sense in the long run."

A Fisherian might say,

"If the p-value is small, then the observed test statistic falls among a group of potential outcomes that are extreme and pretty unlikely (you'd only see such outcomes 5% of the time, say), and so a more plausible explanation of the data is that the null hypothesis is not true. We then say that the result is statistically significant (at the 5% level), because you have some strong evidence against the null (if it were true, then a very unlikely event would have had to have occurred). If you use this procedure with a fixed cutoff of 5%, then the probability of incorrectly rejecting a null hypothesis over the long run is 5%, but that's just gravy -- really you care about the evidence (or lack of) that each particular set of data provides against the null."

These are two different ways of looking at it.

So this

But I always thought that if P <0.05 then its a very good bet that
your data is normal.

is backwards. You haven't said what the hypothesis(es) are, but you probably mean, "the p-value for a test of the null hypothesis of normality is less than ..05", and the appropriate conclusion in that case should be to reject the null.


Hope this helps. MATLAB, by the way, is a perfect platform for understanding all this stuff experimentally, by simulating data and computing rejection rates, and so forth. The book Computational Statistics Handbook with MATLAB by Martinez and Martinez is an excellent place to start.

- Peter Perkins
  The MathWorks, Inc.
.



Relevant Pages

  • What does the p-value mean
    ... When I google it I get quotes like this. ... "If the p-value is less than some significance level, alpha, ... probability of incorrectly rejecting the null hypothesis is less than ...
    (comp.soft-sys.matlab)
  • Re: What does the p-value mean
    ... Peter Perkins wrote: ... >> probability of incorrectly rejecting the null hypothesis is ... correct, i.e. assuming, for example, the test is for normality. ...
    (comp.soft-sys.matlab)
  • Re: distribution of n point on the unit circle
    ... What is the probability density function or the distribution ... sector of the unit circle with that opening angle. ... alpha cannot be bigger than pi. ... if alpha> beta then the anticlockwise angle ...
    (sci.math)
  • Re: Shotgun statistics
    ... would show an effect by chance. ... could then calculate the correlation between A and B in each of these ... samples of size 50 do we need to select before the probability of at ... least one of them showing a "statistically significant" correlation ...
    (sci.stat.math)
  • =?UTF-8?Q?Re:_Bob_Ling=C2=B4s_ignorance_is_clear?=
    ... We evaluate the probability of all pairs x, y obeying to H0 or 1-ALPHA. ... The Exact method is characterized in that ALPHA is directly obtained, being the respective CRITICAL VALUE unknown. ... IT MUST BE SUBSTITUTED UNMISTABLY by the exact method which, based on the FIRST PRINCIPLES, could not provide wrong results. ... REM current method ...
    (sci.stat.math)