Re: Anderson-darling test for discrete distribution fitting



Samik R. wrote:
On 5/2/2007 3:47 AM, David Jones wrote:

My other thought on this topic is that you should check that you
have a reasonable formulation for the AD statistic, suited to
discrete distributions and to tied observations. David Jones


Thanks for all the comments. David, can you clarify some more on your
last comment? Currently I am using the conventional form of AD
statistic for both discrete and continuous distributions (the same
one available at NIST:
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35e.htm). Do
you mean I should change this in some manner to accommodate discrete
distributions?
Regards,
-Samik

(i) the AD test as usually done, is essentially a test of whether certain transformed values of the data have a
uniform distribution ... where the transformed values are obtained as F(X) for observations X from a
distribution having F() for its CDF ... but this only produces a uniform distribution if F is continuous. The
usual asymptotic distribution (for no fitted parameters) is derived for this case. This may be one resaon for
wanting to do simulations anyway (to overcome this problem).

(ii) there is a question of interpretation for what F and (1-F) should be interpreted in the test statistic in
the case of a discrete distribution ... I don't knoe what is usually done. Possibly ...
F = Prob(X<=Xobs) and 1-F=Prob(X>=Xobs)
or
F = Prob(X<=Xobs) and 1-F=Prob(X>Xobs)=Prob(X>=Xobs).
If you do simulations you get results for your particular choice.

(iii) there is also the possibility of revising the weights being used for the separate terms, which are related
to the idea of "plotting positions". If you rewrite the usual formulation by separating into two parts,
reversing the summation on one and then recombining so that each observation appears in only one terms you can
get a better idea for what is going on. Each term is then a function like
w.log y +(1-w) log(1-y)
where w is a weight and y is the probability-point associated with an observation. Allowing y to be free, this
is minimised at y=w, so that effectively w is the target for what y should be in the test statistic. For a
discrete distribution you might want the target to be 1/N for the lowest observation, rather than 1/(2N).

Hope this helps

David Jones


.



Relevant Pages

  • Re: Discrete vs Continuous Solution ??
    ... a finite and continuous straight-line extending in the +ve Y-direction ... of the points where the discrete points are ... distribution of the initial discrete points ...
    (sci.math.num-analysis)
  • Re: Skewed differences and paired students t test
    ... > some circumstances) be both discrete and in a small range. ... The null distribution, evaluated this way, would be based on ... careful to calculate the right probability for the significance level: ... pairs is small then you could replace the randomisation by a complete ...
    (sci.stat.math)
  • Re: A question about Glivenko-Cantelli lemma
    ... >I know it is discrete. ... >It maps to the distribution function, ... about _vague_ convergence of the densities. ... >relative frequency and the true pdf. ...
    (sci.math)
  • Re: Anderson-darling test for discrete distribution fitting
    ... On 5/2/2007 11:00 AM, David Jones wrote: ... have a reasonable formulation for the AD statistic, ... the AD test as usually done, is essentially a test of whether certain transformed values of the data have a uniform distribution ... ... For a discrete distribution you might want the target to be 1/N for the lowest observation, ...
    (sci.stat.consult)
  • Re: Discrete vs Continuous Solution ??
    ... a finite and continuous straight-line extending in the +ve Y-direction ... from y=0.20 to y=1.0 accurately depicts a particular phenomenon. ... of the points where the discrete points are ... distribution of the initial discrete points ...
    (sci.math.num-analysis)

Loading