Re: gamfit and zero values



"Ben" <junk@xxxxxxxxxxx> wrote in message
news:ef5d399.-1@xxxxxxxxxxxxxxxxxxxxxxxxxx
Can anyone explain why Matlab's gamfit blows up when trying to fit a
data set containing zero values?

Ben, gamfit does maximum likelihood estimation. The likelihood function is
unbounded when there are zeros in the data.

One recommended route is to switch to the method of moments in this case,
and that's what the gamfit function does. Maximum likelihood is often a
good method, but by no means the only one. An advantage of maximum
likelihood is that there are confidence interval calculations based on the
likelihood surface.

In your example, you start with something having a gamma distribution, but
then take the floor of it. You conclude that the results are no longer
accurate. Of course, the truncated values no longer have a gamma
distribution much less the same distribution as the original continuous
data. If you use round() in place of floor(), then the result still
wouldn't have a gamma distribution but you wouldn't be systematically
shifting things down, and the estimates would be a bit closer to the true
values for the original data.

Here's what I might do if I were stuck with integer versions of data that
might inherently have a gamma distribution. First I'd define a function
that took the rounding into account. The function f below computes the
probability of each x value. Then it computes minus their sum on the log
scale to avoid numerical problems and for convenience:
x = round(exprnd(ones(1000,1)));
f = @(p) -sum(log(gamcdf(x+.5,p(1),p(2)) - gamcdf(x-.5,p(1),p(2))));

Then I'd use fminsearch or something better from the Optimization Toolbox to
find the maximum likelihood estimates:

p = fminsearch(f,[.5 2])
p =
0.9343 1.1010

This still doesn't give me confidence intervals. To do that I'd have to
examine the likelihood function more carefully. Here's some code to plot
it.

p1 = linspace(.7,1.2,21); p2 = linspace(.8,1.4,19);
for j=1:21, for k=1:19, c(j,k) = f([p1(j),p2(k)]); end, end
contour(p2,p1,c)

And here's something do draw an approximate boundary of a 95% confidence
region computed based on a likelihood ratio statistic:
fmin = f(p)
flr = fmin + chi2inv(.95,2)/2
hold on
[ignore,h] = contour(p2,p1,c,[flr flr]);
set(h,'linestyle',':')
hold off

Some of the details here may be need to be adapted for your real situation,
for example you may need an interval other than [x-0.5,x+0.5] if you're not
rounding to integers. I hope this gives you some ideas, though.

-- Tom


.



Relevant Pages

  • Re: Maximum likelihood estimator and multiple maxima
    ... > Maximum likelihood is NOT the same as minimum chi-squared. ... > prior can separate them. ... need to give prior probability distributions on the parameters of the fit. ...
    (sci.stat.math)
  • MLE - 1st and 2nd derivative of the function
    ... it seems I haven't been doing the maximum likelihood ... This involves millions of calculations. ... maximum likelihood estimation involves the following four ...
    (sci.stat.math)
  • How to find the likelihood functions?
    ... I am writing a winBUGS code and I need to specify the likelihood ... for the gamma distribution and I am specifying it as the pdf: ...
    (sci.stat.math)
  • Re: How to find the likelihood functions?
    ... for the gamma distribution and I am specifying it as ... could use the pdf in ... is the real likelihood function, ...
    (sci.stat.math)
  • Problem about likelihood ratio hypothesis test
    ... I am not sure whether the following programs performed for the likelihood ratio test are correct or not. ... % returns maximum likelihood estimates for the parameters of gamma distribution and exponential distribution. ...
    (comp.soft-sys.matlab)