Re: Anyone have a statistics background?
- From: iandjmsmith@xxxxxxx
- Date: 25 Jan 2006 04:32:20 -0800
Schizoid Man wrote:
> Bill Riel wrote:
> > In article <dr41bc$s5b$1@xxxxxxxxxxxxxxxxxxxxx>, schiz@xxxxxx says...
> >
> >>I have a quick question...
> >>
> >>I need a numerical approximation to Student's t distribution that is
> >>accurate to double precision.
> >>
> >>Unless I am completely mistaken, the TDIST() function in Excel behaves
> >>very strangely.
> >>
> >>It cannot calculate a value for x < 0 and the result of this function is
> >>the pdf of the distribution, not the cdf unlike NORMSDIST().
> >>
> >>I realize that I could construct a better function using the BETADIST()
> >>function, but I'm not sure whether this will alleviate or compound
> >>whatever inaccuracies are already present in Excel.
> >>
> >>Currently, the one I am leaning towards is Numerical Recipes.
> >
> >
> > I don't know about Excel - I do recall that in the past that some of the
> > statistical functions were questionable.
> >
> > However, I've used the code from Numerical Recipes and it's worked well
> > - it should meet your precision requirements.
>
> Thanks, Bill. That's exactly what I did - a bit convoluted given that
> you needed a Log Gamma and Incomplete Beta integral in order to get the
> t, but probably worth it given the precision I am striving for.
Watch out with NR algorithms. They have a nasty habit of omitting to
mention where they don't work. I know you are only using it for the
t-distribution but betacf is slow and inaccurate for both large a&b and
for small a&b.
The implementation of gammln only guarantees an absolute error of less
than 2e-10. That means that when you exponentiate it the relative error
in the answer will be 2e-10. This is the best you can claim for your
overall answer for the cdf of the t distribution. I believe the GAMMALN
function in EXCEL2003 uses the same implementation or something very
similar.
While we are on similarities of code, I think EXCEL2003 uses equation
6.4.9 in NR, or similar, to calculate the t-distribution figures in
terms of the beta distribution.
Note, however, that as df gets large, df/(df+t^2) tends to 1 and the
BETADIST calculation in EXCEL terms suffers from the fact that
cancellation errors occur when evaluating 1-(df/(df+t^2)). For large df
and -ve t, it is better to calculate cdf_tdist(t,df) as
comp_cdf_beta(t^2/(t^2+df),1/2,df/2) instead of
cdf_beta(df/(t^2+df),df/2,1/2).
Of course EXCEL does not offer a right tail evaluation for the beta
distribution and so cannot take advantage of this obvious method of
making the calculations more accurate.
Mind you, the NR code does not offer a method of evaluating the right
tail for the beta distribution. So the NR algorithm does not appear to
have any advantages over the EXCEL2003 version.
With the example you gave for NORMDIST, you must be using an earlier
version of EXCEL. In that case, I would strongly recommend you use the
VBA code in http://members.aol.com/iandjmsmith/Examples.txt This does
have code which evaluates the beta distribution for large and small
values of the parameters. Additionally it has separate code for the t
distribution which is more accurate and does not suffer from problems
such as df/(t^2+df) overflowing in the calculation of t^2 or
underflowing if calculated as df/t/(t+df/t). In other words it works.
Ian Smith
.
- Follow-Ups:
- Re: Anyone have a statistics background?
- From: Schizoid Man
- Re: Anyone have a statistics background?
- References:
- Anyone have a statistics background?
- From: Schizoid Man
- Re: Anyone have a statistics background?
- From: Bill Riel
- Re: Anyone have a statistics background?
- From: Schizoid Man
- Anyone have a statistics background?
- Prev by Date: Re: George's Abusive Posts
- Next by Date: Re: George's Abusive Posts
- Previous by thread: Re: Anyone have a statistics background?
- Next by thread: Re: Anyone have a statistics background?
- Index(es):
Relevant Pages
|