Re: Measuring Turquoise Underwear
- From: John Griffin <thathillbilly@xxxxxxxxxxx>
- Date: 14 May 2007 17:21:03 GMT
Evil Nigel <useweb@xxxxxxxxxx> wrote:
John Griffin wrote:
Stig Holmquist <stigfjorden@xxxxxxxxxxx> wrote:
On Thu, 10 May 2007 12:45:07 +0100, Evil Nigel
<useweb@xxxxxxxxxx> wrote:
Evil Nigel wrote:
Stig Holmquist wrote:
On Sun, 06 May 2007 16:49:12 -0400, Stig Holmquist
<stigfjorden@xxxxxxxxxxx> wrote:
On Sun, 06 May 2007 11:44:12 +0100, Evil Nigel
<useweb@xxxxxxxxxx> wrote:
Stig Holmquist wrote:
On Sat, 05 May 2007 18:42:08 +0100, Evil Nigel
<useweb@xxxxxxxxxx> wrote:
Stig Holmquist wrote:
Please explain the formula used for std.dev. and whatDo you have Excel?
book you used The std.dev. for sums in the 6/49 game
is 32.8, and the std.dev. for 49 integers is 14.14.
Where does 12 come from?
Stig Holmquist>
A B C D E F G
1 2 3 4 5 6 =STDEVPA(A1:F1)
1 2 3 47 48 49 =STDEVPA(A2:F2)
=AVERAGE(G1:G2)
The value in G3 = 12.36, a little higher than I
calculated the average population standard deviation
of the 14M combinations.
Make that 'a little lower' - my calculated average is
approx. 12.72.
Your calculation of std.dev.for F1 and F2 is based on
treating the
data as a population, but they are just samples from a
14 million large population. Thus if you treat each set
as a sample you'll get 1.871 for F1 and 25.211 for F2
with a mean of 13.541.
Since I didn't really know what I was doing and I needed
a measure of diverseness, I assumed I could use either
population standard deviation or sample standard
deviation provided I was consistent throughout. I leaned
towards population rather than sample because the combo
1-2-3-4-5-6 is a complete population - there are no more
members, the values of which are unknown.
But there are 43 sets of samples with std.dev. of 1.871
and only one with 25.211. Also, there is a
distribution curve for the std.dev. of all possible
combinations. The shape of that curve is not known.
That's what I said. However the stats book didn't
specify that the distribution had to be normal. Apart
from the end bits, which are rather small in comparison
to 14 million, it probably is very bell-curve-ish.
Thus it seems to me that any calculation based on 12 is
poinless.
I'm open to suggestions for better methods of analysis.
BTW, I owe you substantial thanks. You're the first
person to have a good read of what I'm trying to
calculate and make intelligent feedback.
Evil Nigel
My memory is not what it used to be so I just recalled
that I corresponded some time ago with Harry Schneider,
who wrote the book "Lottery Numbers". It is about the UK
6/49 game and has stats for about 52 draws. He was the
first to calculate the std.dev. for the numbers and came
up with 14.1. I then modified a formula by J.E Freund
from his book "Mathematical statistics" (1962) on p. 184
where he claims the variance for the mean is
(N+1)(N-n)/12n. I suspect this is a misprint and should
not include the n in the denominator. The revised formula
would yield 50x43/12=179.17 ,which equals 13.4 as
std.dev. Keep in mind that the formula for the variance
of 49 integers taken one at a time is (N^2-1)/12 and thus
close to the above formula.
His book is now available with a co-author.
Stig Holmquist
I've just carried out a std.dev. for the old PB /53 with
302 draws. It yields numbers from 3.21 to 23.70.This range
can then be divided up into units 20 units of 1 and the
frequencies within each class can then be graphed. The
result is a broad tripple peak from 10 to 20 with a small
shoulder peak on each side. I suspect there are too few
data to yield a smooth curve. Can you generate a graph
based on your simulation?
Stig Holmquist
Cruncher running at the moment. I've arbitrarily chosen
integer ranges for the bins. I don't know about a graph -
I've only ever tried the graphing facility a couple of
times and each time they've looked like sh*t.
I'll post the bincounts here if/when it finishes.
Evil Nigel
Wow. I admit my spread*** is slow - 2 hours just to
generate and cycle through 14 million combos, but adding
processing to calculate the population SDev's and putting
them into bins, albeit using crap quality programming, the
macro got just over 5% of the way through in 21 hours.
1 3
2 68
3 417
4 1333
5 3052
6 5744
7 9743
8 15116
9 22278
10 31202
11 42496
12 55908
13 72210
14 90973
15 112116
16 111481
17 90214
18 62232
19 29725
20 8053
21 1208
22 98
23 1
The combo being processed when the macro was terminated was:
1 7 9 18 36 48
The distribution is somewhat skewed. The average SDev so far
is 14.84. Since that drops to 12.72 over the full range of
combos, that would suggest the 5% sample is not
representative of the whole distribution.
Evil Nigel
If you tested the last 500 draws of UK649 I'm sure you would
get a cumulative mean that approaches the theoretical limit.
But you must use the sample s.d not the population form.
Stig
s.d. frequency
2: 385
3: 4008
4: 17658
5: 51401
6: 112400
7: 208995
8: 351353
9: 539383
10: 764322
11:1015862
12:1284690
13:1530424
14:1701592
15:1755043
16:1621714
17:1289115
18: 865629
19: 499579
20: 245578
21: 93947
22: 25502
23: 4649
24: 555
25: 32
Maximum variance at: 1 2 3 43 48 49
Average s.d.: 13.9376251501268
Since you say the average should be 12.72, I wonder if I got
this right. Where did that number come from?
Are you using sample standard deviation? As that has more
degrees of freedom, it will be larger.
Evil Nigel
Yes.
This oughta do it:
Sample standard deviation ,frequency, cumulative proportion,
Population s.d.,...
1: 0 . 0 .
2: 385 .00003 719 .00005
3: 4008 .00031 7431 .00058
4: 17658 .00158 30557 .00277
5: 51401 .00525 83875 .00877
6: 112400 .01329 179586 .02161
7: 208995 .02824 328996 .04514
8: 351353 .05336 536645 .08351
9: 539383 .09193 800448 .14075
10: 764322 .14659 1104192 .21971
11:1015862 .21924 1420396 .32129
12:1284690 .31111 1705899 .44328
13:1530424 .42055 1894023 .57872
14:1701592 .54223 1899713 .71457
15:1755043 .66774 1637695 .83169
16:1621714 .78371 1166599 .91511
17:1289115 .87589 686199 .96418
18: 865629 .9378 336256 .98823
19: 499579 .97352 126475 .99727
20: 245578 .99108 32436 .99959
21: 93947 .9978 5168 .99996
22: 25502 .99963 490 1.
23: 4649 .99996 18 1.
24: 555 1. 0 1.
25: 32 1. 0 1.
Maximum variance at: 1 2 3 43 48 49
Average s.d.: 13.9376251501268
Average s.d.: 12.7232528212385
.
- Follow-Ups:
- Re: Measuring Turquoise Underwear
- From: Evil Nigel
- Re: Measuring Turquoise Underwear
- References:
- Measuring Turquoise Underwear
- From: Evil Nigel
- Re: Measuring Turquoise Underwear
- From: Stig Holmquist
- Re: Measuring Turquoise Underwear
- From: Evil Nigel
- Re: Measuring Turquoise Underwear
- From: Stig Holmquist
- Re: Measuring Turquoise Underwear
- From: Evil Nigel
- Re: Measuring Turquoise Underwear
- From: Stig Holmquist
- Re: Measuring Turquoise Underwear
- From: Stig Holmquist
- Re: Measuring Turquoise Underwear
- From: Evil Nigel
- Re: Measuring Turquoise Underwear
- From: Evil Nigel
- Re: Measuring Turquoise Underwear
- From: Stig Holmquist
- Re: Measuring Turquoise Underwear
- From: John Griffin
- Re: Measuring Turquoise Underwear
- From: Evil Nigel
- Measuring Turquoise Underwear
- Prev by Date: Re: He/she/it.. a.k.a. Gerry/Sherry/Grif***-the-Gremlin.
- Next by Date: Re: Free Tibet
- Previous by thread: Re: Measuring Turquoise Underwear
- Next by thread: Re: Measuring Turquoise Underwear
- Index(es):