Re: Measuring Turquoise Underwear



Stig Holmquist wrote:

On Thu, 10 May 2007 12:45:07 +0100, Evil Nigel <useweb@xxxxxxxxxx>
wrote:


Evil Nigel wrote:


Stig Holmquist wrote:


On Sun, 06 May 2007 16:49:12 -0400, Stig Holmquist
<stigfjorden@xxxxxxxxxxx> wrote:



On Sun, 06 May 2007 11:44:12 +0100, Evil Nigel <useweb@xxxxxxxxxx>
wrote:



Stig Holmquist wrote:


On Sat, 05 May 2007 18:42:08 +0100, Evil Nigel <useweb@xxxxxxxxxx>
wrote:




Stig Holmquist wrote:




Please explain the formula used for std.dev. and what book you used
The std.dev. for sums in the 6/49 game is 32.8, and the std.dev. for
49 integers is 14.14. Where does 12 come from?

Stig Holmquist>

Do you have Excel?
A B C D E F G

1 2 3 4 5 6 =STDEVPA(A1:F1)
1 2 3 47 48 49 =STDEVPA(A2:F2)
=AVERAGE(G1:G2)

The value in G3 = 12.36, a little higher than I calculated the average population standard deviation of the 14M combinations.


Make that 'a little lower' - my calculated average is approx. 12.72.



Your calculation of std.dev.for F1 and F2 is based on treating the
data as a population, but they are just samples from a 14 million
large population. Thus if you treat each set as a sample you'll get 1.871 for F1 and 25.211 for F2 with a mean of 13.541.


Since I didn't really know what I was doing and I needed a measure of diverseness, I assumed I could use either population standard deviation or sample standard deviation provided I was consistent throughout. I leaned towards population rather than sample because the combo 1-2-3-4-5-6 is a complete population - there are no more members, the values of which are unknown.



But there are 43 sets of samples with std.dev. of 1.871 and only one
with 25.211. Also, there is a distribution curve for the std.dev. of
all possible combinations. The shape of that curve is not known.


That's what I said. However the stats book didn't specify that the distribution had to be normal. Apart from the end bits, which are rather small in comparison to 14 million, it probably is very bell-curve-ish.



Thus it seems to me that any calculation based on 12 is poinless.


I'm open to suggestions for better methods of analysis.

BTW, I owe you substantial thanks. You're the first person to have a good read of what I'm trying to calculate and make intelligent feedback.

Evil Nigel



My memory is not what it used to be so I just recalled that I
corresponded some time ago with Harry Schneider, who wrote the book "Lottery Numbers". It is about the UK 6/49 game and has stats for about 52 draws. He was the first to calculate the std.dev.
for the numbers and came up with 14.1.
I then modified a formula by J.E Freund from his book "Mathematical
statistics" (1962) on p. 184 where he claims the variance for the
mean is (N+1)(N-n)/12n. I suspect this is a misprint and should not
include the n in the denominator. The revised formula would yield
50x43/12=179.17 ,which equals 13.4 as std.dev. Keep in mind that the
formula for the variance of 49 integers taken one at a time is
(N^2-1)/12 and thus close to the above formula.

His book is now available with a co-author.
Stig Holmquist



I've just carried out a std.dev. for the old PB /53 with 302 draws.
It yields numbers from 3.21 to 23.70.This range can then be divided up
into units 20 units of 1 and the frequencies within each class can
then be graphed. The result is a broad tripple peak from 10 to 20
with a small shoulder peak on each side.
I suspect there are too few data to yield a smooth curve. Can you
generate a graph based on your simulation?

Stig Holmquist


Cruncher running at the moment. I've arbitrarily chosen integer ranges for the bins. I don't know about a graph - I've only ever tried the graphing facility a couple of times and each time they've looked like sh*t.

I'll post the bincounts here if/when it finishes.

Evil Nigel


Wow. I admit my spread*** is slow - 2 hours just to generate and cycle through 14 million combos, but adding processing to calculate the population SDev's and putting them into bins, albeit using crap quality programming, the macro got just over 5% of the way through in 21 hours.

1 3
2 68
3 417
4 1333
5 3052
6 5744
7 9743
8 15116
9 22278
10 31202
11 42496
12 55908
13 72210
14 90973
15 112116
16 111481
17 90214
18 62232
19 29725
20 8053
21 1208
22 98
23 1

The combo being processed when the macro was terminated was:
1 7 9 18 36 48

The distribution is somewhat skewed. The average SDev so far is 14.84. Since that drops to 12.72 over the full range of combos, that would suggest the 5% sample is not representative of the whole distribution.

Evil Nigel



If you tested the last 500 draws of UK649 I'm sure you would get a cumulative mean that approaches the theoretical limit. But you must
use the sample s.d not the population form.

Stig

I'm happy that I already know the mean, what I'm not certain about is the shape of the distribution.

Since I'm relatively arbitrarily using the population standard deviation throughout, I don't understand why you think I should use the sample standard deviation of each combo in the last 500 draws. The mean of the sample standard deviation sample and the mean of the population standard deviation population will almost certainly be different and non-convergent.

Evil Nigel

.


Loading