Re: Question for the statisticians



On Jun 28, 8:52 pm, henrysun...@xxxxxxxxx wrote:
So I'm reading "Statistics Demystified" and it states the following
about the Central Limit Theorem:

"According to the first part of the Central Limit Theorem, the
sampling distribution of means is a normal distribution if the
distribution for P is normal.  If the distribution for P is not
normal, then the sample distribution of means approaches a normal
distribution as the sample size N increases.  Even if the distribution
for P is highly sekwed (assymetrical), any sampling distribution of
means is more nearly normal than the distribution of P.  It turns out
that if N>=30, then even if the distribution for P is highly skewed
and P is gigantic, for all practical purposes the sampling
distribution of means is a normal distribution."

I'm curious as to how this might apply (or, alternatively, if it does
not apply) to the issue of bridge simulations.

If one were to run 30 separate 50 hand simulations of the same
problem, would the cumulative result of the simulations tend to be
more close to the theoretical expectancy than running 1 simulation of
1500 trials?

Or am I misunderstanding the CLT and its possible application to
simulations?

Henrysun909

No, there would be no difference between running 30*50 vs 1*1500
simulations, and you are misunderstanding the Central Limit Theorem.

Suppose you have a particular hand, say Jxx xx KQ10xxx Qx opposite a
15-17 notrump opening. You'd like to know whether the odds favor
blasting 3NT. You enter the parameters into Dealmaster Pro, and
simulate 100 or 250 or 1000 hands. What conclusion can you draw from
the results?

If you state the problem as "what proportion of the umpteen zillions
of possible hands which fit these parameters will make 3NT double-
dummy", you realize this is exactly the same as if you had umpteen
zillion colored balls, zome red and some green, one red ball for each
time 3NT fails and one green for each time 3NT makes. You draw a
sample of 100 or 250 or 1000 colored balls and compute the proportion
of greens. The CLT allows us to draw inferences about the proportion
of green balls in the entire population, and to make probability
statements based on the known probabilities of normal populations.

Each time you run a simulation of, say 1000, you compute a proportion
of successes (green balls, or 3NT making double-dummy.) That
proportion will vary from sample to sample. However, distribution of
those sample proportions will be normal -- that's the CLT. When we
know something is normally distributed, we can make probabilistic
statements about an individual selected from an overall population --
and the CLT says we can make probabilistic statements about the means
of samples, regardless of how skewed or abnormal the underlying
population is. In particular, the type of stement we can make is of
the form: if the underlying popuation has 40% green balls (3NT makes
40% of the time), what's the probability a random sample will show 37%
greens? For p = 40%, the population variance is (.4)*(1-.4), and the
standard deviation of a sample sized 1000 is the square root of (.4)
(1-.4)/1000, = .01549 . Now, 37% is a result below the expected value
of 40%. We can calculate how many standard deviations away from 40% it
is: (.37-.40)/(.01549) = -1.94 standard deviations below the assumed
mean of 40%. Now look up 1.94 on a table of standard normal
probabilities, and you will see a probability of .4738 : 47.38% of all
outcomes of a standard normal variable lie between 0 and 1.94. Because
it is symmetric, 47.38 lie between 0 and -1.94. That means that 50%+
47.38% = 97.38% of sample proportions from a population with 40% green
balls should lie above 37%. In bridge terms, if you'd bid a game with
a 40% chance of making double-dummy, it is fairly unlikely
(100%-97.38% = 2.32%) but not impossible that your 37% came form a
hand that really has chances of at least 40%. In bridge terms, such a
result on a sample size of 1000 would be sufficient to say "double-
dummy analysis rejects bidding game". It is the Central Limit Therom
combined with the known probabilities of the normal distribution which
allows us draw such inferences.

.



Relevant Pages

  • Re: Pigeons, People, and Priors
    ... the variance of the probability generator go to zero you have a continuum ... a random-interval 60 s schedule is not. ... The Exponential Distribution ... I probably should have used the phrase "statistical learning theory" rather ...
    (comp.ai.philosophy)
  • Re: So called "stimulus/response" models
    ... Instead of answering to each misunderstood, ironic and out of context ... Sorry, you exhibit a simplistic view of probability theory, and an even more ... of acquiring the consequences of responses. ... distribution over consequences of a given act. ...
    (comp.ai.philosophy)
  • Re: Hardy-weinberg Equilibrium
    ... Mating is random. ... while panmixis means equal probability of any ... But suppose we assumed a normal distribution? ... Are you claiming that statistical randomness requires a uniform ...
    (talk.origins)
  • Re: behavior as mapping
    ... estimating a probability distribution, the distribution ... sequence with equal probability - since you have microsecond temporal ... reduction of the entropy Pto the entropy P ... If there were 4 genes we would need 2 bits of binding site info. ...
    (comp.ai.philosophy)
  • Re: Bill Reid, Kelly Criterion
    ... about logs; if a person is talking about a percentage change in the ... probability of going broke the more they trade. ... adjustment (which is the one which allows any distribution which is ...
    (misc.invest.stocks)