Re: Predicting the Future and Kolmogorov Complexity
- From: "Seanpit" <seanpitnospam@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: 28 Mar 2007 08:11:03 -0700
On Mar 27, 10:49 pm, "R. Baldwin" <res0k...@xxxxxxxxxxxxxxxxxxxx>
wrote:
"Seanpit" <seanpitnos...@xxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:1175053380.922091.9850@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
On Mar 26, 8:00 pm, "R. Baldwin" <res0k...@xxxxxxxxxxxxxxxxxxxx>
wrote:
I don't have much time today, but this is the crux of the whole
problem. It is my argument that the basis of predictability has
nothing to do with knowing anything about the origin of the pattern in
question. Predictability is based on the pattern itself.
Nonsense. In statistics we look for things like lurking variables. Data
processing systems are infamous for supplying them. If you are going to
implement pattern analysis based on Kolmogorov Complexity, how do you
know
that you haven't introduced one through your choice of reference
computer?
That is what can happen.
It can happen, but it becomes less and less likely over time as KC,
relative to the chosen UTM, remains constant with each additional
addition to the string. Again, we aren't shooting for perfection
here. We are shooting for a useful degree of predictive value for the
non-random hypothesis that provides better than even odds of
predicting what will come next.
The trouble is as follows:
1. For any prediction scheme, there are computable strings that show no
periodicty within the amount of "useful" time available to the prediction
scheme. They fail the non-random hypothesis on a martingale for a
traditional computer, and "look" random, yet are non-random.
That's right. However, there are potential strings for which a
prediction scheme will work. Such strings, if and when they are
actually found, are unlikely to be the result of random production.
The fact that a prediction scheme may not work on all strings is
irrelevant to the fact that when it works, the string is unlikely to
have been randomly produces and this becomes more and more reliable as
the program has continued success over time.
2. For reference computers optimized to run hash table functions or constant
functions, all description numbers available for programs that produce
strings of what we conventionally see as regular are long with respect to
those strings, because all available short description numbers are used up.
They fail the non-random hypothesis if the martingale correctly matches the
reference computer, pass the non-random hypothesis if the martingale is
mismatched to the reference computer, "look" non-random, yet are random with
respect to the reference computer. The human notion of regularity and the
reference computer's notion of regularity don't have to agree.
That's also true. There doesn't have to be an agreement between my
notion of regularity and that of a particular reference computer.
However, if the reference computer actually finds a string that meets
its own definition of regularity, that string is unlikely to be the
result of random production and this hypothesis becomes more and more
solid over time.
3. No predictor scheme will work that didn't pay proper fealty to the
Nyquist Theorem. Once you have aliasing going on, and this is a function
strictly of your numerical processing system and not the source data, all
bets are off. Regularity cannot be reconstructed in discrete time data
unless the data is sufficiently oversampled. Undersampled data can show
false regularity or fail to show regularity that is there.
That's true. That is why the hypothesis doesn't have a lot of
predictive value at first - with just a small sample. However, as the
sample size grows larger and larger, the hypothesis gains more and
more predictive value or reliability.
4. Because of the superposition principle, you cannot take a string and
impute it to a source. Testing a string for randomness does not tell you a
source is random, because you don't know all the sources for the string.
Testing a string for a pattern does not tell you a source is non-random for
the same reason - especially since your data processing system may have
injected that pattern.
So no, predictability in the real world is never based solely on the
observed pattern.
Testing a string for randomness can never tell you if the source of
the string is actually random with absolute certainty. Likewise,
testing a string for non-randomness cannot tell you if the source of
the string is non-random with absolute certainty. However, the
finding of a pattern that has so far produce much better than even
odds of prediction does indeed indicate, more and more strongly over
time, that the source of the pattern is indeed non-random.
You are mistaken in you notion that the pattern itself says nothing at
all about its likely origin. Try getting that one across to SETI
scientists or forensic scientists or those who manage Las Vegas
casinos.
Sometimes the method of data collection itself introduces a pattern.
Which could be detected and then used for better than even odds of
successful prediction.
How on earth would that be useful?
If the data collection method itself introduces a pattern, that would
be a big problem for a casino manager. The detection of such a
pattern, if distinct enough, would make it much more difficult to
analyze the appropriate "randomness" of something like a roulette
wheel. Cheaters could take advantage of such a blind spot and the
casino owners may suffer a significant financial loss unless they
figure out the cause of the aberrant pattern.
Let's say you are in a room that has a 2 cm diameter round hole in the
wall. Out of this hole comes a ~2 cm blue marble followed by a red
marble that is identical except for color followed by a blue
marble . . . ect. You have no idea as to the origin of the marbles or
why they come out of the whole in such a pattern. After this pattern
has been repeated for a few hundred marbles, with the last marble out
of the wall being a blue marble, the guy next to you asks you if you
would like to bet him on what the next marble will be - blue or red.
Are you telling me you wouldn't bet any money at all in such a
situation? You've got to be kiddin me! How about if the pattern were
unbroken for a million or a billion marbles? - or a trillion? You'd
still not place a bet based only upon your prior knowledge of the
pattern of red and white marbles?
Especially not in this case. I would immediately suspect I was set up.
You could test this "setup" hypothesis of yours by starting out with a
very small bet based on your overall reserves. I just don't believe
you wouldn't try any bet whatsoever in such a situation.
Testing for a sucker bet generally requires looking beyond the pattern of a
single string.
Not really. The pattern itself can give very good evidence of a
sucker bet. For example, if, all of a sudden, whenever a bet is made
by the "sucker" the pattern changes, that is a very good clue that
gains more and more predictive value of a sucker bet.
While it is always possible that you could be wrong, that the pattern
could suddenly change, the odds against this possibility grow as the
finite pattern increases. Even though 100% predictability is never
attainable, better than even odds are attainable and these odds get
better and better with each successful prediction of the program.
Knowing nothing about the process dropping marbles, I disagree. How do I
know there is not a hidden microphone just waiting for me to place a bet,
on
which the marble choice will suddenly change?
You don't know *for sure*. That's the point. Just because you can
*never* know for 100% certainty does NOT mean that you cannot be
reasonably confident of success with better than even odds.
I would not be reasonably confident in the situation you describe. Not
enough is known about the process.
Oh please . . .
As another illustration, imagine that the blue and red marbles come
out of the hole in the wall without a detectable pattern as far as you
can tell. A guy walks up to you and hands you a piece of paper with
an apparently random series of Rs and Bs written down on it. You
notice that the sequence of Rs and Bs on the paper exactly match order
of appearance of red and blue marbles coming out of the wall. You
notice over time that the paper has successfully predicted the color
of the next marble 1,000 times in a row - and the sequence on your
paper extends for another 1,000 "predictions". Then, another man walks
up to you and asks you if you want to bet on the color of the marble.
You have $1,000 in your pocket. Would you bet any of your money on
the color of the next marble based on the past success of the program
written down on your paper?
Nope.
I just don't believe you. This is just ridiculous.
Human betting schemes are based on perceptions about the fairness about the
process. It is not just the pattern observed from the process, it is that
there is at least some transparency about the process that give it the
appearance of fairness with respect to the odds. For example, you can see
the dealer shuffle the cards. You can see that there is a fair deck.
Regardless of how apparently fair the dealer is shuffling the cards or
the roulette wheel is spinning around, you can be tricked - that's the
basis of slight-of-hand deception. However, if you are clued into the
actual pattern produced by the dealer or the roulette wheel, you can
pick up on the fact that you are being taken for a sucker. If you
cannot pick up on the aberrancies in the pattern that strongly go
against what a random hypothesis would predict, I know a lot of people
who would just love to play a good game of cards or craps or roulette
with you!
Experimental bias, fraud, lurking variables, and correlations could
ALWAYS be unknowns regardless of how much information you analyze
regarding a sequence or what you think is the likely source of the
sequence (which itself can be described by a sequence). You can NEVER
eliminate such possibilities completely. They are always going to be
there for those who have anything less than absolute knowledge.
However, despite the fact that 100% certainty is not possible, better
than even odds of success is possible - based on sequence analysis
alone.
There are methods for detecting these things that involve looking beyond the
measured string of numbers. Especially correlations.
Many useful correlations are entirely based on the string of numbers.
The odds that you can "compress" or "predict" what will come next in a
sequence increase with each additional successful prediction. You
simply do not have to know certain types of strings completely before
they can be reliably compressed, with better than even odds of success
(not 100%, but better than 50%), via analysis of a finite subsection.
The pigeonhole principle only proves that the vast majority of all
potential strings in sequence space are non-compressible relative to a
given UTM. This doesn't mean that a small minority of strings can be
detected to have a useful pattern, useful to predict the rest of the
string with better than even odds of success, before actually seeing
the rest of the string.
The word "useful" is undefined here.
Did you miss the part about "better than even odds of success"?
Any given UTM compresses the set of strings it is optimized to compress. Not
all UTMs are optimized the same way. Some are optimized for functions that
humans would generally not expect to be "useful."
It doesn't matter what you or I would "expect". If any UTM can
compress a growing string without increasing KC, that is useful for
better than even odds of prediction of what will come next. This
usefulness becomes more and more useful with each success.
Sean Pitman
www.DetectingDesign.com
.
- References:
- Predicting the Future and Kolmogorov Complexity
- From: Seanpit
- Re: Predicting the Future and Kolmogorov Complexity
- From: Seanpit
- Re: Predicting the Future and Kolmogorov Complexity
- From: Seanpit
- Re: Predicting the Future and Kolmogorov Complexity
- From: Seanpit
- Re: Predicting the Future and Kolmogorov Complexity
- From: Seanpit
- Re: Predicting the Future and Kolmogorov Complexity
- From: Seanpit
- Re: Predicting the Future and Kolmogorov Complexity
- From: Seanpit
- Re: Predicting the Future and Kolmogorov Complexity
- From: Seanpit
- Predicting the Future and Kolmogorov Complexity
- Prev by Date: Re: Design detection challenge # n+1
- Next by Date: Re: Desertphile's uses the tired and false argument from imperfection.
- Previous by thread: Re: Predicting the Future and Kolmogorov Complexity
- Next by thread: flagellum/cilium confusion
- Index(es):
Relevant Pages
|