Re: Probability, compressible sequences: Backgrounds
- From: Thomas Richter <thor@xxxxxxxxxxxxxxxxx>
- Date: Mon, 10 Mar 2008 08:51:24 +0100
a.denis1@xxxxxxxxx schrieb:
A "specific sequence" cannot be random. And as such, as long as you know
your sequence, you can compress it. Actually, to zero bit, as you know
what you're talking about, so there's no point in transmitting it in
first place.
In the compressed sequence you need to insert also the decompressor
otherwise it is trivial.
How can you compress a specific sequence wthin the decompressor to
zero?
Within the given framework (random processes) including the decompressor
is not "necessary". This is because all this theory talks about is the
limiting case of infinitely long sequences generated by a random
process. This means that the "overhead" of including the decompressor -
a limited sized program - is zero. You can do it, but it doesn't change
the entropy. You're probably talking about Kolmogorov-complexity, i.e.
the shortest sized program generating a given sequence, which is quite a
different view.
"Entropy" and "compressibility" are statements that are not
defined on *specific* sequences, or on "subsequences of specific
sequences", but only on *random processes*. A *process* is something
different than a *sequence*.
I think it is possible ( not only possible but I think it is the only
way ) to speak about "Entropy" and "compressibility" about sequence
indipendently from processes.
As long as we're talking about finite sequences, no - it is an
ill-defined concept. For infinite sequences, one can take a
probabilistic view-point - talking about "random" sequences generated by
a random process - or a complexity viewpoint - considering the shortest
program (running on a Turing machine) generating the sequence. In both
cases, the choice of the "programming language" and the "machine" does
not matter - it is an additional constant that diminishes in the limit.
Otherwise, I can always construct a suitable finite state that generates
the given finite sequence with "no program" whatsoever. Or I can
consider the size of the machine as "size of the program", which then
also depends on the representation. No matter how you put it, it is
ill-defined.
I try to reassume your point of view let me know if I misunderstand.
You say that speaking about random sequence is nonsense, you need to
define the process and if the process is random the sequence produced
by the process will be random indipendently from the compressibility
of its serquences.
No. What I want to say is that *processes* can be random or not.
"Randomness" is nothing that can be applied to *sequences*.
How can you define a random process?
It is an abstract model - a thing that generates symbols "on demand"
where, given all symbols you know, the next symbol can be only predicted
with limited success - or rather, the next symbol appears with a
probability depending on the internal state of the machine. For some
random processes, you know this state ("Markov processes"), there are
also others where you don't ("Hidden Markov") but still assume certain
characteristics.
A coin is a random process? Why?
Within a suitable abstraction, the outcome of a coin experiment can be
described *as if* it comes from a random process.
You use a random process as hypothesis and then using this hypothesis
you define the sequence as random .
I'm not using the coin experiment to define what a random process is. It
is an example of an experiment that can be successfully described by a
random process - this is a difference. One can never prove that a coin
*is* a random process, it is just a very reasonable assumption. For
that, an infinite number of experiments has to be made, and we need to
show convergence, etc.. which cannot be done by a physical process for
obvious reasons. A random process is as much a model for reality as "an
electron" is a model for a certain aspect of reality. It allows you to
derive certain results, always hoping that the implicit assumptions you
put into the model fits well to reality, and thus obtaining results that
fit to reality.
You can say ok this sequence is the result of a coin so we know this
is a random process and the sequences will be random but you need to
know that the process is random and to know a process is random you
need to analize its sequences ( yes there are other ways but ...) !
No - exactly not. See above - what I say is that a random process (an
abstract thing!) is a good model for the coin experiment.
We know the coin flipping is a random process becouse in the past we
analize its sequences and we understand is it about impossible to
predict its result and the sequence is incompressbile.
I don't think "we know". As always in physics, it's an incomplete
induction. One can *assume* it is suitably precisely modeled by a random
process and *hope* that results derived this way are fine. Actually,
experience shows that it is, yes.
For this reason I go in the opposite direction I claim that the only
way to define a random process is to identify its sequences as
random , not compressible.
Unfortunately, this cannot be put in a way that makes sense, see above.
Using the coin as device you can create a random process becouse this
process produce not compressible sequences.
Now you can try to flip a coin and you can give me a sequence
absolutely compressible and claim that it is random.
If I use a biased coin, the sequence generated by a coin experiment will
very well be compressible, but it is still random. Even a loaded dice
generates random sequences. (More precise: "A zero-order random process
modeling the biased coin does not have maximum entropy"). Just that the
probabilities aren't all equal, but that doesn't change the theory.
Actually, the only reason why we use compressors is that we model data
by random processes whose probability metric is *not* uniform.
Using the compressibility principle I must say that process is not
random and this is a mistake but if you know only that sequence the
best claim possible is that the process is not random !
To say that the sequence is random you need to identify its process as
known random process and then you can give the correct answer , but
you need to know a lot of data , more than the amount of the sequence.
Of course "knowing a process" is a lot more than "knowing one of its
outcomes". That's exactly the point why "processes" are compressible or
not, not "outcomes". And that's exactly the reason why all working
compressors can only compress a *tiny* amount of sequences, namely those
that are outcomes of random processes used for defining the compressor.
For these reason I say we can speak about random sequence, becouse the
only way to identify a process as random is to watch at its sequences.
Nope. It's ill-defined - you can only watch finite limited sequences.
From that, one cannot conclude (logically) anything about an infiniteamount of sequences. It's an incomplete induction and doesn't make sense
logically.
For these reason I say we don't need to know the process to speak
about random.
And for these reason I claim
Random data is incompressible by definition!
Nope. It's easy to construct random sequences that are compressible. See
above, the "typical" outcome of a loaded dice will very well be
compressible, yet it's random.
Again, please get your concepts cleaned up. Your interpretation of
"random" cannot be put into a strict definition - it doesn't make sense.
So long,
Thomas
.
- Follow-Ups:
- Re: Probability, compressible sequences: Backgrounds
- From: a . denis1
- Re: Probability, compressible sequences: Backgrounds
- References:
- Probability, compressible sequences: Backgrounds
- From: Thomas Richter
- Re: Probability, compressible sequences: Backgrounds
- From: a . denis1
- Probability, compressible sequences: Backgrounds
- Prev by Date: Re: Query in Gzip file format.
- Next by Date: Re: Probability, compressible sequences: Backgrounds
- Previous by thread: Re: Probability, compressible sequences: Backgrounds
- Next by thread: Re: Probability, compressible sequences: Backgrounds
- Index(es):
Relevant Pages
|
|