Re: Entropy



Thomas Richter wrote:
Mok-Kong Shen wrote:

OK. Let me aks your favour to do one thing: In all the books
I have looked at, Komolgorov complexity is defined for finite
strings and no where is any inappropriateness of finite
input mentioned.

None of the above stuff is new, it's just conclusions from the obvious. While I generally don't appreciate Wikipedia for things like this, just go there. You'll find there exactly the same elementary result I stated above, namely that the complexity is language dependent, and that for any two languages the complexity of a given string differs by a constant (depending on the languages). The specific "lena" example is this invariance theorem, just turned around. By making the machine pretty smart, one can make the constant c as large as the language, and then the complexity of *a specific* string very short.

For any *finite* string the complexity isn't a function of the input string itself. However, if you approach the limit of longer and longer strings, this constant becomes irrelevant, but only then.

There is no big science behind that, really.

So your claim that for finite input Komolgorov
complexity makes little sense must be something new in the
field. Could you please give a reference to a book or a journal
that supports your claim, or is it a yet unpublished scientific
result of your own?

Definitely not.

For example, go into Ray Solomonoff's work "A Preliminary Report on a General Theory of Inductive Inference," Report V-131, Zator Co., Cambridge, Ma. Feb 4, 1960." cited there. Go to section 4, and read it. It states exactly what I say - the constant in the above equation becomes irrelevant for long strings, and only then - and *then* the definition is machine-independent. Otherwise, it depends on the machine, clearly. That I can set the complexity to one for a *specific* message is a trivial corollary from the invariance theorem. Specific means here:

For any message A there exists a Turing machine U such that C_U(A) = 1.

quite contrary to:

There is a Turing machine U such that for any message A C_U(A) = 1.

which is a different statement, and clearly wrong. As I might have said already, it depends on the position of the quantors. As the description of a Turing machine is by definition finite, the second statement is clearly wrong since I can always set the message just considerably longer than the description of the machine to be on the safe side. This does not hold for the first statement.

Really, it's just about reading the statement carefully.

I regret that I continue to have difficulties in understanding your
point. Please have patience with me, whose knowledge is very humble.

1. In the Wiki page there is no mention of "infinite string". This,
together with the use of computer programs (in the common style) to
illustrate, seems to indicate that there the term "string" is employed
in the sense 'normally' understood, i.e. a finite string.

2. You seemed to mean that Sec. 4 of the the paper of Solomonoff
(referenced by Wiki) would easily clear up the issue of "infinite
string" vs. "finite string". However, firstly, I don't see there
the word "infinite" and secondly, I regret that that paper is at
a too high level for me to understand.

3. I spent today several hours in two libraries to search for a
book where "infinite string" is mentioned in connection with
Kolmogorov complexity but unfortunately failed. On the other hand,
I found the word "finite" being explicitly mentioned in several
places. Below are a few examples:

(a) Kolmogorov [41], and independently Chaitin [18], established
a mathematical notation of the complexity of finite sequences.
([41] refers to a paper of Kolmogorov of 1965.)

(b) The Kolmogorov complexity of a finite object x is the length
of the shortest effective binary description of x.

(c) Apart from showing that complexity is an attribute of the finite
object alone, the Invariance Theorem has also another most important
consequence: it gives an upper bound on the complexity.

4. In the case of a finite string, each P of the expression | P |
that enters into min { .... } of the formula for complexity is,
if I don't err, a program that 'successfully' ran on the Turing
machine with respect to that given string. That is, the machine
finishes its work and halts. Now, in the case of an infinite string,
the machine either must continue to work, because there always
remain symbols to be processed, and hence not halt, or stops before
everything has been processed, which would mean that that particular
P is not correct for the given string and hence the corresponding
| P | could not be used in min { ..... }. So, either way, I don't
see how one could deal with infinite strings at all similar to the
way one does with finite strings, when min { ......} is to be
evaluated in the formula for complexity.

5. In view of the above I would like to ask your favour to give a
reference to a book or a well-known journal where the use of infinite
strings in Kolmogorov Complexity is 'explicit' and thus could convince
laymen like me, who may not even be able to capture a small fraction
of the deep theoretical materials contained therein.

Thanks,

M. K. Shen

.



Relevant Pages