Re: Entropy



Niels Fröhling wrote:
Mok-Kong Shen wrote:

From all what I have read, I don't have the impression that KC is
intended (or, equivalently, can be) applied to 'real world' problems
in the sense that given a string one 'actually' switches on a PC and
obtains a concrete number.

I think paq's development over to zpaq is a good example of how it can "materialize" in reality. Matt defined an "instruction set" which controls the context-selection mechanisms of the paq-NNs. With that you have the possibility of actually searching through all context-spaces for the "program" with the shortest output.
It's not completly free, as the "machine", the interpreter still has fixed size, but you can select the optimal parameter-set for the given fixed length "machine".
I think one can relate it to LZ, which is a fixed size "machine", who's "instruction set" are length/offset pairs, you may search for the optimal parameter-set giving the smallest output.
Even though I think not much people think of LZ that way (as a "machine" with an "instruction set"). I believe Matt was pretty much inspired by KC when he made the step towards zpaq.

Sounds nice, but I somehow doubt that this algorithm comes up with the *smallest possible* program required to regenerate the output. Because simply if it would, one could take the size of this program and would then have KC(input). However, KC is provably incomputable, so such a program does not exist.

In practical applications, one can at best consider approximations or upper bounds of KC(s).


Despite the above mentioned fact that KC is sort of gedanken-
experiment and never actually evaluated on a PC, these difficulties
would render the concept as such questionable. For a gedanken-
experiment must be such that it is 'reasonably' doable (have

KC isn't reasonably doable. Worm-holes aren't reasonably doable, still it doesn't stop a serious physicist to make the gedankenspiel. Strings probably aren't reasonably provable by real-world devices, still they are part of serious physical theories with real-world implications.

Exactly. It is a theoretical concept, a tool, and by thinking about it
you can come up with reasonable results that are of interest of the field, but you shouldn't expect to find a program that provides you with the KC of a given string. That is, provably, impossible.

similarity to one's procedures in real world experiments) at least
in one's thought and cause no 'essential' problems in the imagined
world. On the other hand, I have, for lack of knowledge, till now
not been able to counter Richter's critique on the application of
the concept of KC on finite strings (even though all textbooks I

Thomas just thinks it's not very interesting to think about machine dependent KC. It contradicts sort of the universality of KC. And as he said, if you look to eliminate the machine dependency, you marginalize it (you don't want to consider all machines, because then you have to look infinite time on infinite machines for a finite string, that's even worst). You can marginalize it by setting finite algorithm size vs. infinite string size.
Maybe you have to understand that the instruction set of a turing machine is free to be defined. If you focus on finite strings you sort of fall into a local minimum, you start to pay attention to the machine, which you shouldn't as you're looking or thinking about universal KC.
Well and Thomas didn't say you can't do it, it's forbidden, or pervert. He just stated as his personal opinion, that it's not very "interesting" in comparison.

I hope I got this right. :)

I would say so. I'm saying it is not very interesting because you cannot derive anything interesting about such a finite string. Every finite string is "special", and you can design machines that are good at generating such special strings. This "loop hole" is closed by taking the limit and applying the idea to longer and longer strings, and the reason why this loop hole is closed is that the Turing machine can only have a *finite* internal state machine. Thus, "if the input is considerably longer than the internal state machine", your machine can no longer be tuned to the specific input.

It is exactly the same situation we find in compression: For any *specific* output ("lena") we can tweak the compression scheme to be good at this output by making the decompressor larger. But by requiring a finite size of the decompression algorithm, any such attempt is impossible by looking at longer and longer inputs. As soon as the input grows beyond the size of the decompressor (roughly), the loop hole breaks down.

have seen define KC on finite strings, some explicity, others
apparently implicitly). I believe that I have now found that his
issue is highly probably a non-issue. Let me argue in the following:

Richter uses "Lena compress" to support his claim that KC makes
little sense for finite strings.

Yes, his machine got 1 instruction, is still universally usable for any other string than "lena", but sucks.

Definitely so. (-:

Now, while Richter's argument is based on a special machine T that
has a special state favouring "Lena", this is by definition not
the case with the reference machine U above. Therefore I think
that Richter's claim that KC makes little sense for finite strings
is apparently invalid.

Uhm, as I understood Thomas, his instruction set is the crucial point, not the state (you can replace "lena" by any finite string, that's really boring to define a special instruction set for every finite string, right?).

Right. Whether that's "lena" or "barbara" doesn't matter at all. The important fact is that tuning a Turing machine for a specific string is possible while keeping it universal.

Thus, asking for the KC of a specific (finite) string doesn't make too much sense in first place. It is pretty machine dependent, and I can easily design a machine for which the KC of this finite string is one.

What is *not* machine dependent, however, is what happens if I grow the string size longer and longer. That is a universal property, unlike the former. Not in the sense that I can get definite values for KC, but in the sense that it either diverges or not, and when it diverges, in which rate it does diverge. From that, I can draw conclusions on the string. I cannot draw any conclusions from the KC of a finite string.

Same problem of course with entropy: You cannot ask for "the entropy" of a finite string. This is an ill-defined problem. You can ask for the entropy of a random source (which generates infinite strings!). Unlike KC, however, entropy gives me a good hint in how to approximate it under the assumption that my input string is "typical", namely by designing a model from the observed relative frequencies in the string.

So long,
Thomas




.



Relevant Pages

  • Re: Entropy
    ... I think one can relate it to LZ, which is a fixed size "machine", who's "instruction set" are length/offset pairs, you may search for the optimal parameter-set giving the smallest output. ... 'all' possible programs P that corresponds to the given string x ... It contradicts sort of the universality of KC. ... If you focus on finite strings you sort of fall into a local minimum, you start to pay attention to the machine, which you shouldn't as you're looking or thinking about universal KC. ...
    (comp.compression)
  • Re: Entropy
    ... I think one can relate it to LZ, which is a fixed size "machine", who's "instruction set" are length/offset pairs, you may search for the optimal parameter-set giving the smallest output. ... 'all' possible programs P that corresponds to the given string x ... It contradicts sort of the universality of KC. ... If you focus on finite strings you sort of fall into a local minimum, you start to pay attention to the machine, which you shouldn't as you're looking or thinking about universal KC. ...
    (comp.compression)
  • Re: An uncountable countable set
    ... repeating decimals, ... that is every digit in that string is in a finite position with respect ... finite strings you call naturals. ... mind that the set of such finite string representations of infinite ...
    (sci.math)
  • Re: Entropy
    ... Thomas just thinks it's not very interesting to think about machine dependent KC. ... And as he said, if you look to eliminate the machine dependency, you marginalize it (you don't want to consider all machines, because then you have to look infinite time on infinite machines for a finite string, that's even worst). ...
    (comp.compression)
  • Re: every is not all
    ... is not an infinite string that is defined by its digits, ... infinite string as such. ... For every finite string we can determine a difference to the infinite ...
    (sci.logic)