Re: Attention Sean - question about CSI



On 2007-08-03, R. Baldwin <res0k7yx@xxxxxxxxxxxxxxxxxxxx> wrote:
"Ari H" <email@xxxxxxxxxxxxxxxxxxxxx> wrote in message
news:f8ssa8$2g6$9@xxxxxxxxxxxxxxxxx
On 2007-08-02, R. Baldwin <res0k7yx@xxxxxxxxxxxxxxxxxxxx> wrote:
"Ari H" <email@xxxxxxxxxxxxxxxxxxxxx> wrote in message
news:f8s40v$2g6$3@xxxxxxxxxxxxxxxxx
On 2007-08-02, R. Baldwin <res0k7yx@xxxxxxxxxxxxxxxxxxxx> wrote:
"Ari H" <email@xxxxxxxxxxxxxxxxxxxxx> wrote in message
news:f8q7rv$v7d$1@xxxxxxxxxxxxxxxxx
On 2007-08-01, TomS <TomS_member@xxxxxxxxxxx> wrote:
"On Tue, 31 Jul 2007 19:12:04 -0600, in article
<O4Kdnbhd6Ix7QDLbnZ2dnUVZ_g6dnZ2d@xxxxxxxxxxx>, dkomo stated..."

Bobby Bryant wrote:
If I have a string of 1,000,000 random binary digits -- genuinely
random, not created by a deterministic RNG -- does it have more,
less, or the same CSI as a string of the same length where the
bits have deliberately been set, e.g. by copying a pattern?


Consider the 1,000,000 random binary digits as a file and run it
through
the best compression algorithm you have. It should compress very
little. On the other hand, a file which has strong patterns will
compress a lot. Maybe the degree of compression can serve as a
measure
of the "pattern-ness" of the file. What relation this would have to
CSI
I'm not sure because I haven't been following the raging CSI debates.

If I am allowed to choose the compression algorithm *after*
seeing the million random digits, then I can choose one
which compresses that string to a single bit.

A side note: generally, in this kind of context you need to include
the compression algorithm in your size estimate, ie you need to give
the size of the decompression program plus the compressed data.

Good luck doing that with 1 bit... :-)


Compressing arbitrary strings to 1 bit depends on being able to choose
the
computer. If you are stuck with an 8086 or 68030 derivative computer,
that
choice limits which strings are compressible. Across the infinite set
of
all
possible computers, however, there are an infinite number on which it
is
possible to compress any given string to 1 bit.

In that case, selecting the right computer to decompress is part of
the algorithm. How are you going to tell what computer to use with
just 1 bit?

That depends entirely on context. We could conceive, for example, of a
special cryptography computer being designed and produced in a limited
number, having a built-in set of large random numbers. These computers
would
compress algebraic combinations of large random numbers to readable text,
and decompress readable text to algebraic combinations of large random
numbers. The algorithms could be represented on the computer by very
short
programs, such as "execute #1", "execute #2", "execute #3", etc. The
shortest algorithm (execute #1) could be a 1-bit program. The shortest
text
string can be a null string. There will be one really long string on this
machine that compresses to 1 bit, and the design of the computer
determines
what that string is. The choice of computer is made simply by handing
copies
of the computer to the appropriate spies with a knowing wink.

You misunderstand me. You are not allowed to transmit anything,
except the compressed data and the description of the algorithm,
so "handling copies of computers" is not allowed here. If you
do that, you're just splitting the data transmission in two
parts, and you have to include both when comparing data sizes.

You are free to use any language to describe the algorithm. For
simplicity, let's say you are transmitting the algorithm and the
compressed data to an intelligent extra terrestial alien with
whom you have never communicated before. You can assume that
the alien is intelligent, and knows that the data you sent has
an algorithm description and compressed data.

Now, if the alien can *theoretically* figure out the algorithm
from the binary code, and use it to decompress the data, then
you sent enough to meet the requirement I meant above.

If, on the other hand, the alien is missing some crucial data,
and can't even theoretically decipher the data, you did not send
enough information.

Well, that is a bit silly. If the sender and receiver do not have
compatible computers, the description of the algorithm will be
meaningless for one of them, won't it?

All it takes is that they have compatible enough universes, that
"laws" of maths and logic are the same... You can carry out any
algorithm by hand, no automated computer is even needed... So
obiviously you do not get what I'm saying. I don't think I can
explain it any beter, so I have to give up.

We never include the description of the computer itself as part of a
complexity description of a string, because there is no arbitrary
reference for describing the computer. If we try, we need another
computer to describe it, and get into an infinite regress.

The thing is, there is no need to describe the "computer",
there's only need to describe the *algorithm*. A set of
logical instructions, that can be translated into any
language, or a mathematical notation, or whatever. You
are free to use any notation that is translatable to any
other notation.

If I don't need to describe the computer, and sender and receiver have
selected computers at random, how do they communicate? The description of
the algorithm is different on different computers. There is no universal
means for representing an algorithm.

It's enough that there is universal way of representing binary
code, and universal "laws" of maths and logic. Intelligence can
figure out the rest, especially if the algorithm is converted to
symbols and symbols are converted to binary code while keeping in
mind that somebody would have to decipher it from scratch.

This concept of compressing arbitrary strings to 1 bit with a
computer selected post hoc is well known in algorithmic information
theory.

But this has nothing to do with my requirement for testing if a
piece of data is compressible...

All finite data is compressible. No computer can compress a larger set of
strings than the pigeonhole principle allows, but the set varies from
computer to computer.

If all data was compressible, you could do this:

1. You give me a computer of any kind.

2. I give you a DVD *full* of bits.

3. You compress that data, and give it back to me on a DVD of same
size, that also *with free space*, ie the compressed version takes
less space.

4. I pop the disk into the computer from step 1, and can decompress
the data, and get data identical to step 2.


Do you *really* think you can do this, even if the DVD of step
2 containts bits generated from eg. thermal noise, with equal
probability for 0 and 1?


--
"Ja pilkut huusivat tuskissaan: 'Lisää vaseliinia, lisää vaseliinia!'"

.



Relevant Pages

  • Re: Attention Sean - question about CSI
    ... It should compress very ... If I am allowed to choose the compression algorithm *after* ... which compresses that string to a single bit. ... from the binary code, and use it to decompress the data, then ...
    (talk.origins)
  • Re: Attention Sean - question about CSI
    ... the best compression algorithm you have. ... It should compress very ... which compresses that string to a single bit. ... from the binary code, and use it to decompress the data, then ...
    (talk.origins)
  • Re: Attention Sean - question about CSI
    ... the best compression algorithm you have. ... It should compress very ... which compresses that string to a single bit. ... already there uncompressed, why would you want to uncompress it ...
    (talk.origins)
  • Re: Attention Sean - question about CSI
    ... or the same CSI as a string of the same length where the ... the best compression algorithm you have. ... It should compress very ... which compresses that string to a single bit. ...
    (talk.origins)
  • fastest decompression of PowerPC object code?
    ... I'm working on a product where we would like to compress gigabytes of ... PowerPC object code and then use the PowerPC to decompress it into RAM ... Which algorithm would be the best? ...
    (comp.compression)

Loading