Re: Attention Sean - question about CSI



On 2007-08-02, R. Baldwin <res0k7yx@xxxxxxxxxxxxxxxxxxxx> wrote:
"Ari H" <email@xxxxxxxxxxxxxxxxxxxxx> wrote in message
news:f8s40v$2g6$3@xxxxxxxxxxxxxxxxx
On 2007-08-02, R. Baldwin <res0k7yx@xxxxxxxxxxxxxxxxxxxx> wrote:
"Ari H" <email@xxxxxxxxxxxxxxxxxxxxx> wrote in message
news:f8q7rv$v7d$1@xxxxxxxxxxxxxxxxx
On 2007-08-01, TomS <TomS_member@xxxxxxxxxxx> wrote:
"On Tue, 31 Jul 2007 19:12:04 -0600, in article
<O4Kdnbhd6Ix7QDLbnZ2dnUVZ_g6dnZ2d@xxxxxxxxxxx>, dkomo stated..."

Bobby Bryant wrote:
If I have a string of 1,000,000 random binary digits -- genuinely
random, not created by a deterministic RNG -- does it have more,
less, or the same CSI as a string of the same length where the
bits have deliberately been set, e.g. by copying a pattern?


Consider the 1,000,000 random binary digits as a file and run it
through
the best compression algorithm you have. It should compress very
little. On the other hand, a file which has strong patterns will
compress a lot. Maybe the degree of compression can serve as a measure
of the "pattern-ness" of the file. What relation this would have to
CSI
I'm not sure because I haven't been following the raging CSI debates.

If I am allowed to choose the compression algorithm *after*
seeing the million random digits, then I can choose one
which compresses that string to a single bit.

A side note: generally, in this kind of context you need to include
the compression algorithm in your size estimate, ie you need to give
the size of the decompression program plus the compressed data.

Good luck doing that with 1 bit... :-)


Compressing arbitrary strings to 1 bit depends on being able to choose
the
computer. If you are stuck with an 8086 or 68030 derivative computer,
that
choice limits which strings are compressible. Across the infinite set of
all
possible computers, however, there are an infinite number on which it is
possible to compress any given string to 1 bit.

In that case, selecting the right computer to decompress is part of
the algorithm. How are you going to tell what computer to use with
just 1 bit?

That depends entirely on context. We could conceive, for example, of a
special cryptography computer being designed and produced in a limited
number, having a built-in set of large random numbers. These computers would
compress algebraic combinations of large random numbers to readable text,
and decompress readable text to algebraic combinations of large random
numbers. The algorithms could be represented on the computer by very short
programs, such as "execute #1", "execute #2", "execute #3", etc. The
shortest algorithm (execute #1) could be a 1-bit program. The shortest text
string can be a null string. There will be one really long string on this
machine that compresses to 1 bit, and the design of the computer determines
what that string is. The choice of computer is made simply by handing copies
of the computer to the appropriate spies with a knowing wink.

You misunderstand me. You are not allowed to transmit anything,
except the compressed data and the description of the algorithm,
so "handling copies of computers" is not allowed here. If you
do that, you're just splitting the data transmission in two
parts, and you have to include both when comparing data sizes.

You are free to use any language to describe the algorithm. For
simplicity, let's say you are transmitting the algorithm and the
compressed data to an intelligent extra terrestial alien with
whom you have never communicated before. You can assume that
the alien is intelligent, and knows that the data you sent has
an algorithm description and compressed data.

Now, if the alien can *theoretically* figure out the algorithm
from the binary code, and use it to decompress the data, then
you sent enough to meet the requirement I meant above.

If, on the other hand, the alien is missing some crucial data,
and can't even theoretically decipher the data, you did not send
enough information.

We never include the description of the computer itself as part of a
complexity description of a string, because there is no arbitrary
reference for describing the computer. If we try, we need another
computer to describe it, and get into an infinite regress.

The thing is, there is no need to describe the "computer",
there's only need to describe the *algorithm*. A set of
logical instructions, that can be translated into any
language, or a mathematical notation, or whatever. You
are free to use any notation that is translatable to any
other notation.

This concept of compressing arbitrary strings to 1 bit with a
computer selected post hoc is well known in algorithmic information
theory.

But this has nothing to do with my requirement for testing if a
piece of data is compressible...

If you are allowed to transmit the data separately, and have
a reliable communication channel, you can transmit the data
over the channel with 0 bits... Just agree that if nothing
is transmitted, then some pre-agreed data is the one that is
"sent".

An infinite compression, nifty eh? Feel free to patent it...

--
"Ja pilkut huusivat tuskissaan: 'Lisää vaseliinia, lisää vaseliinia!'"

.



Relevant Pages

  • Re: Use of Pseudo Random Generators for One Time Pad?
    ... By having that string in your algorithm? ... well fail on longer strings. ... amount of compression. ...
    (sci.crypt)
  • Re: Compression leads to encryption NEW COMPRESSION METHOD!
    ... Lossless compression by Michael Harrington. ... given a string of N characters ... Your algorithm then MUST be flawed. ... theorem are proven in mathematics, ...
    (sci.crypt)
  • Re: Compression by Use of Info String
    ... >from the data string itself; and that the information string is small ... (If we didn't have to transmit the information ... Many compression schemes gather data ...
    (comp.programming)
  • Re: Compressing sparse strings
    ... string of length 2^100. ... The most of the bits are zeros and there are ... which algorithm would have the best compression ratio? ... how the prefix compression would perform if the dimension is 10 (i.e. ...
    (comp.compression)
  • Re: The Helsinki Code, Part 3
    ... >> the algorithm described will still get the Bible across the data channel ... > from the encoded text is encryption, not compression. ... map x to the string formed by prepending "1" to x. ...
    (sci.math)

Loading