Shannon Entropy equals Kolmogorov Complexity



On Jul 31, 10:05 pm, "R. Baldwin" <res0k...@xxxxxxxxxxxxxxxxxxxx>
wrote:
"Seanpit" <seanpitnos...@xxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message

So finding a match to any number at all is significant? Why?

What do you think SETI scientists would do with a radiosignal match to
the first 100 digits of pi repeated a million times in a row? - 10
million times?

First of all, if SETI scientists detected a radio signal with 10 million
digits of anything, pi or not, they would likely conclude terrestrial
origin, because you could not get that much data through the narrow
bandwidth required for vast interstellar distances within the time limits
imposed by orbital mechanics. SETI scientists seem to be pretty
knowledgeable about signal processing, bandwith, Nyquist's Theorem, signal
to noise ratio, etc., so I am pretty confident they would not make such a
silly mistake to assume the first 10 million digits of pi are a possibility
here.

This is a hypothetical, but the assumption here is that if such a
signal where indeed proven to be coming from outer space, not Earth,
what would SETI scientists conclude?

Seth Shostak, a senior astronomer at the SETI Institute, point out
some very interesting concepts regarding the detection of artifact:

"An obvious suggestion . . . is to search for a signal that is
branded with a mathematical label. For example, maybe the aliens will
tag their transmission with the value of pi. That would clearly
bespeak a middle school education, and would prove that the signal
comes from thinking beings, rather than witless neutron stars or some
other cosmic oddity. . . Perhaps the extraterrestrials will preface
their message with a string of prime numbers, or maybe the first fifty
terms of the ever-popular Fibonacci series.
Well, there's no doubt that such tags would convey
intelligence . . . "

Seth Shostak, How to Sort Signs of Artificial Life from the Real
Thing, SETI Thursday, Space.com, January 30, 2003

As for finding 100 digits, it depends on how well the SETI scientists in
question understand mathematics and anthropology. If their understanding is
poor, they might get awfully excited. If their understanding is good, they
might conclude that either the transmission is of terrestrial origin, or
that the digits are an artifact of crosstalk or signal processing. It would
be a mistake to assume that extraterrestrials will use a positional numeral
system comparable to that of modern Western culture, just as it would be a
mistake to assume any particular mathematical relationship between signal
modulation and number.

Mathematics is a symbolic language. Expecting that we can directly
communicate a symbolic language with an alien culture we have never met is
akin to assuming they speak English.

That's right . . . but if they did happen to speak a similar
mathematical language, that would be very good evidence of deliberate
artifact. The fact that their mathematical language may not be
detectable to us at this point it irrelevant. Again, it is impossible
to rule out the possibility of non-detected artifact as responsible
for a given signal. This does not mean that a signal that clearly
shows certain kinds of bias is worthless. It isn't.

Pretending or assuming or modeling the information source as Markov is
a comparison to a specific type of reference - a Markov-type reference
or "source".

Still not getting it, are you?

Nope . . .

Let's back up to your statement "Shannon
information is determined by reference to a known "random" source of string
production - a source that produces maximum Shannon information."

Here are the reasons this is wrong:
1. It takes more than a source to determine Shannon information. It takes a
communication system with source, channel, and receiver. Absent any of these
components and there is no information, as Shannon defines it. Hence,
stating that an information source (regardless of its properties) determines
information is wrong.

The channel and receiver are already givens. The only real variable
is the source. Therefore, the primary determination of Shannon
information is the receiver's assumptions about the nature of the
source.

2. Information sources are generally not random. Shannon's theory only
models them that way to make math easy. Hence, calling information sources
"known random" is wrong.

It isn't wrong because the "model" is "known" or "determined" or
"assumed" by the receiver to be of a certain nature. The assumption
is the basis for receiver "surprise".

3. Information sources do not produce maximum Shannon information. There is
no maximum information because you can always get more by watching the
information source longer. They might produce maximum entropy, if they have
the right symbol distribution, but they do not have to. In general they do
not.

A source that produces a uniform distribution will in fact tend to
produce maximum Shannon entropy over time as well as maximum Shannon
information over a set finite period of time.

4. You are introducing the word "reference" as if that is part of the
language of the Shannon model. It isn't. It is part of the language of the
Algorithmic Information Theory model. It gives your statement the appearance
of improperly mixing two different mathematical theories.

I introduce the word "reference" with regard to Shannon's model on
purpose. I know it isn't "standard", but I am drawing a comparison
that is in fact valid. Shannon's model does use a type of reference,
without with Shannon entropy and information could not be measured. It
matters not if this "reference" is an assumption about a "pretend"
source or not. It still has the function of a reference.

If you still don't understand, tell me what part is unclear, and I'll be
happy to clarify.

Likewise . . .

Further, the information source is not a "reference." You are improperly
crossing over different theories.

I'm saying that assuming the information source is a Markov-type
source is the same thing as using a reference.

Every field of math and science has its own terminology. Information Theory
is no exception. If you want to discuss it, you should try to learn the
terminology so that you don't appear incomprehensible or obtuse. This
statement of yours does not use the conventional terminology.

I know. The fact is though that my use of the term "reference" is a
valid description of what is going on with Shannon's model. Call it
whatever you want, but it's nothing more than a rose by another
name.

If a binary source has a history of producing the digit 1 with a
frequency of 90% the receiver forms a hypothesis about the likelihood
that the next digit will be a 1. If the next digit is a zero instead
of a one, the receiver is surprised, but not nearly as surprised
compared to a frequency history of 99.99999%.

Sorry, this last sentence does not properly parse.

In short, the receiver assumes something about the nature of the
source of the string based on past history. The assumed nature of the
source becomes the "reference" to which the receiver compares future
digits and makes predictions.

Well, you could run a receiver that way. You don't have to. There are any
number of ways you could estimate the symbol probabilities. Running average
is just one of them. You could just as easily design a receiver to assume
Uniform symbol distribution and let it remain surprised. Or you could run
statistical analysis on lots of information sources and assume that your new
source tends toward the average.

That's right. And depending upon how your receiver is set up with its
assumptions about the source, it would assume different amounts of
Shannon entropy. That's the whole point. It is all "relative" to the
receiver's assumptions about the source - i.e., about the receiver's
"reference".

If the string's history happened to be a uniform distribution, the
receiver would not be surprised by either a 1 or a 0 coming next in
the sequence. However, a receiver would be surprised, given the
assumption that the source produces a uniform distribution, by a
series of a million 1's in a row . . .

Quite correct. The receiver had estimated the symbol distribution as
Uniform, and the burst is not Uniform, which has a high surprisal value. If
you use a running average for estimating the symbol probabilities, a long
burst gets less surprising as time goes on. For many other estimating
schemes, the long burst remains surprising.

Correct. Again, it is all relative to the receiver's assumptions
about the nature of the source. The receiver's assumptions are the
"reference" by which the receiver is "surprised" - - or not.

Also, an assumption of a source that produces a uniform distribution
is an assumption of the highest possible entropy (for an alphabet with
n symbols) of the source alphabet with n symbols. In the same line, a
source that has a history of producing a non-uniform distribution
(like 90% 1s), is assumed to have less than maximum possible
entropy.

This is a bit garbled. It sounds like you might be grasping the idea, but it
is hard to tell for sure. It would be simpler to state that "an information
source with a uniform symbol distribution has maximum entropy. An
information source with non-uniform symbol distribution has less than
maximum entropy." That would be much more straightforward.

This is correct, but it is all based on the assumed nature of the
source.

The only *reference*, which is not
accepted terminology for this field,
is the set of individual symbol
properties for the information source. There
is no reference string, and no
reference computer, contrary to the
iimplications of your version.

I'm not implying any reference computer or string at all. Where did
you get that? What I'm saying is that the assumptions of the nature
of the source, such as the assumption that the source has a certain
"set of individual symbol properties", is in fact a use of a
"reference". It is relative to this reference that a particular
string is evaluated.

Oh, statements like "This means that Shannon information is more about the
type of source it will take to transmit a particular type of string rather
than the string itself," for instance.

Exactly . . . That statement clearly shows that I wasn't talking
about the string, but about the source. Where in this statement did
you get the idea that I was suggesting the string itself or any
specific computer as the "reference"? It is all about the assumed
nature of the source. What "type" of source does the receiver
assume? Is the source assumed to produce a uniform or non-uniform
distribution? That's what really matters here.

Or "So, you see, if a system must be able to operate regardless of if pi was
chosen or some other number with equal probability, it must be set up to
handle all possibilities that could be chosen, at random. Therefore, from
the perspective of a receiver who does not yet know what sequence is going
to be sent, the receiver has to be able to receive not only pi, but all
other sequences that are equally probable."

That's true, because, in order to receive pi, the receiver has to
assume that the source is able to produce a uniform distribution.

Or "That's true, but the estimate is based on the type of source needed to
produce such a string."

Yeah, like a producer of uniform or non-uniform distribution.

Or "Actually, a string with maximum Shannon information is a "random"
string. And, how does one define randomness? By comparison to a reference
computer - a UTM."

Shannon Entropy equals or is at least close to the expected Kolmogorov
Complexity (with a constant).

http://homepages.cwi.nl/~paulv/papers/info.pdf
http://www.stanford.edu/~cover/papers/transIT/0331leun.pdf

Since KC is measured relative to a specific UTM, it follows that in
some sense SE in also related to this reference UTM as well.

If you would critically review your own writing, you might see how readers
could quite easily construe that you are casually mixing theories in an
inappropriate way.

Only if one were actually looking for any possible way to misconstrue
what I'm saying.

You are all hung up on semantics here and use this to misconstrue what
I actually said into something completely different.

It would be much harder to misconstrue what you mean, if you wouldn't use
non-standard terminology and throw smoke screens up to cover gross errors.

I've been very deliberate in my use of terminology in such a way that
the meaning of the words used should be obvious to you - even if a bit
non-standard. Also, as far as I am aware, I've not thrown up any
"smoke screens" to cover up any so-called "gross errors" - most of
which have been nothing more than your own invention or the result of
your own ability to distort what is otherwise obvious.

Sean Pitman
www.DetectingDesign.com

.



Relevant Pages

  • Re: Plus Net Binary News Servers SUCK bigtime.
    ... And downloading is undoubtedly copying. ... According to the CDPA, unauthorized distribution is an offense, ... not the receiver so the receiver is not liable. ...
    (uk.telecom.broadband)
  • Re: Pioneer Anomoly
    ... >>> George, take another look at Turyshev,s formula. ... >>I am talking of the 12k/s speed at which the craft ... the reference against which the received ... receiver. ...
    (sci.astro)
  • Re: Auto selection
    ... On Mon, 27 Nov 2006 17:47:19 GMT, H. S. Lahman wrote: ... reference. ... generated and when it is consumed by the receiver in asynchronous ... Asynchronous nature of events is not an advantage. ...
    (comp.object)
  • Re: SNR Question
    ... Some hint, you might look at the signal in the same way the actual ... because the receiver can reject it. ... I usually use one signal as reference, ... possibilities for defining SNR. ...
    (comp.dsp)
  • Re: Androcles and Draper resume Einstein 1905
    ... >> one end, along with a clock, and a mirror at the other end, along with ... >> A. The rod is stationary in the frame of reference. ... >> the time of receipt at the receiver. ... > of the mirror, and x2 is the place of the receiver. ...
    (sci.physics)