Re: Complex Specified Information - Pitman Formula



On Jul 25, 7:40 pm, Seanpit <seanpitnos...@naturalselection.
0catch.com> wrote:
On Jul 25, 4:22 pm, hersheyh <hershe...@xxxxxxxxx> wrote:

The Hamming Distance
"number" is determined by the number of mismatched character positions
between the reference string and the test string. It is an absolute
number; not an arbitrary or subjective choice.

And, if what I have is a 'target' sequence *or* a 'reference' sequence
but not both, how do I determine what the other sequence is? Do you
*intentionally* choose and rig the choice of 'target' or 'reference'
to get the hd you want and the CSI number you want? *That* is the
problem. I *know* what sequences actually exist or possibly could
exist. They don't come with labels of 'target' or 'reference'. How
do you determine, out of all the sequences that currently exist or
could have existed in the past, what the 'target' and 'reference'
sequences are? That is rather crucial to determining hd.

The reference sequences are determined, by you, ahead of time before
you go out to analyze any other sequences. The reference sequences
are based on non-random strings that are known to be produced by
simple algorithms - like pi or like 0101010 . . .

IOW, you would know it if the SETI signal were repeated digits of pi
in base 10, but would not be able to recognize pi in base 2 or the
other reference you give. Using *your* idea, you would declare any
other signal as "random" and unrelated to the 'reference'. Is *that*
what you claim that SETI is doing?

After you have your set of reference strings, you can compare incoming
sequences to your set of reference sequences to see if the incoming
sequences is likely to be non-random in origin.

Again, you would only be able to detect 'targets' that were near
enough to your *biased* selection of 'reference' sequences to register
as 'sufficiently close'.

Also, the selection of the reference string must be done without any
knowledge ahead of time of the test string. The choice must be
completely independent.

IOW, the *reference* string must be a *randomly* chosen sequence out
of total sequence space.

No. The reference string must be chosen based on knowledge that it is
not random - i.e., the reproducible product of a simple algorithm.

Fractals are generated by simple algorithms. So, for that matter, is
a pathway in which you have occasional random mutation and fixation of
the result as a second rarer event. But those are not
*reproducible*. Does that mean you are *specifically* ruling out
evolutionary algorithms *arbitrarily* by requiring a determinative
result rather than a probabilistic one? What would be the 'reference'
sequence for proteins. since, because the same functional protein in
different organisms have different sequences and sometimes
dramatically different sequences, you cannot claim that any particular
available sequence is a reproducibly determined product of a simple
algorithm? If your CSI calculation is going to have meaning for
evolution, you do have to tell us *which* sequence for, say, beta
globin of hemoglobin is the "reproducible result of a simple
algorithm", don't you? Is it the human gamma-G? gamma-A? Embryonic?
Adult?

< snip >

Useful choices would include strings that are
known to be the result of non-random simple algorithms, like pi or a
repeat of a simple pattern - like 01010101 . . ., etc.

But that would be an *arbitrary* choice of 'reference'. You would be
*selectively* and rather *arbitrarily* choosing some modern functional
sequence as the 'reference'.

That's right . . .

What then would be the 'target'?

There is no "target". There are only test strings that you compare to
your reference strings. If the test strings match one of your
reference strings, to a high level of CSI, the hypothesis of non-
random origin is supported.

Then the result you get is entirely dependent on which sequences you
*arbitrarily* chose as your 'reference' strings. How can you be sure
that your choice of all the 'reference' strings you *arbitrarily*
chose to look at will catch the 'intelligently designed' sequence you
'test'. And how will you, if you are too broad in your *arbitrary*
choices, prevent false positives? And if you are too narrow in your
*arbitrary* choices of 'references' aren't you going to ensure many
false negatives?

How
would you choose the 'target' *after*
you have "independently chosen
the 'reference'?

You don't choose the test string. Any string could be tested by the
reference strings - any string at all.

You must *really* be brilliant if you can think of all the 'reference'
strings that not only a non-human ET might send as a signal, but also
all the protein 'reference' sequences that have *ever* existed.
Otherwise I cannot think of any way that your test would not wind up
being hit-or-miss and not much better than dumb luck.

< snip >



And isn't the *minimum* difference between a
'reference' sequence and a 'target'
sequence still going to be hd=1 no
matter what you say and no matter
how large n is?

The minimum HD is actually zero - or complete identity.

I said minimum *difference*. Zero or identity is a state of 'no
difference'.

Of course - but making the point for a minimum HD difference being 1
is irrelevant in this particular discussion.

I agree. But *if* one is talking about the evolution of some new
sequence, the minimum HD difference is 1. But that one step can be
the generation of a second copy of the entire initial sequence.
Duplication and divergence (or specialization) is a common
evolutionary mechanism where the first step (duplication) is both
common and often selectively neutral.

We aren't talking about finding target sequences in this discussion
Howard. That's a different topic altogether. Your notion that the
minimum possible HD (i.e., 1) is always the likely distance is your
stumbling block when it comes to your ability to grasp the fundamental
problem with finding unknown target sequences that exist in sequence/
structure space with different average distances between them -
average distances that are directly related to minimum structural
threshold requirements.

Again though, that is a different topic from the one being discussed
here - for the umpteenth time.

Sean Pitmanwww.DetectingDesign.com


.



Relevant Pages

  • Re: Complex Specified Information - Pitman Formula
    ... between the reference string and the test string. ... I *know* what sequences actually exist or possibly could ... There are only test strings that you compare to ...
    (talk.origins)
  • Re: Complex Specified Information - Pitman Formula
    ... you go out to analyze any other sequences. ... other reference you give. ... is it possible for a set of reference strings to miss a non- ... sequences is likely to be non-random in origin. ...
    (talk.origins)
  • Re: Complex Specified Information - Pitman Formula
    ... you go out to analyze any other sequences. ... other reference you give. ... is it possible for a set of reference strings to miss a non- ... sequences is likely to be non-random in origin. ...
    (talk.origins)
  • Re: tuples, index method, Pythons design
    ... Strings are special in that scan only be a string of length 1. ... That's because strings are different from other sequences. ... if you want to provide new functionality for strings and you have ... in a consistent way to other sequences ...
    (comp.lang.python)
  • Re: The Pitman CSI Formula
    ... How should I interpet the resulting CSI value? ... It's just a way of comparing two sequences. ... Moreover, if we are talking about random strings, then hd is a random ... The Hamming Distance of two randomly produced strings ...
    (talk.origins)