Re: Experimental basis for the Non-Beneficial Gap Problem



On Jul 12, 7:30 am, John Harshman <jharshman.diespam...@xxxxxxxxxxx>
wrote:

This is what I originally wrote in a discussion with Harshman:

     "What I'm saying is that if you compare a collection of larger
proteins the *absolute number* of sequence differences with be
greater, on average, compared to a collection of smaller proteins."

John Harshman responded with:

     "Which makes perfect sense. If two proteins are 5% different,
they will have 5 differences if they're 100 residues long, and 50 if
they're 1000 residues long. Ten times as many!"

You also agreed with this statement.  Now you are trying to go back on
it?  How is that?

I, for one, apparently misunderstood the vague term "collection" here. I
was talking about homologous proteins in pairs of taxa. In retrospect
it's not clear what you were talking about. What were you talking about?
What pairs of proteins were you comparing?

Protein pairs that have uniquely different types of functions.

The "maximum gap size" is what you sometimes call "average gap size".

The maximum gap size for a 100aa system is 100aa differences.  That's
the maximum - obviously.  The average distance is always smaller than
this maximum when it comes to living things - always.  And, the
minimum likely distances is smaller still - always.

How many times do I have to explain it to you before you will give up
on this constant strawman mischaracterization of yours? - this
deliberate lie?

Now that's silly. If an amino acid is entirely free to vary, it doesn't
contribute to any gap size. So if you have a 100aa system in which 30
residues are constrained and 70 are free to vary, the maximum gap size
is 30. Since you always talk in terms of "fairly specified residues",
this must be what you meant. If you're allowing the number of
constrained residues to vary within your "fairly specified" count, then
nobody has any idea any more what you're trying to say.

No. The maximum gap size is defined by the minimum size of sequence
space needed to contain the protein in question. If the minimum size
of a protein is 100aa, the minimum size of sequence space is 20^100
sequences. That means that the maximum gap size in that sequence
space is 100aa differences.

The average gap size is based on the ratio of sequences that would be
able to produce the function in question - - or more relevantly, the
total number of all potentially beneficial sequences vs. non-
beneficial sequences.

The only protein for which you actually calculate any kind of "gap
size" from actual data is cytochrome c.  All other proteins are merely
cytochrome c writ larger, in your bizarre world.  And that calculated
number amounts to nothing but, if you simplify your model protein to
containing only completely invariant and completely free to vary
sites, the effective number of completely invariant sites  The same
degree of invariance would hold for model proteins that had fewer
absolutely invariant sites and more partially variable sites.  That
is, you assume that the "gap size" is the distance between a protein
that has some non-functional aa at every possible invariant site
(completely variable sites, of course, don't matter).  That number
(about 30 for cytochrome c) is the *maximal gap size*, not any kind of
"average gap size".

You don't understand.  The average gap size is a function of the ratio
of potentially beneficial vs. non-beneficial.  This ratio is
calculated by the same means used by Yockey and supported by other
more direct experiments like those done by Sauer, Olsen, and the
others listed.  This ratio is not, let me repeat NOT, a measure of the
maximum gap size.  It is, obviously, a measure of the average gap size
for the function in question in sequence space.

No, it's the maximum effective gap size, because it counts constrained
sites and ignores unconstrained sites. Unconstrained sites shouldn't
count, since it doesn't matter what's in them.

It does matter what's in them because of the overall minimum size
requirement

An *actual* "average gap size" has never been actually calculated by
you for any protein.  Not even cytochrome c.  Never.  Not once.  

You don't understand statistics then.  The average gap size is a
function of the ratio of potentially beneficial vs. non-beneficial -
which has indeed be estimated for specific functions like CytoC as
well as several other types of unique protein-based functions by
direct experimentation.

All
you have done is pull a number out of yer arse.  "Minimum likely gap
sizes", likewise, has never been calculated for any protein.  Not even
cytochrome c.  It is simply pulled out of yer arse as well, typically
after waving the term Poisson ratio as if you actually knew that there
was a Poisson distribution from actual data.

The minimum likely distance between target sequences where the ratio
of targets to non-targets is known, but their specific location in
sequence space is unknown, falls along a Poisson distribution.  There
is no Poisson ratio.  The ratio calculated by those like Yockey is
what is used to calculate the Poisson distribution to estimate the
likelihood of a minimum gap distance.

This doesn't seem to have anything to do with the actual numbers you
keep quoting. Why is 100 the maximum gap size of a 100-site sequence?

Because of the minimum size requirement for the function in question.

If
you're talking about poisson-distributed changes, there should be no
actual maximum, since a poisson process could continue infinitely long
without attaining a perfect sequence match. You really aren't making
sense here.

The Poisson distribution can calculate the odds of one of a limited
set of objects being in a particular location given the overall ratio
and distribution of objects in a limited space.

For example, say you have a circle with a 100m radius. Say you have
10 quarters in that circle, but you don't know the location of the
quarters. For all you know they could be randomly distributed. You
are standing in the middle of the circle. What are the odd that you
will also be standing on one of the quarters?

You see, the odds that the minimum distance between you and one of the
quarters is zero is calculated using a Poisson distribution.

Sean Pitman
www.DetectingDesign.com

.



Relevant Pages

  • Re: Tony Raymonds contra-gap argument
    ... certain types of functional systems with smaller proteins. ... fewer number and still have the flagellar motility function. ... increase the number of sequences searched within sequence space. ... likely minimum gap size will not long remain at the minimum possible ...
    (talk.origins)
  • Re: Experimental basis for the Non-Beneficial Gap Problem
    ... unless you think evolution starts from some random sequence maximally ... It is never at the maximum possible distance - ... greater, on average, compared to a collection of smaller proteins." ... If an amino acid is entirely free to vary, it doesn't contribute to any gap size. ...
    (talk.origins)
  • Re: Non-beneficial Gaps
    ... pre-existing proteins that perform other functions. ... you can find such a small gap between any higher-level systems beyond ... Any currently existing system that had essentially the same *sequence* ... pathway proposed for the evolution of the flagellar system. ...
    (talk.origins)
  • Re: Experimental basis for the Non-Beneficial Gap Problem
    ... proteins the *absolute number* of sequence differences with be ... greater, on average, compared to a collection of smaller proteins." ... The maximum gap size for a 100aa system is 100aa differences. ...
    (talk.origins)
  • Re: Most valuable poster
    ... nylonase or lactase evolution examples. ... residues), you have the ability, so you say, to tell us *exactly* what ... the average gap size is based on the size of the end product. ... recognizable sequence homologs or recognizable intermediate functions. ...
    (talk.origins)