Re: Question for Sean Pitman



On Jan 10, 5:21 pm, "R. Baldwin" <res0k...@xxxxxxxxxxxxxxxxxxxx>
wrote:
Seanpit <sean...@xxxxxxxxx> wrote innews:3a7c6693-aebe-4fcd-92d0-e9341c5416a9@xxxxxxxxxxxxxxxxxxxxxxxxxxxx:

On Jan 9, 11:48 pm, "R. Baldwin" <res0k...@xxxxxxxxxxxxxxxxxxxx>
wrote:
Seanpit <sean...@xxxxxxxxx> wrote
innews:fec336c9-0217-4f99-898a-4a9772aa
b...@xxxxxxxxxxxxxxxxxxxxxxxxxxxx:

On Jan 8, 11:32 pm, "R. Baldwin" <res0k...@xxxxxxxxxxxxxxxxxxxx>
wrote:

You are completely missing the point here, Sean. I offered the
hill climbing algorithm as an *example* of an algorithm that
does better than random search against a class of data. There
are other algorithms for other classes of data.

You need to present algorithms that are actually relevant to the
problem at hand.  Your hill-climbing suggestion will not work
unti
l
a
t
least the edge of an island is found.  

Macromutations that combine motifs will find proteins coded for by
combined motifs.

Macromutations, produced randomly where the combination is beyond
1000 fsaars, are very very unlikely to land on a viable much less
beneficial sequence combination - even when starting with
beneficial sequences.  That's the problem

And you know this how?
What does 1000 have anything to do with it?

Because of the extremely low ratio of targets vs. non-targets at and
beyond the 1000 fsaar level.  The number of potential combinations of
viable lower-level sequences vastly outnumbers the number of viable,
much less beneficial, 1000aa sequences.

The ratio of targets to non-targets is irrelavent. The issue is whether a
macromutation that combines two motifs preserves properties that the motifs
have. If it does, then you have a high likelihood of landing on a target no
matter how rare they are.

The process is not one of searching large spaces - it is a process of
combining things from smaller spaces, and finding that the combination
happens to be in a larger space.

All combinations will be in larger spaces. The problem is that there
are vast more ways two motifs can be combined vs. the number of viable
resulting combintations - not to mention beneficial. The ratio of
targets is therefore not "irrelevant". The likelihood of landing on a
target is not "high" like you claim. It is very low.

That's not true.  They aren't likely to exist where motifs can be
joined together.  

Are you serious? Protein-coding sequences are not likely to exist
where chunks of protein-coding sequences are joined? Why not?

That's the whole problem.  The vast majority of ways
that motifs could be joined together wouldn't be stable much less
beneficial.

And you know this how?

Simple math.  Take the number of possible combinations of stable
sequences and compare that number to the number of viable 1000aa
sequences.  The number of possible combinations vastly outnumbers the
number of viable, not to mention beneficial, 1000aa sequences.

Instead, let's consider getting there in a hierarchical fashion. Say I have
a set of small motifs with lengths of 10 to 20. If they start getting fused
together, I add to my set motifs with lengths of 20 to 100. Now if these
start fusing together, I add to my set motifs with lengths of 30 to 200.
Continue this process. It won't take long to reach 1000.

Each of your fusions encounters problems because there are vastly more
ways to combine smaller motifs vs. the number of viable larger
combintations.

Produce a direct quote from any one of these references where any
such macromutation has been observed to produced any qualitatively
novel beneficial system beyond the 1000 fsaar threshold level.  
There are assumptions that sequence similarities where produced by
this mechanism, but this assumption is not backed up by either
direct observation or by statistical analysis of the odds that such
a scenario is remotely likely.  It is just an assertion at this
point - bald, plain and simple.  Nothing more.  Not science in any
sense of the word.

Nandhagopal et. al. "The structure and evolution of the major capsid
protein of a large, lipid-containing DNA virus" PNAS, Vol. 99 No. 23,
Nov. 2002.http://www.pnas.org/content/99/23/14758.full.pdf?ck=nck

[snip uote]

You can also google for the combination of "motif", "large protein",
and "evolution" and start reading articles for yourself.

Two problems here. The first one is that you didn't produce what I
actually asked for - an example of evolution in action.  Did you
notice the word "probable" in the above paragraph?  The authors assume
that the mechanism of RM/NS did the job based simply on sequence
similarities - not on actual demonstration of the mechanism in action
or even on any statistical analysis supporting the assumption that
this particular mechanism could actually do the job in a reasonable
amount of time.

Reasonable amout of time? What is that?

A few billion years . . . as per your own ToE.

The second problem is that capsid protein structure is largely
repetitive and does not require a significant amount of genetic real
estate to code for it.  It is a simplistic code that simply repeats a
small sequence over and over again - below the 1000 fsaar threshold.
This is unlike the code for the structure of something like a
flagellar motility system that has much greater minimum size and
specificity requirements well beyond the 1000 fsaar threshold.

A flagellar motility system is not a single sequence, though, is it?

Who is asking for a single sequence? I'm asking for a specified
system where all the parts work together at the same type in a
specific arrangement with each other.

You're simply stuck with a random search algorithm.  Bummer I
know
,
but that's the mechanism you're stuck with.

No Sean, that is the mechanism your invalid model is stuck with.
Nature gets to use all the mechanisms that really do occur in
addition to point mutation.

All mutations, including indel-type mutations, are random Bob.
 Just because motifs can be put together by Nature doesn't mean
that they aren't put together randomly.  They are.

Who is Bob?

What is it? Richard Baldwin?

Yes. I go by Rich.

Fine - Rich it is.

Reality has regulatory processes, not just
fixed-length open reading frames.

What is "fixed" is the minimum size and specificity needed to
produce a given type of function to a useful level of functionality
in a given organism in a given environment.  That size and
specificity or "minimum structural threshold" is in fact "fixed".

Sean, when you include smaller sequences in the mix, and you make
sure that a good percentage of them are proteins, and you start
watching them interact, then larger proteins can be built.

Larger proteins can be built.  It is possible.  It just gets
exponentially less and less likely when you start to consider higher
and higher levels of functional complexity.

I'm glad to see you admit that larger proteins can be built. But you are
looking at it backwards, Sean. Try thinking about it this way - given that
you've started with a set of motifs, and that you can put them together in
arbitrary ways, how much can be done with them?

It isn't a matter of what can be done. It is a matter of what is
likely to be done via a random process of concatenation. Again, there
are many more possible vs. viable combinations. That's the problem.

You set up a straw man and argue while it just sits there..

My model represents reality very well. If anything, it gives all
benefits of the doubt to the opposing position.  Your own
suggestions to more closely model reality would actually favor my
position, not yours.  Try again.

It does not give all the benefits of the doubt to the opposing
position. It is a caricature of the opposing position. You don't seem
to trouble yourself to find out what the opposing position even is.

I've carefully considered all of your suggestions. None of them
remotely challenge my position - they actually support it.

If you have carefully considered it, why don't you try to elucidate what my
position is, here and now? I don't think you can do it, because I don't
think you've tried to understand it.

[snip]http://www.dbb.su.se/@api/deki/files/67/=domt

You think you can take short viable sequences and easily combine them
to produce larger and still larger viable sequences. While this is
possible, even via a random process, it becomes exponentially less and
less likely that any random concatenation process will actually be
successful.


ree.pdf

Again, nice story - and very common.  However, it isn't backed up by
either demonstration or relevant statistical analysis when it comes to
the mechanism of RM/NS.  It is simply assumed that given enough time
that RM/NS could do the job.  This assumption is just that - a bald
assumption.  There is no supporting science behind it.  Any real
statistical analysis will show that it is essentially impossible for
this assumption to be true beyond very very low levels of functional
complexity.  That's a fact if you actually care to do the math
yourself.

There is all kinds of supporting science behind it, with relevant stistical
analysis. You've not bothered to look.

Please provide it if you actually come across it. There is no
statistical analysis whatsoever when it comes to supporting the notion
that the mechanism of RM/NS did the job. The only statistical
analysis that I ever see in print is based on sequence comparisons and
the assumption that a certain degree of sequence similarity must have
been the result of RM/NS. That particular assumption that RM/NS must
have done the job is what is not backed up by any sort of statistical
analysis or predictive value.

Your odds analysis never addressed having populations of identical
sequences, so you did not answer the question.

Take the entire population of all the bacteria on Earth and put it on
a single location.  In a trillion years, how far into sequence space
is it likely completely explore?  Only a Hamming distance of about
30.  If you want to get more realistic here, the actual Hamming
distance to the nearest target is likely to be far greater than 50 -
more like 350.  Putting the entire population on a single starting
point hardly solves the problem.   But, because you brought it up, I
guess I'll have to add this explanation to the calculation as
well . . .

http://www.detectingdesign.com/flagellum.html#Calculation

Your latest revision only considers having one potential target within that
space. If you packed all the 1e34 estimated targets in the area within a
distance of 50 from a centroid, as per your paragraph above the new section
on larger populations, and applied the (overly simplistic) math in the
revised section, your probability per generation would become
p = 1e29/(1e65/1e34) = 1e-2 per generation.

So you see, Sean, it does matter.

1e34 isn't the number of targets in 1000 fsaar sequence space.
Rather, it is the number of starting point sequences within the
population of all the bacteria on Earth. The total number of targets
within that space is around 1e652 and the ratio of beneficial vs. non-
beneficial is 1e-649.

Given such a low ratio, how on Earth do you propose to get such an
increase in target density around your starting point island when the
overall density in sequence space suggests that such a high local
density is extremely unlikely? - to the point of essential
impossibility. Upon what basis do you propose such an extreme
stacking of the deck?

You see, the only way you can get your theory to work is if you assume
that the available targets in sequence space are extremely clustered
in one tiny corner of it. There simply is no basis for this
assumption besides wishful thinking as far as I can tell. That's not
science. Sorry.

I haven't got any particular notion about how tightly clustered the
protein-coding sequences are, so naturally, I don't have evidence.
Since evolution is not supposed to find protein-coding sequences by
random walk, it doesn't matter.

Evolution is supposed to find protein-coding sequences by either
random walks of small or large steps are large single leaps into
sequence space that are also taken randomly.  That is why the
mechanism is said to be based on random mutations.  They are all
random.

The before and after conditions for sequences taking large leaps are
different than the before and after conditions for sequences undergoing
random walk by point mutation.

Not statistically - there really is no significant difference relative
to the problem at hand. The odds of both types of mutations hitting
upon a rare target still decrease, exponentially, with increasing
functional complexities under consideration.

Granted, there is a big of clustering going on in sequence space,
but this clustering gets less and less pronounced at higher and
higher levels of functional complexity.  Known higher-level systems
that hav
e
qualitatively unique functions are *all* very far apart in sequence
space. They simply are not significantly clustered like they would
have to be for your argument to be remotely tenable.

You don't appear to understand my argument. IT DOESN"T MATTER HOW FAR
APART THEY ARE!

Yes, it does.  If you actually would spend at least some time with the
math involved, you'd know this.

Sean, if you arrive at a point in N-space from a combination of i-space and
(N-i)-space, there IS no Hamming distance. It isn't defined unless
sequences have the same length.

The combination that produces N-space does have a Hamming distance
relative to that space since they are both the same size. The odds
that a Hamming distance of an equivalent combination of smaller
sequences, assembled randomly, will be within a 50 are extremely small
given combo requirements beyond 1000 fsaars.

And, as I've explained to you before, the odds are combining
smaller sequences to form larger ones do not overcome the
problem to any significant degree.

http://www.detectingdesign.com/flagellum.html#starting

You are not convincing.

That's not a helpful comment.  You need to produce at least a
few detailed counters here that actually hold up - unlike your
original "challenges" that you used to start off this thread.

The reason it is not convincing is that you make an implicit
assumption, without presenting any evidence, that combining
smaller sequences starts a new random walk. You only devote about
two sentences to this entire subject. Meanwhile, I can easily find
lots of articles indicating that combining smaller sequences is
fundamental to protein evolution, and that genetic engineering
makes use of this fact.

Hmhmmm . . .   find just one paper that actually presents something
besides the simple assertion that RM/NS is the mechanism that did
the job.  

It ISN'T the sole mechanism!

Oh really?  What other mechanism do you know of to find novel targets
in sequence space besides RM/NS?  This is a new argument for
sure!  ; )

gene flow and genetic drift.

Genetic drift and gene flow? Tell me, how does genetic drift produce
novel sequences without the use of either random mutations or natural
selection?

Find one that actually applies some statistical analysis to
support this assumption.  Or, find one that shows this happening in
observable time.  It just doesn't happen.  It has never be observed
beyond the 1000 fsaar threshold.  And, there is no statistical
backin
g
behind the assertions that RM/NS did the job.  You'll find none of
this in literature.

Practically all of the articles I've recently read about protein
evolution are based on statistical analyses of protein databanks.

The "statistics" only concern themselves with degrees of sequence
similarities - not the odds of the mechanism of RM/NS actually being
successful in a given span of time.  It is simply assumed that this
mechanism did the job without any statistical basis for this
assumption.

Nobody but you appears interested in that particular statistical analysis,
so I wouldn't really expect to find scientific articles about it.

So, you admit it. The mechanism is simply assumed in literature
because no one bothers to actually consider the odds of it actually
doing what everyone assumes it has to be able to do?! LOL - too
good! ; )

Lots
of these articles are about finding the evolutionary history of
various proteins based on their motifs. This includes very large
proteins. So yes, it is in the literature.

Not true.  The evolutionary history is simply assumed based on the
notion that sequence similarities indicate a common ancestry that was
produced by the mindless mechanism of RM/NS.  While we all agree that
there was a common origin, the mechanism is what is in question here.
What mechanism was able to do the job?  This mechanism is what needs
to have statistical support.  That support is completely missing.  All
you will find in literature is a bald assertion that given a few
million or billion years RM/NS must certainly have been able to do the
job.  This assertion is made without any actual statistical support or
mathematical analysis of any kind.

Sean, I think you would agree that random mutation and natural selection
could find find a sequence across a Hamming distance of 2, right? This
would be statistically likely?

Sure - even for a fairly small population.

So, consider two motifs, lengths 23 and 42. The combination of the motifs
has length 65. How many mutations away is the long motif from either short
one? One. Do you find a mutation of one step to be so unimaginably
improbable?

The answer to this question is actually a Hamming distance of zero.
The odds that one or the other or both motifs will still be functional
in their lower-level capacity even after combination are pretty good.
The real question here is if the combination is stable enough to be
maintained as functionally beneficial for a system of function that
requires at least 65 fsaars. The odds of that happening are
exponentially less likely than the odds of finding a lower-level
system that has a smaller minimum size and/or specificity
requirement.

Did you actually read the argument linked above?  It directly
answers your question.   Any real counters to the explanation
offered?  See
ms
to me that you're not remotely considering the problem with any
seriousness at all.

Yes, I did read the link. No, it does not directly answers my
question. WHY does your model treat macromutations no differently
than any other mutation, but as just another permutation? What
evidence do you have that insertions, deletions, end to end
combos, and different length combos, do NOT end in an area on or
near a viable protein sequence?

Because, there are vastly more ways that stable sequences can be
concatenations vs. the total number of viable sequences in sequence
space.  

That does not parse.

Why not?

I couldn't at first tell whether "vs." applied to "concatentations" or
"ways." I suppose it applies to "ways"?

Yes - ways.

By definition then most concatenations will not be stable.

You can't prevent Nature from doing what it does through a
definition.

If the number of different combinations vastly outnumbers the number
of viable outcomes, explain to me how the odds of success are "likely"
to be good?

Because concatentating two sequences having stability and kinetic
accessibility is far more likely to result in a sequence having stability
and kinetic stability, as opposed to what you describe - a completely
random sequence undergoing point mutation.

You didn't answer the question. If the number of different possible
concatenations vastly outnumbers the number of viable outcomes, how
are the odds of success "likely" to be good? Are you seriously
trying to argue that all or nearly all combintations will be viable?
Based on what? This notion is demonstrably mistaken. It is an
impossible argument you're trying to make.

How much more evidence do you need?  Your position is untenable -
by definition.  If you think otherwise, produce some actual
evidence to this effect.  Have any actual data to support this
inane notion of yours?  Simply gluing together existing beneficial
sequences at rando
m
is very unlikely to produce a stable much less beneficial sequence.

Except that I keep reading articles showing that it does exactly
that.

None of your articles contain any statistical analysis when it comes
to explaining the odds of success vs. failure of a random process
doing the job.  Your articles simply assert it without any
mathematical support.

Why would you expect scientists to even bother with that, Sean?

Because without this statistical support, their assertions are just
that - bald assertions. That's not science. That's wishful
thinking.

Sean Pitman
www.DetectingDesign.com

.



Relevant Pages

  • Re: Six Simple And Reasonable Questions ...
    ... I certainly agree that evolution doesn't work if there is nothing ... Where in this calculation do you get the idea that proteins must ... defined function to all sequences of that size. ... The only 'gap distance' you actually calculate is always ...
    (talk.origins)
  • Re: Question about Behe
    ... I can't speak to how proteins work, or how evolution works particularly in ... to all amino acid sequences of comparable length? ... People doubted the Yockey study even though he was probably the leading ... So the creationists that want to use this argument have to figure out ...
    (talk.origins)
  • Re: Cascading vs. Specified Systems
    ... A cascading system of different enzymes do not work together ... they are required to work in a specific arrangement with each ... character sequences somewhere in a pool of a million randomly 5- ... Flagellar motility works because all the proteins are required to be ...
    (talk.origins)
  • Re: Non-beneficial Gaps
    ... translated into chained amino acids. ... proteins from the perspective of a given life form. ... The majority of possible sequences represent non-viable ... Temperature obviously can affect the relative stability as ...
    (talk.origins)
  • Re: Sean Pitman: definitions wanted
    ... >> functional cytochrome c sequences in sequence space would probably be ... as well as minimum size and specificity requirements ... The proteins in the chain have to capture and release electrons to ... >> tolerate even one substitution of any other amino acid. ...
    (talk.origins)