Re: Evolution increases the computational ability of organisms.



Tim Tyler wrote:

John Harshman wrote:

Tim Tyler wrote:

John Harshman wrote:

Tim Tyler wrote:

John Harshman wrote:

Tim Tyler wrote:


I have said what I mean quite a few times now. I think evolution
is progressive in more than the Gouldian random walk way. [...]
Dawkins I attribute this to two main factors.

One is "technological" progress: the cumulative "inventions" by
organisms of survival tech: photosynthesis, DNA, etc, favoured by
natural selection.

The other is sexual selection, which tends to drive lineages in
arbitraryish directions, so many of their traits do not diffuse,
but rather are driven about.

But they are driven in arbitrary directions. Again, nobody is saying
that the mechanism of all evolution is drift, just that environments
vary in unpredictable ways so that a course driven by selection alone
can act quite like a random walk.

Well, I won't argue with that - except to say that it was not
Gould's point. Gould argued for neutrality in a *particular*
case.

Were we talking about Gould?

He does seem to get a mention at the top.


What particular case are you talking about here?

The one in Life's Grandeur/FH - the idea that complexity
is a neutral trait overall.

I don't have the book in front of me. Is he really arguing that it's
neutral, or that different degrees of it are advantageous in different
environments? "Neutral" is a highly specific term in biology.


Fluctuating selection would satisfy Gould equally well.

"The modal bacter: Why progress does not characterize the history of
life"... is his chapter title and theme.

Probably the major theme is that complexity follows a random walk
bounded at the low end by a wall - and that this mechanism explains the
origin of complex organisms - without invoking any kind of selective
mechanism favouring progress.

So...I'm vindicated here?

Most "technological progress" is a local adaptation to some environment.
If the environment changes, technological progress can involve reversal
of the previous progress, and that doesn't sound like progress to me.
Some adaptations are useful in a wide variety of environments, and so
are unlikely to be lost in most events. These are quite rare [...]

Pah!

DNA? RNA? Enzymes? Proteins? Sex? Lipid cell walls? Cellulose?

I don't think they justify a claim that evolution is in general

progressive.

Your view that the accumulation of technology like DNA, RNA, enzymes,
proteins, sex, photosynthesis etc does not make evolution progressive
is duly noted.

What exactly does "progressive" mean here, and why should 6 events in 4
billion years be considered the norm?

Roughly, progress is intended as a measure of evolutionary competence;
of the ability to extract resources from an environment before
competitors do - and so on. It is related to fitness.

It is not just represented by 6 events - those were examples.

How many events would you suppose?

Billions? I really don't know in much detail. Bacteria have
made an *awful* lot of useful biochemical inventions, and humans
have made one or two of their own discoveries by now.

Please leave human technology out of this. Now it looks as if nearly all
major bacterial inventions were finished a billion years ago or more,
since the trees of the major metabolic guilds go that deep. And there
are no more than a couple of dozen of those, however you slice it. I
don't know how you're going to come up with billions, and I don't know
how you will get a rate of progress out of it unless that rate is
slowing down.

It happens because ploidity-increasing events followed by divergence
which is then reinforced by selection are relatively common among
some lineages - while selection in the opposite direction is
obviously rather weak.

This is confused. Are you saying that gene number generally increases in
evolution?
I see no sign of that.

?!? Have you looked?

Yes, to the extent possible. What sort of evidence would you like to
present for your claim?

Ploidy-increasing events are rather common, as can be seen by
all the high-ploidy organisms out there.

Not sure what you mean. Most organisms are either haploid or diploid.
Polyploids are transient, because homologous chromosomes differentiate.

Huh? What is Paleopolyploidy, then?

I have no idea. Wait here. Ah, wikipedia to the rescue: "Paleopolyploidy
refers to ancient genome duplications which occurred at least several
million years ago (mya). The genome doubling event could either be an
autopolyploidy or an allopolyploidy. Due to functional redundancy, genes
are rapidly silenced and/or lost from the duplicated genomes. Most
paleopolyploids, through evolutionary time, have lost their "polyploid"
status through a process called diploidization, and are referred as
"diploids" nowadays (eg. baker's yeast, Arabidopsis and perhaps humans)."

Note that this says exactly what I did.

I think my most sensible response at this point is: it
doesn't make much difference if the organism is classified
as polyploid or not - the point is that its genome has
been duplicated, and is larger than it once was.

True, for a while. Note that genes are rapidly lost from the duplicated
genomes, until we're back where we started.

And we know selection
against junk DNA is remarkably weak - because related organisms with
lots of it do not appear to suffer in competition with those with
little - and because there is so much of it in so many non-bacteria.

I would say that's evidence that it's either weak, or fluctuating, or
both. But the fact that genome size fluctuates so much doesn't suggest a
consistent trend toward increase.

The bit of genomes which is junk is rather likely to fluctuate.

Here we come again to the ambiguity you refuse to resolve, whether we're
talking about number of genes, size of the functional genome, or size of
the genome in general.

Again? Refuse? You specified the topic here in the section quoted
above. You said "genome size". I haven't attempted to switch the
context since then.

Yes you have, right here. You try to dismiss fluctuating genome size by
saying that's just the junk that fluctuates. This would imply that the
size of the non-junk portion of the genome is what you are really
concerned with. Or did you misspeak?

All you're doing is stating a mechanism that increases number and
denying that any mechanism decreases number.

No: I'm suggesting the increase mechanism is powerful,
while the decrease mechanism is weak. This is true in
plants, and false in bacteria - but the plants are enough
to boost up the averages, so there is a net increase overall.

Ah, so this supposed progressive mechanism is limited to plants.

No. It is /commonest/ in plants, though.

So, if you were right, wouldn't plants consistently have larger genomess
than other groups?

Yes - if that were the main mechanism responsible for size increases.

So we have come down to the notion that your proposed mechanism is at
most a side issue.

My using the term "polyploidity" selected too narrow a target.

There are many ways of duplicating genes and then the copies
diverging and adopting independent functional roles that do
not qualify as polyploidity.

Another potential target for selection for large genome
size is mobile genetic elements. These typically benefit
from being as numerous as possible, and the bigger genomes
are, the more copies they can contain.

Here I think there's a confusion of level and direction of causation.
Mobile elements don't benefit from being numerous. It's just that those
that replicate fastest increase in frequency in the genome. But this is
not selection on the genome, it's selection (of a sort) on the mobile
elements themselves. This may be opposed to, in concert with, or
irrelevant to selection on the genome. Also, you should say that the
more copies a genome contains, the bigger it is. There is no such thing
as genome capacity.

I seem to recall previously defining genome complexity for you:

The Kolmogorov complexity of the information in the genome - considered
in base 4. Would you like me to specify the associated language?

Ah, so this is the minimum length of an algorithm that would reproduce
the sequence of a genome, yes?

Yes.

But wouldn't a completely random genome then be the most complex?

Yes.


In which case, you are claiming that the randomness of genomes is
increasing over time?

More their length.

If so, shouldn't your measure be genome length, not complexity? The
simplest way to increase complexity is to increase randomness, not length.

Now of course complexity would increase over time if length
increased over time, but only if the increases in length were
random additions of sequence.

No, the sequences just have to have a somewhat unpredictable
element. Mutations and redundancy in the genetic code make
this likely to be true after a while.

Surely you're not supposing that the genetic code has much influence on
genome randomness, given that such a small percentage of the typical
genome is coding sequence. You need to think this through a bit more.

Tetraploidization would in fact be only a minimal (a few bytes) increase in complexity,
the addition only of "do this twice" to the previous algorithm. So it's
puzzling that you would give it such importance.

Consider the effect of subsequent mutations.

One major effect would be to reduce genome size again. Now it might be
that point mutations would increase complexity faster than deletions
would reduce it, but I don't see how you can be confident in your assertion.

This also has nothing to do with function of the genome, which seems odd.

Kolmogorov complexity is the most standard metric I can think of -
though it is hard to compute if you use a proper programming
language, rather than, say, a compressor.

Getting into what part of the genome is functional and what
isn't soon turns into a rat's nest of ambiguity. I like
metrics that are a function of the genome only. It keeps
everything neat and digital.

But is it meaningful? Your proposed measure doesn't seem to have
anything to do with biology.

It's not so simple as all that. It's not clear just what controls genome
size. What we can see from the tree is that genome size has both
increased and decreased during evolution, and by amounts that can't
generally be explained by polyploidy.

Right. I accept your correction that other types of insertion
are more important.

...and deletion?

Deletions happen. I am not sure what you are asking here.

Just checking. Would you agree that deletions too are more important
than polyploidization?

Genomes of advanced organisms tend to swell up. That's part
of why there is so much junk DNA about.

But not for the reason you claim. The balance between insertion and
deletion mutations is not clear so far. We can't really rule out size
increases being due to mutation pressure. I don't think you know why
genomes are the sizes they are, which I will note can be radically
different between closely related species.

Beyond what is functionally necessary, chance is a substantial part of
it. Genome size is subject to changes via mutation - and the
excess junk is nearly neutral - and so it is free to grow and shrink
randomly.

*If* we were only talking about junk DNA, Gould's random walk
might not be such a bad model - though there are still those pesky
ploidity changes: where the drunkard suddenly mysteriously teleports to
twice the distance from the wall, rather than wandering back and forth
as drunkards are renowned for doing.

No, it's merely a step that's a bit larger than some. That need not
affect the directionality or lack thereof, and there really doesn't seem
to be any.



...and there are those cases where the sheer quantity of DNA
is actually selected, e.g. because it influences cell size.

Maybe. However, if that's true, then the optimum cell size, and thus
genome size, can fluctuate over time, and we're back to the random walk.

.../assuming/ that optimum cell size fluctuates randomly, that is.

Do you see a reason why it wouldn't? It would be like the size of
anything else. There is no universally optimum size.

Say Cavalier-Smith is right on:

Nuclear volume control by nucleoskeletal DNA, selection for cell volume
and cell growth rate, and the solution of the DNA C-value paradox
http://jcs.biologists.org/cgi/content/abstract/34/1/247

...and genome sizes influence growth rates, and K-selected and
r-selected organisms make use of this by having large or small
genomes respectively.

This wouldn't lead to a trend favoring large size - but it
would make a mess of the hypothesis that the distribution
of genome sizes is predictable from a random walk bounded
by a wall. There would be many abnormally small genomes,
from the r-selected organisms and many abnormally large
genomes, from the K-selected ones. You would have to
invoke selection to explain the results.

I don't see this. In the real biota we see no separate clump of
"k-selected" vs. "r-selected" species, but a continuum along which
species are arranged between extremes. Presumably cell/genome size
distributions caused by k vs. r selection would follow the k/r
distribution too and would be scattered across the continuum. And as
environments changed over time, the degree of k and r selection in a
lineage would be expected to change too.

Now we can in fact investigate this hypothesis using phylogenetic trees.
I don't think we have a count of
gene numbers that's good enough, but genome size may be an acceptable
proxy since your claimed mechanism would result not only in more genes
but in bigger genomes. I haven't done any formal analysis, but based on
what I know of genome sizes there is no apparent increase over time.

Well, we *know* that exists.

Do we? I don't. Can you present some evidence for your assertion?

Trivially: genomes started small. Now they are large. There /must/
have been a net increase in size over time. As I said, "the only
controversy is whether the drift mechanism explains it".

Sure, they started small. Most of them are still small. We have agreed
that most prokaryotes' low tolerance for junk keeps them fairly small,
and possibly there being a single origin of replication would be
important there. So forget prokaryotes. Eukaryotes have bigger genomes.
But have eukaryote genomes gotten bigger over time, on average?

One of the problems I see with this is that the ancestor was a single
cell. If it happened to be a particularly large-genome cell of its
type, that would seriously mess up any claim for progress from that
point. There have probably been efforts to reconstruct the genome of
this cell from its relatives - and I am not familiar with their results-
but this is one example of why I try to stick to discussing ecosystems:

Whoa, now you have moved from the Kolmogorov complexity of a genome to
the complexity of an ecosystem. Can't you keep your goalpost still for a
while?

I don't want to be forbidden from discussing other things.

If someone ask me "What is genome complexity?", as happened earlier on,
I should be able to answer.

Be aware that is makes you hard both to follow and to respond to.

But if in fact there is a drive to increase genome size (a goalpost move
from Kolmogorov complexity too, by the way), then it doesn't matter what
the size of the ancestral genome is. Size would only increase from
whatever starting point there was.

Say there is some optimal genome size for that type of
creature, though? If the ancestral cell is below this,
there will be a progressive increase as time passes.
If the ancestral cell is above it there will be a
decline. The size of the ancestral cell might really
affect the result of the question of whether there
is progress or not - so sampling errors in its selection
are a real concern.

Not at all. You were supposedly talking about the ancestor of all
eukaryotes, so "that type of creature" is not really a meaningful term.
If there is an optimum size for any particular situation, and situations
fluctuate, there's your random walk again. If instead, as you have
proposed, there is a general pressure for increase, it doesn't matter
where we start. If there is in fact a universal limit, then it depends
on how close to that limit we started and how fast increase happens. But
again the huge differences in genome size among species suggest that
limit is either not particularly universal or extremely huge. At any
rate, it seems to have no obvious influence.

If you ask: "has a lineage progressed since a particular ancestral
cell", the answer may depend a lot on what the properties of that
particular ancestral cell happened to be - and as one individual, its
measurements may be subject to considerable sampling error effects.

Perhaps. And that's why it's a good reason to look at the whole tree and
ask whether whatever property you think is increasing really is
increasing on most branches of the tree.

The problem I see with that is that inventions can be made - and
complexity can increase, by other mechanisms - in particular
by an increase in the number of branches.

This is the advantage of reductionism: we can check one proposed
mechanism at a time. Now if you want to modify your claims to be that
genome size is not increasing but numbers of species are, we can discuss
that too. But it will be a different discussion.

If at any given time it's not especially mor likely
to go up than to go down, regardless of where we
started from, your theory is in trouble.

Not if the number of branches is increasing. That
might still represent progressive development.

An odd definition of progress, but never mind. Since you are defining
complexity roughly as genome size * number of genomes, sure, either
factor could make the product increase. But we can consider them
separately, as indeed seems wise. So if we've disposed of genome size
increase, we can consider increase in species numbers. In fact we
already have, in other posts. I don't see evidence for that either.

Genome size seems pretty randomly distributed over the tree of eukaryotes.
Would you consider that at all suggestive?

I would consider it to be totally inaccurate:

Taxon #Records / No. of species
(%) Genome size range (pg) Mean genome size (pg)
Vertebrates
Jawless fishes 26 17 (16) 1.3–4.6 2.3
Cartilaginous fishes 183 130 (13) 2.5–17.1 5.7
Lungfishes 14 4 (66) 50–133 90.4
Chondrostean fishes 38 22 (42) 1.2–7.3 3.5
Teleost fishes 1761 1354 (5) 0.4–4.4 1.2
Amphibians 870 463 (9) 0.95–120.1 16.7
Reptiles 406 309 (4) 1.1–5.4 2.3
Birds 274 205 (2) 1.0–2.2 1.5
Mammals 600 432 (9) 1.7–8.4 3.5

"Eukaryotic genome size databases"
http://nar.oxfordjournals.org/cgi/screenpdf/35/suppl_1/D332.pdf

There is massive variation of genome sizes between different types of
animal. The distribution is not remotely random.

There is much more variation within groups than between them, except for
the lungfishes, and there we're talking about only 3 species; not much
room for variation. Amphibians are the weird ones, but note that their
variation covers the variation of all the other groups except teleosts.
Now phylogeny does make some patterns even if variation is random, as
long as it doesn't change too fast. But I don't see anything that needs
a serious causal explanation in that data, nor does it contradict my claim.

That "Genome size seems pretty randomly distributed over the tree of
eukaryotes?" 2.3, 5.7, 90.4, 3.5, 1.2, 16.7, 2.3, 1.5 and 3.5 look
like a suspiciously non-random bunch of numbers to me.

First, those aren't eukaryotes, but only vertebrates.

That was not a complete list of all eukaryotes ;-)

My point is more than that; it's a highly biased list, concentrating on
one small group, just because you happen to belong to that group.

Second, in what way do they look non-random? It would seem that

the distribution is skewed toward the small end; is that what you meant?

No. The outlying samples are too far out.

OK, I'll agree that simple "random walk" makes it unlikely for big
genomes to be concentrated in a couple of groups. But you also have to
consider the phylogenetic dependence here. We don't have thousands of
independent random walks from the starting point. Instead we have a
branching series of random walks with each new pair of branches
inheriting its ancestor's genome size. If some frog happens to have an
unusually large genome, its decendants start with unusually large
genomes, and some may get even larger, and so on. Anyway, the take-home
message is that you shouldn't look at these as if they're independent
samples.

The random walk hypothesis obviously needs to be amended into the
"genome size is neutral" hypothesis.

Of course that hypothesis is not very good either - because of all
the cases where genome size is directly selected via its influence
on things like cell size.

There are no such cases; what we have are theories that this is
happening. But remember that nobody says that the random walk is a
product of drift. Cell size can be subject to fluctuating selection just
like anything else, and so produce what looks like a random walk.

Maybe. You seem /awfully/ keen to wave away any of the
selection effects I raise as irrelevant and promote
random walks, on the basis of not very much at all.

No, I merely point out that if the selection effect is not constant over
all life, it will appear like a random walk.

Not really: only if all the selection effects cancel out,
or nearly do so, will the result appear like a random walk.

Right.

[snip Cavalier-Smith]

Again, the choice is not between selection and neutrality. You are
proposing a general rule that genome size will increase over time. That
requires that selection generally is in the direction of increase. Is
Cavalli-Sforza proposing this? It seems that he isn't, but instead that
genome size goes up and down by selection. And so if selection
fluctuates we may approach a random walk.

A random walk /is/ a possible outcome from fluctuating selection -
but not a likely one. The selective would need to cancel out -
or nearly do so in order to avoid leaving a signature - and
I don't see why this would happen.

For example if Cavalier-Smith's K-selected organisms are common, there
will be a cluster of large genome organisms - and if his r-selected
organisms are common, there will be another cluster of small genome
ones. Even if the K-selected organisms and r-selected organisms are
somehow equal in number, the resulting senome sire distribution from
these two types of selection won't look like the results obtained
from a random walk.

As I said before, you are manufacturing a dichotomy that does not exist
in nature; k/r is a continuum.

Genuine neutrality is the best way to produce a random walk.

Fluctuating selection can do it in theory, but there had
better be a good reason to expect the selective forces to
almost exactly cancel each other out, and rarely does this
happen.

I'm reminded here of the Central Limit Theorem. Are you acquainted? The
combination of many random variables, regardless of their individual
distributions, tends toward a normal distribution of effect.

.