Re: Yet again, human evolution: huh?
- From: John Harshman <jharshman.diespamdie@xxxxxxxxxxx>
- Date: Sat, 15 Oct 2005 05:50:03 GMT
anon1@xxxxxxx wrote:
>>>What is the backbone? (Deoxyribose ring at each position?)
>>
>>The ribose + phosphodiester is usually considered the backbone.
>
>
> I'm confused: I thought RNA meant ribo-nucleic acid, which means ribose
> is the sugar part of the base, whereas DNA is called deoxyribo-nucleic
> acid, becuase instead of ribose it's using deoxyribose. But you say DNA
> uses ribose just the same as RNA??
No, I didn't mean to say that; it's deoxyribose. The bond is 5'
carbon-O-P-O-3' carbon.
>>>... Each backbone sugar is connected to the next, the 3' to the next
>>>5' or vice versa, I forget which, ...
>>
>>Traditionally it's 5' to 3', probably because that's the direction of
>>replication and transcription.
>
> OK, I think I'll skip trying to learn specifically whether the 3' or
> the 5' of one unit connects to the physphoric acid of the next or
> previous unit, and I'll just dumb it down to "replication and
> transcription progress in the same direction in all known life on
> Earth, and by convention reading out the sequence for the purpose of
> publication etc. goes in that same direction".
When DNA is replicated, the triphosphate is attached to the 5' carbon of
the free nucleotide; two phosphates are removed in attaching this to the
3' carbon of the existing strand.
> Hmm, does replication and transcription go in the same direction along
> the two parallel strands, or one strand processed in one direction and
> the other strand processed in the opposite direction because the
> base+phos combo is physically oriented in the opposite direction?
The latter. It's always 5' to 3', whichever strand is being replicated
or transcribed.
> Does unzipping of the two strands, prior to replication fo each strand
> separately, during mitosis, happen from one end or the other, or entire
> strand in parallel at once? If unzipping happens from one end, but one
> of the two strands is replicated or transcribed in the opposite
> direction, how can that possibly work??
Look up "okazaki fragment". The leading strand is replicated
continuously, but the lagging strand is replicated in discontinuous
segments. But always 5' to 3' on that strand.
> Hmm, if the two strands are physically layed out in opposite
> directions, then whenever replication of a single strand happens, the
> newly-built strand is being built in its own backwards direction,
> right? (If the two strands run in true-parallel, same direction on both
> strands, then this issue doesn't apply.)
I don't know what you meant by that. The two strands are anti-parallel,
which means that the 5'-3' direction on one strand is the 3'-5'
direction on the other strand.
> Anyway, I'll trust that regardless of whether the two strands are
> replicated and transcribed in true-parallel (same direction) or
> anti-parallel (opposite directions), for each strand the replication
> and transcription go in the saem direction, and readout of base
> sequence for purposes of these discussions and all genomic databases
> also goes in that same direction for that one strand. I hope at least I
> got that correct now.
Yep.
> Regarding anti-parallel layout:
>
>>Yes, that's correct. But I really don't want to get into that. I'm
>>giving just exactly enough information to understand the phylogenetic
>>discussion, and no more. Almost everything you say, here and below, is
>>beyond what I need.
>
> That's OK. You asked for feedback as to what parts were opaque or
> caused confusion, and I simply replied from my own point of view,
> bringing up questions that might confuse me if I didn't know the
> answer, and asking some other related questions while we were on the
> topic. I saw a later posting where you got rid of the five-species
> example, using a four-species example instead. As a first tutorial as
> to what's going on, I think that's a good idea, and don't know why I
> didn't think of it myself, but I'm glad you did.
>
> Still, the real hypothesis is that three (3) African apes, not just
> two of the three, are in a small clade, split off from the other apes.
> Maybe you can go into great detail with just two of them, such as
> Gorilla+Human ---*--- Orangatan+Gibbon
> then once that's all explained, present just the table of results
> for the other two possible 2/2 splits:
> Gorilla+Chimpanzee ---*--- Orangatan+Gibbon
> Chimpanzee+Human ---*--- Orangatan+Gibbon
> If all those 2/2 splits show very high confidence, then we can
> summarize the three results as a sure conclusion of the desired 3/2
> split.
>
> Note that the 3/2 split does *not* guarantee the African-ape
> hypothesis, that Gorilla+Chimpanzee+Human are very closely related. The
> opposite could be equally true, consistent with the 3/2 split: Gibbon
> and Orangatan could be very closely related, and all three Gorilla
> Chimpanzee and Human could be distant out-groups. It's only when you
> build a philogenic tree that includes hundreds or thousands of species,
> and see that the African apes remain together, separate from all other
> species, that the African-ape-clade hypothesis is demonstrated
> conclusively. Even then, in principle it's possible that some new
> completely unexpected species that looks totally unlike apes and hasn't
> been classified as apes might turn out to be very closely related to
> the African apes. But if that ever happened, it would call a lot of
> philogenics into question. I don't believe that would happen, that's my
> prediction. But the current classification *is* falsifiable by that
> means.
Or to put it more simply: a phylogenetic tree represents the
relationships among the included taxa only, and does not represent the
relationships of any taxa that are not included.
[snip]
> I have a side question: When there's no duplication event, just a
> simple insert or delete, I can understand a large segment of DNA
> getting lost, and I can understand a large insertion that is a simple
> pattern repeating over and over, and also horizontal gene transfer can
> insert a large segment that came from somewhere totally alien. But does
> it ever happen that an apparently random sequence is inserted, not from
> any source, but brand-new random sequence of DNA bases out of nowhere?
Yes, fairly often. Of course it may come from somewhere else in the
genome, but unless you have sequenced the entire genome, how would you tell?
> If not, then I would assume whenever we find two sequences like this:
> CACGAGCCATACGATATCAGT CCGTAGTGAGCACTATTAAACAGTTAGAGCGGTTT
> CACGACCCATACGATATCAGTTTGTTCATTAGCTCAATAATTCCGTTGTGAGCACTATTAAACAGTTAGAGCCGTTT
> if that middle part of the bottom sequence doesn't look like anything
> available via lateral gene transfer then we must assume the bottom
> sequence is ancestral and the top sequence derives from it as the
> result of a deletion event? It can't reasonbly be top-ancestral
> bottom-large-random-insertion-event, right? (Note: I've thrown in a few
> point mutations also just to make the data more realistic.)
This is where having sequence from multiple species would be handy. You
can't assume the directionality of an indel based just on two species.
[snip]
>>I think it's very confusing to talk of this as two trees instead of
>>as one tree.
>
>
> There are two rooted trees, what is usually meant by a tree, attached
> back to back (trunk to trunk) to make one large unrooted tree (the
> unusual kind being discussed here). In general at any inner link if you
> cut there you get two rooted trees, and for these kinds of ternary
> unrooted trees if you cut out any single internal node you get three
> rooted trees. It all seems simple to me, don't know if it makes sense
> to you, or more important to the intended reader of your tutorial.
This is not what "unrooted" and "rooted" mean in systematics. Perhaps in
graph theory. In systematics a root gives a time direction to each
branch, while an unrooted tree has no such direction.
>>There are no patterns that partly support it. Either they support it,
>>contradict it, or are irrelevant to it.
>
>
> I disagree. The hypothesis is:
> Gor+Chi+Hum / Ora+Gib
> The following data is what you are looking for:
> Gor+Chi+Hum / Ora+Gib
> The following sets of data are also consistent with the hypothesis:
> Chi+Hum / Gor+Ora+Gib * Note this case
> Gor / Chi+Hum / Ora+Gib * Note this case
> Gor+Chi+Hum / Ora / Gib
True. But they are also consistent with trees that contradict the
hypothesis. Thus they do not decide between hypotheses. They are irrelevant.
> The following data is inconsistent with the hypothesis, except by
> arguing special circumstances such as duplicate mutations in distant
> branches or polymorphism in common ancestor or horizontal gene transfer:
> Gor+Gib / Ora+Chi+Hum
>
> * The two cases flagged above clearly show a close relation between
> chimp and human, separate from the relation between orangatan and
> gibbon, which is *part* of the close clustering of all three African
> apes, hence my claim these give some evidence of that hypothesis.
> I claim that is indeed weak support, rather than totally neutral.
And this claim is incorrect. If we are using a parsimony criterion it's
easy to demonstrate. Those sites are equally compatible with the
hypothesized tree and with contradictory trees. The site you
characterize as Gor+Chi+Hum/Ora/Gib is in fact compatible with any tree
whatsoever, i.e. it requires two changes no matter what tree you have.
> Sometimes in mathematics, it's best to derive a stronger theorem than
> what we really want, and then generate what we want as a simple
> corollary to the main theorem. In this case, we might generate the
> stronger fully-resolved 5-species unrooted tree:
> Gib Gor Hum
> | | |
> Orang---*----*----*---Chimp
> and then the three-African-ape hypothesis would be one of the two
> possible corollaries (the other being Hum+Chi cousins hypothesis).
> (With the caveat that we haven't proven whether Ora+Gib are outgroups
> for tight Gor+Hum+Chi clade, or whether Gor+Hum+Chi are the outgroups
> with tight Ora+Gib.)
>
>>In the first case, the time between speciation events is such that
>>ancestral polymorphisms would be expected to have coalesced without any
>>retention.
>
> How is this a valid fact to assert, if the only information you have
> are these five short DNA sequences, no information about branch lengths
> as you claimed above?
I was importing additional information.
>>Remember, these are mitochondrial seqeuences.
>
>
> No, I don't remember seeing this in the OP. Let me check there ...
> indeed, the *only* mention in the entire article is way at the very end
> in the reference:
>
>>Molecular phylogeny and evolution of primate mitochondrial DNA.
>
> I have no access to a technical library, so I didn't read that part.
> You should have mentionned this was mitochondrial DNA way at the top.
It really doesn't matter for my purposes, only for discussing this with
you. My apologies for assuming you would know, but you are showing an
amazing and confusing mixture of knowledge and lack of knowledge, and I
don't know how to treat you in a discussion. I would have assumed that
you would know that ND4 and ND5 are mitochondrial genes. What is your
background?
> Indeed, I would expect mitochondrial DNA to have very very little
> polymorphism that lasts only a few generations. So with that
> revelation, I agree ancestral polymorphism is unlikely, leaving
> duplicate identical mutations in separate branches, and horizontal gene
> transfer (via some virus vector such as ape influenza or monkey pox)
Very, very unlikely for mitochondria. In fact I don't know of a single case.
> as
> the two remaining ways we can explain away the conflicting data.
> It would help to have some statistical analysis of the general mutation
> rate (before individuals are immediately killed off by fatal
> mutations), hence the chance that the same exact neutral or beneficial
> mutation would occur twice.
>
>
>>In fact homoplasy is quite common in DNA sequences. There are only 4
>>possible bases, after all.
>
>
> I've heard that there's about one mutation per generation. For a gemone
> of several million bases, that's one mutation per several million bases
> per generation, or one mutation per base per several million
> generations. Are there several million generations separating African
> apes and other apes, whereby we'd expect mutations in the same location
> to recur all over the place, with one third of them resulting in the
> same result, hence several such within any segment of the size you were
> considering? I didn't think so, but was I mistaken?
Mitochondria have a higher mutation rate than nuclear genomes.
Divergence in mammals averages around 2%/million years. I would guess
that the mutation rate is roughly 3 times that (assuming that silent
sites are neutral, non-silent sites highly deleterious), or 3% per
million years per lineage. If we imagine that a generation is around 3
years for the average mammal, we can conveniently suppose a mutation
rate of 10^-7/site/generation, if I'm doing my math right.
These taxa are only separated by a few million years, but it's enough
that at least half the sites are showing clear homoplasy. Fortunately
the homoplasy is randomly distributed, but the homology is not.
>>Please read just a little bit ahead before commenting.
>
> Do you expect your novice readers to read the whole tutorial before
> they understand any of it? Wouldn't it be more reasonble for it to be
> self-explanatory in a single-pass forward reading?
>>I don't think you understand how the chi square test works. There is
>>only one test performed on the entire distribution. It asks whether the
>>7 patterns (and there really are only 7 relevant patterns) occur with
>>equal frequency. They don't. That's all.
>
> I don't need the chi squared test to see from the raw totals that the
> African/nonAfrican split is way out ahead compared to all the rest.
Precisely, but you need it to tell whether this is likely to come about
by chance.
> Converting them to all chi-squ scores doesn't look, on the face of it,
> just looking at the numbers, not knowing what to expect, whether it's
> really significant or not. On the other hand, 95% or better confidence
> on one hypothesis, and only 50% of less for all the others, shows me
> true and obvious significance. From later in your tutorial, I see
> something better than 99.999999% confidence, which is super super good.
> But that's too late. This reader is already saying "so what? why did
> you bother to show me these meaningless (to me) chi-sq results here?"
> at the point where the chi-sq has been shown but the P hasn't yet.
It's chronological. You need the sum of squares, to compare to the chi
square distribution, before you can compute the P value. Again, you
should understand that you do *not* get a P value for any one
hypothesis, only for the entire distribution of sites over hypotheses.
The P value represents how confident you can be in rejecting the null
hypothesis that all differences are random.
>>Here's what you can do. Go to Genbank and find a random sequence that
>>has entries for all 5 species above. ...
>
> I have no idea how to do that. Is there a simple tutorial for laypeople
> who want to do such simple tasks? One time a year or so ago I saw a URL
> for a set of provisional sequence data that was organized in a way
> where I could just browse the data at random and pick some sequence
> from somewhere and view/download. But I have no idea how to find
> matching sequences in five different gemones, and I'm sure it can't be
> done by manually browsing each of the five independently even if I had
> URLs for each of the five.
That's what GenBank is for. One url. GenBank gives you various options
to search the database. And there are tutorials and all manner of other
help. Though in fact phylogenetic analysis is not what it's really set
up for. You might start by searching for sequences with taxon =
Hylobates, then BLAST those to find other primates with homologous
sequences, choosing those that give you all five species.
Here's the home page:
http://www.ncbi.nlm.nih.gov/
>>This is not an assumption of what I want to prove. It's a prediction
>>based on experience, mine and that of everyone who has ever sequenced
>>primate DNA.
>
> I think you should have presented it clearly as such a prediction from
> the model, a way of falsifying the model (hypothesis) if your
> prediction turns out to be wrong in more than just a tiny fraction of
> cases.
>
>
>>>I believe that among five species there are thirty possible unrooted trees,
>
>
>>15.
>
>
> Aha, there are indeed thirty unrooted trees if you distinguish the
> inner branches as major and minor, for example one of them supported to
> a 99.999999% confidence level and the other supported only to a 95% or
> even 80% confidence level. But if you ignore branch lengths or
> confidence levels, then indeed there are only 15 modulo the symmetry.
> I had the right thing in mind, but said it wrong, sorry.
>
>
>>>or considering just
>>>the toplevel 3/2 split there are ten possible, correct? So there's
>>>nothing unique about any particular one of them.
>>
>>No. However, if there is statistical support for any one tree as opposed
>>to all other trees, that itself is an expectation of common descent.
>>That is, common descent supposes that you will get one consistent tree
>>from different samples, though it does not a priori tell you what tree
>>to expect. Fiat creation has no such expectation.
>
> We're in agreement on the facts. I just thought your original wording
> was misleading, indicating that this one (of the ten possible) 3/2
> split was unique.
Yes, it's our expectation going in, because previous data, both
morphological and molecular, have given us that tree.
> When there are ten possibilties that are special, any
> one of the ten is possible, even with special creation you'll get one
> of those ten for any given sequence data, it's just that if you use
> several different sequences of DNA you'll get the same 3/2 split in all
> cases with common descent whereas you'll get a random sample of the ten
> possible splits with special creation.
>
> But that's not quite correct. If the correct fully-resolved unrooted
> tree is the one I showed above, repeated here again:
> Gib Gor Hum
> | | |
> Orang---*----*----*---Chimp
> then there are two different 3/2 splits that we expect to be supported
> by the data:
> Ora+Gib / Gor+Hum+Chi
> Ora+Gib+Gor / Hum+Chi
> For some sets of data, one or another might be strongly supported by
> the data, and in some *both* might be strongly supported. So, on that
> premise, looking at lots of sequences you'd expect a random mix of one
> or the other or both strongly supported.
This would be the case if branch lengths were uncorrelated among loci.
But they are in fact strongly correlated. The left internal branch is
much longer than the right one, time and in expected probability of site
change. So it would be very, very unusual to find the right branch
supported without strong support for the left one too.
> In the general case, for any five species, there's only one possible
> unrooted tree in the topological sense,
Presumably you mean one possible unlabeled, unrooted tree.
> fifteen possible ways that tree
> can fit in with the five species (15 different ways to assign labels).
Correct.
> Given the general hypothesis of universal common descent, we have
> common descent for these five species, hence only one of these fifteen
> fully-resolved unrooted-trees is correct. But then we have two
> different 3/2 splits supported, random mix of one or other or both with
> different sets of sequence data.
>
> On the other hand, with only four species, again there's only one
> topological unrooted-tree, three possible labeled unrooted-trees, and
> for whichever such unrooted tree is correct there's only one 2/2 split,
> so there the split really is unique given the hypothesis.
>
> It's sad that you had to delete one of the four apes to make the
> tutorial manageable, but it seems to be a necessary decision.
I don't know. Several people have told me they like it better with all
five apes.
.
- References:
- Re: Yet again, human evolution: huh?
- From: anon1
- Re: Yet again, human evolution: huh?
- From: anon1
- Re: Yet again, human evolution: huh?
- Prev by Date: Re: Avian Flu, ToE and ID
- Next by Date: Re: THE ORIGIN OF ATHEISM
- Previous by thread: Re: Yet again, human evolution: huh?
- Next by thread: QOTD
- Index(es):
Relevant Pages
|