Re: Sean PItman and nested hierarchy
- From: John Harshman <jharshman.diespamdie@xxxxxxxxxxx>
- Date: Wed, 05 Mar 2008 21:47:36 GMT
Charles Brenner wrote:
On Mar 5, 10:52 am, John Harshman <jharshman.diespam...@xxxxxxxxxxx>
wrote:
Charles Brenner wrote:On Mar 4, 6:18 pm, John Harshman <jharshman.diespam...@xxxxxxxxxxx>Yes, it can, depending on the algorithm used. Now it's true that all the
wrote:
Charles Brenner wrote:I'll bet some graphs are literally impossible. The graphs are producedOn Mar 4, 12:40 pm, John Harshman <jharshman.diespam...@xxxxxxxxxxx>I was thinking of the data sets as genomes.
wrote:
Charles Brenner wrote:[the following seems to me to justify a big snip]On Mar 3, 6:59 pm, John Harshman <jharshman.diespam...@xxxxxxxxxxx>
wrote:
Charles Brenner wrote:On Mar 3, 8:27 am, John Harshman <jharshman.diespam...@xxxxxxxxxxx>
wrote:
Charles Brenner wrote:On Mar 2, 8:21 pm, John Harshman <jharshman.diespam...@xxxxxxxxxxx>
wrote:
Charles Brenner wrote:On Feb 26, 9:36 am, John Harshman <jharshman.diespam...@xxxxxxxxxxx>
wrote:
[I thought I'd start a new thread since Sean isn't replying in the old
one. ...
Let's at least agree on what we're talking about.Ok. I accept the correction.We're talking about a likelihood ratio framework. Data of life orNo. That's not what the two hypotheses are, unless you're stating them
fossils is observed and there are two hypotheses about how it came
about --
Hn = common descent from natural processes
Hg = God created life
badly. The two hypotheses are 1) common descent and 2) separate
creation.
That's what I should have meant. (Speaking of confusing, my editorThe salient and interesting fact about the data is that it indicates aNo, unless by "god created life" you mean that there is no common
nested hierarchy (NS). NS is a more or less inevitable consequence of
Hn, but only one possible consequence of Hg. So the data supports Hn
over Hg, but by a lot or by a little? If the NS property of the data
is improbable under Hg, then the data is strong evidence supporting Hn
over Hg.
In order to get a handle on how improbable NS is as a consequence of
Hg, you posit the universe of all possible data states and imagine
them to be equally likely. That's quite unlike what one does in real
life (and most of my work is related to the theory or practice of
likelihoods of DNA data), but I've agreed to play along with your
point of view -- though of course I argue that the "equally likely"
provision doesn't make sense (mathematically, let alone
scientifically). However, if God created life then the patterns in the
data correspond to what God did -- the scheme that God chose. Are we
agreed so far?
descent, which is a confusing way to say it.
apologizes for introducing the abbreviation NS for "nested
hierarchy".)
I don't see why not. Also, if you allow that the data can includeIs this true? Only if individual data sets can increase without bound.And we're specifically talking about the patterns in character data,There is an infinitude of possible data sets.
i.e. similarities and differences among species. These in turn imply
connections among the species, or perhaps lack thereof. So I have
simplified by reducing the possible patterns to different assemblies of
connections among species.
Can they?
continuous measurements then there are infinitely many different
possible data sets even of bounded size.
What are your criteria for considering some graphs to have zero probability?My remarks in parentheses just above and below were supposedly theIt's natural to want toNot intentionally so.
bin the data to cope with it. In your case part of the motive for
doing so is to have a finite set, because for a finite set at least
the concept of "uniform probability distribution" is well-defined.
However, lifting this distribution from the finite set back to the
real data does not impose a uniform distribution on the real data --
just some arbitrary distribution that is tailored for the purpose of
your argument.
In short, what you call a "simplification" is a way of being sneaky.
(It's also very bizarre. Unless I quite misunderstand, a few of theI don't understand what you mean by saying it's bizarre. Can you explain?
graphs are trees representing natural hierarchies, which might
actually come about as a consequence of some God-scheme we could think
of.
explanation. To expand slightly: Since the vast majority of the graphs
are ones that have zero probability (in my opinion), they are just
stuffing. By positing equal probability (assuming Hg) for each graph
you thereby artificially exaggerate the apparent rareness of (the
handful of graphs representing) nested hierarchies as consequences of
Hg.
by an algorithm (to be accurate one or another favored heuristic) that
tries to fit the genomic data into a pattern. Regardless of whether
there are any limitations or preferences on the model of creation --
on "God" -- can phylogenetic analysis really ever produce graph
including a loop? (But I already gave a weaker form of this argument
in my previous response, below, so perhaps I have not understood your
question.)
most commonly used algorithms will always produce a tree (often a
single, fully bifurcating one) regardless of data input, but there are
others that will produce loops or unconnected points. See, for example,
John Alroy's program CTA. And there are of course loops in reality --
hybrid species -- for which there are programs. And there is also
Splitstree, which produces a big set of loops. Trust me. Programs are
written to fit the data, not the other way around. If the data suggested
some weird graph, there would be a program to analyze it.
I think we've been a little imprecise about nomenclature. I assume the
graphs we have in mind are in fact directed graphs -- they don't
merely say that A and B seem to be descended one from the other, but
specifically in which direction.
No. That's not necessary. Phylogenies are directed graphs. But I was considering the more general case.
When I said "loop" I had in mind a
circle of inheritance, as opposed to a set of edges like A->B->Z and A-
>C->Z (which I've heard called "articulation". Is that a usual word
for it?)
Is it possible you mean "reticulation"? But anyway, that's exactly what I thought you mean by "loop".
which could arise from hybridization. A devious creator could
I suppose create data that mildly suggests a circle: the junk DNA of B
looks like it descended from A in being substantially like a mutated
and partly broken-up and re-arranged version of A, C looks like a
descendant of B, ... and A looks like a descendant of Z. A->B->C->...-
A. Does John Alroy's program cater to this? Would it ever produce theresult A->B->A?
No. It produces, if I recall, undirected graphs (i.e. unrooted trees).
I have an alternative (rather abstract, tedious to try to write down)Sadly, it isn't.
argument in mind involving consideration of a multiplicity of binning
schemes that we might devise, of which your graph idea would be only a
non-distinguished example. But it may not be of interest, especially
if the answer above is satisfactory.
My abstract argument comes down in effect to saying, why would John
Astor's program be so special?
Alroy. I have no idea what you meant by that.
Let's imagine that my creationist adversary (C.A., whom you are ably
representing) won't buy any of my logical or intuitive attempts to
argue that some (directed) graphs (nearly) cannot occur as
representations of the genomic data.
Hey, how come I have to be the creationist?
Now, for any particular creation-
scheme (G), some kinds of graphs are quite likely and others are not.
However, when I try to say that certain graphs -- tangled loops for
example -- are unlikely from any G whatever, C.A. raises the objection
that I'm imposing a limitation on G. After all, for all we know G's
method involves intentionally designing the junk DNA just so as to
create a tangled loop in the graph. (We're assuming for the sake of
argument that such is mathematically possible.)
I don't understand what a tangled loop is, unless you mean to imply a directed graph with a circle in it, A>B>C>A, for example. It seems to me that would be possible, as long as different parts of the genome were compared for each piece of the loop.
Very well. From the C.A. perspective God or whatever chooses a
separate-creation-scheme G and there is some probability distribution
(unknown to you, me, or C.A.) across all possible choices for G.
Depending on which G is chosen, the various bins (=choices of graph)
that CTA can produce are more or less likely to occur. In other words,
the probability distribution on the bins of CTA is a projection from
the probability distribution of G's. Is that probability distribution
approximately uniform?
The algorithm CTA is man-made, e.g. by John Alroy. While it may be a
natural-seeming way to arrange the data from the perspective of
evolutionary biologists with their typical interests, in the sense
that that's an arbitrary interest, CTA is an arbitrary method of
arrangement. Instead of CTA, we could use some different binning
algorithm including algorithms which make no pretense of doing a
similar thing. Anything that computes some bin number based on the
entirety of the genomic data (perhaps providing that it tends to
compute about the same bin number from a substantial subset of the
genomic data) will do.
Let me stop you here. I would contend that there are methods of determining which binning methods fit the data better. Nested-hierarchy data demand to be dealt with by a nested-hierarchy algorithm, for example.
Each such algorithm defines a projection from
the probability distribution of G onto a set of numbered bins. Since
the bins are man-made and arbitrary, no matter what the probability
distribution on G may be, it can't tend to be uniform when projected
by an arbitrary algorithm. One particular algorithm, by chance, the
probability distribution for the various bins may be uniform, but not
in general across algorithms. (For example two algorithms could differ
mainly in that one of them collapses half the bins of the other into a
single bin. The probability distribution of the bins cannot
simultaneously be uniform for both algorithms.) Therefore, it would be
far-fetched to suppose that the bins of some particular binning scheme
such as CTA would have a uniform distribution.
You have lost me entirely. Who are you arguing with again? About what?
It's so much easier to write in terms of yes/no. Please assume in theor "unlikely to observe"?I personally have no problem with this. By constraining theI don't agree "no basis". I'm perfectly happy to reject such a demandThe problem with this is that we have no basis on which to say that aOnly a few of these patterns would be trees,(The vast majority consist of a bunch of random edges here and there
which of course would correspond to nested hierarchies.
including tangles and loops, and these correspond, to put it mildly,
to no obvious kind of God-scheme so my guess is that they have
probability about zero. On the other hand, a very obvious category of
God-schemes wherein God tosses completely different random DNA into
every different species, corresponds to the empty graph. Hence to
suppose equal probabilities for all the different graphs is pretty
odd. Recall my baseball analogy.)
god-scheme must be obvious, or indeed fit any parameters we choose to
impose on it. The whole idea comes from Sean's demand that we place no
constraints on god.
and so should be any thinking person. For example if God is
operationally defined as the "intelligent designer" then I have a
basis to prefer a God akin to intelligences with which I am familiar.
Sean or anyone is free to argue with my choices but to say no concept
of God is more reasonable than any is tantamount to ignoring the
absurdity of having a uniform probability on the infinite and
irregular space of all possible schemes, motives, and methods a God
might have.
distribution, you are assuming some particular sort of god. But the only
way we can produce a well-characterized hypothesis, subject to
scientific testing, is to do something of this sort. You can't of course
test all possible sorts of gods, at least at one time.
Naive falsificationism aside, it seems to me the answer depends on howThat may depend whether you mean Hg="God did it" as I originally wroteIf so, then the take-home lesson is that we can't estimate theThe uniform distribution is often used to represent ignorance of the trueI agree with that too. Even when the "uniform distribution" is
distribution. But it isn't really, is it?
perfectly well-defined, it is often nonsensical to invoke it.
likelihood of (in your terminology) NS given Hg. This prevents any sort
of science being done if the potential explanatory universe includes Hg.
Would you agree?
or Hg="separate creation" as you corrected. The latter is the
interesting question so I'll assume that. Relating to our discussion,
my answer is that it depends on how we define science. According to
the myth that the standard of science is "falsification" in so black
and white a sense that no one could doubt or argue, then I would agree
that Hg isn't subject to it, but perhaps not much is. According to my
preferred view of science though, Hg is potential grist for the
science mill. As I said at the outset, I like your argument that the
observed data (indicating nested hierarchy) is characteristic of Hn
and hugely non-characteristic of Hg -- i.e. by the "likelihood
principle" it is enormous evidence against Hg. One could go into all
sorts of detailed arguments about how "non-characteristic". All I say
is that you can't make those arguments with mere mathematics; they
will have an element of subjectivity. That's life, and science.
we define Hg. "Separate creation" by itself is not a testable hypothesis
unless we have some kind of idea of what we would/would not observe if
the hypothesis were true.
future that there is always an implied confidence interval around
everything.
I doubt anything will help, since he is his own sole judge, and theHow would we get such an idea? If you suggestIf you want to convict your adversary, you really have to switch to
analogies with human design, creationists will accept only those
features of human design they see in nature, taking that as proof of
separate creation, and reject any features they don't see, because "god
is not limited by human capabilities".
mathematics.
verdict is fixed. But what did you have in mind?
Sorry -- I meant stop doing biology. Mathematicians, dealing with
mathematics, are more pliable than creationists or lawyers. If the
Riemann hypothesis happens to be wrong and you can provide a
legitimate counter-example, the world of mathematics will warmly
accept correction. It may be a less frustrating culture in that way
than is honing your debating skills here.
Too high a price. Biology is much more fun than mathematics, and Sean realy doesn't intrude much into that. Failing a creationist takeover of the government, that is.
A little earlier I was listening to an interview of John Maynard Smith
(by Robert Wright). Smith remembered the experience of shedding his
religion. He explained that it was a liberating occurrence because
before it there were times when he felt he could not allow his
thinking to go too far in a direction.
.
- Follow-Ups:
- Re: Sean PItman and nested hierarchy
- From: Charles Brenner
- Re: Sean PItman and nested hierarchy
- References:
- Re: Sean PItman and nested hierarchy
- From: John Harshman
- Re: Sean PItman and nested hierarchy
- From: Charles Brenner
- Re: Sean PItman and nested hierarchy
- From: John Harshman
- Re: Sean PItman and nested hierarchy
- From: Charles Brenner
- Re: Sean PItman and nested hierarchy
- From: John Harshman
- Re: Sean PItman and nested hierarchy
- From: Charles Brenner
- Re: Sean PItman and nested hierarchy
- From: John Harshman
- Re: Sean PItman and nested hierarchy
- From: Charles Brenner
- Re: Sean PItman and nested hierarchy
- From: John Harshman
- Re: Sean PItman and nested hierarchy
- From: Charles Brenner
- Re: Sean PItman and nested hierarchy
- From: John Harshman
- Re: Sean PItman and nested hierarchy
- From: Charles Brenner
- Re: Sean PItman and nested hierarchy
- Prev by Date: Homosexuality...
- Next by Date: Re: *** Post of the Month (POTM) for November 2006: Vote Now! ***
- Previous by thread: Re: Sean PItman and nested hierarchy
- Next by thread: Re: Sean PItman and nested hierarchy
- Index(es):
Relevant Pages
|