Re: Sean PItman and nested hierarchy



Charles Brenner wrote:
On Mar 5, 10:52 am, John Harshman <jharshman.diespam...@xxxxxxxxxxx>
wrote:
Charles Brenner wrote:
On Mar 4, 6:18 pm, John Harshman <jharshman.diespam...@xxxxxxxxxxx>
wrote:
Charles Brenner wrote:
On Mar 4, 12:40 pm, John Harshman <jharshman.diespam...@xxxxxxxxxxx>
wrote:
Charles Brenner wrote:
On Mar 3, 6:59 pm, John Harshman <jharshman.diespam...@xxxxxxxxxxx>
wrote:
Charles Brenner wrote:
On Mar 3, 8:27 am, John Harshman <jharshman.diespam...@xxxxxxxxxxx>
wrote:
Charles Brenner wrote:
On Mar 2, 8:21 pm, John Harshman <jharshman.diespam...@xxxxxxxxxxx>
wrote:
Charles Brenner wrote:
On Feb 26, 9:36 am, John Harshman <jharshman.diespam...@xxxxxxxxxxx>
wrote:
[I thought I'd start a new thread since Sean isn't replying in the old
one. ...
[the following seems to me to justify a big snip]
Let's at least agree on what we're talking about.
We're talking about a likelihood ratio framework. Data of life or
fossils is observed and there are two hypotheses about how it came
about --
Hn = common descent from natural processes
Hg = God created life
No. That's not what the two hypotheses are, unless you're stating them
badly. The two hypotheses are 1) common descent and 2) separate
creation.
Ok. I accept the correction.
The salient and interesting fact about the data is that it indicates a
nested hierarchy (NS). NS is a more or less inevitable consequence of
Hn, but only one possible consequence of Hg. So the data supports Hn
over Hg, but by a lot or by a little? If the NS property of the data
is improbable under Hg, then the data is strong evidence supporting Hn
over Hg.
In order to get a handle on how improbable NS is as a consequence of
Hg, you posit the universe of all possible data states and imagine
them to be equally likely. That's quite unlike what one does in real
life (and most of my work is related to the theory or practice of
likelihoods of DNA data), but I've agreed to play along with your
point of view -- though of course I argue that the "equally likely"
provision doesn't make sense (mathematically, let alone
scientifically). However, if God created life then the patterns in the
data correspond to what God did -- the scheme that God chose. Are we
agreed so far?
No, unless by "god created life" you mean that there is no common
descent, which is a confusing way to say it.
That's what I should have meant. (Speaking of confusing, my editor
apologizes for introducing the abbreviation NS for "nested
hierarchy".)
And we're specifically talking about the patterns in character data,
i.e. similarities and differences among species. These in turn imply
connections among the species, or perhaps lack thereof. So I have
simplified by reducing the possible patterns to different assemblies of
connections among species.
There is an infinitude of possible data sets.
Is this true? Only if individual data sets can increase without bound.
Can they?
I don't see why not. Also, if you allow that the data can include
continuous measurements then there are infinitely many different
possible data sets even of bounded size.
I was thinking of the data sets as genomes.
It's natural to want to
bin the data to cope with it. In your case part of the motive for
doing so is to have a finite set, because for a finite set at least
the concept of "uniform probability distribution" is well-defined.
However, lifting this distribution from the finite set back to the
real data does not impose a uniform distribution on the real data --
just some arbitrary distribution that is tailored for the purpose of
your argument.
In short, what you call a "simplification" is a way of being sneaky.
Not intentionally so.
(It's also very bizarre. Unless I quite misunderstand, a few of the
graphs are trees representing natural hierarchies, which might
actually come about as a consequence of some God-scheme we could think
of.
I don't understand what you mean by saying it's bizarre. Can you explain?
My remarks in parentheses just above and below were supposedly the
explanation. To expand slightly: Since the vast majority of the graphs
are ones that have zero probability (in my opinion), they are just
stuffing. By positing equal probability (assuming Hg) for each graph
you thereby artificially exaggerate the apparent rareness of (the
handful of graphs representing) nested hierarchies as consequences of
Hg.
What are your criteria for considering some graphs to have zero probability?
I'll bet some graphs are literally impossible. The graphs are produced
by an algorithm (to be accurate one or another favored heuristic) that
tries to fit the genomic data into a pattern. Regardless of whether
there are any limitations or preferences on the model of creation --
on "God" -- can phylogenetic analysis really ever produce graph
including a loop? (But I already gave a weaker form of this argument
in my previous response, below, so perhaps I have not understood your
question.)
Yes, it can, depending on the algorithm used. Now it's true that all the
most commonly used algorithms will always produce a tree (often a
single, fully bifurcating one) regardless of data input, but there are
others that will produce loops or unconnected points. See, for example,
John Alroy's program CTA. And there are of course loops in reality --
hybrid species -- for which there are programs. And there is also
Splitstree, which produces a big set of loops. Trust me. Programs are
written to fit the data, not the other way around. If the data suggested
some weird graph, there would be a program to analyze it.

I think we've been a little imprecise about nomenclature. I assume the
graphs we have in mind are in fact directed graphs -- they don't
merely say that A and B seem to be descended one from the other, but
specifically in which direction.

No. That's not necessary. Phylogenies are directed graphs. But I was considering the more general case.

When I said "loop" I had in mind a
circle of inheritance, as opposed to a set of edges like A->B->Z and A-
>C->Z (which I've heard called "articulation". Is that a usual word
for it?)

Is it possible you mean "reticulation"? But anyway, that's exactly what I thought you mean by "loop".

which could arise from hybridization. A devious creator could
I suppose create data that mildly suggests a circle: the junk DNA of B
looks like it descended from A in being substantially like a mutated
and partly broken-up and re-arranged version of A, C looks like a
descendant of B, ... and A looks like a descendant of Z. A->B->C->...-
A. Does John Alroy's program cater to this? Would it ever produce the
result A->B->A?

No. It produces, if I recall, undirected graphs (i.e. unrooted trees).

I have an alternative (rather abstract, tedious to try to write down)
argument in mind involving consideration of a multiplicity of binning
schemes that we might devise, of which your graph idea would be only a
non-distinguished example. But it may not be of interest, especially
if the answer above is satisfactory.
Sadly, it isn't.

My abstract argument comes down in effect to saying, why would John
Astor's program be so special?

Alroy. I have no idea what you meant by that.

Let's imagine that my creationist adversary (C.A., whom you are ably
representing) won't buy any of my logical or intuitive attempts to
argue that some (directed) graphs (nearly) cannot occur as
representations of the genomic data.

Hey, how come I have to be the creationist?

Now, for any particular creation-
scheme (G), some kinds of graphs are quite likely and others are not.
However, when I try to say that certain graphs -- tangled loops for
example -- are unlikely from any G whatever, C.A. raises the objection
that I'm imposing a limitation on G. After all, for all we know G's
method involves intentionally designing the junk DNA just so as to
create a tangled loop in the graph. (We're assuming for the sake of
argument that such is mathematically possible.)

I don't understand what a tangled loop is, unless you mean to imply a directed graph with a circle in it, A>B>C>A, for example. It seems to me that would be possible, as long as different parts of the genome were compared for each piece of the loop.

Very well. From the C.A. perspective God or whatever chooses a
separate-creation-scheme G and there is some probability distribution
(unknown to you, me, or C.A.) across all possible choices for G.
Depending on which G is chosen, the various bins (=choices of graph)
that CTA can produce are more or less likely to occur. In other words,
the probability distribution on the bins of CTA is a projection from
the probability distribution of G's. Is that probability distribution
approximately uniform?

The algorithm CTA is man-made, e.g. by John Alroy. While it may be a
natural-seeming way to arrange the data from the perspective of
evolutionary biologists with their typical interests, in the sense
that that's an arbitrary interest, CTA is an arbitrary method of
arrangement. Instead of CTA, we could use some different binning
algorithm including algorithms which make no pretense of doing a
similar thing. Anything that computes some bin number based on the
entirety of the genomic data (perhaps providing that it tends to
compute about the same bin number from a substantial subset of the
genomic data) will do.

Let me stop you here. I would contend that there are methods of determining which binning methods fit the data better. Nested-hierarchy data demand to be dealt with by a nested-hierarchy algorithm, for example.

Each such algorithm defines a projection from
the probability distribution of G onto a set of numbered bins. Since
the bins are man-made and arbitrary, no matter what the probability
distribution on G may be, it can't tend to be uniform when projected
by an arbitrary algorithm. One particular algorithm, by chance, the
probability distribution for the various bins may be uniform, but not
in general across algorithms. (For example two algorithms could differ
mainly in that one of them collapses half the bins of the other into a
single bin. The probability distribution of the bins cannot
simultaneously be uniform for both algorithms.) Therefore, it would be
far-fetched to suppose that the bins of some particular binning scheme
such as CTA would have a uniform distribution.

You have lost me entirely. Who are you arguing with again? About what?

Only a few of these patterns would be trees,
which of course would correspond to nested hierarchies.
(The vast majority consist of a bunch of random edges here and there
including tangles and loops, and these correspond, to put it mildly,
to no obvious kind of God-scheme so my guess is that they have
probability about zero. On the other hand, a very obvious category of
God-schemes wherein God tosses completely different random DNA into
every different species, corresponds to the empty graph. Hence to
suppose equal probabilities for all the different graphs is pretty
odd. Recall my baseball analogy.)
The problem with this is that we have no basis on which to say that a
god-scheme must be obvious, or indeed fit any parameters we choose to
impose on it. The whole idea comes from Sean's demand that we place no
constraints on god.
I don't agree "no basis". I'm perfectly happy to reject such a demand
and so should be any thinking person. For example if God is
operationally defined as the "intelligent designer" then I have a
basis to prefer a God akin to intelligences with which I am familiar.
Sean or anyone is free to argue with my choices but to say no concept
of God is more reasonable than any is tantamount to ignoring the
absurdity of having a uniform probability on the infinite and
irregular space of all possible schemes, motives, and methods a God
might have.
I personally have no problem with this. By constraining the
distribution, you are assuming some particular sort of god. But the only
way we can produce a well-characterized hypothesis, subject to
scientific testing, is to do something of this sort. You can't of course
test all possible sorts of gods, at least at one time.
The uniform distribution is often used to represent ignorance of the true
distribution. But it isn't really, is it?
I agree with that too. Even when the "uniform distribution" is
perfectly well-defined, it is often nonsensical to invoke it.
If so, then the take-home lesson is that we can't estimate the
likelihood of (in your terminology) NS given Hg. This prevents any sort
of science being done if the potential explanatory universe includes Hg.
Would you agree?
That may depend whether you mean Hg="God did it" as I originally wrote
or Hg="separate creation" as you corrected. The latter is the
interesting question so I'll assume that. Relating to our discussion,
my answer is that it depends on how we define science. According to
the myth that the standard of science is "falsification" in so black
and white a sense that no one could doubt or argue, then I would agree
that Hg isn't subject to it, but perhaps not much is. According to my
preferred view of science though, Hg is potential grist for the
science mill. As I said at the outset, I like your argument that the
observed data (indicating nested hierarchy) is characteristic of Hn
and hugely non-characteristic of Hg -- i.e. by the "likelihood
principle" it is enormous evidence against Hg. One could go into all
sorts of detailed arguments about how "non-characteristic". All I say
is that you can't make those arguments with mere mathematics; they
will have an element of subjectivity. That's life, and science.
Naive falsificationism aside, it seems to me the answer depends on how
we define Hg. "Separate creation" by itself is not a testable hypothesis
unless we have some kind of idea of what we would/would not observe if
the hypothesis were true.
or "unlikely to observe"?
It's so much easier to write in terms of yes/no. Please assume in the
future that there is always an implied confidence interval around
everything.

How would we get such an idea? If you suggest
analogies with human design, creationists will accept only those
features of human design they see in nature, taking that as proof of
separate creation, and reject any features they don't see, because "god
is not limited by human capabilities".
If you want to convict your adversary, you really have to switch to
mathematics.
I doubt anything will help, since he is his own sole judge, and the
verdict is fixed. But what did you have in mind?

Sorry -- I meant stop doing biology. Mathematicians, dealing with
mathematics, are more pliable than creationists or lawyers. If the
Riemann hypothesis happens to be wrong and you can provide a
legitimate counter-example, the world of mathematics will warmly
accept correction. It may be a less frustrating culture in that way
than is honing your debating skills here.

Too high a price. Biology is much more fun than mathematics, and Sean realy doesn't intrude much into that. Failing a creationist takeover of the government, that is.

A little earlier I was listening to an interview of John Maynard Smith
(by Robert Wright). Smith remembered the experience of shedding his
religion. He explained that it was a liberating occurrence because
before it there were times when he felt he could not allow his
thinking to go too far in a direction.


.



Relevant Pages

  • Re: Sean PItman and nested hierarchy
    ... if God created life then the patterns in the ... the concept of "uniform probability distribution" is well-defined. ... Since the vast majority of the graphs ... Yes, it can, depending on the algorithm used. ...
    (talk.origins)
  • Re: Sean PItman and nested hierarchy
    ... the concept of "uniform probability distribution" is well-defined. ... Since the vast majority of the graphs ... Yes, it can, depending on the algorithm used. ... Separate creation can produce any pattern whatsoever, including a nested hierarchy. ...
    (talk.origins)
  • Re: Sean PItman and nested hierarchy
    ... the concept of "uniform probability distribution" is well-defined. ... Since the vast majority of the graphs ... Yes, it can, depending on the algorithm used. ... why would this one produce bins (the various ...
    (talk.origins)
  • Re: Sean PItman and nested hierarchy
    ... if God created life then the patterns in the ... the concept of "uniform probability distribution" is well-defined. ... Since the vast majority of the graphs ... or "unlikely to observe"? ...
    (talk.origins)
  • Re: Sean PItman and nested hierarchy
    ... if God created life then the patterns in the ... the concept of "uniform probability distribution" is well-defined. ... Since the vast majority of the graphs ... of science being done if the potential explanatory universe includes Hg. ...
    (talk.origins)