Re: Book-able view of ID as speculative science
- From: "topmind" <topmind@xxxxxxxxxxxxxxxx>
- Date: 26 Dec 2005 17:32:41 -0800
Lilith (Deanne Taylor) wrote:
> There are two good books on recognizing and classifying patterns --
> there are others, but I think two good ones are a place to start; one
> is called "Pattern Recognition" by Theodoridis and Koutroumbas,
> published by Academic Press. The other is "Pattern Classification" by
> Duda, Hart, and Stork published by Wiley Interscience. I'll call the
> former PR, the latter PC.
>
> PR's chapters include classifiers based on Bayes decision theory,
> linear classifiers, nonlinear classfiiers, feature selection (a hot
> topic), feature generation, several chapters on clustering. PC goes
> into the same kinds of topics but does go a bit into genetic
> programming, machine learning, and a few exotic methods. If you're
> interested in how pattern classification or pattern recognition is
> done, there are plenty of these kinds of works out there in the
> engineering and image processing disciplines.
>
> That said, there are many different methods used to do searching for
> patterns. In biology, "function" (however you wish to define it here)
> is important in context. Many methods applied to the genome are
> supervised methods, which search for features with some kind of
> knowledge of the thing they are looking for. This knowlege can be scant
> or based on a calculated probability based on how well a sequence
> matches a known profile. There are also unsupervised models, that
> look for patterns in data without having a priori knowledge of the
> thing they are looking for, though there is knowledge inherent in the
> parameters of the model. You basically either have to know what you're
> looking for (albeit loosely) in supervised models, or have a strict
> method of searching for something and have a good definition of what
> constitutes a "pattern" so you can evaluate your results (unsupervised
> method).
This seems to imply that they are looking for something related to
biology, not just "patterns" per se. They have in mind up front what
they are looking for.
> The biggest sandtrap is when you throw any random sequence
> into a bunch of unsupervised methods. You'll always get something out
> that looks interesting, but interpreting it so that it means something
> is another matter.
>
> That said, I'm surprised "ID" people haven't gone into the human
> genome and thrown every pattern matching algorithm they could against
> it, try to find some random signature that is as likely as any other,
> but be obscure about it and insist it indicates special design for some
> theological reason unconnected to the actual genomics. It might sound
> like fruitcake on a plate, but it can't be any worse than the whole
> irreducible complexity argument.
>
> As far as supervised methods go, there are many successful ones. A
> baysian method for gene prediction, for instance, are programs like
> GenScan, that use pre-existing knowledge of gene structure to predict
> whether or not a span of DNA is likely to contain a certain feature of
> a gene. There are programs like RepeatMasker, which give the likelihood
> that a gene sequence contains a signature of a retroelement ( like
> viral sequences). There are many other algorithms that do pattern
> searching/matching on known charateristic signatures. There are
> several other supervised methods that try to find characteristics based
> on structures or features that are not well-define sequence features,
> like helix searching, protein motif searching etc.
>
> See for example papers in the list here:
>
> http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&cmd=Display&dopt=pubmed_pubmed&from_uid=15852508
>
>
> In unsupervised methods, the model assumes no knowledge a priori and
> goes out "mining" for interesting results. Those are most difficult
> because the information isn't very valuable without biological context.
> There are some papers that are successful in showing some of these
> methods, here are some:
>
> http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&cmd=Display&dopt=pubmed_pubmed&from_uid=14571370
>
Look at the abstract from #3 here:
"Novel tools are needed for comprehensive comparisons of interspecies
characteristics of massive amounts of genomic sequences currently
available. An unsupervised neural network algorithm, Self-Organizing
Map (SOM), is an effective tool for clustering and visualizing
high-dimensional complex data on a single map. We modified the
conventional SOM, on the basis of batch-learning SOM, for genome
informatics making the learning process and resulting map independent
of the order of data input. We generated the SOMs for tri- and
tetranucleotide frequencies in 10- and 100-kb sequence fragments from
38 eukaryotes for which almost complete genome sequences are available.
SOM recognized species-specific characteristics (key combinations of
oligonucleotide frequencies) in the genomic sequences, permitting
species-specific classification of the sequences without any
information regarding the species. We also generated the SOM for
tetranucleotide frequencies in 1-kb sequence fragments from the human
genome and found sequences for four functional categories (5' and 3'
UTRs, CDSs and introns) were classified primarily according to the
categories. Because the classification and visualization power is very
high, SOM is an efficient and powerful tool for extracting a wide range
of genome information."
(end quote)
They are basically matching similarities and graphing the similarity as
a presentation. Something like this won't find say an image of Mona
Lisa or a formula for geometric buildings hidden in there.
While it may identify some "patterns" per se, they are mostly tuned for
biological research purposes, not finding intelligent encoding.
A good many of them seem devoted to pattern matching itself, not really
the nature of the patterns. For example, it may find 8 occurences of a
given pattern, but says nothing about that pattern itself (other than
maybe matching a library of patterns). If the 8-repeat was an image of
Mona Lisa, nobody would probably catch that because they are not
looking for such. They are mostly looking for similarites within
sequences, among similar species, different species, etc.
They are essentially cross-reference engines. While such may have use
in intelligent pattern searching, it is a fairly narrow technique and
should not be considered the only or best approach.
>
> Enjoy --
> Deanne
-T-
.
- Follow-Ups:
- Re: Book-able view of ID as speculative science
- From: Lilith (Deanne Taylor)
- Re: Book-able view of ID as speculative science
- From: Mark VandeWettering
- Re: Book-able view of ID as speculative science
- From: Matt Silberstein
- Re: Book-able view of ID as speculative science
- References:
- Re: Book-able view of ID as speculative science
- From: Mark VandeWettering
- Re: Book-able view of ID as speculative science
- From: topmind
- Re: Book-able view of ID as speculative science
- From: Deadrat
- Re: Book-able view of ID as speculative science
- From: Mark VandeWettering
- Re: Book-able view of ID as speculative science
- From: topmind
- Re: Book-able view of ID as speculative science
- From: Jon Fleming
- Re: Book-able view of ID as speculative science
- From: topmind
- Re: Book-able view of ID as speculative science
- From: Deadrat
- Re: Book-able view of ID as speculative science
- From: topmind
- Re: Book-able view of ID as speculative science
- From: Deadrat
- Re: Book-able view of ID as speculative science
- From: topmind
- Re: Book-able view of ID as speculative science
- From: josephus
- Re: Book-able view of ID as speculative science
- From: topmind
- Re: Book-able view of ID as speculative science
- From: Lilith (Deanne Taylor)
- Re: Book-able view of ID as speculative science
- From: Deadrat
- Re: Book-able view of ID as speculative science
- From: Lilith (Deanne Taylor)
- Re: Book-able view of ID as speculative science
- Prev by Date: Re: Homosexual Molesters
- Next by Date: Re: The origin of human consciousness
- Previous by thread: Re: Book-able view of ID as speculative science
- Next by thread: Re: Book-able view of ID as speculative science
- Index(es):
Relevant Pages
|