Re: Spin calling John 100% identity (Human & Gekko)
- From: Greg Guarino <greg@xxxxxxxxxxxxx>
- Date: Fri, 19 Sep 2008 19:41:55 GMT
On Fri, 19 Sep 2008 07:06:55 -0700, John Harshman
<jharshman.diespamdie@xxxxxxxxxxx> wrote:
spintronic wrote:
According to you this doesnt exist. 100% sequence identity in a human,
and a
First attempt. Enjoy.
Fascinating. I don't believe it. The most likely explanation in my mind
is that the gecko cDNA library was contaminated with some human
sequences. Unfortunately there are no other sequences to use in checking
the accuracy of that EST, i.e. no Gekko atpase sequences, and the study
the sequence comes from is unpublished. But this is contrary to all
results in all better-controlled studies.
How did you find this? Did you just start Blasting random human
sequences? However you did it, it was quite a feat.
I'd like someone to educate me a bit about what I'm reading here.
lcl|15957Length=386
Score = 704 bits (381), Expect = 0.0
Identities = 381/381 (100%), Gaps = 0/381 (0%)
Strand=Plus/Plus
Query 4427
CACACCACTCAGTGCATGGCGGCCGATGGGCAGCCAACCCAAACCCGCGCCTTTCCTTGT 4486
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 6
CACACCACTCAGTGCATGGCGGCCGATGGGCAGCCAACCCAAACCCGCGCCTTTCCTTGT 65
I notice that the "Query" and "Sbjct" numbers differ from the number
at the end of the string by the number of bases in the string. ie.
65-6=59 +1 (because we include both "end points")=60, which is how
many bases I counted. I assume the numbers then denote locations.
Correct?
Are the locations of these strings in the two organisms analogous in
some way? Or are they randomly selected from anywhere in the genomes?
I ask this because (and I was going to wait for the answer to the
question above before I let this wild speculation loose into the
world, but I've decided to be time-efficient and risk the
consequences.) the question reminds me a certain person's attempts to
filter out spam from his email.
The guy simply can't abide the idea that it is basically an impossible
task. So he is constantly fiddling with a number of filters for his
Thunderbird email, based on objectionable words: penis, Nigeria,
opportunity, etc. When he has the filters snag messages with those
words in the subject line, they work pretty well.
But of course, sometimes the objectionable words only appear in the
body of the message. So he has some more filters that look for the
words, and even some short phrases, in the body of the email.
Here's the problem: He gets a lot of emails with attachments, Word and
Excel files, pdfs, etc. As far as the Thunderbird filters are
concerned, the attachments are simply more ASCII characters. As it
turns out, when he implements his filters, nearly EVERY email with an
attachment gets sent to the trash bin. Even when he, at my suggestion,
made each "bad" word or phrase a separate filter, and turned all of
them off except one, the same thing happened. That one word or phrase
(and he tried different ones, one at a time) was contained in nearly
every attachment.
I then suggested that he further constrain his filters to act only on
small messages, under 40Kb if I remember correctly. That was a good
idea, as most of the attachments were larger than that. Yet
occasionally a smaller attachment would come in, and even some of
those got snagged.
Evidently the odds of finding the word penis in an essentially random
string of even a few tens of thousands of characters is very good. Now
I don't know the nuts and bolts of UU Encoding, if that's still even
how it works. But I think there are 128 available characters. Even if
they only use (English) letters and numbers that's over 60. DNA uses
4.
So that's why I ask about the location. I'm in well over my head here,
but it doesn't seem difficult to imagine that you could find some
reasonably long identical strings of bases out of the billions that
make up a genome, if you weren't picky about where you find them and
what they are.
Any biologists out there with a few moments to disabuse me of my
foolish notions?
Greg Guarino
.
- Follow-Ups:
- Re: Spin calling John 100% identity (Human & Gekko)
- From: John Harshman
- Re: Spin calling John 100% identity (Human & Gekko)
- From: Prof Weird
- Re: Spin calling John 100% identity (Human & Gekko)
- References:
- Re: Spin calling John 100% identity (Human & Gekko)
- From: John Harshman
- Re: Spin calling John 100% identity (Human & Gekko)
- Prev by Date: Re: In the News: Anglican church to apologize to Darwin
- Next by Date: Re: N. Carolina: Brunswick school board to consider creationism
- Previous by thread: Re: Spin calling John 100% identity (Human & Gekko)
- Next by thread: Re: Spin calling John 100% identity (Human & Gekko)
- Index(es):
Relevant Pages
|
Loading