Re: correlating a sample = ordered vector with a database of such vectors, prioritization of results, statistics
- From: Ray Koopman <koopman@xxxxxx>
- Date: Wed, 16 Jan 2008 00:36:50 -0800 (PST)
On Jan 15, 9:32 am, karpatov <jirivol...@xxxxxxxxx> wrote:
[...]
What I did :
- converted activities into (- log10) scale (not normalised)
Fine. I was going to suggest logging the data. The base doesn't
matter -- whatever you're most comfortable with. I can only
speculate as to what "normalised" might mean in this context.
- calculated correlation coeficient of my profiles and profiles in
database (tried pearson, spearman, kendall)
You might also try some kind of intraclass correlation (ICC),
but if you're not familiar with ICCs, put this suggestion on hold
until other problems are settled.
- calculating p-values - null hypothesis corelation is 0
(cor.test in R)
As I explained in my December 4, 2007, response to your previous
question, the usual p-value calculations do not apply, because the
60 cell lines are fixed, as opposed to being randomly sampled from
a population. The parameter you want to estimate is what the
correlation over those 60 cell lines would be if you could measure
activity errorlessly. The problem is that any correlation you get
will be affected by measurement error. You need to adjust the
correlation for measurement error, and to estimate how accurate
that adjustment has been.
You do not say how the two replicates were obtained or used. Were
there two batches of each drug (or something analogous to batches),
with each batch being split into 60 samples, and each sample being
tested on a cell line? Or was there only one batch of each drug,
with each batch being split into 120 samples, and 2 samples being
tested on a cell line? Statistically, the difference between those
two examples is that the first one gives matched data and the second
does not. In either case, the best simple estimate of the correlation
will probably be obtained by using the average of the two measures
of activity, rather than by combining the correlations, but the
matched-unmatched distinction will be relevant when estimating the
accuracy of the correlation.
[...].
- Follow-Ups:
- References:
- Prev by Date: correlating a sample = ordered vector with a database of such vectors, prioritization of results, statistics
- Next by Date: Re: correlating a sample = ordered vector with a database of such vectors, prioritization of results, statistics
- Previous by thread: correlating a sample = ordered vector with a database of such vectors, prioritization of results, statistics
- Next by thread: Re: correlating a sample = ordered vector with a database of such vectors, prioritization of results, statistics
- Index(es):
Relevant Pages
|
Loading