Re: correlating a sample = ordered vector with a database of such vectors, prioritization of results, statistics



On Jan 15, 9:32 am, karpatov <jirivol...@xxxxxxxxx> wrote:
[...]
What I did :
- converted activities into (- log10) scale (not normalised)

Fine. I was going to suggest logging the data. The base doesn't
matter -- whatever you're most comfortable with. I can only
speculate as to what "normalised" might mean in this context.

- calculated correlation coeficient of my profiles and profiles in
database (tried pearson, spearman, kendall)

You might also try some kind of intraclass correlation (ICC),
but if you're not familiar with ICCs, put this suggestion on hold
until other problems are settled.

- calculating p-values - null hypothesis corelation is 0
(cor.test in R)

As I explained in my December 4, 2007, response to your previous
question, the usual p-value calculations do not apply, because the
60 cell lines are fixed, as opposed to being randomly sampled from
a population. The parameter you want to estimate is what the
correlation over those 60 cell lines would be if you could measure
activity errorlessly. The problem is that any correlation you get
will be affected by measurement error. You need to adjust the
correlation for measurement error, and to estimate how accurate
that adjustment has been.

You do not say how the two replicates were obtained or used. Were
there two batches of each drug (or something analogous to batches),
with each batch being split into 60 samples, and each sample being
tested on a cell line? Or was there only one batch of each drug,
with each batch being split into 120 samples, and 2 samples being
tested on a cell line? Statistically, the difference between those
two examples is that the first one gives matched data and the second
does not. In either case, the best simple estimate of the correlation
will probably be obtained by using the average of the two measures
of activity, rather than by combining the correlations, but the
matched-unmatched distinction will be relevant when estimating the
accuracy of the correlation.

[...]
.



Relevant Pages

  • Re: How to determine the optimized correlation between 2 sets of data?
    ... Do you have some formula in a cell ... If so, you could run a solver optimization, to maximize the ... calculated correlation by changing the input value in C1. ... Let assume the data set under column A is fixed, and we try to shift ...
    (microsoft.public.excel.misc)
  • Re: polychoric correlations
    ... As Rich said, with binary items, the term "tetrachoric correlation" is ... If a single cell of the fourfold table has a zero frequency, ...
    (sci.stat.consult)
  • Re: correlating a sample = ordered vector with a database of such vectors, prioritization of results
    ... The drug comes from just 1 vial, ... such an estimation that a correlation could arise by chance "from ... Moreover I can think beyond those 60 cell lines, ... To make an average from the replicates and order the correlated ...
    (sci.stat.consult)
  • easy correlation computations with lags
    ... and I know there are all sorts of ... I know that I can get a correlation with the formula =CORREL(v3:v175, ... insert/delete a cell above the data I wanted to lag/unlag and keep my ...
    (microsoft.public.excel.misc)
  • Finding Values in a "Matrix"
    ... In the matrix fields I calulate their correlations. ... corresponding text names for that high correlation. ... So for example if cell ... return Red Magenta. ...
    (microsoft.public.excel.misc)

Loading