Re: Crosscorrelation problems
- From: dave@xxxxxxxxxxx
- Date: 14 Apr 2006 07:51:12 -0700
Gordon Sande wrote:
On 2006-04-13 14:51:42 -0300, Robert Lundqvist <Robert.Lundqvist@xxxxxx> said:
In an analysis of some times series data, the plan was to do some ordinary
cross-correlation calculations. However, when I tried to confirm that I
did the right thing, I wanted to compare the results from using a function
in the software used (R, MacAnova, Minitab) with what I get from "manual"
calculations. It didn't work...
A short example in R code:
*********
a<-rnorm(5);b<-rnorm(5)[1] 1.4429135 0.8470067 1.2263730 -1.8159190 -0.6997260
a;b
[1] -0.4227674 0.8602645 -0.6810602 -1.4858726 -0.7008563
#Calculation of ccf for a lag of max 4
cc<-ccf(a,b,lag.max=4,type="correlation")Autocorrelations of series 'X', by lag
cc
-4 -3 -2 -1 0 1 2 3 4
-0.056 -0.289 -0.232 0.199 0.618 0.568 -0.517 -0.280 -0.012
#Calculation of correlation between a and lagged b (lag 1)
cor(a[2:5],b[1:4])[1] 0.6759322
**********
Besides not getting the same results, I also don't understand the ccf for
a large lag. With one variable lagged 4 steps and vectors of length 5
there should as far as I can see only be 2 pairs of observations. The
correlation would then be 1. Guess I am missing something really simple
here. Anyone who could explain what the ccf really does and where I have
been wrong??
Robert
There is question of what is the "spectral window" you expect as it
maps into a "auto/cross-covariance divisor". Sometimes you want the
divisor to be "N" and sometimes you want it to be "N-t" at lag t.
You need to answer a fairly technical question of whether you want
an unbiased point estimator of the covariance or a differing estimator
of the spectra. This is a nice example of where preserving a joint
property of a function is not the same as asking for some property
of the points of the function.
It sounds like you need to ask a times series specialist as an "ordinary"
covariance can be differing things depending on what question you ask.
In an attempt to help readers of this NG as they inadvertently try to
use statistical tests designed for independent observations ...on data
that is dependent over time ...
1. You can compute anything you want in any way you desire .....
But to test the statistical signifcance of what you compute you
must test for certain distributional assumptions ( or use
non-parameteric methods )
2. Please refer to http://www.met.rdg.ac.uk/cag/stats/corr.html for a
nice summary
....in particular
Suppose that X and Y are independent normal random variables. Then, in
the absence of temporal autocorrelation, the correlation coefficient,
r, between random samples of size n from X and Y has a probability
density function f(r) = ((1 - r^2)^0.5(n-4)) / B(0.5,0.5(n-2)) The
distribution has mean zero and a variance of (n-1)^-1. However, the
distribution is affected by the autocorrelation in X and Y, which
increases the variance of the distribution and so gives rise to
spurious large correlations. This problem was recognised for time
series as early as 1926 by Yule in his presidential address to the
Royal Statistical Society.
In other words if you have autocorrelation then standard tests of
reliability (significance ) are merely descriptive and not inferential .
.
- References:
- Crosscorrelation problems
- From: Robert Lundqvist
- Re: Crosscorrelation problems
- From: Gordon Sande
- Crosscorrelation problems
- Prev by Date: SAS macro for testing moderation/interaction
- Next by Date: interpreting significant interactions in Multiple Regression
- Previous by thread: Re: Crosscorrelation problems
- Next by thread: SAS macro for testing moderation/interaction
- Index(es):
Relevant Pages
|