Re: Weighted standard deviation




G Robin Edwards wrote:
I am investigating quite a large (by my standards!) data matrix. Actual
dimensions are 113 columns and up to 593 rows. Many columns contain
much smaller numbers of rows, but usually have at least 150. The data
are actually time series, and column 1 is the time indicator.

The columns contain data that are all supposed to be measures of a
fundamental quantity, and have the property that they are consistent in
the way they map to the fundamental property. However, they vary
spectacularly in both location and variance, by orders of magnitude in
many cases.

I wish to generate a summary of the columns in the form of a mean, and
also variance (standard deviation rather, since it is in convenient
units).

Clearly, naive averaging across the columns is useless, so I have
attempted to homogenise the data by the simple and straightforward
method of standardising every column to mean zero, variance 1, before
averaging or other treatment.

This technique appears to work very well in that I obtain what seem to
be meaningful summaries for each row (year), and I am often able to make
some intriguing inferences.

However, the original data columns also have weights attached to them -
though how these have been chosen is not clear. The weights are either
1, 0.75, 0.5 or 0.33. I can readily calculate the weighted averages of
the (standardised) values, but at the moment I haven't worked out how to
estimate standard deviations for each row, which must also depend to
some degree on the given weights. The number of columns having data
values varies from one row to another. If ignore I the weights I can
estimate a "confidence interval" for the mean and the standard
deviation, based simply on the number of columns containing actual data,
but what should I do to include the given weights in my estimate?

Perhaps I'm being stupid and there's no problem. However, I'd like to
be reassured on that.

Robin

The weights might be sampling weights that represent how the sample was
drawn, for example a stratified random sample would assign weight of
N/n to each observation where N is the size of the strata in the
population and n is the sample size for that strata. There is
specialized software that takes into account sample weights such as
SUDAAN and some routines in SAS and Stata, not sure about SPSS.

.



Relevant Pages

  • Re: Weighted standard deviation
    ... also variance (standard deviation rather, ... the original data columns also have weights attached to them - ... estimate standard deviations for each row, ... The SPSS Complex Samples option allows you to select a sample according to a complex design and incorporate the design specifications into the data analysis, thus ensuring that your results are valid. ...
    (sci.stat.consult)
  • Re: Key West Trip and Dive Report
    ... Since I bring my own tanks and don't use weights, ... Almost always true for those that are happy diving with standard 80s. ... boat on the surface. ...
    (rec.scuba)
  • Re: Key West Trip and Dive Report
    ... Since I bring my own tanks and don't use weights, ... Almost always true for those that are happy diving with standard 80s. ... boat on the surface. ...
    (rec.scuba)
  • Weighted standard deviation
    ... also variance (standard deviation rather, ... the original data columns also have weights attached to them - ... estimate standard deviations for each row, ...
    (sci.stat.consult)
  • Re: Image filtering_help_request
    ... % Demo to take the local mean, variance, and standard deviation ... title('Original Image', 'FontSize', fontSize); ...
    (comp.soft-sys.matlab)