About rank invariant tests on ordinal data
- From: valter.sundh@xxxxxxxxxxxx
- Date: 15 Sep 2005 08:31:36 -0700
I have trouble understanding the relevance of the concept "rank
invariant test" in the situation of analyzing ordered categorical data.
Take as an example this table for analyzing the association between
two variables measured in a 7-point ordinal scale
+----+---+----+----+-----+----+----+-----+
! 1 ! 2 ! 3 ! 4 ! 5 ! 6 ! 7 ! TOT !
+----+---+----+----+-----+----+----+-----+
1 ! 1 6 16 6 8 4 0 41 !
2 ! 9 18 35 10 21 10 0 103 !
3 ! 20 27 54 17 51 9 5 183 !
4 ! 6 10 14 11 21 6 2 70 !
5 ! 6 7 12 5 13 4 0 47 !
6 ! 0 4 9 1 6 4 1 25 !
7 ! 0 1 3 2 6 1 2 15 !
+----------------------------------------+
Total ! 42 73 143 52 126 38 10 484 !
--------+----------------------------------------+
Approx. Approx.
Value T Sig.
----------------------------------------------
Spearman Correlation 0.0858 1.890 0.0594
Pearson's R 0.1035 2.285 0.0228
The pressing problem now is: Do we dare to report that the association
is significant at the 5% level or not?
What make us hesitate is that we have seen and heard arguments that the
most proper method to use in this situation is the Spearman test
because it is rank invariant.
But to me the Pearson test is equally rank invariant.
Maybe that is because I don't understand the proper definition of rank
invariance, but as I understand it, rank invariance means that no
matter what values are assign to the categories, the outcome of a rank
invariant test is always the same because the test will only use the
information about how the data is ordered. For example, if we choose to
code the categories as
1,2,3,4,5,6,7 or 1,2,9,16,25,36,49 or 1,2,3,6,7,14,15 or anything else
that does not violate the ordered nature of the data, the outcome of a
rank invariant test will always be the same.
According to this definition the Spearman test is obviously rank
invariant. The reason is that the ordinal levels in the test are
recoded to the mean rank for each category. After this recoding the
test is performed exactly like the Pearson test. This assures that the
result always will be same regardless of what values have been assigned
to the categories beforehand, so it is rank invariant.
This may described as: The Spearman test is rank invariant because it
makes up its own values and discards any information in previously
assigned values.
On the other side, the Pearson test does not make up any values, it
have to work on pre-assigned values defined by the user. But what I
can't understand is this: Why isn't it so that at the exact moment when
we have decided to use a certain recoding procedure, for example to
recode the first ordered category to the numerical value 1, the second
category to 2 and so on, we also have defined an invariant test? Or
maybe we have?
This new test does have the property of a rank invariant test as I
described it above - the result will always be same regardless of what
values have been assigned to the categories beforehand, because the
recoding procedure is now as fixed as the one used in the Spearman
test.
Logically, a test procedure consisting of two parts, recoding_step +
Pearson_test, must be a rank invariant test if the recoding_step meet
the requirement that it is a fixed deterministic algorithm that retains
the order of the data, and that it works without any further knowledge
on what values, if any, that have been preassigned to the categories.
But maybe I don't understand the proper definition of rank invariant
test.
So please could anyone clarify this matter!
Either that I have an incorrect definition of rank invariant test,
or that there is a logical flaw in the common use of the rank
invariance principle as an argument for putting the Spearman test above
the Pearson test when testing association in ordinal data,
or that is a logical flaw somewhere in my reasoning above.
If there is no flaw in my reasoning, and my understanding of the
definition of rank invariant test is basically correct, the obvious
conclusion is that the concept "rank invariant test" does not give us a
relevant criterium for choosing between Pearson and Spearman in the
example above, and of course this applies to all situations when we
choose between rank tests and non-rank tests on categorical ordinal
data.
Valter Sundh
.
- Follow-Ups:
- Re: About rank invariant tests on ordinal data
- From: Richard Ulrich
- Re: About rank invariant tests on ordinal data
- Prev by Date: Re: independent variance
- Next by Date: Significance testing of Dependent proportions
- Previous by thread: how to compare the similarity of two groups?
- Next by thread: Re: About rank invariant tests on ordinal data
- Index(es):