Re: Are graded clinical signs more reliable than dichotomized?



No. There will probably be *more* agreement with *fewer* categories.
The fewer the categories, the less opportunity for disagreement.

If you ask two persons to describe different colours I think you are
right, but what about shades of gray? In that situation I am confident
that three levels (black/gray/white) will give better agreement than
two (black/white). And I guess the agreement will be better the more
grades you have.

I would like to have a referens to support this statement - if there is
any.

Another thing is that the intensity of rebound tenderness has a meaning
- strong intensity usually reflect a more serious stage of the disease
than slight intensity. If you only use two grades (present/absent) you
will loose this information.

With only two levels (absent/present) the surgeon also has to decide if
the rebound tenderness is of significant intensity or not, which is
dependent on his experience of previous cases.

If the surgeon is allowed only to describe the intensity without taking
into consideration if it has any significance or not I assume that this
will give better agreement.

But apparently this or similar issues has not been studied?

Regards

Roland Andersson
Surgeon

John Uebersax wrote:
Roland wrote:

Instead of dichotomising these clinical signs I think that a grading
of the intensity of the sign must retain more of the diagnostic
information

Correct

One problem is of course that there is no definition for the grades of
the variable. However the dichotomised variable has the same problem

Correct

and I assume that the agreement between two examiners will be larger
if graded variables are used instead of dichotomised.

No. There will be *more* agreement with *fewer* categories. The fewer
the
categories, the less opportunity for disagreement.

I tend to agree with those who suggest that having more categories is
better
from a *research* or *statistical* standpoint. But clinical practice
evolves
to optimize several factors. Some are optimized by having more
categories,
some by fewer. There is a tradeoff, and the solution to the tradeoff
is
different for each application.

I believe there is such a tradeoff between
reliability/agreement/reproducibility (better with fewer levels) and
accuracy/precision (better with more levels).

The solution to this tradeoff is different for each application. That
is why we see staging in some areas, but yes/no distinctions in others.

One example where dichotomous ratings might be better is a screening
test. It is logistically better to screen patients into two groups.
Then a more refined test can be given to those who screen positive.

Hope this helps.
--
John Uebersax PhD

.