Re: describing logistic regression results for clinicians



David Winsemius wrote:
Jerry Dallal <gdallal@xxxxxxxxxxxxxxxx> wrote in
news:hct_f.1947$No6.42645@xxxxxxxxxxxxxx:

David Winsemius wrote:
Jerry Dallal <gdallal@xxxxxxxxxxxxxxxx> wrote in news:IVQYf.1917
$No6.42339@xxxxxxxxxxxxxx:

Bill Howells wrote:
This was a hallway conversation with a colleague. He is responding
to reviews from a paper to a second tier health research journal
and showed me a table with coefficients and confidence intervals
from a logistic regression model of a binary outcome with two
binary predictors, say x1 and x2, and their interaction. The two
binary predictors were combined into four categories of 0/0, 0/1,
1/0, 1/1 with the 0/0 as the reference category (ie. x1=0, x2=0). He said in another model where the interaction was parameterized as
the product of the two binary variables, the coefficient for the
interaction was not statistically different from zero. The odds
ratios in the model he showed me were something like OR1=7, OR2=9,
OR12=23 (95% CI: 11-77) where OR1 is the odds ratio for the "X1
only" group, OR2 is the odds ratio for the "X2 only" group, and
OR12 is the odds ratio for the "both" group (1/1), relative to the
0/0 group. OR1 and OR2 were statistically different than zero.

Anyway, the question! He originally described the results using
the phrase "X1 and X2 are independent predictors" of outcome
because they are both statistically signficant in the model and the
interaction is statistically non-significant. Reviewer #1 balked
at this description because the paper also contained a test of
association between X1 and X2 which was statistically signficant,
and the point estimate of the "both" group, ie. OR12, is "far from"
what one would expect if the terms were truly independent and
therefore additive (on the log scale). So, really two questions:

1. how to respond to Reviewer #1? 2. how to describe the results
in the paper whose audience is non-statistical clinicians?

Regarding question 2, another colleague had suggested using the
phrase "additive on the log-scale" but the author felt this wasn't
very meaningful for a non-statistical audience. I suggested
pointing out that although the point estimate is far from what one
would expect from additivity, the confidence interval for OR12
includes the null value, eg. 7*9 = 63 is within the confidence
interval 11-77, and commenting there is low power to detect the
interaction due to fewer subjects in the "both" group.
The departure would not have been from "additivity" but from multiplicativity. Additive on the log-odds scale is the same as multiplicative on the odds scale. Lack of power is besides the point
if the test for the interaction came in positive.
As far as responding to the reviewer, not sure how to address his questions about using the term "independent" to describe the
results. Comments, suggestions appreciated. Bill H.

[I found the first paragraph of the post too dense to sort out in a casual reading, so I skipped to the second.]
I read it as having a "full" model with parameterisation:
intercept for 0/0, b1, b2, b(interact)
If his ORs of OR1=7, OR2=9, ORinteract=23 and the addition of the interaction is significant, it make me wonder if the interaction is
"sub- multiplicative".

Welcome to the land of Murk, where things are murky.

I can understand why a reviewer might draw the line at using the
phrase "independent predictors" to describe variables that are
associated.
I would think the reviewer "might", but he would be wrong. The joint association of the two independent variables does not mean that they cannot be independent predictors of an outcome. Consider the
association of HDL cholesterol with gender. Does anyone think that
just because HDL levels are lower in men that you would not want both
variables in the model predicting coronary disease events? The
mechanics of the LR machinery should be able to determine whether two
variables have independent predictive capacity. You should also be
asking what the predictors are _in_reality_. What does prior science
say about these as predictors? Is there any causal connections of one
of these Xs with the other?
When reading these questions, it is important to keep in mind not only
how we ourselves see them, but how others might reasonably see them.

Which club do you imagine "we" belong to? My graduate degrees lead me to call myself a "physician" and "epidemiologist". You had already passed on interpreting his first paragraph so I thought a non-statistician should take a shot.

The problem here is the word "predictors". For many, "predictors" is synonymous with "predictor variables" with emphasis on "variables". If one reads it that way, then "independence" becomes "independent variables", which has a particular meaning in statistics. If one
reads "predictor" as "risk factor", then the argument I gave for "risk
factor" applies. Because "predictor" lends itself to this kind of
confusion, I can understand why a random reviewer might not like it. It is easily fixed, however, with a few simple edits.

For better or worse, the medical literature has adopted the phrase "independent risk factor" to describe the sort of thing your
colleague is seeing. That is, it's okay to call something an
independent risk factor if, after one has modified his/her risk by
all other factors, one can further modify it by attending to the
factor under consideration.
Modifiabilty has nothing to do with prediction. Gender and age are
not modifiable in the slightest, yet are potent predictors of many
outcomes.
I seem to be speaking more colloquially than you like. In place of "modify" insert "done all one can about".

Your suggested fix does not satisfy my logical objections in the slightest. It appears you did not read my comments for meaning. I gave two specific counter-examples. What "can be done about" age or gender?

You are correct that I read your comment too quickly. I've got no problem if you want to dot all the i's and cross all the t's. There seems to be NO disagreement over the fact that two things can properly be called ndependent risk factors without being statistically independent rvs, which, in fact was the OP's question.

My point still ... modifiability has nothing to do with "independent" predictors. Granted, they would not be amenable to modification if you were designing a randomized intervention trial, but that was not the question.
>
Even then, you would want to consider the risk of an imbalance in important non-modifiable, independent predictors in your allocation. There are a host of modelling issues that may arise in dealing with these non-modifiable variables, but you cannot just define them away as not worthy of consideration as independent predictors.

Take a look at the history of the term "risk factor" in medicine. You will find age, gender, and family history in most of the early discussions of independent risk factors. Here is a current link to a page by the American Heart Association that might be considered "colloquial": http://www.americanheart.org/presenter.jhtml?identifier=4726

<quote> The American Heart Association has identified several risk factors. Some of them can be modified, treated or controlled, and some can't. The more risk factors you have, the greater your chance of developing coronary heart disease. </quote>

It is not necessary that the risk factors be
statistically independent, but that each makes an "independent"
contribution to the risk. One way investigators establish that
something is an independent risk factor is by showing that it makes
a statistically significant contribution to a model that contains
all other known or suspected risk factors.
There is another way?
Well, sure! But, again, I seem to be speaking more colloquially than you like and said "model" rather than "logistic regression model". This has been a discussion about logistic regression. There are other
techniques for handling such data.

There are, certainly. But even if you adopt another link function or embed the LR model in a more general family of models, you will still be determining whether adding the interaction term reduces the variance or deviance significantly relative to the smaller model.

A search on the phrase "independent risk factor" in any good search engine will give you a wealth of examples.

Maybe with the added terms "confounding" and "joint association".
Even without!

Of course. My suggested constraints on the search terms were intended to foster a more dense collection of hits. I got over half a million hits with your phrase. Adding "confounding" narrowed that down by a factor of one hundred. Adding "joint association" probably narrowed it too far. Sometimes it is useful for someone who lacks the proper statistical language to be given terms for a more focussed search.

No question that there are always ways to improve a search and ways to tighten a description until it's textbook ready. I think it's fine that you've tightened things up, but the essential fact hasn't changed--namely, that independent risk factors need not be "statistically indpendent" to be called independent.
.



Relevant Pages

  • Re: describing logistic regression results for clinicians
    ... He originally described the results using the phrase "X1 and X2 are independent predictors" of outcome because they are both statistically signficant in the model and the interaction is statistically non-significant. ... Reviewer #1 balked at this ... If one reads it that way, then "independence" becomes "independent variables", which has a particular meaning in statistics. ... That is, it's okay to call something an independent risk factor if, after one has modified his/her risk by all other factors, ...
    (sci.stat.consult)
  • Re: describing logistic regression results for clinicians
    ... binary predictors, say x1 and x2, and their interaction. ... how to respond to Reviewer #1? ... which has a particular meaning in statistics. ... independent risk factor if, after one has modified his/her risk by ...
    (sci.stat.consult)
  • Re: describing logistic regression results for clinicians
    ... predictors, say x1 and x2, and their interaction. ... phrase "X1 and X2 are independent predictors" of outcome because they ... Reviewer #1 balked at this ... it's okay to call something an independent risk ...
    (sci.stat.consult)
  • Re: Using Ridge Regression to disentangle highly correlated explanatory variables
    ... "Statistics is about communicating formation from one ... sensitivity question daily and you never ever say "That can't be done" ... THE PREDICTORS ARE ORTHOGONAL OR AT LEAST WELL- ... data) failed to convey the ill-effects of collinearity. ...
    (sci.stat.math)
  • Re: Question for Quentin (cholesterol ratios)
    ... LDL readings alone are a poor predictor of CVD risk. ... A comparison of lipid variables as predictors of cardiovascular disease in the Asia Pacific region. ... Individuals in the highest fourth of each lipid variable had approximately twice the risk of CHD compared with those with lowest levels. ...
    (alt.support.diabetes)