Have a personal or library account? Click to login
Relative Measures of Association for Binary Outcomes: Challenges and Recommendations for the Global Health Researcher Cover

Relative Measures of Association for Binary Outcomes: Challenges and Recommendations for the Global Health Researcher

Open Access
|Nov 2019

Figures & Tables

Table 1

Example of two relative measures of association, adapted from results of a randomized controlled trial in febrile individuals published in BMJ Global Health.a

Exposure groupOutcomebRelative Measure of Associationc
Tested for malariaDid not test for malaria“Risk” of malaria testingRisk RatioOdds Ratio
Intervention762776/103 = 73.8%73.8/51.0 = 1.45 7627/5149=2.7
Control514951/100 = 51.0%

[i] a RCT by O’Meara et al. (2016) [5] was a 2 × 2 factorial design of two interventions for febrile individuals. Here we have adapted the example to focus on one of those interventions, namely a subsidy for a rapid diagnostic test, where “intervention” denotes the group that received the subsidy and “control” denotes the group that did not receive the subsidy. Specifically, we have extracted outcome data from Table 2 of O’Meara et al. (2016) [5] for the two groups which did not receive the second intervention.

b Row counts correspond to the number of participants with each level of the outcome within each exposure group.

c For intervention group vs. control group, where we note that O’Meara et al. (2016) [5] reported neither of these results in their Table 2 because they instead reported absolute measures of effect.

Table 2

Unadjusted measures of relative association from three articles in the global health literature.

ExposureOutcomeUnexposed group outcome proportionRisk RatioaOdds RatiobMagnitude of odds ratio relative to risk ratioc
1Surviving Ebola virus [32]Safe sexual behavior14%2.713.6735%
2Point-of-care testing [33]Antibiotic use78%0.820.50178%
3Drinking [34]Feelings of aggression20%3.16.7116%

[i] a Risk ratio (for “exposed” vs. “unexposed”) computed directly from outcome proportions reported in the article as none of the three articles used the risk ratio as a measure of relative association.

b Odds ratio is obtained from unadjusted logistic regression [32] or directly from outcome proportions reported [3334].

c In these examples where the outcome is relatively common (i.e., >10%), if the odds ratio were to be incorrectly interpreted as a risk ratio, this is the magnitude of overstatement of relative association.

agh-85-1-2581-g1.png
Figure 1

Relationship between the odds ratio and risk ratio at various levels of the reference risk.

Table 3

Brief summary of four methods of obtaining risk ratios for binary outcomes.

Name of methodType of methodBackground literatureSome advantagesSome disadvantagesExample of use in the global health literatureExposureBinary Outcome
Log-binomialDirectWacholder (1986) [35]Easy to implement.May not converge; may estimate individual-level probabilities (and/or the upper bound of their 95% confidence intervals) above 1.Gibson et al. (2017) [37]Mobile phone based intervention to improve immunization rates, in a cluster-randomized trialFull immunization by 12 months of age.
Modified log-PoissonDirectZou (2004) [16]Easy to implement; almost always converges.May estimate individual-level probabilities (and/or the upper bound of their 95% confidence intervals) above 1.Chan et al. (2017) [38]AIDS-related stigmaProbable depression (PHQ-9 score ≥10 or recent suicidal thoughts).
SubstitutionIndirectZhang and Yu (1998) [25]Easy to implement. Uses output from logistic regression.Generally produces biased estimates and 95% confidence intervals are expected to be too narrow, on average [18].Agweyu et al. (2018) [39]Various demographics and health-related exposuresMortality.
Marginal or Conditional StandardizationIndirectLocalio et al. (2007) [18]Uses output from logistic regression.May be more difficult to implement and interpret than other methods, especially in certain software packages.Weobong et al. (2017) [40]Psychological intervention for depression, in a randomized trialRemission from depression as measured by the PHQ-9.

[i] Abbreviation: PHQ-9 – Patient Health Questionnaire 9-item [36], a screening tool for depression.

Table 4

Code to fit the log-binomial and modified log-Poisson models in four commonly used statistical software packages, and to use the marginal standardization method in two of the packages.

Software ProgramData StructureLog-binomial codeaModified log-Poisson codebMarginal standardization codec
StataeIndglm binaryoutcome exposure,
      family(binomial) link(log) eform
glm binaryoutcome exposure,
      family(poisson) link(log)
      vce(robust) eform
logit binaryoutcome i.exposure, or
margins exposure, coeflegend post
nlcom (RR:
      _b[1.exposure]/_b[0bn.exposure]
      ), post
Clustdxtset cluster
xtgee binaryoutcome exposure,
      family(binomial) link(log)
      corr(exchangeable) eform
xtset cluster
xtgee binaryoutcome exposure,
      family(poisson) link(log)
      corr(exchangeable) eform
xtset cluster
xtgee binaryoutcome i.exposure,
      family(binomial) link(logit)
      corr(exchangeable) eform
margins exposure, post coeflegend
nlcom (ratio1:
      _b[1.exposure]/_b[0bn.exposure]
      ), post
SASIndproc genmod data=temp;
      class binaryoutcome exposure /
            param=ref ref=first;
      model binaryoutcome = exposure /
            dist=bin link=log;
      estimate ‘Risk Ratio’ exposure 1
            /exp;
run;
proc genmod data=temp;
      class binaryoutcome exposure
            participantID / param=ref
            ref=first;
      model binaryoutcome = exposure /
            dist=poisson link=log;
      repeated subject=participantID /
            type=Ind;
      estimate ‘Risk Ratio’ exposure 1 /
            exp;
run;
Clustproc genmod data=temp descending;
      class binaryoutcome exposure
      cluster / desc;
      model binaryoutcome = exposure /
            dist=binomial link=log;
      repeated subject=cluster /
            corr=exch;
      estimate ‘Risk Ratio’ exposure 1
            -1 /exp;
run;
proc genmod data=temp descending;
      class binaryoutcome exposure cluster
            / desc;
      model binaryoutcome = exposure /
            dist=poisson link=log;
      repeated subject=cluster /
            corr=exch;
      estimate ‘Risk Ratio’ exposure 1
            -1 /exp;
run;
RfIndlglm <-glm(binaryoutcome~exposure,
      family=binomial(link=”log”))
exp(cbind(coef(lglm), confint(lglm)))
library(gee)
pglm <-summary(gee(binaryoutcome~exposure,
      family=(poisson(link=”log”)),id=p
            articipantID,corstr=”independ
            ence”))
cbind(RiskRatio=exp(pglm$coefficients[2]),
      LCI=exp(pglm$coefficients[2]-
            1.96*pglm$coefficients[8]),
      UCI=exp(pglm$coefficients[2]+1.96*p
      glm$coefficients[8]))
library(epitools)
lglm <-glm(binaryoutcome~exposure,
      family=binomial)
probratio(lglm,method=”ML”)
Clustlibrary(gee)
lgee<-summary(gee(bina
      ryoutcome~exposure,
      family=(binomial(link=”log”)), id=
      cluster,corstr=”exchangeable”))
cbind(RiskRatio=exp(lgee$coefficients[2]),
      LCI=exp(lgee$coefficients[2]-
            1.96*lgee$coefficients[8]),
      UCI=exp(lgee$coefficients[2]+1.96*l
            gee$coefficients[8]))
library(gee)
pgee<-summary(gee(binaryoutcome~exposure,
      family=(poisson(link=”log”)), id=cl
            uster,corstr=”exchangeable”))
cbind(RiskRatio=exp(pgee$coefficients[2]),
      LCI=exp(pgee$coefficients[2]-
            1.96*pgee$coefficients[8]),
      UCI=exp(pgee$coefficients[2]+1.96*pg
            ee$coefficients[8]))
SPSSIndgenlin binaryoutcome (reference=first)
      by exposure (order=descending)
/model exposure intercept=yes
      distribution=binomial link=Log
/Print summary
      solution(exponentiated)
genlin binaryoutcome (reference=first)
      by exposure (order=descending)
/model exposure intercept=yes
      distribution=poisson link=log
/repeated subject=participantID
      sort=yes corrtype=independent
adjustcorr=yes covb=robust
/Print summary solution(exponentiated)
Clustgenlin binaryoutcome (reference=first)
      by exposure (order=descending)
/model exposure intercept=yes
      distribution=binomial link=Log
/repeated subject=cluster sort=yes
      corrtype=exchangeable
adjustcorr=yes covb=robust
/Print summary
      solution(exponentiated)
genlin binaryoutcome (reference=first)
      by exposure (order=descending)
/model exposure intercept=yes
      distribution=poisson link=Log
/repeated subject=cluster sort=yes
      corrtype=exchangeable
adjustcorr=yes covb=robust
/Print summary solution(exponentiated)

[i] Abbreviations: Ind = Independent (i.e., non-clustered); Clust = Clustered.

Variables: binaryoutcome = the binary outcome; exposure = exposure (e.g., treatment group indicator), assumed to be categorical; participantID = participant identifier; cluster = cluster identifier.

a The log-binomial code for direct estimation of the risk ratio in the clustered setting is only shown in the generalized estimating equations (GEE) framework. A generalized linear mixed model (GLMM) could also be used.

b For the log-Poisson approach, a robust standard error is needed to account for misspecification of the outcome distribution (i.e., Poisson instead of binomial); GEE is the natural approach to obtain this robust standard error, in both the non-clustered and clustered setting.

c To our knowledge, the marginal standardization method is not as straightforward to implement in SAS or SPSS, so no code is provided. In addition, we are unaware of an easy-to-implement function in R to perform marginal standardization in a clustered setting.

d In the context of GEE to analyze clustered outcome data, we have used an exchangeable working correlation matrix as an example. It is natural to use such a working correlation matrix when the outcome data are measured at a single point in time and the clustering arises through some natural grouping of individuals (e.g., in schools or hospitals). But, if the clustering arises from longitudinal data, other working correlation structures may be preferred.

e The standard errors from Stata may be slightly larger than that obtained from the other programs. This is because Stata multiplies the robust standard errors by K/(K–1), where K is the number of clusters, whereas other programs do not do this.

f The cbind R code illustrated here works only for a single binary exposure variable. It will need to be modified for more complex scenarios. Additionally, the gee function requires that the outcome be set up as a numeric variable, rather than a factor variable, when specifying the modified log-Poisson model.

DOI: https://doi.org/10.5334/aogh.2581 | Journal eISSN: 2214-9996
Language: English
Published on: Nov 20, 2019
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2019 John A. Gallis, Elizabeth L. Turner, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.