Relative Measures of Association for Binary Outcomes: Challenges and Recommendations for the Global Health Researcher

John A. Gallis; Elizabeth L. Turner

doi:10.5334/aogh.2581

Figures & Tables

Table 1

Example of two relative measures of association, adapted from results of a randomized controlled trial in febrile individuals published in BMJ Global Health.^a

Exposure group	Outcome^b			Relative Measure of Associationc
Exposure group	Tested for malaria	Did not test for malaria	“Risk” of malaria testing	Risk Ratio	Odds Ratio
Intervention	76	27	76/103 = 73.8%	73.8/51.0 = 1.45	$\frac{76}{27} / \frac{51}{49} = 2.7$
Control	51	49	51/100 = 51.0%

[i] ^a RCT by O’Meara et al. (2016) [5] was a 2 × 2 factorial design of two interventions for febrile individuals. Here we have adapted the example to focus on one of those interventions, namely a subsidy for a rapid diagnostic test, where “intervention” denotes the group that received the subsidy and “control” denotes the group that did not receive the subsidy. Specifically, we have extracted outcome data from Table 2 of O’Meara et al. (2016) [5] for the two groups which did not receive the second intervention.

^b Row counts correspond to the number of participants with each level of the outcome within each exposure group.

^c For intervention group vs. control group, where we note that O’Meara et al. (2016) [5] reported neither of these results in their Table 2 because they instead reported absolute measures of effect.

Table 2

Unadjusted measures of relative association from three articles in the global health literature.

	Exposure	Outcome	Unexposed group outcome proportion	Risk Ratio^a	Odds Ratio^b	Magnitude of odds ratio relative to risk ratio^c
1	Surviving Ebola virus [32]	Safe sexual behavior	14%	2.71	3.67	35%
2	Point-of-care testing [33]	Antibiotic use	78%	0.82	0.50	178%
3	Drinking [34]	Feelings of aggression	20%	3.1	6.7	116%

[i] ^a Risk ratio (for “exposed” vs. “unexposed”) computed directly from outcome proportions reported in the article as none of the three articles used the risk ratio as a measure of relative association.

^b Odds ratio is obtained from unadjusted logistic regression [32] or directly from outcome proportions reported [33 34].

^c In these examples where the outcome is relatively common (i.e., >10%), if the odds ratio were to be incorrectly interpreted as a risk ratio, this is the magnitude of overstatement of relative association.

Relationship between the odds ratio and risk ratio at various levels of the reference risk.

Table 3

Brief summary of four methods of obtaining risk ratios for binary outcomes.

Name of method	Type of method	Background literature	Some advantages	Some disadvantages	Example of use in the global health literature	Exposure	Binary Outcome
Log-binomial	Direct	Wacholder (1986) [35]	Easy to implement.	May not converge; may estimate individual-level probabilities (and/or the upper bound of their 95% confidence intervals) above 1.	Gibson et al. (2017) [37]	Mobile phone based intervention to improve immunization rates, in a cluster-randomized trial	Full immunization by 12 months of age.
Modified log-Poisson	Direct	Zou (2004) [16]	Easy to implement; almost always converges.	May estimate individual-level probabilities (and/or the upper bound of their 95% confidence intervals) above 1.	Chan et al. (2017) [38]	AIDS-related stigma	Probable depression (PHQ-9 score ≥10 or recent suicidal thoughts).
Substitution	Indirect	Zhang and Yu (1998) [25]	Easy to implement. Uses output from logistic regression.	Generally produces biased estimates and 95% confidence intervals are expected to be too narrow, on average [18].	Agweyu et al. (2018) [39]	Various demographics and health-related exposures	Mortality.
Marginal or Conditional Standardization	Indirect	Localio et al. (2007) [18]	Uses output from logistic regression.	May be more difficult to implement and interpret than other methods, especially in certain software packages.	Weobong et al. (2017) [40]	Psychological intervention for depression, in a randomized trial	Remission from depression as measured by the PHQ-9.

[i] Abbreviation: PHQ-9 – Patient Health Questionnaire 9-item [36], a screening tool for depression.

Table 4

Code to fit the log-binomial and modified log-Poisson models in four commonly used statistical software packages, and to use the marginal standardization method in two of the packages.

Software Program	Data Structure	Log-binomial code^a	Modified log-Poisson code^b	Marginal standardization code^c
Stata^e	Ind	`glm binaryoutcome exposure,` `family(binomial) link(log) eform`	`glm binaryoutcome exposure,` `family(poisson) link(log)` `vce(robust) eform`	`logit binaryoutcome i.exposure, or` `margins exposure, coeflegend post` `nlcom (RR:` `_b[1.exposure]/_b[0bn.exposure]` `), post`
	Clust^d	`xtset cluster` `xtgee binaryoutcome exposure,` `family(binomial) link(log)` `corr(exchangeable) eform`	`xtset cluster` `xtgee binaryoutcome exposure,` `family(poisson) link(log)` `corr(exchangeable) eform`	`xtset cluster` `xtgee binaryoutcome i.exposure,` `family(binomial) link(logit)` `corr(exchangeable) eform` `margins exposure, post coeflegend` `nlcom (ratio1:` `_b[1.exposure]/_b[0bn.exposure]` `), post`
SAS	Ind	`proc genmod data=temp;` `class binaryoutcome exposure /` `param=ref ref=first;` `model binaryoutcome = exposure /` `dist=bin link=log;` `estimate ‘Risk Ratio’ exposure 1` `/exp;` `run;`	`proc genmod data=temp;` `class binaryoutcome exposure` `participantID / param=ref` `ref=first;` `model binaryoutcome = exposure /` `dist=poisson link=log;` `repeated subject=participantID /` `type=Ind;` `estimate ‘Risk Ratio’ exposure 1 /` `exp;` `run;`
	Clust	`proc genmod data=temp descending;` `class binaryoutcome exposure` `cluster / desc;` `model binaryoutcome = exposure /` `dist=binomial link=log;` `repeated subject=cluster /` `corr=exch;` `estimate ‘Risk Ratio’ exposure 1` `-1 /exp;` `run;`	`proc genmod data=temp descending;` `class binaryoutcome exposure cluster` `/ desc;` `model binaryoutcome = exposure /` `dist=poisson link=log;` `repeated subject=cluster /` `corr=exch;` `estimate ‘Risk Ratio’ exposure 1` `-1 /exp;` `run;`
R^f	Ind	`lglm <-glm(binaryoutcome~exposure,` `family=binomial(link=”log”))` `exp(cbind(coef(lglm), confint(lglm)))`	`library(gee)` `pglm <-summary(gee(binaryoutcome~exposure,` `family=(poisson(link=”log”)),id=p` `articipantID,corstr=”independ` `ence”))` `cbind(RiskRatio=exp(pglm$coefficients[2]),` `LCI=exp(pglm$coefficients[2]-` `1.96pglm$coefficients[8]),` `UCI=exp(pglm$coefficients[2]+1.96p` `glm$coefficients[8]))`	`library(epitools)` `lglm <-glm(binaryoutcome~exposure,` `family=binomial)` `probratio(lglm,method=”ML”)`
	Clust	`library(gee)` `lgee<-summary(gee(bina` `ryoutcome~exposure,` `family=(binomial(link=”log”)), id=` `cluster,corstr=”exchangeable”))` `cbind(RiskRatio=exp(lgee$coefficients[2]),` `LCI=exp(lgee$coefficients[2]-` `1.96lgee$coefficients[8]),` `UCI=exp(lgee$coefficients[2]+1.96l` `gee$coefficients[8]))`	`library(gee)` `pgee<-summary(gee(binaryoutcome~exposure,` `family=(poisson(link=”log”)), id=cl` `uster,corstr=”exchangeable”))` `cbind(RiskRatio=exp(pgee$coefficients[2]),` `LCI=exp(pgee$coefficients[2]-` `1.96pgee$coefficients[8]),` `UCI=exp(pgee$coefficients[2]+1.96pg` `ee$coefficients[8]))`
SPSS	Ind	`genlin binaryoutcome (reference=first)` `by exposure (order=descending)` `/model exposure intercept=yes` `distribution=binomial link=Log` `/Print summary` `solution(exponentiated)`	`genlin binaryoutcome (reference=first)` `by exposure (order=descending)` `/model exposure intercept=yes` `distribution=poisson link=log` `/repeated subject=participantID` `sort=yes corrtype=independent` `adjustcorr=yes covb=robust` `/Print summary solution(exponentiated)`
	Clust	`genlin binaryoutcome (reference=first)` `by exposure (order=descending)` `/model exposure intercept=yes` `distribution=binomial link=Log` `/repeated subject=cluster sort=yes` `corrtype=exchangeable` `adjustcorr=yes covb=robust` `/Print summary` `solution(exponentiated)`	`genlin binaryoutcome (reference=first)` `by exposure (order=descending)` `/model exposure intercept=yes` `distribution=poisson link=Log` `/repeated subject=cluster sort=yes` `corrtype=exchangeable` `adjustcorr=yes covb=robust` `/Print summary solution(exponentiated)`

[i] Abbreviations: Ind = Independent (i.e., non-clustered); Clust = Clustered.

Variables: binaryoutcome = the binary outcome; exposure = exposure (e.g., treatment group indicator), assumed to be categorical; participantID = participant identifier; cluster = cluster identifier.

^a The log-binomial code for direct estimation of the risk ratio in the clustered setting is only shown in the generalized estimating equations (GEE) framework. A generalized linear mixed model (GLMM) could also be used.

^b For the log-Poisson approach, a robust standard error is needed to account for misspecification of the outcome distribution (i.e., Poisson instead of binomial); GEE is the natural approach to obtain this robust standard error, in both the non-clustered and clustered setting.

^c To our knowledge, the marginal standardization method is not as straightforward to implement in SAS or SPSS, so no code is provided. In addition, we are unaware of an easy-to-implement function in R to perform marginal standardization in a clustered setting.

^d In the context of GEE to analyze clustered outcome data, we have used an exchangeable working correlation matrix as an example. It is natural to use such a working correlation matrix when the outcome data are measured at a single point in time and the clustering arises through some natural grouping of individuals (e.g., in schools or hospitals). But, if the clustering arises from longitudinal data, other working correlation structures may be preferred.

^e The standard errors from Stata may be slightly larger than that obtained from the other programs. This is because Stata multiplies the robust standard errors by K/(K–1), where K is the number of clusters, whereas other programs do not do this.

^f The cbind R code illustrated here works only for a single binary exposure variable. It will need to be modified for more complex scenarios. Additionally, the gee function requires that the outcome be set up as a numeric variable, rather than a factor variable, when specifying the modified log-Poisson model.

Relative Measures of Association for Binary Outcomes: Challenges and Recommendations for the Global Health Researcher

Figures & Tables

Table 1

Table 2

Figure 1

Table 3

Table 4

Paradigm

My account