Have a personal or library account? Click to login
Regression discontinuity design and its applications to Science of Science: A survey Cover

Regression discontinuity design and its applications to Science of Science: A survey

By: Meiling Li,  Yang Zhang and  Yang Wang  
Open Access
|Jun 2023

Figures & Tables

Figure 1.

Illustrations of RDD. (a) The continuity framework, and (b) the local randomization framework. The figure depicts the expected outcomes conditional on the running variable Xi, denoted by E[Yi(1)│Xi=x] and E[Yi(0)│Xi=x]. τSRD and τSLR represent the causal effect using these two frameworks at the cutoff c in the window [c−Δ, c+Δ], respectively. This figure is adapted from (Cattaneo & Titiunik, 2022).
Illustrations of RDD. (a) The continuity framework, and (b) the local randomization framework. The figure depicts the expected outcomes conditional on the running variable Xi, denoted by E[Yi(1)│Xi=x] and E[Yi(0)│Xi=x]. τSRD and τSLR represent the causal effect using these two frameworks at the cutoff c in the window [c−Δ, c+Δ], respectively. This figure is adapted from (Cattaneo & Titiunik, 2022).

Figure 2.

Data collection procedure. (a) Illustration of data collection procedure. Specifically, we manually collect 3,387 RDD papers from Web of Science through keyword searching, and we obtain 2,061 RDD papers in the MAG by matching their DOIs with the Web of Science data. (b) The number of RDD papers in 19 MAG categories as the function of time. The main plot is smoothed using a three-year sliding window. The inset figure shows the total number of RDD papers from 1960 to 2021.
Data collection procedure. (a) Illustration of data collection procedure. Specifically, we manually collect 3,387 RDD papers from Web of Science through keyword searching, and we obtain 2,061 RDD papers in the MAG by matching their DOIs with the Web of Science data. (b) The number of RDD papers in 19 MAG categories as the function of time. The main plot is smoothed using a three-year sliding window. The inset figure shows the total number of RDD papers from 1960 to 2021.

Figure 3.

The RDD keyword network and emergent words in WOS. (a) We illustrate the RDD keyword network, where nodes represent keywords and links indicate that two keywords appear in the same paper. The modularity Q is 0.37, indicating a strong community structure. Here, we display only the largest eight clusters, excluding small clusters. (b) Top 15 emergent words of RDD papers, which indicate research frontiers. Year indicates the year when the keyword first appeared, while Begin and End represent the starting and ending years of the keyword as the research frontier. The graph on the rightmost displays the research frontiers in different time periods. For example, air pollution is the research frontier of RDD between 2021 and 2023.
The RDD keyword network and emergent words in WOS. (a) We illustrate the RDD keyword network, where nodes represent keywords and links indicate that two keywords appear in the same paper. The modularity Q is 0.37, indicating a strong community structure. Here, we display only the largest eight clusters, excluding small clusters. (b) Top 15 emergent words of RDD papers, which indicate research frontiers. Year indicates the year when the keyword first appeared, while Begin and End represent the starting and ending years of the keyword as the research frontier. The graph on the rightmost displays the research frontiers in different time periods. For example, air pollution is the research frontier of RDD between 2021 and 2023.

Figure 4.

The citation behaviors between RDD and other academic domains over time. (a) The fraction of references made by RDD papers to certain scientific domains. (b) The fraction of references made to RDD papers by papers in various scientific domains. (c) Reference strength from RDD papers to papers in other academic fields. (d) Reference strength from other academic fields to RDD papers. Black dashed lines in c,d represent φ= 1, and other dashed lines in c, d indicate that the strength of references from certain academic fields is lower than the average value cross fields in 2016.
The citation behaviors between RDD and other academic domains over time. (a) The fraction of references made by RDD papers to certain scientific domains. (b) The fraction of references made to RDD papers by papers in various scientific domains. (c) Reference strength from RDD papers to papers in other academic fields. (d) Reference strength from other academic fields to RDD papers. Black dashed lines in c,d represent φ= 1, and other dashed lines in c, d indicate that the strength of references from certain academic fields is lower than the average value cross fields in 2016.

Figure 5.

The results of the analysis conducted in (Ludwig & Miller, 2007). (a) - (b) show the linear and quadratic fits, respectively, using rdplot for county mortality of children aged 5 to 9 in 1973-1983. (c) shows the quadratic fit using rdplot for county mortality of people ages 25 and older in 1973-1983. The data used in the analysis come from (Matias D. Cattaneo, 2021).
The results of the analysis conducted in (Ludwig & Miller, 2007). (a) - (b) show the linear and quadratic fits, respectively, using rdplot for county mortality of children aged 5 to 9 in 1973-1983. (c) shows the quadratic fit using rdplot for county mortality of people ages 25 and older in 1973-1983. The data used in the analysis come from (Matias D. Cattaneo, 2021).

Regression discontinuity estimation of the effect of HS funding on mortality_ Robust standard errors are in parentheses,

(1)(2)(3)(4)(5)
VariableMeanNonparametric estimatorParametric
Flexible linearFlexible quadratic
Bandwidth or poverty range 91836816
Main results
Number of countries 5249542,161482858
Mortality, Ages 5-9 (%)2.252−1.895*(0.984)−1.198*(0.662)−1.114**(0.501)−2.201**(1.058)−2.558**(1.096)
Mortality, Ages 25+(%)132.6262.204(5.645)6.016(4.025)5.872(3.600)2.091(5.872)2.574(6.370)

The survey of studies that utilize RDD_ Context reveals the settings of the focal paper_ Outcome(s) means the dependent variable of the focal paper_ Treatment(s) is the treatment variable in the focal paper_ In practice, the treatment variable is a binary variable_ Running variable(s) is the forcing variable for individuals_

ContextOutcome(s)Treatment(s)Running Variable(s)
Economics
Yi et al. (Yi et al., 2022)Great Famine in ChinaRisk tolerance and entrepreneurship in adulthoodExperiencing early-life hardshipLocation
García-Jimeno et al. (García-Jimeno et al., 2022)

Women’s Temperance

Crusade in American

Collective action decisionsAffective information networksLocation
Akhtari et al. (Akhtari et al., 2022)The politically motivated replacement of personnel in the schools in BrazilThe quality of public education provision by the governmentPolitical turnoverShare of Votes
Van Der Klaauw (Van Der Klaauw, 2002)East Coast college’s aidCollege enrollmentOffering financial aidAid allocation decisions
Education
Davies et al. (Davies et al., 2018)Reform of increasing the minimum school leaving age in EnglandRisk of diabetes and mortalityRemaining in schoolTime
Huang et al. (Huang & Zhou, 2013)Great Famine in ChinaCognition estimated by episodic memory surveyCompletion of primary schoolYear of birth and entering primary schooling
Clark et al. (Clark & Royer, 2013)Reform of increasing the minimum school leaving age in England Adult mortality and healthRemaining in schoolTime
Science of Science or Innovation Studies
Seeber et al. (Seeber et al., 2019)Scientists’ promotion in Italian higher Education systemScientists’ number of self-citationsUndergoing the introduction of the habilitation procedureTime
Wang et al. (Y. Wang et al., 2019)Early-career setback, NIH R01 grant applications

Future

Career outcomes

Receiving the R01 grantPriority score
Bol et al. (Bol et al., 2018)Innovation Research Incentives Scheme for early career scientists, NetherlandsWinning a midcareer grantWinning the early career awardEvaluation scores
Bronzini et al. (Bronzini & Iachini, 2014)Firms’ R&D subsidy in northern ItalyInvestment spending of firmsReceiving fundingPriority score
Jacob et al. (Jacob & Lefgren, 2011b)NIH R01 grant applicationsSubsequent publications and citationsReceiving an NIH research grantPriority score
Jacob et al. (Jacob & Lefgren, 2011a)NIH postdoctoral training grantsSubsequent publications and citationsReceiving an NIH postdoctoral training grantPriority score

Counties Characteristic_ Column 1 represents county-level data, including the county poverty rate in 1960, mortality of children aged 5 to 9, and people aged 25 and older in 1973-1983_ Counties with a 1960 poverty rate of 49_198% to 59_198% are the control group, while counties with a 1960 poverty rate of 59_1984% to 69_1984% are the treatment group, i_e_, the poorest counties funded by the HS funding program_

County-level dataCounties with 1960 poverty 49.198% to 59.198Counties with 1960 poverty 59.1984% to 69.1984
No. of observations (counties)347228
MeanStd MeanStd.
County Poverty Rate 1960 (%)54.082.861 63.402.644
Mortality, Ages 5-9, 1973-1983 (%)3.0445.897 2.3164.566
Mortality, Ages 25+, 1973-1983 (%)132.530.96 135.730.53
DOI: https://doi.org/10.2478/jdis-2023-0008 | Journal eISSN: 2543-683X | Journal ISSN: 2096-157X
Language: English
Page range: 43 - 65
Submitted on: Mar 31, 2023
Accepted on: Apr 3, 2023
Published on: Jun 7, 2023
Published by: Chinese Academy of Sciences, National Science Library
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2023 Meiling Li, Yang Zhang, Yang Wang, published by Chinese Academy of Sciences, National Science Library
This work is licensed under the Creative Commons Attribution 4.0 License.