Have a personal or library account? Click to login
Using the Bootstrap to Account for Linkage Errors when Analysing Probabilistically Linked Categorical Data Cover

Using the Bootstrap to Account for Linkage Errors when Analysing Probabilistically Linked Categorical Data

Open Access
|Sep 2015

Abstract

Record linkage is the act of bringing together records that are believed to belong to the same unit (e.g., person or business) from two or more files. Record linkage is not an error-free process and can lead to linking a pair of records that do not belong to the same unit. This occurs because linking fields on the files, which ideally would uniquely identify each unit, are often imperfect. There has been an explosion of record linkage applications, particularly involving government agencies and in the field of health, yet there has been little work on making correct inference using such linked files. Naively treating a linked file as if it were linked without errors can lead to biased inferences. This article develops a method of making inferences for cross tabulated variables when record linkage is not an error-free process. In particular, it develops a parametric bootstrap approach to estimation which can accommodate the sophisticated probabilistic record linkage techniques that are widely used in practice (e.g., 1-1 linkage). The article demonstrates the effectiveness of this method in a simulation and in a real application.

Language: English
Page range: 397 - 414
Submitted on: Jan 1, 2014
Accepted on: Dec 1, 2014
Published on: Sep 1, 2015
Published by: Sciendo
In partnership with: Paradigm Publishing Services
Publication frequency: 4 times per year

© 2015 James O. Chipperfield, Raymond L. Chambers, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.