Have a personal or library account? Click to login
Data Fusion for Joining Income and Consumtion Information using Different Donor-Recipient Distance Metrics Cover

Data Fusion for Joining Income and Consumtion Information using Different Donor-Recipient Distance Metrics

Open Access
|Jun 2022

References

  1. Albayrak, O., and T. Masterson. 2017. Quality of statistical match of household budget survey and SILC for Turkey, Levy Economics Institute, Working Paper (885). DOI: https://doi.org/10.2139/ssrn.2924849.10.2139/ssrn.2924849
  2. Andridge, R.R., and R.J.A. Little. 2009, “The Use of Sample Weights in Hot Deck Imputation”. Journal of Official Statistics 25(1): 21–36. Available at: https://www.scb.se/contentassets/ca21efb41fee47d293bbee5bf7be7fb3/the-use-of-sample-weights-inhot-deck-imputation.pdf (accessed March 2022).
  3. Andridge, R.R., and R.J.A. Little. 2010. “A review of hot deck imputation for survey non-response”. International statistical review 78(1): 40–64. DOI: https://doi.org/10.1111/j.1751-5823.2010.00103.x.10.1111/j.1751-5823.2010.00103.x313033821743766
  4. Beretta, L. and A. Santaniello. 2016. “Nearest neighbor imputation algorithms: a critical evaluation”. BMC medical informatics and decision making 16(3): 74. DOI: https://doi.org/10.1186/s12911-016-0318-z.10.1186/s12911-016-0318-z495938727454392
  5. Burgette, L.F., and J.P. Reiter. 2010 “Multiple imputation for missing data via sequential regression trees”. American Journal of Epidemiology 172(9): 1070–1076. DOI: https://doi.org/10.1093/aje/kwq260.10.1093/aje/kwq26020841346
  6. Chen, J., and J. Shao. 2000. “Nearest Neighbor Imputation for Survey Data”. Journal of Official Statistics 16(2): 113–131. Available at: https://www.scb.se/contentassets/ca21efb41fee47d293bbee5bf7be7fb3/nearest-neighbor-imputation-for-survey-data.pdf.
  7. Conti, P.L., D. Marella, and M. Scanu 2012. “Uncertainty analysis in statistical matching”. Journal of Official Statistics 28(1): 69–88. Available at: https://www.scb.se/contentassets/ca21efb41fee47d293bbee5bf7be7fb3/uncertainty-analysis-in-statistical-matching.pdf.
  8. Dalla Chiara, E., Menon, M., and F. Perali, F. 2019. “An Integrated Database to Measure Living Standards”. Journal of Official Statistics 35(3): 531–576. DOI: https://doi.org/10.2478/JOS-2019-0023.10.2478/jos-2019-0023
  9. Donatiello, G., M. D’Orazio, D. Frattarola, A. Rizzi, M. Scanu, and M. Spaziani. 2014. “Statistical matching of income and consumption expenditures”. International Journal of Economic Sciences 3(3): 50–65.
  10. D’Orazio, M. 2020. Statmatch: Statistical matching or data fusion: R-package. Available at: https://cran.r-project.org/web/packages/StatMatch/StatMatch.pdf (accessed September 2021).
  11. D’Orazio, M., M. Di Zio, and M. Scanu. 2006a. Statistical matching: Theory and practice, John Wiley & Sons.10.1002/0470023554
  12. D’Orazio, M., M. Di Zio, and M. Scanu. 2006b. “Statistical Matching for Categorical Data: Displaying Uncertainty and Using Logical Constraints”. Journal of Official Statistics 22(1): l37–157. Available at: https://www.scb.se/contentassets/ca21efb41-fee47d293bbee5bf7be7fb3/statistical-matching-for-categorical-data-displaying-uncertainty-and-using-logical-constraints.pdf.
  13. D’Orazio, M., Frattarola, D., A. Rizzi, A., M. Scanu, and M. Spaziani. 2018, The statistical matching of EU-SILC and HBS at ISTAT: where do we stand for the production of official statistics. Available at: https://www.istat.it/it/les//2018/11/Scanuoriginal-paper.pdf (accessed September 2021).
  14. Endres, E., P. Fink, and T. Augustin. 2019. “Imprecise Imputation: A Nonparametric Micro Approach Reecting the Natural Uncertainty of Statistical Matching with Categorical Data”. Journal of Official Statistics 35(3): 599–624. DOI: http://doi.org/10.2478/JOS-2019-0025.10.2478/jos-2019-0025
  15. EU-SILC SUF DE. 2015. European union statistics on income and living conditions. Scientific use file Germany. Available at: https://ec.europa.eu/eurostat/cros/EU-SILCSUF_en.
  16. EU-SILC SUF FR. 2015. European union statistics. Scientific use file France. Available at: https://ec.europa.eu/eurostat/cros/EU-SILC-SUF_en.
  17. Eurostat. 2013. European household income by groups of households. Available at: https://ec.europa.eu/eurostat/documents/3888793/5858173/KS-RA-13-023-EN.PDF (accessed September 2021).
  18. Eurostat. 2016. Methodological Guidelines and Description of EU-SILC Target Variables: DocSILC065, 2015 Operation. Available at: https://circabc.europa.eu/sd/a/afb4601b-4e5c-4f40-86bb-0c3d0d94aa12/DOCSILC065operation2015VERSION08-08-2016.pdf.
  19. Eurostat. 2018. R code to match EU-SILC and HBS.
  20. Fosdick, B.K., M. DeYoreo, and J.P. Reiter. 2016. “Categorical data fusion using auxiliary information”. The Annals of Applied Statistics 10(4): 1907–1929. DOI: https://doi.org/10.1214/16-AOAS925.10.1214/16-AOAS925
  21. Gabler, S. 1997. “Datenfusion”. ZUMA-Nachrichten 21(40): 81–92.
  22. Gilula, Z., R.E. McCulloch, and P.E. Rossi. 2006. “A direct approach to data fusion”. Journal of Marketing Research 43(1): 73–83. DOI: https://doi.org/10.1509/jmkr.43.1.73.10.1509/jmkr.43.1.73
  23. Gower, J.C. 1971. “A general coefficient of similarity and some of its properties”. Biometrics 27(4): 857–871. DOI: https://doi.org/10.2307/2528823.10.2307/2528823
  24. Kamakura, W.A., and M. Wedel. 1997. “Statistical data fusion for cross-tabulation”. Journal of Marketing Research 34(4): 485–498. DOI: https://doi.org/10.1177/002224379703400406.10.1177/002224379703400406
  25. Kiesl, H., and S. Rässler. 2005. “Techniken und Einsatzgebiete von Datenintegration und Datenfusion”. In Datenfusion und Datenintegration: 6. Wissenschaftliche Tagung, Tagungsberichte, Bonn: 17–32.
  26. Kiesl, H., and S. Rässler. 2006. How valid can data fusion be? Available at: http://doku.iab.de/discussionpapers/2006/dp1506.pdf (accessed September 2021).
  27. Kim, J.K. 2002. “A note on approximate bayesian bootstrap imputation”. Biometrika 89(2): 470–477. DOI: https://doi.org/10.1093/biomet/89.2.470.10.1093/biomet/89.2.470
  28. Kleinke, K. 2017. “Multiple imputation under violated distributional assumptions: A systematic evaluation of the assumed robustness of predictive mean matching”. Journal of Educational and Behavioral Statistics 42(4): 371–404. DOI: https://doi.org/10.3102/1076998616687084.10.3102/1076998616687084
  29. Koller-Meinfelder, F. 2009. Analysis of Incomplete Survey Data – Multiple Imputation via Bayesian Bootstrap Predictive Mean Matching. PhD thesis, Bamberg. Available at: https://fis.uni-bamberg.de/bitstream/uniba/213/2/Dokument_1.pdf.
  30. Koschnick, W.J. 1995. Standard-Lexikon für Mediaplanung und Mediaforschung in Deutschland: Bd. 1.2, 2., überarb. aufl. edn, Saur, München.
  31. Lamarche, P. 2017. Measuring Income, Consumption and Wealth jointly at the Micro-Level. Eurostat. Available at: https://ec.europa.eu/eurostat/documents/7894008/8074103/income_methodological_note.pdf.
  32. Lamarche, P. 2018. Measuring Income, Consumption and Wealth jointly at the microlevel. Eurostat.
  33. Landerman, L.R., K.C. Land, and C.F. Pieper. 1997. “An empirical evaluation of the predictive mean matching method for imputing missing values”. Sociological Methods & Research 26(1): 3–33. DOI: https://doi.org/10.1177/0049124197026001001.10.1177/0049124197026001001
  34. Leulescu, A. and M. Agafitei. 2013, Statistical matching: A model based approach for data integration. Available at: https://ec.europa.eu/eurostat/documents/3888793/5855821/KS-RA-13-020-EN.PDF (accessed September 2021).
  35. Little, R.J.A. 1988. “Missing-data adjustments in large surveys”. Journal of Business & Economic Statistics 6(3): 287–296.10.1080/07350015.1988.10509663
  36. Little, R.J.A., and D.B. Rubin. 2020. Statistical analysis with missing data, third edition, John Wiley & Sons.10.1002/9781119482260
  37. Lumley, T., and A. Miller. 2020. leaps: Regression subset selection: R-package. Available at: https://cran.r-project.org/web/packages/leaps/leaps.pdf (accessed September 2021).
  38. Meinfelder, F. 2013. “Datenfusion: Theoretische implikationen und praktische umsetzung”. In Weiterentwicklung der amtlichen Haushaltsstatistiken, edited by T. Riede, N. Ott, S. Bechthold, T. Schmidt, M. Eisele, B. Schimpl-Neimanns, F. Meinfelder, R. MŁunnich, J.P. Burgard and T. Zimmermann: 83–98.
  39. Meinfelder, F., and T. Schnapp. 2015. Baboon: Bayesian bootstrap predictive mean matching – multiple and single imputation for discrete data: R-package. Available at: https://cran.r-project.org/web/packages/BaBooN/BaBooN.pdf (accessed September 2021).
  40. Meng, X.-L. 1994. “Multiple-imputation inferences with uncongenial sources of input”. Statistical Science 9(4): 538–558. DOI: https://doi.org/10.1214/ss/1177010269.10.1214/ss/1177010269
  41. Okner, B. 1972. “Constructing a new data base from existing microdata sets: The 1966 merge file”. In Annals of Economic and Social Measurement, 3(1): 325–362, National Bureau of Economic Research, Inc.
  42. Parzen, M., Lipsitz, S.R., and G.M. Fitzmaurice. 2005. “A note on reducing the bias of the approximate bayesian bootstrap imputation variance estimator”. Biometrika 92(4): 971–974. DOI: https://doi.org/10.1093/biomet/92.4.971.10.1093/biomet/92.4.971
  43. Pfeffermann, D., and A. Sikov. 2011. “Imputation and Estimation under Nonignorable Nonresponse in Household Surveys with Missing Covariate Information”. Journal of Official Statistics 27(2): 181–209. Available at: https://www.scb.se/contentassets/-ca21efb41fee47d293bbee5bf7be7fb3/imputation-and-estimation-under-nonignorablenonresponse-in-household-surveys-with-missing-covariate-information.pdf (accessed March 2022).
  44. Quartagno, M., J.R. Carpenter, and H. Goldstein. 2020. “Multiple imputation with survey weights: a multilevel approach”. Journal of Survey Statistics and Methodology 8(5): 965–989. DOI: https://doi.org/10.1093/jssam/smz036.10.1093/jssam/smz036
  45. R Core Team. 2021. R: A Language and Environment for Statistical Computing, R. Foundation for Statistical Computing, Vienna, Austria. Available at: https://www.R-project.org/ (accessed September 2021).
  46. Rässler, S. 2002. “Statistical matching: A frequentist theory, practical applications, and alternative Bayesian approaches”. Vol. 168 of Lecture notes in statistics, Springer, New York.10.1007/978-1-4613-0053-3_2
  47. Rodgers, W.L. 1984. “An evaluation of statistical matching”. Journal of Business & Economic Statistics 2: 91–102. DOI: https://doi.org/10.2307/1391358.10.2307/1391358
  48. Rubin, D.B. 1978. “Multiple imputation in sample surveys – a phenomological bayesian approach to nonresponse”. In Proceedings of the Survey Research Method Section of the American Statistical Association: 20–40. Available at: http://www.asasrms.org/GGTSPU-f422b6f0b7825427-56279-110474-QWt4FYDtNN9fK3kX-LOD/Proceedings/papers/1978_004.pdf.
  49. Rubin, D.B. 1986. “Statistical matching using file concatenation with adjusted weights and multiple imputations”. Journal of Business & Economic Statistics 4(1): 87–94.10.1080/07350015.1986.10509497
  50. Rubin, D.B. 1987. Multiple Imputation for Nonresponse in Surveys, Wiley, New York.10.1002/9780470316696
  51. Rubin, D.B., and N. Schenker. 1986. “Multiple imputation for interval estimation from simple random samples with ignorable nonresponse”. Journal of the American Statistical Association 81(394): 366–374. DOI: https://doi.org/10.2307/1391390.10.2307/1391390
  52. Serafino, P., and R. Tonkin. 2017. Statistical Matching of European Union Statistics on Income and Living Conditions (EU-SILC) and the Household Budget Survey, Eurostat. Available at:. https://ec.europa.eu/eurostat/documents/3888793/7882299/KS-TC-16-026-ENN.pdf (accessed September 2021).
  53. Sims, C.A. 1972. “Comments (on Okner 1972)”. Annals of Economic and Social Measurement 1: 343–345.
  54. Singh, A.C., H.J. Mantel, M.D. Kinack, and G. Rowe. 1993. “Statistical matching: Use of auxiliary information as an alternative to the conditional independence assumption”. Survey Methodology 19(1): 59–79.
  55. Stiglitz, J., Sen, A., and J. Fitoussi. 2009. Report of the Commission on the Measurement of Economic Performance and Social Progress (CMEPSP). Avasilable at: https://ec.europa.eu/eurostat/documents/8131721/8131772/Stiglitz-Sen-Fitoussi-Commission-report.pdf.
  56. Uçar, B., and G. Betti. 2016. Longitudinal statistical matching: transferring consumption expenditure from hbs to silc panel survey, Technical report, Department of Economics, University of Siena. Available at: http://repec.deps.unisi.it/quaderni/739.pdf.
  57. Van Buuren, S. 2018. Flexible imputation of missing data, CRC press.10.1201/9780429492259
  58. Van Buuren, S. 2021. Mice: Multivariate imputation by chained equations: R-package. Available at: https://cran.r-project.org/web/packages/mice/mice.pdf (accessed September 2021).
  59. Van Buuren, S. and K. Groothuis-Oudshoorn. 2011. “Mice: Multivariate imputation by chained equations in r”. Journal of Statistical Software 45(3): l–67.10.18637/jss.v045.i03
  60. Van der Putten, P., Kok, J.N., and A. Gupta. 2002. Data fusion through statistical matching: Working paper 4342-02, MIT Sloan School of Management. DOI: http://doi.org/10.2139/ssrn.297501.10.2139/ssrn.297501
  61. Webber, D. and R. Tonkin. 2013. Statistical Matching of EU-SILC and the Household Budget Survey to Compare Poverty Estimates Using Income, Expenditures and Material Deprivation, Eurostat. Available: https://ec.europa.eu/eurostat/documents/3888793/5857145/KS-RA-13-007-EN.PDF (accessed September 2021).
  62. Xie, X. and X.-L. Meng. 2017. “Dissecting multiple imputation from a multi-phase inference perspective: What happens when god’s, imputer’s and analyst’s models are uncongenial?”. Statistica Sinica: l485–1545. DOI: https://doi.org/10.5705/ss.2014.067.10.5705/ss.2014.067
  63. Zhang, L.-C. 2015. “On Proxy Variables and Categorical Data Fusion”. Journal of Official Statistics 31(4): 783–807. DOI: http://doi.org/10.1515/JOS-2015-0045.10.1515/jos-2015-0045
  64. Zhou, H. 2014. Accounting for Complex Sample Designs in Multiple Imputation Using the Finite Population Bayesian Bootstrap, PhD thesis, Michigan. DOI: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.911.6156&rep=rep1&type=pdf.
Language: English
Page range: 509 - 532
Submitted on: Nov 1, 2020
|
Accepted on: Sep 1, 2021
|
Published on: Jun 14, 2022
Published by: Sciendo
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2022 Florian Meinfelder, Jannik Schaller, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.