Have a personal or library account? Click to login
Controlling for Selection Bias in Social Media Indicators through Official Statistics: a Proposal Cover

Controlling for Selection Bias in Social Media Indicators through Official Statistics: a Proposal

Open Access
|Jun 2020

References

  1. Alajajian, S.E., J.R. Williams, A.J. Reagan, S.C. Alajajian, M.R. Frank, L. Mitchell, J. Lahne, C.M. Danforth, and P.S. Dodds. 2017. “The Lexicocalorimeter: Gauging public health through caloric input and output on social media.” PLOS ONE 12(2)(February): 1–25. DOI: https://doi.org/10.1371/journal.pone.0168893.10.1371/journal.pone.0168893530285328187216
  2. Baker, R., J.M. Brick, N.A. Bates, M. Battaglia, M.P. Couper, J.A. Dever, K.J. Gile, and R. Tourangeau. 2013. “Summary Report of the AAPOR Task Force on Non-probability Sampling.” Journal of Survey Statistics and Methodology 1(2): 90. DOI: https://doi.org/10.1093/jssam/smt008.10.1093/jssam/smt008
  3. Bollen, J., B. Gonçalves, G. Ruan, and H. Mao. 2011. “Happiness is Assortative in Online Social Networks.” Artif. Life (Cambridge, MA, USA) 17(3)(August): 237–251. DOI: https://doi.org/10.1162/artl_a_00034.10.1162/artl_a_0003421554117
  4. Braaksma, B. and K. Zeelenberg. 2015. “Re-make/Re-model: Should big data change the modelling paradigm in official statistics?” Statistical Journal of the IAOS 31(2): 193–202. DOI: https://doi.org/10.3233/sji-150892.10.3233/sji-150892
  5. Ceron, A., L. Curini, and S.M. Iacus. 2016. “iSA: A fast, scalable and accurate algorithm for sentiment analysis of social media content.” Information Sciences 367–368: 105–124. ISSN: 0020-0255. DOI: https://doi.org/10.1016/j.ins.2016.05.052.10.1016/j.ins.2016.05.052
  6. Clark, A.E. and A.J. Oswald. 1994. “Unhappiness and Unemployment.” Economic Journal 104(424): 648–659. DOI: https://doi.org/10.2307/2234639.10.2307/2234639
  7. Cooper, D. and M. Greenaway. 2015. Non-probability Survey Sampling in Official Statistics. Office for National Statistics – Methodology Working Paper Series N4. Available at: https://www.k/ons/guide-method/method-quality/specific/gss-methodology-series/ons-working-paper-series/mwp3-non-probability-survey-sampling-inofficial-statistics.pdf (accessed May 2020).
  8. Couper, M.P. 2013. “Is the Sky Falling? New Technology, Changing Media, and the Future of Surveys.” Survey Research Methods 7(3): 145–156. ISSN: 1864-3361. DOI: https://doi.org/10.18148/srm/2013.v7i3.5751.
  9. Culotta, A. 2014. “Estimating County Health Statistics with Twitter.” In Proceedings of the 32nd Annual ACM Conference on Human Factors in Computing Systems, 1335–1344. CHI ’14. Toronto, Ontario, Canada: ACM. ISBN: 978-1-4503-2473-1. DOI: https://doi.org/10.1145/2556288.2557139.10.1145/2556288.2557139
  10. Curini, L., S. Iacus, and L. Canova. 2015. “Measuring Idiosyncratic Happiness Through the Analysis of Twitter: An Application to the Italian Case.” Social Indicators Research 121(2): 525–542. ISSN: 1573-0921. DOI: https://doi.org/10.1007/s11205-014-0646-2.10.1007/s11205-014-0646-2
  11. Daas, P.J.H., M.J. Puts, B. Buelens, and P. A.M. van den Hurk. “Big Data as a Source for Official Statistics.” Journal of Official Statistics 31(2): 249–262. DOI: https://doi.org/10.1515/jos-2015-0016.10.1515/jos-2015-0016
  12. Deaton, A. 2011. “The Financial Crisis and the Well-Being of America.” In Investigations in the Economics of Aging, edited by David A. Wise, 343–368. University of Chicago Press, June.10.7208/chicago/9780226903163.003.0011
  13. Falorsi, S., A. Fasulo, A. Naccarato, and M. Pratesi. 2017. Small Area model for Italian regional monthly estimates of young unemployed using Google Trends Data. 61st World Congress of the International Statistical Institute 16–21 July 2017 – Marrakech, Marocco, October. Available at: https://www.researchgate.net/publication/320554956_Small_Area_model_for_Italian_regional_monthly_estimates_of_young_unemployed_using_Google_Trends_Data (accessed May 2020).
  14. Fay, R.E. and R.A. Herriot. 1979. “Estimates of Income for Small Places: An Application of James-Stein Procedures to Census Data.” Journal of the American Statistical Association 74(366): 269–277. ISSN: 01621459. DOI: https://doi.org/10.2307/2286322.10.2307/2286322
  15. Feddersen, J., R. Metcalfe, and M. Wooden. 2016. “Subjective wellbeing: why weather matters.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 179(1): 203–228. ISSN: 1467-985X. DOI: https://doi.org/10.1111/rssa.12118.10.1111/rssa.12118
  16. Fleurbaey, M. 2009. “Beyond GDP: The Quest for a Measure of Social Welfare.” Journal of Economic Literature 47(4): 1029–1075. DOI: https://doi.org/10.1257/jel.47.4.1029.10.1257/jel.47.4.1029
  17. Ghosh, M., N. Nangia, and D.H. Kim. 1996. “Estimation of Median Income of Four-Person Families: A Bayesian Time Series Approach.” Journal of the American Statistical Association 91(436): 1423–1431. ISSN: 01621459. DOI: https://doi.org/10.2307/2291568.10.1080/01621459.1996.10476710
  18. Heckman, J.J. 1979. “Sample Selection Bias as a Specification Error.” Econometrica 47(1): 153–161. ISSN 00129682, 14680262. DOI: https://doi.org/10.2307/1912352.10.2307/1912352
  19. Henderson, C.R. 1975. “Best Linear Unbiased Estimation and Prediction under a Selection Model.” Biometrics 31(2): 423–447. ISSN 0006341X, 15410420. DOI: https://doi.org/10.2307/2529430.10.2307/2529430
  20. Hofacker, C.F., E.C. Malthouse, and F. Sultan. 2016. “Big Data and consumer behavior: imminent opportunities.” Journal of Consumer Marketing 33(2): 89–97. DOI: https://doi.org/10.1108/JCM-04-2015-1399.10.1108/JCM-04-2015-1399
  21. Iacus, S.M. 2014. “Big Data or Big Fail?” The Good, the Bad and the Ugly and the missing role of Statistics. Electronic Journal of Applied Statistical Analysis: Decision Support Systems and Services Evaluation 5(1): 4–11. DOI: https://doi.org/10.1285/i2037-3627v5n1p4.
  22. Iacus, S.M., G. Porro, S. Salini, and E. Siletti. 2015. “Social networks, happiness and health: from sentiment analysis to a multidimensional indicator of subjective well-being.” ArXiv e-prints Statistics – Applications (December): 1–26. Available at: 1512.01569 [stat.AP] (accessed December 2015).
  23. Iacus, S.M., G. Porro, S. Salini, and E. Siletti. 2017. “How to exploit big data from social networks: a subjective well-being indicator via Twitter.” In SIS 2017. Statistics and data science: new challenges, new generations. Proceedings of the Conference of the Italian Statistical Society, edited by Alessandra Petrucci and Rosanna Verde, 537–542. 28–30 June 2017, Firenze: Firenze University Press. ISBN: 978-88-6453-521-0
  24. Iacus, S.M., G. Porro, S. Salini, and E. Siletti. 2019. “Social Networks Data and Subjective Well-Being. An Innovative Measurement for Italian Provinces.” Scienze Regionali, Italian Journal of Regional Science Speciale (2019): 667–678. ISSN: 1720-3929. DOI: https://doi.org/10.14650/94673.
  25. Kahneman, D. and A.B. Krueger. 2006. “Developments in the Measurement of Subjective Well-Being.” Journal of Economic Perspectives 20(1): 3–24. DOI: https://doi.org/10.1257/089533006776526030.10.1257/089533006776526030
  26. King, G. 2011. “Ensuring the Data Rich Future of the Social Sciences.” Science 331(February): 719–721. DOI: https://doi.org/10.1126/science.1197872.10.1126/science.119787221311013
  27. King, G. 2016. “Preface: Big Data is Not About the Data!” Chap. 1 in Computational Social Science: Discovery and Prediction, edited by R. Michael Alvarez, 1–10. Cambridge: Cambridge University Press.
  28. King, G., J. Pan, and M.E. Roberts. 2013. “How Censorship in China Allows Government Criticism but Silences Collective Expression.” American Political Science Review 107(2): 326–343. DOI: https://doi.org/10.1017/S0003055413000014.10.1017/S0003055413000014
  29. King, G., J. Pan, and M.E. Roberts. 2014. “Reverse-engineering censorship in China: Randomized experimentation and participant observation.” Science 345(6199): 891–913. ISSN: 0036-8075. DOI: https://doi.org/10.1126/science.1251722.10.1126/science.125172225146296
  30. King, G., J. Pan, and M.E. Roberts. 2017. “How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, Not Engaged Argument.” American Political Science Review 111(3): 484 – 501. DOI: https://doi.org/10.1017/S0003055417000144.10.1017/S0003055417000144
  31. Kitchin, R. 2015. “The opportunities, challenges and risks of big data for official statistics.” Statistical Journal of the IAOS 31(3): 471–481. DOI: https://doi.org/10.3233/SJI-150906.10.3233/SJI-150906
  32. Kwong, B.M., S.M. McPherson, J.F.A. Shibata, and O.T. Zee. 2012. “Facebook: Data mining the world’s largest focus group.” Graziadia Business Review 15: 1–8. Available at: https://gbr.pepperdine.edu/2012/11/facebook-data-mining-the-worlds-largest-focus-group/ (accessed April 2020).
  33. Lazer, D., A. Pentland, L. Adamic, S. Aral, A.-L. Barabási, D. Brewer, N. Christakis, N. Contractor, J. Fowler, M. Gutmann, T. Jebara, G. King, M. Macy, D. Roy, and M. van Alstyne. 2009. “Computational Social Science.” Science 323(5915): 721–723. DOI: https://doi.org/10.1126/science.1167742.10.1126/science.1167742274521719197046
  34. Marchetti, S., C. Giusti, and M. Pratesi. 2016. “The use of Twitter data to improve small area estimates of households’ share of food consumption expenditure in Italy.” AStA Wirtschafts – und Sozialstatistisches Archiv 10(2)(October): 79–93. ISBN 1863-8163. DOI: https://doi.org/10.1007/s11943-016-0190-4.10.1007/s11943-016-0190-4
  35. Marchetti, S., C. Giusti, M. Pratesi, N. Salvati, F. Giannotti, D. Pedreschi, S. Rinzivillo, L. Pappalardo, and L. Gabrielli. 2015. “Small Area Model-Based Estimators Using Big Data Sources.” Journal of Official Statistics 31(2): 263–281. DOI: https://doi.org/10.1515/jos-2015-0017.10.1515/jos-2015-0017
  36. Marhuenda, Y., I. Molina, and D. Morales. 2013. “Small area estimation with spatio-temporal Fay-Herriot models.” The Third Special Issue on Statistical Signal Extraction and Filtering, Computational Statistics & Data Analysis 58: 308–325. ISSN: 0167-9473. DOI: https://doi.org/10.1016/j.csda.2012.09.002.10.1016/j.csda.2012.09.002
  37. Molina, I. and Y. Marhuenda. 2015. “sae: An R package for small area estimation.” The R Journal 7(1): 81–98. DOI: https://doi.org/10.32614/RJ-2015-007.10.32614/RJ-2015-007
  38. Murphy, J., M.W. Link, J. Childs, C. Tesfaye, E. Dean, M. Stern, J. Pasek, J. Cohen, M. Callegaro, and P. Harwood. 2014. “Social Media in Public Opinion Research Executive summary of the AAPOR task force on Emerging Technologies in Public Opinion Research.” Public Opinion Quarterly 78(4): 788–794. DOI: https://doi.org/10.1093/poq/nfu053.10.1093/poq/nfu053
  39. New Economics Foundation. 2012. The Happy Planet Index: 2012 Report. A global index of sustainable well-being. New Economics Foundation. Available at: https://neweconomics.org/uploads/files/d8879619b64bae461f_opm6ixqee.pdf (accessed August 2015).
  40. Pentland, A. 2014. Social Physics: how good ideas spread – the lessons from a new science. EBL-Schweitzer. Scribe Publications Pty Limited. ISBN: 978113143.
  41. Porter, A.T., S.H. Holan, C.K. Wikle, and N. Cressie. 2014. “Spatial Fay-Herriot models for small area estimation with functional covariates.” Spatial Statistics 10: 27–42. DOI: https://doi.org/10.1016/j.spasta.2014.07.001.10.1016/j.spasta.2014.07.001
  42. Rao, J.N.K. and M. Yu. 1994. “Small-Area Estimation by Combining Time-Series and Cross-Sectional Data.” The Canadian Journal of Statistics 22(4): 511–528. ISSN: 03195724. DOI: https://doi.org/10.2307/3315407.10.2307/3315407
  43. Rao, J.N.K. 2005. Small Area Estimation. Wiley Series in Survey Methodology. John Wiley & Sons, January. ISBN: 9780471431626.
  44. Rosembaum, P.R. and D.B. Rubin. 1983. “The central role of the propensity score in observational studies for causal effects.” Biometrika 70(1): 41 – 55. DOI: https://doi.org/10.2307/2335942.10.1093/biomet/70.1.41
  45. Schwarz, N. 1999. “Self-reports: how the questions shape the answers.” American psychologist 54(2): 93–105. DOI: https://doi.org/10.1037/0003-066X.54.2.93.10.1037/0003-066X.54.2.93
  46. Schwarz, N. and F. Strack. 1999. “Reports of subjective well-being: Judgmental processes and their methodological implications.” In Well-being: The foundations of hedonic psychology, edited by D. Kahneman, E. Diener, and N. Schwarz, 7: 61–84. New York: Russell Sage Foundation.
  47. Severo, M., A. Feredj, and A. Romele. 2016. “Soft Data and Public Policy: Can Social Media Offer Alternatives to Official Statistics in Urban Policymaking?” Policy & Internet 8(3)(September): 354–372. ISSN: 1944-2866. DOI: https://doi.org/10.1002/poi3.127.10.1002/poi3.127
  48. Singh, B.B., G.K. Shukla, and D. Kundu. 2005. “Spatio-temporal models in small area estimation.” Survey Methodology 31(2): 183–195. DOI: https://doi.org/10.1.1.617.1513.
  49. Stiglitz, J., A. Sen, and J.-P. Fitoussi. 2009. Report by the Commission on the Measurement of Economic Performance and Social Progress. INSEE. Available at: https://www.researchgate.net/publication/258260767_Report_of_the_Commission_on_the_Measurement_of_Economic_Performance_and_Social_Progress_CMEPSP (accessed April 2020).
  50. Struijs, P., B. Braaksma, and P.J.H. Daas. 2014. “Official statistics and Big Data.” Big Data & Society 1(1): 1–6. DOI: https://doi.org/10.1177/2053951714538417.10.1177/2053951714538417
  51. Tam, S.-M. and F. Clarke. 2015. “Big Data, Official Statistics and Some Initiatives by the Australian Bureau of Statistics.” International Statistical Review 83(3)(December): 436–448. DOI: https://doi.org/10.1111/insr.12105.10.1111/insr.12105
  52. Van den Brakel, J., J. Söhler, P.J.H. Daas, and B. Buelens. 2017. “Social media as a data source for official statistics; the Dutch Consumer Conhdence Index.” Survey Methodology 12-001-X (43): 183–210. DOI: https://doi.org/10.13140/RG.2.2.19294.64326.
  53. Winkelmann, R. 2014. “Unhappiness and Unemployment.” IZA World of Labor 94. DOI: https://doi.org/10.15185/izawol.94.10.15185/izawol.94
  54. Ybarra, L.M.R. and S.L. Lohr. 2008. “Small Area Estimation When Auxiliary Information Is Measured with Error.” Biometrika 95(4): 919–931. ISSN: 00063444. DOI: https://doi.org/10.1093/biomet/asn048.10.1093/biomet/asn048
  55. Zhao, Y., F. Yu, B. Jing, X. Hu, A. Luo, and K. Peng. 2018. “An Analysis of Well-Being Determinants at the City Level in China Using Big Data.” Social Indicators Research (October). ISSN: 1573-0921. DOI: https://doi.org/10.1007/s11205-018-2015-z.10.1007/s11205-018-2015-z
Language: English
Page range: 315 - 338
Submitted on: Mar 1, 2019
|
Accepted on: Jan 1, 2020
|
Published on: Jun 15, 2020
Published by: Sciendo
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2020 Stefano M. Iacus, Giuseppe Porro, Silvia Salini, Elena Siletti, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.