Have a personal or library account? Click to login
Versioning Data Is About More than Revisions: A Conceptual Framework and Proposed Principles Cover

Versioning Data Is About More than Revisions: A Conceptual Framework and Proposed Principles

Open Access
|Mar 2021

References

  1. 1Albertoni, R, Browning, D, Cox, SJD, Gonzalez-Beltran, A, Perego, A, Winstanley, P, Maali, F and Erickson, JS. 2019. Data Catalog Vocabulary (DCAT) – Version 2 (W3C Proposed Recommendation). cambridge, ma: World Wide Web Consortium (W3C). Available at https://www.w3.org/TR/2019/PR-vocab-dcat-2-20191119/.
  2. 2Allison, DB, Brown, AW, George, BJ and Kaiser, KA. 2016. Reproducibility: A tragedy of errors. Nature News, 530(7588): 27. DOI: 10.1038/530027a
  3. 3Asch, M, Moore, T, Badia, R, Beck, M, Beckman, P, Bidot, T, Bodin, F, Cappello, F, Choudhary, A, de Supinski, B, Deelman, E, Dongarra, J, Dubey, A, Fox, G, Fu, H, Girona, S, Gropp, W, Heroux, M, Ishikawa, Y, Keahey, K, Keyes, D, Kramer, W, Lavignon, J-F, Lu, Y, Matsuoka, S, Mohr, B, Reed, D, Requena, S, Saltz, J, Schulthess, T, Stevens, R, Swany, M, Szalay, A, Tang, W, Varoquaux, G, Vilotte, J-P, Wisniewski, R, Xu, Z and Zacharov, I. 2018. Big data and extreme-scale computing: Pathways to Convergence-Toward a shaping strategy for a future software and data ecosystem for scientific inquiry. The International Journal of High Performance Computing Applications, 32(4): 435479. DOI: 10.1177/1094342018778123
  4. 4Bryan, J. 2018. Excuse Me, Do You Have a Moment to Talk About Version Control? The American Statistician, 72(1): 2027. DOI: 10.1080/00031305.2017.1399928
  5. 5Ciccarese, P, Soiland-Reyes, S, Belhajjame, K, Gray, AJ, Goble, C and Clark, T. 2013. PAV ontology: provenance, authoring and versioning. Journal of Biomedical Semantics, 4(1): 37. DOI: 10.1186/2041-1480-4-37
  6. 6Cudahy, T. 2012. Satellite ASTER Geoscience Product Notes for Australia (No. EP125895). Canberra, Australia: Commonwealth Scientific and Industrial Research Organisation. [Last accessed 20 January 2020]. DOI: 10.4225/08/584d948f9bbd1
  7. 7DataCite Metadata Working Group. 2018. DataCite Metadata Schema Documentation for the Publication and Citation of Research Data (No. Version 4.2). Hannover, Germany: DataCite e.V. DOI: 10.5438/bmjt-bx77
  8. 8Dataset Exchange Working Group. 2017. Dataset Exchange Working Group. W3C Dataset Exchange Working Group. Available at https://www.w3.org/2017/dxwg/wiki/Main_Page [Last accessed 20 March 2019].
  9. 9Diepenbroek, M, Grobe, H, Reinke, M, Schindler, U, Schlitzer, R, Sieger, R and Wefer, G. 2002. PANGAEA – an information system for environmental sciences. Computers & Geosciences, 28(10): 12011210. DOI: 10.1016/S0098-3004(02)00039-0
  10. 10ESIP Data Preservation and Stewardship Committee. 2019. Data Citation Guidelines for Earth Science Data, Version 2. Earth Science Information Partners. DOI: 10.6084/m9.figshare.8441816.v1
  11. 11Fitzpatrick, B, Pilato, CM and Collins-Sussman, B. 2009. Version Control with Subversion. Sebastopol, CA: O’Reilly Media, Inc. Available at http://svnbook.red-bean.com/ [Last accessed 11 March 2019].
  12. 12Haller, A, Janowicz, K, Cox, SJD, Lefrançois, M, Phuoc, DL, Lieberman, J, García-Castro, R, Atkinson, RA and Stadler, C. 2018. The Modular SSN Ontology: A Joint W3C and OGC Standard Specifying the Semantics of Sensors, Observations, Sampling, and Actuation | www.semantic-web-journal.net. Semantic Web Journal, online (1878). Available at http://www.semantic-web-journal.net/content/modular-ssn-ontology-joint-w3c-and-ogc-standard-specifying-semantics-sensors-observations [Last accessed 11 June 2018]. DOI: 10.3233/SW-180320
  13. 13Hourclé, JA. 2009. FRBR applied to scientific data. Proceedings of the American Society for Information Science and Technology, 45(1): 14. DOI: 10.1002/meet.2008.14504503102
  14. 14Klump, J, Huber, R and Diepenbroek, M. 2016. DOI for geoscience data – how early practices shape present perceptions. Earth Science Informatics, 9(1): 123136. DOI: 10.1007/s12145-015-0231-5
  15. 15Klump, J, Wyborn, LAI, Downs, RR, Asmi, A, Wu, M, Ryder, G and Martin, J. 2020a. Compilation of Data Versioning Use cases from the RDA Data Versioning Working Group. Research Data Alliance. [Last accessed 24 January 2020]. DOI: 10.15497/RDA00041
  16. 16Klump, J, Wyborn, LAI, Wu, M, Downs, RR, Asmi, A, Ryder, G and Martin, J. 2020b. Final Report of the Research Data Alliance Data Versioning Working Group – Principles and best practices in data versioning for all data sets big and small (Working Group Final Report). Kensington WA, Australia: Research Data Alliance. DOI: 10.15497/RDA00042
  17. 17König-Langlo, G and Gernandt, H. 2008. 426 ozonesonde profiles from Georg-Forster-Station (Data). Bremerhaven, Germany: Alfred Wegener Institute for Polar and Marine Research, Bremerhaven. [Last accessed 9 November 2010]. DOI: http://doi.pangaea.de/10.1594/PANGAEA.547983
  18. 18Lebo, T, Sahoo, S and McGuinness, D. 2013. PROV-O: The PROV Ontology (W3C Recommendation). Cambridge, MA: World Wide Web Consortium (W3C). Available at http://www.w3.org/TR/2013/REC-prov-o-20130430/
  19. 19Ledford, H and van Noorden, R. 2020. High-profile coronavirus retractions raise concerns about data oversight. Nature, 582(7811): 160160. DOI: 10.1038/d41586-020-01695-w
  20. 20Lin, D, Crabtree, J, Dillo, I, Downs, RR, Edmunds, R, Giaretta, D, De Giusti, M, L’Hours, H, Hugo, W, Jenkyns, R, Khodiyar, V, Martone, ME, Mokrane, M, Navale, V, Petters, J, Sierman, B, Sokolova, DV, Stockhause, M and Westbrook, J. 2020. The TRUST Principles for digital repositories. Scientific Data, 7(1): 144. DOI: 10.1038/s41597-020-0486-7
  21. 21Mehra, MR, Ruschitzka, F and Patel, AN. 2020. Retraction—Hydroxychloroquine or chloroquine with or without a macrolide for treatment of COVID-19: a multinational registry analysis. The Lancet, 395(10240): 1820. DOI: 10.1016/S0140-6736(20)31324-6
  22. 22National Aeronautics and Space Administration. 2019. Data Processing Levels. EARTHDATA. Available at https://earthdata.nasa.gov/collaborate/open-data-services-and-software/data-information-policy/data-levels/ [Last accessed 13 July 2020].
  23. 23Paskin, N. 2003. On Making and Identifying a “Copy.” D-Lib Magazine, 9(1). DOI: 10.1045/january2003-paskin
  24. 24Peng, RD. 2011. Reproducible Research in Computational Science. Science, 334(6060): 12261227. DOI: 10.1126/science.1213847
  25. 25Preston-Werner, T. 2013. Semantic Versioning 2.0.0. Semantic Versioning. Available at https://semver.org/spec/v2.0.0.html [Last accessed 7 March 2019].
  26. 26Rauber, A, Asmi, A, van Uitvanck, D and Pröll, S. 2016. Data Citation of Evolving Data: Recommendations of the Working Group on Data Citation (WGDC) (Technical Report). Denver, CO: Research Data Alliance. [Last accessed 21 September 2017]. DOI: 10.15497/RDA00016
  27. 27Razum, M, Schwichtenberg, F, Wagner, S and Hoppe, M. 2009. eSciDoc Infrastructure: A Fedora-Based e-Research Framework. In: Research and Advanced Technology for Digital Libraries. Heidelberg, Germany: Springer Verlag. pp. 227238. DOI: 10.1007/978-3-642-04346-8_23
  28. 28Software versioning. 2019. Wikipedia. Available at https://en.wikipedia.org/w/index.php?title=Software_versioning&oldid=886437916 [Last accessed 11 March 2019].
  29. 29Study Group on the Functional Requirements for Bibliographic Records. 1998. Functional Requirements for Bibliographic Records (No. 19). Munich, Germany: International Federation of Library Associations and Institutions. Available at http://www.ifla.org/publications/functional-requirements-for-bibliographic-records. DOI: 10.1515/9783110962451
  30. 30Taylor, K, Woodcock, R, Cuddy, S, Thew, P and Lemon, D. 2015. A Provenance Maturity Model. In: Denzer, R, Argent, RM, Schimak, G and Hřebíček, J (eds.), Environmental Software Systems. Infrastructures, Services and Applications. Cham, Switzerland: Springer International Publishing. pp. 118. [Last accessed 17 July 2015]. DOI: 10.1007/978-3-319-15994-2_1
  31. 31Wilkinson, MD, Dumontier, M, Packer, AL, Gray, AJG, Mons, A, Gonzalez-Beltran, A, Waagmeester, A, Baak, A, Brookes, AJ, Evelo, CT, Mons, B, Persson, B, Goble, C, Schultes, E, van Mulligen, E, Aalbersberg, IjJ, Appleton, G, Boiten, J-W, Dillo, I, Grethe, JS, Heringa, J, Strawn, G, Velterop, J, Bouwman, J, van der Lei, J, Kok, J, Zhao, J, Wolstencroft, K, da Silva Santos, LB, Roos, M, Thompson, M, Martone, ME, Crosas, M, Swertz, MA, Axton, M, Blomberg, N, Dumon, O, Groth, P, ’t Hoen, PAC, Wittenburg, P, Bourne, PE, Rocca-Serra, P, van Schaik, R, Finkers, R, Hooft, R, Kok, R, Edmunds, S, Lusher, SJ, Sansone, S-A, Slater, T, Sengstag, T, Clark, T and Kuhn, T. 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3: 160018. DOI: 10.1038/sdata.2016.18
  32. 32Wing, JM. 2019. The Data Life Cycle. Harvard Data Science Review, 1(1): 6. DOI: 10.1162/99608f92.e26845b4
Language: English
Submitted on: Jul 17, 2020
Accepted on: Jan 15, 2021
Published on: Mar 23, 2021
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2021 Jens Klump, Lesley Wyborn, Mingfang Wu, Julia Martin, Robert R. Downs, Ari Asmi, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.