Have a personal or library account? Click to login
Scaling Identifiers and their Metadata to Gigascale: An Architecture to Tackle the Challenges of Volume and Variety Cover

Scaling Identifiers and their Metadata to Gigascale: An Architecture to Tackle the Challenges of Volume and Variety

Open Access
|Mar 2023

References

  1. Albertoni, R, Browning, D, Cox, SJD, Gonzalez-Beltran, A, Perego, A and Winstanley, P. 2021. Data Catalog Vocabulary (DCAT) - Version 3 (W3C Proposed Recommendation). Cambridge, MA: World Wide Web Consortium (W3C). Available at https://www.w3.org/TR/vocab-dcat-3/.
  2. Barkstrom, BR. 2010. When is it sensible not to use XML? Earth Science Informatics, 4: 4553. DOI: 10.1007/s12145-010-0063-2
  3. Berners-Lee, T. 2009. Linked Data. W3C Design Issues. Available at https://www.w3.org/DesignIssues/LinkedData.html [Last accessed 29 October 2021].
  4. Buys, M and Lehnert, KA. 2021. Partnership between IGSN and DataCite. DataCite Blog. [Last accessed 3 November 2021]. DOI: 10.5438/7z70-1155
  5. Cousijn, H, Braukmann, R, Fenner, M, Ferguson, C, van Horik, R, Lammey, R, Meadows, A and Lambert, S. 2021. Connected research: The potential of the PID graph. Patterns, 2(1): 17. DOI: 10.1016/j.patter.2020.100180
  6. Damerow, JE, Varadharajan, C, Boye, K, Brodie, EL, Burrus, M, Chadwick, KD, Crystal-Ornelas, R, Elbashandy, H, Eloy Albes, RJ, Ely, KS, Goldman, AE, Habermann, T, Hendrix, V, Kakalia, Z, Kemner, KM, Kersting, AB, Merino, N, O’Brien, F, Perznan, Z, Robles, E, Sorensen, P, Stegen, JC, Walls, RL, Weisenhorn, P, Zavarin, M and Agarwal, D. 2021. Sample identifiers and metadata to support data management and reuse in multidisciplinary ecosystem sciences. Data Science Journal, 20(1): 11. DOI: 10.5334/dsj-2021-011
  7. Davies, N, Deck, J, Kansa, EC, Kansa, SW, Kunze, J, Meyer, C, Orrell, T, Ramdeen, S, Snyder, R, Vieglais, D, Walls, RL and Lehnert, K. 2021. Internet of samples (iSamples): Toward an interdisciplinary cyberinfrastructure for material samples. GigaScience, 10(giab028). DOI: 10.1093/gigascience/giab028
  8. De Smedt, K, Koureas, D and Wittenburg, P. 2020. FAIR digital objects for science: From data pieces to actionable knowledge units. Publications, 8(2): 21. DOI: 10.3390/publications8020021
  9. Devaraju, A and Huber, R. 2021. An automated solution for measuring the progress toward FAIR research data. Patterns, 2(11): 100370. DOI: 10.1016/j.patter.2021.100370
  10. Dinov, ID, Rubin, D, Lorensen, W, Dugan, J, Ma, J, Murphy, S, Kirschner, B, Bug, W, Sherman, M, Floratos, A, Kennedy, D, Jagadish, HV, Schmidt, J, Athey, B, Califano, A, Musen, M, Altman, R, Kikinis, R, Kohane, I, Delp, S, Parker, DS and Toga, AW. 2008. iTools: A framework for classification, categorization and integration of computational biology resources. PLoS ONE, 3(5): e2265. DOI: 10.1371/journal.pone.0002265
  11. Eco, U. 1997. The search for the perfect language. 2nd ed. Oxford, United Kingdom: Blackwell.
  12. Fils, D. 2021. gleanerio/gleaner-compose. Available at https://github.com/gleanerio/gleaner-compose [Last accessed 25 June 2021].
  13. Fils, D, Klump, J and Robertson, J. 2020. Connecting data to the physical world: IGSN 2040 sprint outcomes and recommendations (Technical Report). DOI: 10.5281/zenodo.3905364
  14. Ganske, A, Heydebreck, D, Höck, H, Kraft, A, Quaas, J and Kaiser, A. 2020. A short guide to increase FAIRness of atmospheric model data. Meteorologische Zeitschrift, 29(6): 483491. DOI: 10.1127/metz/2020/1042
  15. Genova, F, Arviset, C, Almas, BM, Bartolo, L, Broeder, D, Law, E and McMahon, B. 2017. Building a disciplinary, world-wide data infrastructure. Data Science Journal, 16(16). DOI: 10.5334/dsj-2017-016
  16. Guha, R. 2011. Official Google blog: Introducing schema.org: Search engines come together for a richer Web. Google Blog. Available at https://googleblog.blogspot.com/2011/06/introducing-schemaorg-search-engines.html [Last accessed 3 July 2020].
  17. Haller, A, Janowicz, K, Cox, SJD, Lefrançois, M, Phuoc, DL, Lieberman, J, García-Castro, R, Atkinson, RA and Stadler, C. 2019. The modular SSN ontology: A joint W3C and OGC standard specifying the semantics of sensors, observations, sampling, and actuation. Semantic Web, 10(1): 932. DOI: 10.3233/SW-180320
  18. Hardisty, A, Addink, W, Glöckler, F, Güntsch, A, Islam, S and Weiland, C. 2021. A choice of persistent identifier schemes for the Distributed System of Scientific Collections (DiSSCo). Research Ideas and Outcomes, 7: e67379. DOI: 10.3897/rio.7.e67379
  19. Jones, M, Richard, SM, Vieglais, D, Shepherd, A, Duerr, RE, Fils, D and McGibbney, LJ. 2021. Science-on-Schema.org v1.2.0. DOI: 10.5281/zenodo.4477164
  20. Klump, J and Huber, RX. 2017. 20 years of persistent identifiers – Which systems are here to stay? Data Science Journal, 16(9): 17. DOI: 10.5334/dsj-2017-009
  21. Klump, J, Lehnert, KA, Ulbricht, D, Devaraju, A, Elger, K, Fleischer, D, Ramdeen, S and Wyborn, LAI. 2021. Towards globally unique identification of physical samples: Governance and technical implementation of the IGSN global sample number. Data Science Journal, 20(33): 116. DOI: 10.5334/dsj-2021-033
  22. Klump, J, Lehnert, K, Wyborn, L and Ramdeen, S. 2020. IGSN 2040 Technical Steering Committee Meeting Report. Potsdam, Germany: IGSN e.V. DOI: 10.5281/zenodo.3724683
  23. Laney, D. 2001. 3D Data Management (No. 949). Stamford, CT: META Group. Available at https://web.archive.org/web/20120806062002/http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf.
  24. Lannom, L, Koureas, D and Hardisty, AR. 2019. FAIR data and services in biodiversity science and geoscience. Data Intelligence, 2(1–2): 122130. DOI: 10.1162/dint_a_00034
  25. Lehnert, KA, Goldstein, SL, Lenhardt, WC and Vinayagamoorthy, S. 2004. SESAR: Addressing the need for unique sample identification in the Solid Earth Sciences. In: AGU Fall Meeting 2004. Presented at the AGU Fall Meeting 2004. San Francisco, CA: American Geophysical Union. pp. SF32A-06. Available at http://adsabs.harvard.edu/abs/2004AGUFMSF32A..06L [Last accessed 10 May 2016].
  26. Lehnert, K, Klump, J, Ramdeen, S, Wyborn, L and Haak, L. 2021. IGSN 2040 Summary Report: Defining the Future of the IGSN as a Global Persistent Identifier for Material Samples. Zenodo. DOI: 10.5281/zenodo.5118289
  27. Lidwell, W, Holden, K and Butler, J. 2010. Universal Principles of Design, Revised and Updated. 2nd ed. Beverley, MA: Rockport Publishers. Available at https://learning.oreilly.com/library/view/universal-principles-of/9781592535873/.
  28. Lingerfelt, E, Fils, D and Shepherd, A. 2018. Project 418: A Funded Project of the EarthCube Science Support Office. Presented at the AGU Fall Meeting 2018. Washington, DC: American Geophysical Union. pp. IN31B-22. Available at https://agu.confex.com/agu/fm18/meetingapp.cgi/Paper/442533 [Last accessed 18 January 2022].
  29. Michel, F and The Bioschemas Community. 2018. Bioschemas & Schema.org: a Lightweight Semantic Layer for Life Sciences Websites. Biodiversity Information Science and Standards, 2: e25836. DOI: 10.3897/biss.2.25836
  30. Neumann, J and Brase, J. 2014. DataCite and DOI names for research data. Journal of Computer-Aided Molecular Design, 28(10): 10351041. DOI: 10.1007/s10822-014-9776-5
  31. Noy, N and Brickley, D. 2017. Facilitating the discovery of public datasets. Google AI Blog. Available at http://ai.googleblog.com/2017/01/facilitating-discovery-of-public.html [Last accessed 3 March 2020].
  32. Parsons, MA, Duerr, R and Godøy, Ø. 2022. The evolution of a geoscience standard: An instructive tale of science keyword development and adoption. Geoscience Frontiers, in press: 101400. DOI: 10.1016/j.gsf.2022.101400
  33. Plomp, E. 2020. Going digital: Persistent identifiers for research samples, resources and instruments. Data Science Journal, 19(46): 8. DOI: 10.5334/dsj-2020-046
  34. Robertson, JC, Fils, D, Devaraju, A, Song, L, Ramdeen, S and Klump, J. 2020. IGSN/igsn-json: Test schema repo for IGSN 2040 Architecture sprint. Available at https://github.com/IGSN/igsn-json [Last accessed 10 November 2022].
  35. Ross, S, Ballsun-Stanton, B, Cassidy, S, Cook, P, Sobotkova, A and Klump, J. 2020. FAIMS 3.0: Electronic Field Notebooks. In: CAAA Digital Archaeology Conference 2020. Presented at the CAA Australasia 2020. Online: Computer Applications and Quantitative Methods in Archaeology. Available at https://au.caa-international.org/2020-conference-abstracts/.
  36. Schindler, U and Devaraju, A. 2020. MARUM DIS IGSN landing page mockup implementation. Available at https://github.com/pangaea-data-publisher/marum-dis-igsn [Last accessed 10 November 2022].
  37. Schwardmann, U, Fenner, M, Hellström, M, Koers, H, L’Hours, H, Matthews, B, Ritz, R, Valle, M, van de Sanden, M and Zamani, T. 2021. PID architecture for the EOSC: report from the EOSC Executive Board Working Group (WG) Architecture PID Task Force (TF) (No. KI-03-20-757-EN-N). Luxembourg, L: Directorate-General for Research and Innovation (European Commission). [Last accessed 19 October 2021]. DOI: 10.2777/525581
  38. Servilla, MS, Brunt, J, Costa, D, Gries, C, Grossman-Clarke, S, Hanson, PC, O’Brien, M, Smith, C, Vanderbilt, K and Waide, R. 2018. Facilitating data discovery on the internet using sitemaps.org and schema.org dataset metadata through the Environmental Data Initiative Data Portal. In: AGU Fall Meeting 2018. Presented at the AGU Fall Meeting 2018. Washington, DC: AGU. pp. IN31B-20. Available at https://agu.confex.com/agu/fm18/meetingapp.cgi/Paper/445657 [Last accessed 6 May 2022].
  39. sitemaps.org. 2006. What are Sitemaps? Available at https://www.sitemaps.org/ [Last accessed 12 July 2021].
  40. Suresh, J. 2014. Bird’s eye view on “big data management.” In: 2014 Conference on IT in Business, Industry and Government (CSIBIG). Presented at the 2014 Conference on IT in Business, Industry and Government (CSIBIG). Indore, India: IEEE. pp. 15. DOI: 10.1109/CSIBIG.2014.7056930
  41. Thessen, AE, Poelen, JH, Collins, M and Hammock, J. 2018. 20 GB in 10 minutes: A case for linking major biodiversity databases using an open socio-technical infrastructure and a pragmatic, cross-institutional collaboration. PeerJ Computer Science, 4: e164. DOI: 10.7717/peerj-cs.164
  42. Thessen, AE, Woodburn, M, Koureas, D, Paul, D, Conlon, M, Shorthouse, DP and Ramdeen, S. 2019. Proper attribution for curation and maintenance of research collections: Metadata recommendations of the RDA/TDWG Working Group. Data Science Journal, 18(1): 54. DOI: 10.5334/dsj-2019-054
  43. Van de Sompel, H, Nelson, ML, Lagoze, C and Warner, S. 2004. Resource harvesting within the OAI-PMH framework. D-Lib Magazine, 10(12): 18. DOI: 10.1045/december2004-vandesompel
  44. Wilkinson, MD, Dumontier, M, Packer, AL, Gray, AJG, Mons, A, Gonzalez-Beltran, A, Waagmeester, A, Baak, A, Brookes, AJ, Evelo, CT, Mons, B, Persson, B, Goble, C, Schultes, E, van Mulligen, E, Aalbersberg, IjJ, Appleton, G, Boiten, J-W, Dillo, I, Grethe, JS, Heringa, J, Strawn, G, Velterop, J, Bouwman, J, van der Lei, J, Kok, J, Zhao, J, Wolstencroft, K, da Santos, LB, Roos, M, Thompson, M, Martone, ME, Crosas, M, Swertz, MA, Axton, M, Blomberg, N, Dumon, O, Groth, P, ’t Hoen, PAC, Wittenburg, P, Bourne, PE, Rocca-Serra, P, van Schaik, R, Finkers, R, Hooft, R, Kok, R, Edmunds, S, Lusher, SJ, Sansone, S-A, Slater, T, Sengstag, T, Clark, T and Kuhn, T. 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3: 160018. DOI: 10.1038/sdata.2016.18
Language: English
Submitted on: Sep 8, 2022
|
Accepted on: Dec 20, 2022
|
Published on: Mar 1, 2023
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2023 Jens Klump, Doug Fils, Anusuriya Devaraju, Sarah Ramdeen, Jess Robertson, Lesley Wyborn, Kerstin Lehnert, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.