Table 1
Mapping of metadata fields in this study to core required elements of a data citation (ESIP DPSC, 2019; FORCE11’s Data Citation Implementation Pilot Project Repository Expert Group, Fenner et al., 2019) and properties available in DataCite Metadata Schema Version 4.5 (DataCite Metadata Working Group, 2024).
| THIS STUDY | DATA CITATION GUIDELINES FOR EARTH SCIENCE DATA | FORCE11, DATA CITATION IMPLEMENTATION PILOT | DATACITE PROPERTIES |
|---|---|---|---|
| Resource Type | [N/A; ‘resource type’ is not a required concept in these guidelines] | Type | ResourceTypeGeneral |
| Creator | Author or Creator | Creator | Creator(s) |
| Title | Title | Title | Title(s) |
| Publication Year | Public Release Date | Publication Date | PublicationYear |
| Version | Version ID | Version | Version |
| Publisher | Repository | Data repository or Archive | Publisher |
| DOI | Resolvable Persistent Identifier | Dataset Identifier | DOI |
| Access Date | Access Date | N/A | N/A |
Table 2
Reference manager software examined in this study.
| REFERENCE MANAGER | DESCRIPTION |
|---|---|
| EndNote | A proprietary reference manager released in 1989 and now produced by Clarivate for Windows and macOS. |
| Mendeley | A proprietary reference manager released in 2007 and acquired by Elsevier in 2013 for Windows, macOS, and Linux. |
| Paperpile | A web-based proprietary reference manager released in 2012 and produced by Paperpile, LLC. |
| Papers | A proprietary reference manager released in 2007 and produced by ReadCube for Windows and macOS. |
| RefWorks | A web-based proprietary reference manager released in 2001 and acquired by ProQuestin 2008. |
| Sciwheel | A web-based proprietary reference manager. Formerly called F1000 Workspace; acquired by SAGE Publishing in 2022. |
| Zotero | An open-source reference manager released in 2006 and managed by the non-profit Corporation for Digital Scholarship as of 2021. |
Table 3
Data repositories examined in this study and their scope.
| DATA REPOSITORY | DESCRIPTION |
|---|---|
| Climate Data Store | Disciplinary climate data repository for the Copernicus Climate Change Service (C3S). |
| DataverseNO | Generalist repository for data produced by researchers at Norwegian institutions. |
| Dryad | Generalist repository for research data from all disciplines. |
| Figshare | Generalist repository for research data from all disciplines. |
| EarthChem (Interdisciplinary Earth Data Alliance; IEDA) | Disciplinary data repository for geoscience research (analytical data, data syntheses, models, and technical reports). |
| Environmental Data Initiative Data Portal (EDI) | Disciplinary data repository for environmental and ecological data. |
| International Federation of Digital Seismograph Networks (FDSN) | Disciplinary organization and data repository exposing seismological data from member organizations for free and open use. |
| Mendeley Data | Generalist repository for research data from all disciplines. |
| NASA Goddard Earth Sciences Data and Information Services Center (NASA GES DISC) | Disciplinary repository serving NASA’s Atmospheric Composition, Water & Energy Cycles, and Climate Variability Focus Areas. |
| NSF National Center for Atmospheric Research Research Data Archive (NCAR) | Disciplinary data repository containing meteorological, atmospheric compositions, and oceanographic observations. |
| Oak Ridge National Laboratory Distributed Active Archive Center for Biogeochemical Dynamics (ORNL DAAC) | One of NASA’s Earth Observing System Data and Information System data centers, containing data on biogeochemical dynamics, ecology, and environmental processes. |
| PANGAEA | Data repository publishing georeferenced data from Earth systems research. |
| Planetary Data System (PDS) | Data repository hosting data from NASA’s planetary missions, astronomical observations, and laboratory measurements. |
| Zenodo | Generalist repository for research data from all disciplines. |

Figure 1
Counts of correct (blue), missing (orange), and incorrect (red) information in repository-provided citations available on dataset landing pages for the 14 repositories surveyed.
Table 4
Discrepancies in data publisher names provided in repository-provided citations versus DataCite metadata.
| DATASET DOI | PUBLISHER NAME (REPOSITORY-PROVIDED CITATION) | PUBLISHER NAME (DATACITE) |
|---|---|---|
| 10.24381/cds.ce973f02 | ‘Copernicus Climate Change Service (C3S) Climate Data Store (CDS)’ | ‘ECMWF’ |
| 10.5065/MM6J-9282 | ‘Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory’ | ‘UCAR/NCAR- Research Data Archive’ |
| 10.3334/ORNLDAAC/1868 | ‘ORNL DAAC’ | ‘ORNL Distributed Active Archive Center’ |
| 10.5067/OMPS/OMPS_N20_NMSO2_PCA_L2_Step1.1 | ‘NASA GES DISC’ | ‘NASA Goddard Earth Sciences Data and Information Services Center’ |
| 10.17632/4dyn8f8srx.2 | ‘Mendeley Data’ | ‘Mendeley’ |
Table 5
Discrepancies in author names in repository-provided citations versus DataCite metadata.
| DATASET DOI | AUTHOR NAME(S) (VERBATIM, REPOSITORY-PROVIDED CITATION) | AUTHOR (‘CREATOR’) NAME(S) (DATACITE, ONLY RELEVANT COMPONENTS SHOWN) | ISSUE |
|---|---|---|---|
| 10.24381/cds.ce973f02 | ‘Copernicus Climate Change Service, Climate Data Store’ | ‘creators’: ‘name’: ‘Copernicus Climate Change Service’, ‘affiliation’: [], ‘nameIdentifiers’: [] | Repository-provided citation and DataCite ‘creator’ fields not aligned. |
| 10.17632/4dyn8f8srx.2 | ‘Xiong, Wei; Mei, Xi; Huang, Long’ | ‘creators’: ‘name’: ‘Wei Xiong’ … ‘contributors’: ‘name’: ‘Xi Mei’, ‘contributorType’: ‘Other’ ‘name’: ‘Wei Xiong’, ‘contributorType’: ‘Other’ ‘name’: ‘Long Huang’, ‘contributorType’: ‘Other’ | DataCite metadata lists a sole author and three contributors, one of which is also the author. No ‘givenName’ and ‘familyName’ sub-properties in DataCite metadata files to disambiguate first and last names. |
| 10.5067/OMPS/OMPS_N20_NMSO2_PCA_L2_Step1.1 | ‘Can Li, Nickolay A. Krotkov, Peter Leonard, et al.’ | ‘creators’: ‘name’: ‘Can Li, Nickolay A. Krotkov’, ‘nameType’: ‘Personal’, ‘givenName’: ‘Nickolay A. Krotkov’, ‘familyName’: ‘Can Li’ ‘nameIdentifiers’: [] … ‘contributors’: [] | DataCite metadata lists two author’s names under a single creator property. Author order mis-aligned between repository-provided citation and DataCite metadata. Repository-provided citation has more authors than DataCite metadata, but no identifying names (…’et al).’ |
| 10.17189/1522849 | ‘Rodriguez-Manfredi, Jose A; de la Torre Juarez, Manuel’ + 11 editor names | ‘creators’: [ ‘name’: ‘Manuel de la Torre Juarez’, ‘nameType’: ‘Personal’, ‘nameIdentifiers’: [] ‘name’: ‘Jose A Rodriguez-Manfredi’, ‘nameType’: ‘Personal’, ‘nameIdentifiers’: [] ‘contributors’: …’contributorType’: ‘Editor’ … (eleven editors from PDS site are listed, see Vrouwenvelder and Raia, 2025a) | First and second author are switched. PDS’ ‘citation’ table leaves ambiguity as to whether editors should be included in data citation. |

Figure 2
Percentage of surveyed repositories that provide downloadable .bib files for datasets and the types of templates employed.

Figure 3
Counts of correct (blue), missing (orange), and incorrect (red) information present in .bib files available for download on dataset landing pages for the seven repositories providing this download option to users.

Figure 4
(A) Percentages of correct (blue), missing (orange), and incorrect (red) data citation metadata across all datasets, categorized for each reference manager and metadata import method. (B) Percentage of successfully imported metadata fields (correct or incorrect) that are dropped during export. (C) Mean correct metadata fields with standard error.

Figure 5
Percent of correct (blue), missing (orange), and incorrect (red) metadata fields for each type of metadata field across all repositories, reference managers, and import methods.

Figure 6
Percent of correct (blue), missing (orange), and incorrect (red) metadata fields for each type of metadata field across all repositories, reference managers, and export methods.

Figure 7
Actionable recommendations for improving data citation by stakeholder.
