Have a personal or library account? Click to login
Obstacles to Dataset Citation Using Bibliographic Management Software Cover

Obstacles to Dataset Citation Using Bibliographic Management Software

Open Access
|May 2025

Figures & Tables

Table 1

Mapping of metadata fields in this study to core required elements of a data citation (ESIP DPSC, 2019; FORCE11’s Data Citation Implementation Pilot Project Repository Expert Group, Fenner et al., 2019) and properties available in DataCite Metadata Schema Version 4.5 (DataCite Metadata Working Group, 2024).

THIS STUDYDATA CITATION GUIDELINES FOR EARTH SCIENCE DATAFORCE11, DATA CITATION IMPLEMENTATION PILOTDATACITE PROPERTIES
Resource Type[N/A; ‘resource type’ is not a required concept in these guidelines]TypeResourceTypeGeneral
CreatorAuthor or CreatorCreatorCreator(s)
TitleTitleTitleTitle(s)
Publication YearPublic Release DatePublication DatePublicationYear
VersionVersion IDVersionVersion
PublisherRepositoryData repository or ArchivePublisher
DOIResolvable Persistent IdentifierDataset IdentifierDOI
Access DateAccess DateN/AN/A
Table 2

Reference manager software examined in this study.

REFERENCE MANAGERDESCRIPTION
EndNoteA proprietary reference manager released in 1989 and now produced by Clarivate for Windows and macOS.
MendeleyA proprietary reference manager released in 2007 and acquired by Elsevier in 2013 for Windows, macOS, and Linux.
PaperpileA web-based proprietary reference manager released in 2012 and produced by Paperpile, LLC.
PapersA proprietary reference manager released in 2007 and produced by ReadCube for Windows and macOS.
RefWorksA web-based proprietary reference manager released in 2001 and acquired by ProQuestin 2008.
SciwheelA web-based proprietary reference manager. Formerly called F1000 Workspace; acquired by SAGE Publishing in 2022.
ZoteroAn open-source reference manager released in 2006 and managed by the non-profit Corporation for Digital Scholarship as of 2021.
Table 3

Data repositories examined in this study and their scope.

DATA REPOSITORYDESCRIPTION
Climate Data StoreDisciplinary climate data repository for the Copernicus Climate Change Service (C3S).
DataverseNOGeneralist repository for data produced by researchers at Norwegian institutions.
DryadGeneralist repository for research data from all disciplines.
FigshareGeneralist repository for research data from all disciplines.
EarthChem (Interdisciplinary Earth Data Alliance; IEDA)Disciplinary data repository for geoscience research (analytical data, data syntheses, models, and technical reports).
Environmental Data Initiative Data Portal (EDI)Disciplinary data repository for environmental and ecological data.
International Federation of Digital Seismograph Networks (FDSN)Disciplinary organization and data repository exposing seismological data from member organizations for free and open use.
Mendeley DataGeneralist repository for research data from all disciplines.
NASA Goddard Earth Sciences Data and Information Services Center (NASA GES DISC)Disciplinary repository serving NASA’s Atmospheric Composition, Water & Energy Cycles, and Climate Variability Focus Areas.
NSF National Center for Atmospheric Research Research Data Archive (NCAR)Disciplinary data repository containing meteorological, atmospheric compositions, and oceanographic observations.
Oak Ridge National Laboratory Distributed Active Archive Center for Biogeochemical Dynamics (ORNL DAAC)One of NASA’s Earth Observing System Data and Information System data centers, containing data on biogeochemical dynamics, ecology, and environmental processes.
PANGAEAData repository publishing georeferenced data from Earth systems research.
Planetary Data System (PDS)Data repository hosting data from NASA’s planetary missions, astronomical observations, and laboratory measurements.
ZenodoGeneralist repository for research data from all disciplines.
dsj-24-1874-g1.png
Figure 1

Counts of correct (blue), missing (orange), and incorrect (red) information in repository-provided citations available on dataset landing pages for the 14 repositories surveyed.

Table 4

Discrepancies in data publisher names provided in repository-provided citations versus DataCite metadata.

DATASET DOIPUBLISHER NAME (REPOSITORY-PROVIDED CITATION)PUBLISHER NAME (DATACITE)
10.24381/cds.ce973f02‘Copernicus Climate Change Service (C3S) Climate Data Store (CDS)’‘ECMWF’
10.5065/MM6J-9282‘Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory’‘UCAR/NCAR- Research Data Archive’
10.3334/ORNLDAAC/1868‘ORNL DAAC’‘ORNL Distributed Active Archive Center’
10.5067/OMPS/OMPS_N20_NMSO2_PCA_L2_Step1.1‘NASA GES DISC’‘NASA Goddard Earth Sciences Data and Information Services Center’
10.17632/4dyn8f8srx.2‘Mendeley Data’‘Mendeley’
Table 5

Discrepancies in author names in repository-provided citations versus DataCite metadata.

DATASET DOIAUTHOR NAME(S) (VERBATIM, REPOSITORY-PROVIDED CITATION)AUTHOR (‘CREATOR’) NAME(S) (DATACITE, ONLY RELEVANT COMPONENTS SHOWN)ISSUE
10.24381/cds.ce973f02‘Copernicus Climate Change Service, Climate Data Store’‘creators’:
‘name’: ‘Copernicus Climate Change Service’,
‘affiliation’: [],
‘nameIdentifiers’: []
Repository-provided citation and DataCite ‘creator’ fields not aligned.
10.17632/4dyn8f8srx.2‘Xiong, Wei; Mei, Xi; Huang, Long’‘creators’:
‘name’: ‘Wei Xiong’

‘contributors’:
‘name’: ‘Xi Mei’,
‘contributorType’: ‘Other’
‘name’: ‘Wei Xiong’,
‘contributorType’: ‘Other’
‘name’: ‘Long Huang’,
‘contributorType’: ‘Other’
DataCite metadata lists a sole author and three contributors, one of which is also the author.
No ‘givenName’ and ‘familyName’ sub-properties in DataCite metadata files to disambiguate first and last names.
10.5067/OMPS/OMPS_N20_NMSO2_PCA_L2_Step1.1‘Can Li, Nickolay A. Krotkov, Peter Leonard, et al.’‘creators’:
‘name’: ‘Can Li, Nickolay A. Krotkov’,
‘nameType’: ‘Personal’, ‘givenName’: ‘Nickolay A. Krotkov’,
‘familyName’: ‘Can Li’
‘nameIdentifiers’: []

‘contributors’: []
DataCite metadata lists two author’s names under a single creator property.
Author order mis-aligned between repository-provided citation and DataCite metadata.
Repository-provided citation has more authors than DataCite metadata, but no identifying names (…’et al).’
10.17189/1522849‘Rodriguez-Manfredi, Jose A; de la Torre Juarez, Manuel’
+ 11 editor names
‘creators’: [
‘name’: ‘Manuel de la Torre Juarez’,
‘nameType’: ‘Personal’,
‘nameIdentifiers’: []
‘name’: ‘Jose A Rodriguez-Manfredi’,
‘nameType’: ‘Personal’,
‘nameIdentifiers’: []
‘contributors’:
…’contributorType’: ‘Editor’
… (eleven editors from PDS site are listed, see Vrouwenvelder and Raia, 2025a)
First and second author are switched.
PDS’ ‘citation’ table leaves ambiguity as to whether editors should be included in data citation.
dsj-24-1874-g2.png
Figure 2

Percentage of surveyed repositories that provide downloadable .bib files for datasets and the types of templates employed.

dsj-24-1874-g3.png
Figure 3

Counts of correct (blue), missing (orange), and incorrect (red) information present in .bib files available for download on dataset landing pages for the seven repositories providing this download option to users.

dsj-24-1874-g4.png
Figure 4

(A) Percentages of correct (blue), missing (orange), and incorrect (red) data citation metadata across all datasets, categorized for each reference manager and metadata import method. (B) Percentage of successfully imported metadata fields (correct or incorrect) that are dropped during export. (C) Mean correct metadata fields with standard error.

dsj-24-1874-g5.png
Figure 5

Percent of correct (blue), missing (orange), and incorrect (red) metadata fields for each type of metadata field across all repositories, reference managers, and import methods.

dsj-24-1874-g6.png
Figure 6

Percent of correct (blue), missing (orange), and incorrect (red) metadata fields for each type of metadata field across all repositories, reference managers, and export methods.

dsj-24-1874-g7.png
Figure 7

Actionable recommendations for improving data citation by stakeholder.

Language: English
Submitted on: Dec 14, 2024
Accepted on: Apr 24, 2025
Published on: May 20, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Kristina Vrouwenvelder, Natalie H. Raia, Andrea K. Thomer, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.