Have a personal or library account? Click to login
The Integration of the Japan Link Center’s Bibliographic Data into OpenCitations: The production of bibliographic and citation data structured according to the OpenCitations Data Model, originating from an Anglo-Japanese dataset Cover

The Integration of the Japan Link Center’s Bibliographic Data into OpenCitations: The production of bibliographic and citation data structured according to the OpenCitations Data Model, originating from an Anglo-Japanese dataset

Open Access
|Feb 2024

Figures & Tables

johd-10-178-g1.png
Figure 1

Workflow for the ingestion of citation data and bibliographic metadata into the OpenCitations datasets.

johd-10-178-g2.png
Figure 2

Flowchart describing the preliminary processing of citing bibliographic entities.

johd-10-178-g3.png
Figure 3

Flowchart describing the processing of cited bibliographic entities, their validation, and the production of metadata and citation tables.

Table 1

Sample of Meta input tables produced by oc_ds_converter, storing bibliographic entities’ metadata.

IDTITLEAUTHORPUB_DATEVENUEVOLUMEISSUEPAGETYPEPUBLISHEREDITOR
DOI: 10.14825/kaseki.68.0_14本邦産白亜紀アンモナイトデータベースおよび種多様性について利光, 誠一; 平野, 弘道; 松本, 崇; 高橋, 一晴2000化石 [issn:0022-9202 issn:2424-2632 jid:kaseki]68014–16journal article日本古生物学会
DOI: 10.1126/science.235.4793.1156Chronology of fluctuating sea levels since the Triassic1987Science2351156–1167
Table 2

Sample of Index input tables, produced by oc_ds_converter, storing citation data.

johd-10-178-g4.png
Figure 4

Language distribution in Meta bibliographic entities, calculated on Meta dump, version 5 (https://doi.org/10.6084/m9.figshare.21747461.v5). The analysis was performed on bibliographic entities with a declared title.

johd-10-178-g5.png
Figure 5

Bar charts illustrating the analysis of multilingualism within the input dataset, categorized by bibliographic metadata fields.

Table 3

Table showing the metadata languages in the original dataset and the linguistic information loss due to OCDM constraints. The total amount of metadata provided for a field is the sum of the number of values provided solely in one language, twice the number of values supplied in two languages, and the product between the number of values provided in more than two languages and the precise number of furnished languages. The information loss is calculated as the sum of values provided in more languages out of the total calculated. The publisher’s name field has not been included in the table since it does not necessarily concern the loss of linguistic information but might involve cases where the information loss derives from having multi-publisher values.

1 LANGUAGE2 LANGUAGES3+ LANGUAGESTOTAL VALUES PROVIDEDINFORMATION LOSS WRT. THE ORIGINAL DATASET
title citing5,701,2851,641,89539(3 languages)8,985,1921,641,973; 18.27%
title cited217,31612,6160242,54812,616; 5.2%
authors citing9,892,5224,556,81239(3 languages)19,006,2634,556,890; 23,98%
authors cited308,079157,5560623,191157,556; 25.28%
journal title citing1,137,3682,658,67821,213 (20,572 3 languages; 641 4 languages)6,519,0042,701,745; 41.44%
journal title cited180,51500180,5150
johd-10-178-g6.png
Figure 6

Language distribution in Meta bibliographic entities, calculated on Meta dump, version 6 (https://doi.org/10.6084/m9.figshare.21747461.v6). The analysis was performed on bibliographic entities with a declared title.

Table 4

Yellow cells represent the single contribution of each collection to OpenCitations Index, i.e., the number of citations uniquely derived by a given source. Pink cells represent the number of citations in the sources’ intersection. The table is based on OpenCitations data at its latest update (29 November 2023).

INDEXCROSSREFDATACITEPUBMEDOPENAIREJALC
INDEX1,975,552,8461,563,218,160169,814,412695,988,81014,645,838396,788
Crossref1,100,963,34627,051458,309,2973,917,3291,137
DataCite169,663,2559,623114,4830
PubMed237,208,8679,711,789125
OpenAire1,067,7120
JaLC395,526
DOI: https://doi.org/10.5334/johd.178 | Journal eISSN: 2059-481X
Language: English
Submitted on: Oct 28, 2023
Accepted on: Jan 30, 2024
Published on: Feb 29, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Arianna Moretti, Marta Soricetti, Ivan Heibi, Arcangelo Massari, Silvio Peroni, Elia Rizzetto, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.