Skip to main content
Have a personal or library account? Click to login
Making the Complete OpenAIRE Citation Graph Easily Accessible Through Compact Data Representation Cover

Making the Complete OpenAIRE Citation Graph Easily Accessible Through Compact Data Representation

By:  and    
Open Access
|Apr 2026

Abstract

The OpenAIRE graph contains a large citation graph dataset, with over 200 million publications and over two billion citations. The current graph is available as a dump with metadata which, when uncompressed, totals ∼2.5 TB. This makes it hard to process on conventional computers. To make this network more accessible for the community, we provide a processed OpenAIRE graph which is downscaled to 16 GB RAM, while preserving the full graph structure. Apart from this we offer the processed data in a very simple format, which allows for further straightforward manipulation. We also provide (1) a Python pipeline, which can be used to process the next releases of the OpenAIRE graph, and (2) a larger version of the dataset including more publication fields such as the title and list of authors.

DOI: https://doi.org/10.5334/johd.520 | Journal eISSN: 2059-481X
Language: English
Page range: 63 - 63
Submitted on: Feb 13, 2026
Accepted on: Apr 10, 2026
Published on: Apr 30, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Joakim Skarding, Pavel Sanda, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.