Have a personal or library account? Click to login
Guidelines for Publicly Archiving Terrestrial Model Data to Enhance Usability, Intercomparison, and Synthesis Cover

Figures & Tables

Table 1

Summary of data centers and their data publication storage limitations, and resources for data contributors on best practices for curating data packages, modeling related and in general.

PROVIDES DATA CONTRIBUTOR GUIDELINES
DATA CENTERSTORAGE LIMIT PER DATA PUBLICATIONMODEL-DATA SPECIFIC?OTHER?
National Science Foundation Arctic Data CenterNo limitYesYes
Oak Ridge National Laboratory DAACNA1YesYes
NASA’s Earth Observing System Data and Information System (EOSDIS)NA1NA1Yes
U.S. DOE ESS-DIVE10GB/500 GB2NoYes
Dryad300 GB2NoYes
Zenodo50 GBNoNo
Earth System Grid Federation (ESGF)NA1NA1NA1

[i] 1 NA: Not available, i.e. no public information found.

2 Limit on size of individual files. For ESS-DIVE, 10GB is the default file size limit, and can be increased upto 500GB by request. Files >500GB are considered upon review.

Table 2

Summary of the standalone terrestrial models used by 12 researchers participating in this study. Coupled models (e.g., ELM-FATES and ELM-PLOTRAN) are not listed but were also considered in evaluating archiving needs.

MODEL ACRONYMMODEL NAME (ORGANIZATION)REFERENCESDESCRIPTION
ELMEnergy Exascale Earth System Model (E3SM) Land Model (DOE)Golaz et al. (2019); https://e3sm.org/Land model component of the E3SM Earth System Model
FATESFunctionally Assembled Terrestrial Ecosystem Simulator (DOE)Koven et al. (2020);https://github.com/NGEET/fates-releaseSize and age-structured vegetation demographic model within a land surface model and can be coupled with an Earth system model
PFLOTRANParallel Flow and Transport (DOE)Hammond, Lichtner and Mills (2014); https://www.pflotran.orgParallel reactive flow and transport model for subsurface hydrobiogeochemical processes
ATSAdvanced Terrestrial Simulator (DOE)Coon et al. (2020); https://amanzi.github.io/ats/An integrated, distributed watershed hydrology model including surface and subsurface flow, energy transport, reactive transport, and ecohydrology.
CrunchFlowN/A (DOE)Steefel and Molins (2009)Model for simulating multicomponent multi-dimensional reactive transport in porous media
MAATMulti-Assumption Architecture & Testbed (DOE)Walker, Ye, et al. (2018); https://github.com/walkeranthonyp/MAATModular terrestrial ecosystem process modeling framework for building multiple models that vary in process representation/hypotheses.
CLMCommunity Land Model (NCAR)Lawrence et al. (2019); https://www.cesm.ucar.edu/models/clm/Land model for the Community Earth System Model (CESM), a fully-coupled global climate model
ED2Ecosystem Demography Biosphere Model (NSF/NASA)Longo et al., (2019); https://github.com/EDmodel/ED2Size- and age- structured terrestrial biosphere model
PRMSPrecipitation Runoff Modeling System (USGS)Markstrom et al. (2015); https://www.usgs.gov/software/precipitation-runoff-modeling-system-prmsDeterministic process-based model developed to evaluate the impacts of climate and land use on streamflow and watershed hydrology.
SWATSoil and Water Assessment Tool (USDA/Texas A&M University)Bieger et al. (2017); https://swat.tamu.edu/Watershed to river basin-scale model used to simulate the quality and quantity of surface and ground water and predict the environmental impact of land use, land management practices, and climate change.
LPJ-GUESSLund-Potsdam-Jena General Ecosystem Simulator (Lund University)Smith, Prentice and Sykes (2001); https://web.nateko.lu.se/lpj-guess/Dynamic vegetation-terrestrial ecosystem model for regional or global studies
GDAYGeneric Decomposition and YieldComins and McMurtrie (1993);https://github.com/mdekauwe/GDAYStand-scale ecosystem model that simulates carbon, nitrogen, and water dynamics.
SDGVMSheffield Dynamic Global Vegetation Model (Sheffield University)Woodward and Lomas (2004); https://bitbucket.org/walkeranthonyp/sdgvm/Terrestrial biosphere carbon cycle model for ecosystem to global scale simulations. Simple size and age structure.
OpenFOAMN/A (OpenFOAM foundation)https://openfoam.org/Computational fluid dynamics open source software
CALANDCalifornia Natural and Working Lands Carbon and Greenhouse Gas Model (California Natural Resources Agency)Di Vittorio and Simmonds (2019); https://doi.org/10.5281/zenodo.3256727.Carbon stock and flux model that simulates the effects of various management practices, land use and land cover change, wildfire, and climate change on ecosystem carbon dynamics across all California lands
Table 3

Estimates of archiving needs for typical spatial and temporal representations of simulation data from DOE terrestrial models, which are the most commonly-used models by the researchers in this study. Note that the same models are often run at different spatial extents (e.g., site to global) and temporal duration (e.g., weeks to centuries).

DETAILS FOR TYPICAL SIMULATION1 TO BE ARCHIVED
MODELSPATIAL RESOLUTION OR REPRESENTATIONSPATIAL EXTENTTEMPORAL RESOLUTION2TEMPORAL DURATIONNO. OF FILESMEAN FILE SIZE (GB)TYPES OF FILE FORMATSTOTAL ANNUAL STORAGE NEEDS (GB)
Multiple LSMs3Point4pointdaily200 yrs3000.1CSV50
ELMpointpointhourly, daily10 – 20 yrs200.004netCDF3
ELM1/2° – 2°globalmonthly250 yrs25000.2netCDF15000
ELM-FATESpoint, ~1 km, ~1 degreepoint, regional, and global modessub-daily, monthly~500 yrs1K – 10K50netCDF1000
FATESpointpoint<hourly10 yrs703netCDF2000
ELM-PFLOTRAN1 – 100 m100 m – 10 kmhourly/daily10+ yrs10 – 10010HDF5, netCDF1000
PFLOTRAN<1 m5-6 km<hourly30 yrs51000HDF510000
ATS100 m – 250 m10 kmdaily10 – 100 yrs20100XML + HDF5, CSV1000
ATS<1 – 100 m10 m – 10 kmdaily10 – 100 yrs2XML + HDF51000
ATS0.25 m25 mdaily100 yrs50 – 200XML + HDF510
CrunchFlow<1 m<1 km<hourly30 days1000.001TXT1

[i] 1 Note that “ensembles” of simulations were not considered in this survey, except in the total annual storage needs reported.

2 This could represent either the simulation temporal resolution, or output file temporal resolution.

3 Here we use Land Surface Model “LSMs” to include both standard CMIP-style Earth System Models (e.g. ELM) and more complex vegetation phenology models (e.g. FATES).

4 Note that “point” is used to indicate a single vertical column of cells or otherwise a single location in horizontal space.

dsj-21-1374-g1.png
Figure 1

Perspectives from a group of 12 U.S. Department of Energy terrestrial model researchers of (a) archiving different components of model data in a public repository (b) the period of time over which publicly archived model data remain useful, and (c) purposes served by archiving model data in a public repository. The importance ranking for (a) and (c) are shown as 1 (not important at all) to 5 (extremely important), and represent average importance scores across 12 researchers.

dsj-21-1374-g2.png
Figure 2

Decision tree for determining recommended approach for grouping model-related files for public archiving.

Language: English
Submitted on: Jun 22, 2021
Accepted on: Nov 23, 2021
Published on: Feb 7, 2022
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2022 Maegen B. Simmonds, William J. Riley, Deborah A. Agarwal, Xingyuan Chen, Shreyas Cholia, Robert Crystal-Ornelas, Ethan T. Coon, Dipankar Dwivedi, Valerie C. Hendrix, Maoyi Huang, Ahmad Jan, Zarine Kakalia, Jitendra Kumar, Charles D. Koven, Li Li, Mario Melara, Lavanya Ramakrishnan, Daniel M. Ricciuto, Anthony P. Walker, Wei Zhi, Qing Zhu, Charuleka Varadharajan, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.