
Figure 1
The double challenge of reducing the scope for both producers and consumers in their development and exploitation of essential data variables: from heterogeneous sources to broad dissemination.

Figure 2
CCI Open Data Portal architecture showing key public and private interfaces, including the CCI Web Presence and Toolbox interface, data services, and other CEDA, ESG and OGC interfaces (all discussed in the text).

Figure 3
The CCI Open Data Portal data publishing and querying workflow. Publishing consists of five key steps: acquisition of data into the archive (ingestion and validation), archive metadata creation, creation of DRS metadata, then followed by ESGF ingestion, and OGC-CSW integration). Querying can exploit both the ESGF and OGC-CSW interfaces as well as a vocabulary service.
Table 1
Comparison of ESGF Search and CEDA CSW metadata catalogue and search services.
| Catalogue and Search Technology | Granularity of content | Supported data types | Use of controlled vocabularies | Dataset metadata | Links to data access services |
|---|---|---|---|---|---|
| OGC CSW/ISO 19115 metadata for CEDA MOLES Metadata Catalogue | Supports collections and Datasets only | Any – dataset and dataset collection information input manually | Limited use of unbound keywords | Full scope of ISO 19115: including abstract, responsible party, licensing and other constraints on use. | Links at the granularity of datasets only. Links provided using OnlineResources. Links have name and description but not classified by service type. |
| ESGF Search | Datasets and file level metadata | ESGF publisher works with gridded netCDF data only | Per project DRS controlled vocabularies supports faceted search | Limited dataset metadata such as dataset variables information | Links provided at dataset and file level. Links categorised by type e.g. OPeNDAP, HTTPServer, GridFTP |

Figure 4
SKOS concept scheme for CCI DRS vocabularies (showing only downward concept mappings). Each is represented as a Collection (consistent with conventions adopted for the NVS) and a ConceptScheme, to enable bidirectional navigation between ConceptScheme and Concept.

Figure 5
CCI Open Data Portal Dashboard showing the temporal extent of datasets (top panel) and discovery metadata and temporal extent for contents within a specific dataset (ocean colour, bottom panel).

Figure 6
The Web Presence faceted search interface – integrating content from the Vocabulary Server, CSW and ESGF Search services.

Figure 7
Example CSW query using Processing Level and daily frequency SKOS concepts for keyword searching. This will query for records corresponding to Level 4 processed data (SKOS vocabulary concept http://vocab.ceda.ac.uk/collection/cci/procLev/proc_level4) with a temporal frequency of daily (SKOS vcocabulary concept http://vocab.ceda.ac.uk/collection/cci/freq/freq_day).

Figure 8
Two examples of third-party applications consuming CCI Open Data Portal services: A WMS client (top panel) developed by PML Applications Ltd showing Sea Surface Temperature and the CCI ToolBox (bottom panel) showing soil moisture data.
| CEDA | Centre for Environmental Data Analysis |
| CF-NetCDF | denotes data formatted using the Network Common Data Form (NetCDF) from |
| Unidata which complies with the Climate and Forecast Conventions for describing and naming data variables. | |
| CCI | The ESA Climate Change Initiative |
| CMIP5 | 5th Coupled Model Intercomparison Project |
| CSW | Catalogue Service for the Web – standard from the Open Geospatial Consortium defining a web service interface for discovery of catalogue records that describe geospatial data. |
| CWIC | CEOS (Committee on Earth Observation Satellites) WGISS (Working Group on Information Systems and Services) Integrated Catalogue |
| DRS | Data Reference Syntax – a system of controlled vocabularies for describing data defined for datasets hosted in the Earth System Grid Federation. |
| ECV | Essential Climate Variable (within the ESA CCI). |
| ESA | European Space Agency |
| ESGF | Earth System Grid Federation |
| GEOSS | Global Earth Observation System of Systems |
| HTTP | Hypertext Transfer Protocol |
| JASMIN | The UK Joint Analysis supercomputer |
| MOLES | Metadata Objects for Linking Environmental Sciences – a metadata information model developed for CEDA. |
| NCML | NetCDF Markup Language |
| NetCDF | Network Common Data Form – data format and software libraries from Unidata for array-based scientific data |
| NOSQL | Not-only SQL – Collective name for database technologies which do not follow the strict rules governing traditional relational databases. |
| NVS | NERC (Natural Environment Research Council) Vocabulary Service. |
| OBS4MIPS | The Project for Observations for Model Intercomparison Projects |
| OGC | Open Geospatial Consortium |
| OPeNDAP | Open-source Project for a Network Data Access Protocol – initiative focussed on the development of services for remote access of gridded data through the DAP (Data Access Protocol) specification. |
| OWL | Web Ontology Language |
| SKOS | Simple Knowledge Organisation System |
| THREDDS | Thematic Real-time Environmental Distributed Data Services – project from Unidata to develop middleware to bridge between data providers and users |
| WMS | Web Map Service – Open Geospatial Consortium standard defines a web service interface for accessing geo-referenced map images |
| WCS | Web Coverage Service – Open Geospatial Consortium standard defines a web service interface for service data coverages defined by geo-temporal query parameters. |
| WCPS | Web Coverage and Processing Service – Open Geospatial Consortium standard which defines a query language for filtering and processing multi-dimensional coverage data. |
