Table 1
Examples of use cases being re-written.
| As A | Theme | I want | So that |
|---|---|---|---|
| Ph.D Candidate | Economics | To have advanced search functionality | So he can refine a search when needed |
| Researcher | Herpetology | To find more data to correlate with the locations of her tortoise populations | So she can put her research into perspective and identify collaborators |

Figure 1
Two layered grouping of Use Cases. The first and second layer are from axial coding and open coding respectively.
Table 2
From Use Cases to Requirements.
| As A | Theme | I want | So that |
|---|---|---|---|
| Researcher | Social Science | To see what data is available right now | Make a forecast |
| Researcher | Social Science | Cares about Access Conditions | |
| Researcher | Physical Science | wants a very prominent Download button | |
| Researcher | Computer Science | see (data) publish date or available date | |
| Researcher | Health Science | data | |
| Requirement | Indication of Data availability |
Table 3
Nine user requirements elicited from use cases.
| User Requirements | User Type | Actors who can meet requirement | Description (extracts from the ‘so that’ field) |
| REQ 1. Indication of data availability | Researcher/Research Student | Data Repository Operator, Data Provider | If there is no clear indication of data availability, the search is usually dropped within the first 2 minutes. A ‘sort by availability’ function could also reveal potential data embargo. Ideally should have an evident big button for ‘Download’. |
| REQ 2. Connection of data with person/institution/paper/citations/grants | Funder/Researcher/Research Student | Data Repository Operator, Data Provider | This allows for ranking of datasets, the connection of the information displayed with personal details as well as accountability. Also, this information can be used for grant application as well as for comparative studies (datasets underpinned several papers). Finally, allow for the upload of manuscript for direct connection. |
| REQ 3. Fully annotated data (including granularity, origin, licensing, provenance, and method of production, times downloaded) | Researcher/Research Student | Data Provider, Data Repository Operator | This information will validate the use of a dataset in a particular study, as well as remove the step of having to read the corresponding manuscript to understand the data. To judge validity, need to know where and when the data was measured, and the basic experimental and instrumental parameters. These are more important than e.g. who created the data. To assess the validity of the data, look at repository/paper, then look at the data first to see if it makes sense. |
| REQ 4. Filtering of data based on specific criteria on multiple fields at the same time (such a release date, geo coverage, text content, date range, specific events). | Researcher/Research Student | Data Repository Operator, Data Provider | Support targeted studies (e.g. find global temperature records for volcanic eruptions in the last century; find articles on bronze age in Britain). |
| REQ 5. Cross-referencing of data (same or different repositories). | Researcher/Research Student | Data Provider, Data Repository Operator | Having the same with different identifiers is not sufficiently convenient for studies. Also, there are multiple instances/versions and reproducibility necessitates specific uses every time. Finally, cross-referencing will avoid duplication and maximize efficiency and access. |
| REQ 6. Visual analytics/inspection of data/thumbnail preview | Researcher | Data Repository Operator | Decide if this data set is right for a research purpose. Also allows quick visual filtering from a results set. |
| REQ 7. Sharing data (either whole dataset, particular records, or bibliographic information) in a collaborative environment | Researcher/Research Student | Data Repository Operator | Make sure that there is a common space of keeping both data and their versions across time – alleviate the need to rerun at the last minute to check nothing has been published since last study/search, or to share bibliographic information about data. |
| REQ 8. Accompanying educational/training material | Librarian | Research Office/Libraries, Data Repository | Help researchers manage and discover data in a methodical and seamless manner. |
| REQ 9. Portal functionality similar to other established academic portals | Researcher | Data Repository Operator | For example, finding more within a subject, search by visual (i.e. draw a structure to search for), free text search, build query functionality, subscription, save lists. |

Figure 2
Query interface from National Snow & Ice Data Center (http://nsidc.org/data/search/), with spatial and temporal search up front.

Figure 3
An example of preview data from Elsevier Datasearch (https://datasearch.elsevier.com/#/).
Table 4
Matching requirements to recommendations.
| Requirement | Recommendations |
|---|---|
| REQ1: data availability | REC3 Assessable search result |
| REQ2: Connection of data | REC2 Multiple access points REC4 Readable metadata records REC8 Identifiable duplicates |
| REQ3: Annotations | REC3 Assessable search result REC4 Readable and analysable metadata records REC6 Available data usage statistics |
| REQ4: Filtering with single or multiple criteria | REC1 Multiple query interfaces REC2 Multiple access points |
| REQ5: Cross-reference | REC8 Identifiable duplicates |
| REQ6: Inspection of data | REC1 Multiple query interfaces REC2 Multiple access points REC3 Assessable search result |
| REQ7: Collaborative environment | REC5 Available bibliographic references |
| REQ8: Training material | Eleven quick tips for finding research data |
| REQ9: Similarity across portals | REC1 Multiple query interfaces REC2 Multiple access points REC7 Consistent interface |
| Support data searchers from web search engines | REC9 Findable from web search engines |
| The Fair Data Principles – interoperability | REC10 Interoperability with other repositories |
