
Figure 1
Overview of the stages performed to develop a smart questionnaire for the Data Stewardship Wizard (DSW): gathering relevant knowledge sources, developing a DSW knowledge model (questionnaire), validating the questionnaire, aligning with the Research Data Management toolkit for Life Sciences (RDMkit) (ELIXIR-CONVERGE 2022) and FAIR Cookbook (FAIRplus 2022), and publishing the questionnaire in a DSW instance.
Abbreviations: Data Stewardship Wizard (DSW); European Reference Networks (ERNs); findable, accessible, interoperable, and reusable (FAIR).
Legend: input (yellow), output (blue).
Table 1
Overview of the workflow steps and inventory of topics and implementations.
| WORKFLOW STEP | RELATED TOPICS | IMPLEMENTATION |
|---|---|---|
| 1. Identify FAIR objectives and expertise | a. Defining objectives | |
| b. Giving training | ||
| c. Hiring of personnel | ||
| 2. Define data elements to be collected | a. Common data elements | CDE core elements (European Commission 2019) |
| b. Data dictionary | ERDRI.mdr (European Commission 2022a) | |
| c. Central metadata repository registration | ||
| 3. Define metadata elements to be collected | a. Machine interpretable metadata | EJP RD metadata model (EJP RD 2022d) |
| b. Metadata store | FAIR data point (Bonino da Silva Santos et al. 2023) | |
| 4. Create a semantic data model | a. Reuse of existing model(s) | CDE semantic model (Kaliyaperumal et al. 2022) CDISC ODM (CDISC 2022) HL7 FHIR (HL7 2022) OMOP CDM (OHDSI 2022) |
| 5. Obtain consent | a. Standardized informed consent form | ERN ICF (EJP RD 2022b) |
| 6. Enter (FAIR) data | a. Electronic data capture systems | |
| 7. Standardize metadata | a. Metadata model(s) | EJP RD metadata model (EJP RD 2022d) |
| b. Standard terminology | CDE semantic model terminology (EJP RD 2022e) | |
| 8. Transform (meta)data to RDF | a. Data transformation | CDE in a box (EJP RD 2022a) |
| b. Terminology mappings | ||
| 9. Manage authentication and authorization | a. Authorization roles | |
| b. Access conditions | ||
| c. Data pseudonymization | ||
| d.Querying |
[i] Abbreviations: common data elements on rare disease registration (CDE); European Platform on Rare Disease Registration metadata repository (ERDRI.mdr); European Reference Network (ERN); Health Level 7 Fast Healthcare Interoperability Resources (HL7 FHIR); Clinical Data Interchange Standards Consortium Operational Data Model (CDISC ODM); Observational Medical Outcomes Partnership Common Data Model (OMOP CDM); findable, accessible, interoperable, and reusable (FAIR); informed consent form (ICF); Resource Description Framework (RDF).
Table 2
Challenges and categories from Dos Santos Vieira et al. (2022) that were included in the questionnaire during the mind mapping phase. Challenges marked as indirectly covered are not specifically mentioned in the questionnaire but were solved solely by the use of the Data Stewardship Wizard (DSW) and the questionnaire.
| CATEGORY | DIRECTLY INCLUDED | INDIRECTLY INCLUDED | MOTIVATION |
|---|---|---|---|
| Community | 0 out of 7 | 7 out of 7 | All challenges addressed a lack of alignment between registries. The DSW questionnaire solves this issue. |
| Implementation | 7 out of 9 | 0 out of 9 | Two not-included challenges were irrelevant at the time of developing the questionnaire. |
| Legal | 3 out of 5 | 0 out of 5 | Two not-included challenges addressed a tool that was not relevant for developing the questionnaire. |
| Modeling | 3 out of 5 | 0 out of 5 | Two not-included challenges addressed issues that were too specific. |
| Five not-included challenges addressed irrelevant tools. | |||
| Training | 9 out of 15 | 1 out of 15 | One indirectly covered challenge was not mentioned specifically in the questionnaire but could be deducted from the information. |
| All categories | 22 out of 41 | 8 out of 41 |
Table 3
Quantification of the received feedback per chapter. Feedback is categorized as textual change, structural change, or software issue.
| CHAPTER | TEXTUAL CHANGES | STRUCTURAL CHANGES | SOFTWARE ISSUES |
|---|---|---|---|
| Administrative information | 18 | 5 | 3 |
| Reusing data | 13 | 2 | 0 |
| Creating and collecting data | 3 | 2 | 2 |
| Processing data | 19 | 3 | 0 |
| Interpreting data | 47 | 8 | 0 |
| Describing data | 10 | 1 | 0 |
| Giving access to data | 27 | 3 | 0 |
| All chapters | 137 | 24 | 5 |
Table 4
Questions and external references per chapter. Top-level questions are questions that precede all other questions and are always presented to a user.
| CHAPTER | TOP-LEVEL QUESTIONS | TOTAL QUESTIONS | REFERENCES TO FAIR COOKBOOK | REFERENCES TO RDMKIT |
|---|---|---|---|---|
| Administrative information | 6 | 15 | 1 | 4 |
| Reusing data | 2 | 9 | 3 | 3 |
| Creating and collecting data | 2 | 5 | 1 | 1 |
| Processing data | 1 | 5 | 0 | 2 |
| Interpreting data | 2 | 12 | 4 | 1 |
| Describing data | 2 | 4 | 0 | 0 |
| Giving access to data | 4 | 7 | 1 | 2 |
| All chapters | 19 | 57 | 10 | 13 |
[i] Abbreviations: findable, accessible, interoperable, and reusable (FAIR); Research Data Management toolkit for Life Sciences (RDMkit) (ELIXIR-CONVERGE 2022); FAIR Cookbook (FAIRplus 2022).

Figure 2
Simplified view of the knowledge model.
Abbreviations: common data elements (CDE); electronic data capture (EDC); European Joint Programme on Rare Diseases (EJP RD); European Platform on Rare Disease Registration (ERDRI); findable, accessible, interoperable, and reusable (FAIR); Health Level 7 Fast Healthcare Interoperability Resources (HL7 FHIR); Clinical Data Interchange Standards Consortium Operational Data Model (CDISC ODM); Observational Medical Outcomes Partnership Common Data Model (OMOP CDM); REpresentational State Transfer Application Programming Interface (REST API); SPARQL Protocol and RDF Query Language (SPARQL).

Figure 3
Screenshot of the knowledge model with top-level questions (Data Stewardship Wizard knowledge model editor module).

Figure 4
Screenshot of the first question of the ‘Describing data’ chapter (Data Stewardship Wizard questionnaire module).
