Have a personal or library account? Click to login
Research Data Governance. The Need for a System of Cross-organisational Responsibility for the Researcher’s Data Domain Cover

Research Data Governance. The Need for a System of Cross-organisational Responsibility for the Researcher’s Data Domain

Open Access
|Apr 2025

Full Article

1. Research Data Life Cycle and Dead Ends for Responsibility and Accountability Systems

Despite the fact that researchers are able to draw from the full range of services and support materials for research data management (RDM),1 it seems to be a challenge to integrate these services and materials into research (data) life cycles. As a consequence, pertaining to the research (data) life cycle, it is not uncommon for researchers to fall short in finding, interpreting, and/or integrating relevant topics and guidelines, or to come to a dead end regarding responsibility and accountabilityespecially with respect to service conditions, data curation and quality issues, legal contexts, and ethical approval. The dead ends are illustrated with three examples. The three examples are an abstract distillation of several years of experience in consulting in the field of research data management in the digital humanities, cultural and social sciences, which led to a reflection on the existing obstacles in the implementation of research data management from the perspective of researchers.

Example 1. Publishing: Data and/or articles could not be published because of missing ethical approval required by journals or repository services. A reason for missing approval may be the insufficient conditions of responsibility when several organisations, academic bodies, or projects are involved.

Example 2. Hosting: Researchers might fail to establish a sustainable data hosting strategy for the active stages of the research data life cycle (e.g., collection, creation, or enrichment) due to the lack of existing or explicitly communicated regulations regarding responsibility models that formulate requirements for data maintenance with the help of a software service.

Example 3. Transfer: Researchers might deal with different or incompatible data transfer regulations of (service) providers because the regulations vary from provider to provider across the entire research data life cycle leading to obstacles for either reusing and/or documenting, licensing, and publishing new or adapted research data.

What these examples have in common are challenges that are caused by responsibility systems that are situated outside of the individual researcher’s responsibility structures in or across organisations and are part of research data governance. The questions are: Which data governance system affects research data in which stages of the research data life cycle? What are the decision-making areas of (individual) researchers and of other stakeholders?

The research data policies of academic organisations typically state that researchers and/or principal investigators are themselves accountable for the entire research data life cycle. The role of the organisation in supporting researchers is left vague. Commonly, organisations install supportive stakeholders, such as data privacy officers, RDM offices/advisory teams, support units at computing centres for specific IT services, open access services at libraries, ethics committees (on the organisational or faculty level), and research units/offices on the organisational level for supporting proposals for (often large-scale) research projects in compliance with research funders requirements. These stakeholders typically work individually on a high level, have different responsibility systems, and are not usually closely integrated in RDM processes that address research or/and discipline-specific requirements.

In the researchers’ understanding, they should be accountable for nothing less than sustainable, reproducible, and FAIR data (Wilkinson et al., 2016), but at the same time there is a lack of an established or communicated responsibility and decision-making structure, and a lack of connection between the RDM stakeholders in their organisation(s) with the research (data) life cycles, accountability and responsibility of the researchers.

The research data life cycle is a common process model used by researchers to identify the actual stages where responsibility systems (might fail to) apply from the researcher’s perspective. Importantly, the life cycle can be understood as a data domain of data governance programs according to Abraham, Schneider, and Vom Brocke (2019, p. 431). In this way, the research data life cycle is already connected to data governance.

Research data governance,2 as discussed here, is focused on researchers (and not on library and information professionals or IT professionals) as researchers are the owners of their research data—from design to collection, creation, enrichment, analysis, and publication. Furthermore, the focus is on research data rather than administrative data such as enrolment records (the third type of data according to Jim and Chang, 2018, p. 198). This essay focuses on the perspective of researchers in the digital humanities, social sciences, and cultural studies. These research areas are characterised by heterogeneous digital methods and workflows in their research data life cycle and similarly heterogeneous starting points with regard to data governance systems. At least according to a survey by Benfeldt Nielsen (2017), data governance is more established in computational science-related research fields. The degree of experience, standardisation, and methods used throughout the research data life cycle in the digital social and cultural studies and humanities vary across discipline and academic organisation on faculty and department level. Data Governance at organisational-wide level, e.g., with the help of policies (cf. Section 3), can typically only address that discipline-specific standards should be applied.

I argue that a governance structure is missing that covers the researcher’s responsibilities and accountability structures. In addition, the data governance of domain-specific/relevant service providers installed in one or/and across academic bodies or organisations should be part of the body of knowledge of such a research data governance model (e.g., covering and developing standard or best practice cooperation and regulations). The distinction between responsibility and accountability is important: an accountable person is able to assign a task to another person, who is in turn responsible for completing it. We need to develop a research data governance system that actually fits the researcher-specific and discipline- or domain-specific needs and that is able to create accountability relations between existing governance models at and across academic organisations.

2. Data Governance as an Organisational System for Responsibility and Decision-Making

The term data governance is mainly defined and implemented in non-academic companies and organisations. The Data Government Institute in Greensboro (U.S.), for example, provides the following definition:

‘Data Governance is a system of decision rights and accountability for information-related processes, executed according to agreed-upon models which describe who can take what actions with what information, and when, under what circumstances, using what methods.’ (Data Government Institute, 2024)

Data governance can thereby refer to organisational bodies, rules (policies, standards, guidelines, business rules), decision rights (how we ‘decide how to decide’), accountability, enforcement methods for people, and information systems as they perform information-related processes.3 Following Khatri and Brown (2010, p. 148), the relation between management and governance can be defined as follows:

  • Governance refers to what decisions must be made to ensure effective management and use of IT (decision domains) and who makes the decisions (locus of accountability for decision making).

  • Management involves making and implementing decisions.’

Data management is an explicit planning and monitoring tool for the organisation and administration of research data. Without data governance, data management may be understood as the implementation of an unwritten governance system that defines the accountability and responsibility for the various tasks and activities undertaken within the domain of research data management.

Focusing on data quality, data governance is defined by Otto (2011, p. 74) as follows: a ‘companywide framework for assigning decision-related rights and duties in order to be able to adequately handle data as a company asset’. Data governance establishes processes and tools that enable responsible, transparent, and comprehensible regulation and monitoring. Governance is then instantiated by a governance body (Ebel, 2021, pp. 163, 178). In the context of higher education, there are a number of institutional bodies that assume responsibility for the governance of data. These typically include computing centres and libraries.

Research data policies and guidelines for good scientific practices are standard tools in academic organisations at the rule level. These policies cannot, by definition, intervene profoundly in the research (data) life cycles. Such policies or guidelines specify the overall accountability of researchers for their data and software as well as the organisation’s responsibility to provide the framework for good RDM. On the same rule level, the FAIR Guidelines, as cross-organisational guidelines, provide a specific focus on the definition of goals for high quality metadata and data. Policies and requirements for data handling of research funding agencies of the European Commission4 or the German Research Foundation (Deutsche Forschungsgemeinschaft, 2022) operate on a similar level. Importantly, these policies refer to each other. Organisational policies for good scientific practices often refer, for example, to FAIR data and are, at least in German academic system, compliant with common funding agencies requirements. The data governance on the rule level provides a high-level compliance on the one hand. On the other hand, it may also refer to common standard RDM tools that cannot (and are not designed to) provide a system of decision rights and accountability. For example, the RDM process model at the standard level is the research data life cycle, which is a highly useful process model for implementing RDM. However, it is devoid of insights pertaining to accountability and decision-making. The standard with respect to guidelines is the data management plan, where responsibility, except for the principal investigator, is often not clearly defined in terms of what accountability means and how decisions are made.

Data Governance comes with a vast array of approaches and models. I will focus on organisation and compliance, which are two of the four governance principles (including alignment and common understanding) described by Brous, Janssen and Vilminko-Heikkinen (2016). Both principles illustrate the necessity for the development of a complementary data governance system (cf. Section 3). On the basis of the data governance framework of Abraham, Schneider, and Vom Brocke (2019, pp. 426–428), I focus on parameters relevant for organisations and data.

3. Research Data Governance as a Cross-organisational System for the Researcher’s Data Domain

For researchers, research data (research domain/data scope) and the research data life cycle (as a data decision domain) are not bound to a single organisation. Consequently, being compliant would require negotiating between compliance regulation across organisations. Data reuse from different organisations might add even more complexity to data governance systems. With the help of a research data governance system, researchers might be able to design their data ownership within and between disciplines and organisations. This is in line with Koltay (2016, p. 123), who argues the following:

‘To broaden such an ecosystem, researchers and policy makers must grapple with the growing and increasingly diverse landscape of organizations and stakeholders involved in the production and use of research data, seeking to understand the relationships both among these entities and between these entities and individuals who carry out research.’

For research data management, it is common sense that it “takes a village” to manage data and software (Borgman and Bourne, 2022). If governance enables management (cf. Section 2), I argue that we also require a similar approach in the context of data governance systems, which must be defined, coordinated, and lived in a way that aligns with the different stages of the research data life cycle that is based more on specific research questions and disciplines than on organisational-wide processes.

Abraham, Schneider, and Vom Brocke (2019, p. 426) and Lis and Otto (2020, pp. 2–3) distinguish between the intra- and inter-organisational scope. The intra-organisational scope can be applied to an entire organisation, while the inter-organisational scope defines data governance between organisations. While there is a growing emphasis on the development and communication of explicit data and IT governance models within libraries, computing centres, and larger research units, there is still a lack of integration between these models and a narrow focus on specific stages of the research data life cycle in research contexts: departments and faculties. This is at odds with the need for researchers to consider the entire research data life cycle and to integrate governance models with their own discipline-, institution-, or project-specific governance models. The exploratory study conducted by Kouper, Raymond, and Giroux (2020) on the researchers’ perception of data governance through a survey points in a similar direction, indicating that ‘[m]any individuals also belong to multiple communities, identifying themselves with a specific discipline and with interdisciplinary communities and with communities that are involved in various aspects of the data lifecycle’ (ibid., p. 131).

In terms of data scope (Abraham, Schneider, and Vom Brocke, 2019, p. 432), data governance usually focuses on administrative data generated or accumulated within academic contexts, e.g., enrolment data, financial data, and employee data, among others. But the nature of research data differs from administrative data. Research data is typically not collected, created, or analysed within a single organisation, but across multiple organisations. Furthermore, research data often varies in scope, domain, methods, format, analysis, and (re-)usage scenarios, in contrast to administrative data, which is typically standardised within a single organisation. This particular variation is typically better understood in the research domain and field than it is at the organisational level. It is therefore possible that a university data governance structure might fall short in terms of considering the variability and the specific requirements of different disciplines or research domains. In addition, research data has specific value throughout the entire research data life cycle for researchers, the research community, and the discipline or research area in question. From an organisation-wide perspective (e.g., university), the open access publication in repositories is of value (last stage of the research data life cycle). Usually, the service portfolio of one organisation is not sufficient for all stages of the research data life cycle (cf. Example 2 and 3). Researchers may use tools and services across organisations and across national and disciplinary boundaries to create, enrich, analyse, and publish their data. For example, if libraries work with a data governance model, their research data services are typically repositories that play a role at either the conclusion or the outset of the research data life cycle. Computing centres might operate under an IT/data governance model, and they provide research data services to the research community, such as Jupyter Hubs.5 These services contribute to the research data life cycle stages of creation, enrichment, and analysis as a basic service, for which researchers might add additional research and discipline-specific software and applications usually requiring different governance models. Transferring data from one service to another or combining services may prove challenging when conflicting data governance regulations, e.g., related to data privacy, copyright, licensing, and identity management must be combined (Example 3).

Another question that arises is the organizational location of a research data governance system. The organisational level of a university, library, or computing centre is too broad, because they can often only focus on single stages in the research data life cycle, and existing governance systems are not designed for research data and varying cross-organisational research data life cycles. Individual researchers need a research data governance system that, among other things, acts as a broker position between organisation-wide data governance systems within and across organisations, while addressing domain- or discipline-specific requirements (data domain, research data life cycle). The research data governance system should ideally be installed in the context of their research, e.g., at the faculty or department level, where there are already established and tested communication channels and responsibility domains (for different goals/purposes). A new decision-making body could be created and installed in the bodies of academic self-government (e.g., faculty or department) with the possibility to connect to discipline-related research infrastructures outside the university. In this way, faculties or departments would be able, firstly, to develop and, secondly, to decide with researcher communities, disciplines, projects or teams, what such a governance system should look like and how to adapt it in response to evolving requirements specific to the researcher’s data domain. A second advantage would be that research data governance could be more easily integrated into the curriculum through a department or faculty structure that supports the development of data literacies, because literacies are in turn also discussed in the context of data governance (e.g., Koltay, 2016). Locating research data governance (e.g., in a faculty) would ensure that relevant committees of (parts of) the university (e.g., data ethics committees or data protection advisory positions) are integrated into the research data governance systems (Example 1). A potential design of such a research data governance system could be a polycentric governance structure for a knowledge commons (cf. Kouper, Raymond, and Giroux, 2020). A polycentric approach would then enable the functional link between, for example, the research data governance of a faculty with the IT governance of the in-house or external computing centre or libraries that provide research data services to faculty researchers (Example 2, 3). A connecting nucleus in the polycentric structure of the research data governance model could be an RDM committee that represents the faculty disciplines/departments and consists of intra-organisational stakeholders from labs, centres, other committees and support units relevant to the faculty’s research fields. To legitimize the committee’s work and integrate it into the faculty’s body of knowledge, the faculty council approves the committee’s proposals. The RDM committee could then establish cooperation with domain- and research-relevant inter-organisational units. The RDM committee could then act as a steering board for the development of research data governance and as a broker and knowledge base for researchers. Consider examples 1–3 again. In example 1 (Publishing), the RDM committee would help to define the responsibility domains of the in-house ethical approval and points to external responsibility domains if, for example, a study is being conducted in another country. In example 2 (Hosting), the researcher would benefit from IT service staff being part of the RDM committee, where hosting and maintenance strategies can be developed in-house. For maintenance strategies that are external to the organisation, the RDM committee would at least be able to support/advise researchers in formulating principles or requirements that need to be met. In example 3 (Transfer), the RDM committee would act as a hub of information and expertise to which researchers could turn to resolve case-specific transfer regulation issues. With a long-term perspective, the RDM committee could gain research-specific experience, build up and further develop faculty-wide knowledge and case management. As committees are typically appointed by the faculty itself, a sustainable staffing and organizational structure is also feasible without directly creating new positions.

The approach to research data governance discussed here has the potential to bridge the gap between organisation-wide governance structures (within and across organisations) and to establish a subject-specific and data-domain-related responsibility system for researchers in academic self-governance. This would enable researchers to gain insight into and develop their areas of decision-making and accountability throughout their research data life cycle, while respecting the freedom and domain-specificity of research.

Notes

[1] The RDM services include, for example, institutional research data managers, IT services for data preparation such as version control systems, and publication services such as repositories and backup systems. RDM materials include, for example, the guidelines and policies such as FAIR Guiding Principles for research (meta)data and software (Wilkinson et al., 2016; Barker et al., 2022; Chue Hong et al., 2021; Strasser, 2015), research data life cycle models (Cox and Tam, 2018) data management plans (Grossmann et al., 2024), software management plan (Bishop et al., 2023), RDM blogs such as the blog of the Digital Curation Center (https://www.dcc.ac.uk/), discipline-specific RDM material from national consortia (https://www.nfdi.de/consortia/?lang=en), Open Access communities such as copim (https://www.copim.ac.uk/), funding agency guidelines for RDM such as Open AIRE for Horizont Europe (https://www.openaire.eu/rdm-in-horizon-europe-proposals), Guidelines for Safeguarding Good Research Practice, and the Code of Conduct of the German Research Foundation (Deutsche Forschungsgemeinschaft, 2022).

[2] At least to my knowledge, a research data governance model with a focus on the researcher’s perspective and the integration of an inter- and cross-organisational responsibility model in academic self-organisation is a novel approach.

[4] Recommendation of Open AIR for RDM in EU projects https://www.openaire.eu/rdm-in-horizon-europe-proposals, accessed February 26, 2025.

[5] https://jupyter.org/hub accessed 18.12.2024.

Acknowledgements

Many thanks for the critical discussion of this contribution go to Dr. Malte Belz (Humboldt-Universität zu Berlin). For the extensive support in all matters, especially concerning the effort and support in designing, implementing, and evaluating new profiles and structures for domain-specific research data management, I thank the Faculty of Language, Literature and Humanities at Humboldt-Universität zu Berlin.

Competing Interests

The author has no competing interests to declare.

Language: English
Submitted on: Jun 20, 2024
Accepted on: Mar 24, 2025
Published on: Apr 11, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Carolin Odebrecht, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.