Have a personal or library account? Click to login
Data Management in a Community-Based Birth Cohort: What the SEMILLA Study Teaches Us Cover

Data Management in a Community-Based Birth Cohort: What the SEMILLA Study Teaches Us

Open Access
|Feb 2026

Abstract

In cohort studies, systematic information management often receives limited attention in study protocols, resulting in delays, quality issues, and threats to data validity. This paper describes the data management process of a community-based cohort study, using the SEMILLA (Study of Environmental Exposure of Mothers and Infants Impacted by Large-Scale Agriculture) study conducted in Cayambe, Ecuador, as a case example, and highlights the challenges, adaptations, and lessons learned, with the aim of informing similar studies.

The SEMILLA data management process was structured around three key responsibility areas: strategic management, technical coordination, and data administration. The process unfolded in two main stages: Preparatory, which involved iterative refinement of data collection instruments, definition of coding rules, platform adjustments and migration, continuous team training, and implementation of security and anonymization protocols; and Organization, which included the assignment of interviewers for field data entry, primary data cleaning, creation of additional variables to better describe the sample composition and operational conditions, and the production of technical documentation.

This approach contributed to improving data entry consistency, reducing recurrent errors, and strengthening record traceability throughout the follow-up by means of operational monitoring procedures. Key lessons include the importance of establishing a data management protocol and involving a data manager from the study design phase, maintaining flexibility in selecting data collection platforms, ensuring proper assignment of interviewers for each instrument, automating quality control processes, and continuously generating technical and operational documentation. Collectively, these practices help preserve data quality and promote operational efficiency in longitudinal studies conducted in similar contexts.

Language: English
Submitted on: Jul 1, 2025
|
Accepted on: Jan 13, 2026
|
Published on: Feb 6, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Nataly Cadena, Fadya Orozco, Stephanie Montenegro, Fabián Muñoz, Alexis J. Handal, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.