
Data Organization Made Easy: Comprehensive Folder Structure Template for Early Career Life/Natural Science Researchers
By: Yasmin Demerdash, Ron Dockhorn and Jeanne Wilbrandt
References
- Batista, D., Gonzalez-Beltran, A., Sansone, S.A. and Rocca-Serra, P. (2022). ‘Machine actionable metadata models’, Scientific Data, 9(1), p.
592 . Available at: 10.1038/s41597-022-01707-6 - Bayer, C., Frech, A., Gabriel, V., Kümmet, S., Lücke, S., Munke, J., Putnings, M., Rohrwild, J., Schulz, J., Spenger, M. and Weber, T. (2022). ‘DataCite Best Practice Guide’, Zenodo. Available at: 10.5281/zenodo.7040047
- Biodiversity Information Standards (TDWG) (2025) ‘Darwin Core’. Available at:
https://dwc.tdwg.org (Accessed: 30 September 2025). - Borer, E.T., Seabloom, E.W., Jones, M.B. and Schildhauer, M. (2009). ‘Some simple guidelines for effective data management’, The Bulletin of the Ecological Society of America, 90(2), pp. 205–214. Available at: 10.1890/0012-9623-90.2.205
- Bowers, E.C., Stephenson, J., Furlong, M. and Ramos, K.S. (2023). ‘Scope and financial impact of unpublished data and unused samples among U.S. academic and government researchers’, iScience, 26(7), p.
107166 . Available at: 10.1016/j.isci.2023.107166 - Briney, K.A. (2020). ‘File naming convention worksheet’, California Institute of Technology. Available at: 10.7907/894Q-ZR22
- CERN & Contributors (2025) ‘Invenio – Powering Open Science’. Available at:
https://inveniosoftware.org (Accessed: 30 September 2025). - CERN Data Centre (2025) ‘Zenodo JSON file’. Available at:
https://help.zenodo.org/docs/github/describe-software/zenodo-json (Accessed: 30 September 2025). - CESSDA Training Team (2020) ‘CESSDA data management expert guide’, Zenodo. Available at: 10.5281/zenodo.3820473
- Colomb, J., Arendt, T., Sehara, K. and The Gin-Tonic team (2023) ‘Towards a standardized research folder structure’. Available at:
https://gin-tonic.netlify.app/standard (Accessed: 30 September 2025). - DCMI (2025a) ‘DublinCore – Metadata Basics’. Available at:
https://www.dublincore.org/resources/metadata-basics (Accessed: 30 September 2025). - DCMI (2025b) ‘DCMI Metadata Terms’. Available at:
https://www.dublincore.org/specifications/dublin-core/dcmi-terms (Accessed: 30 September 2025). - de Kok, T. (2018). ‘Folder structure generator for research projects’. Available at:
https://www.tiesdekok.com/folder-structure-generator (Accessed: 30 September 2025). - Demerdash, Y., Dockhorn, R. and Wilbrandt, J. (2025). ‘PhD project folder template for the life sciences’, Zenodo. Available at: 10.5281/zenodo.15835126
- Dockhorn, R. (2025a). ‘ParsingMetadataMD2 JSON’, Zenodo. Available at: 10.5281/zenodo.14942696
- Dockhorn, R. (2025b). ‘From data to credits: Using ReadMe, Markdown, and Dublin Core for better documentation’, Zenodo. Available at: 10.5281/zenodo.14848834
- Dryad (2025) ‘Dryad data platform’. Available at:
https://datadryad.org (Accessed: 30 September 2025). - EMBL-EBI (2025) ‘GFF/GTF file format – Definition and supported options’. Available at:
https://www.ensembl.org/info/website/upload/gff.html (Accessed: 30 September 2025). - European Organization For Nuclear Research and OpenAIRE (2025) ‘Zenodo’. Available at:
https://zenodo.org (Accessed: 30 September 2025). - Felden, J., Möller, L., Schindler, U., Huber, R., Schumacher, S., Koppe, R., Diepenbroek, M. and Glöckner, F.O. (2023). ‘PANGAEA – Data publisher for earth & environmental science’, Scientific Data, 10(1), p.
347 . Available at: 10.1038/s41597-023-02269-x - Git community (2025) ‘Git – A free and open source distributed version control system’. Available at:
https://git-scm.com (Accessed: 30 September2025). - International Organization for Standardization (2019) ‘ISO 8601-1:2019(en) Date and time – Representations for information interchange – Part 1: Basic rules’. Available at:
https://www.iso.org/obp/ui/en/#iso:std:iso:8601:-1:ed-1:v1:en:term:3.1.3 (Accessed: 30 September 2025). - JSON-LD Community Group (2025) ‘JSON for linking data’. Available at:
https://json-ld.org (Accessed: 30 September 2025). - Kanza, S. and Knight, N.J. (2022). ‘Behind every great research project is great data management’, BMC Research Notes, 15(1), p.
20 . Available at: 10.1186/s13104-022-05908-5 - King, G. (2007). ‘An introduction to the Dataverse network as an infrastructure for data sharing’, Sociological Methods & Research, 36(2), pp. 173–199. Available at: 10.1177/0049124107306660
- Krapp, M. (2017). ‘Reproducible science’. Available at:
https://github.com/mkrapp/cookiecutter-reproducible-science (Accessed: 30 September 2025). - Landi, A., Thompson, M., Giannuzzi, V., Bonifazi, F., Labastida, I., da Silva Santos, L.O.B. and Roos, M. (2020). ‘The “A” of FAIR – As open as possible, as closed as necessary’, Data Intelligence, 2(1–2), pp. 47–55. Available at: 10.1162/dint_a_00027
- Lang, K., Roman, G., Jessica, R., Annett, S., Nadine, N. and Lehmann, A. (2021). ‘The 5S methodology in research data management’, Zenodo. Available at: 10.5281/zenodo.4494258
- Logseq, Inc. (2025). ‘Logseq: A privacy-first, open-source knowledge base’. Available at:
https://logseq.com (Accessed: 30 September 2025). - Lyrasis (2024) ‘DSpace – Build an open digital repository’. Available at:
https://dspace.org (Accessed: 30 September 2025). - Markowetz, F. (2015). ‘Five selfish reasons to work reproducibly’, Genome Biology, 16(1), p.
274 . Available at: 10.1186/s13059-015-0850-7 - Microsoft (2024) ‘Maximum path length limitation’. Available at:
https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation (Accessed: 30 September 2025). - Noble, J. and Butcher, L. (2025). ‘A system to organise your life’. Available at:
https://johnnydecimal.com (Accessed: 30 September 2025). - Obsidian (2025) ‘Obsidian – Sharpen your thinking’. Available at:
https://obsidian.md (Accessed: 30 September 2025). - Open Knowledge Foundation (2025) ‘CKAN – The open source data management system’. Available at:
https://ckan.org (Accessed: 30 September 2025). - Overleaf (2025) ‘Collaborative, Online LaTeX editor’. Available at:
https://www.overleaf.com (Accessed: 30 September 2025). - Paperpile (2022) ‘A complete guide to the BibTeX format’. Available at:
https://www.bibtex.com/g/bibtex-format (Accessed: 30 September 2025). - Piwowar, H.A. and Vision, T.J. (2013). ‘Data reuse and the open data citation advantage’, PeerJ, 1, p.
e175 . Available at: 10.7717/peerj.175 - RDMkit (2025) ‘RDMkit: The ELIXIR research data management toolkit for life sciences’. Available at:
https://rdmkit.elixir-europe.org (Accessed: 30 September 2025). - Rogonondo, H. (2025). ‘Data science project folder structure’. Available at:
https://github.com/hardefarogonondo/data-science-project-folder-structure (Accessed: 30 September 2025). - RWTH Aachen University (2025) ‘Coscine – Collaborative scientific integration environment’. Available at:
https://about.coscine.de (Accessed: 30 September 2025). - Scheidgen, M., Himanen, L., Ladines, A.N., Sikter, D., Nakhaee, M., Fekete, A., Chang, T., Golparvar, A., Márquez, J.A., et al. (2023). ‘Nomad: A distributed web-based platform for managing materials science research data’, Journal of Open Source Software, 8(90), p.
5388 . Available at: 10.21105/joss.05388 - Schreier, A.A., Wilson, K. and Resnik, D. (2006). ‘Academic research record-keeping: Best practices for individuals, group leaders, and institutions’, Academic Medicine, 81(1), pp. 42–47. Available at: 10.1097/00001888-200601000-00010
- Schwab, S., Janiaud, P., Dayan, M., Amrhein, V., Panczak, R., Palagi, P.M., Hemkens, L.G., Ramon, M., Rothen, N., Senn, S., et al. (2022). ‘Ten simple rules for good research practice’, PLOS Computational Biology, 18(6), p.
e1010139 . Available at: 10.1371/journal.pcbi.1010139 - Science Europe (2021) ‘Practical guide to the international alignment of research data management – Extended edition’. Zenodo. Available at: 10.5281/zenodo.4915862
- Seep, L., Grein, S., Splichalova, I., Ran, D., Mikhael, M., Hildebrand, S., Lauterbach, M., Hiller, K., Ribeiro, D.J.S., Sieckmann, K. et al. (2024). ‘From planning stage towards FAIR data: A practical metadatasheet for biomedical scientists’, Scientific Data, 11(1), p.
524 . Available at: 10.1038/s41597-024-03349-2 - Subramaniam, P., Ma, Y., Li, C., Mohanty, I. and Fernandez, R.C. (2021). ‘Comprehensive and comprehensible data catalogs: The what, who, where, when, why, and how of metadata management’, arXiv [preprint]. Available at: 10.48550/arxiv.2103.07532
- Text Tree Generator (2025) ‘ASCII text tree generator’. Available at:
https://www.text-tree-generator.com (Accessed: 30 September 2025). - Utrecht University (2025) ‘Yoda – Research data management’. Available at:
https://github.com/utrechtuniversity/yoda (Accessed: 30 September 2025). - Vreede, B. (2020). ‘Good enough project’. Available at:
https://github.com/bvreede/good-enough-project (Accessed: 30 September 2025). - Vukovic, N. (n.d.) ‘Setting up an organised folder structure for research projects’. Available at:
http://nikola.me/folder_structure.html (Accessed: 30 September 2025). - Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.W., da Silva Santos, L.B., Bourne, P.E. et al. (2016). ‘The FAIR guiding principles for scientific data management and stewardship’, Scientific Data, 3(1), p.
160018 . Available at: 10.1038/sdata.2016.18 - Wilson, G., Bryan, J., Cranston, K., Kitzes, J., Nederbragt, L. and Teal, T.K. (2017). ‘Good enough practices in scientific computing’, PLOS Computational Biology, 13(6), p.
e1005510 . Available at: 10.1371/journal.pcbi.1005510 - Wolkovich, E.M. (2024). ‘Obviously ChatGPT – how reviewers accused me of scientific fraud’, Nature. Available at: 10.1038/d41586-024-00349-5
- YAML Language Development Team (2021) ‘YAML ain’t markup language’. Available at:
https://yaml.org (Accessed: 30 September 2025). - Zettlr (2025) ‘Zettlr: Your one-stop publication workbench’. Available at:
https://www.zettlr.com (Accessed: 30 September 2025).
DOI: https://doi.org/10.5334/dsj-2025-035 | Journal eISSN: 1683-1470
Language: English
Page range: 35 - 35
Submitted on: Jul 21, 2025
Accepted on: Nov 6, 2025
Published on: Dec 2, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year
Keywords:
© 2025 Yasmin Demerdash, Ron Dockhorn, Jeanne Wilbrandt, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.