Table 1
Description of data types for submission.
| Exome/Whole Genome Sequence | 16S rRNA Microbiome studies | Genome Wide Association studies/genotyping arrays |
|---|---|---|
| Study type and description | Study type and description | Study type and description |
| Sequencing platform and technology used | Sequencing platform and technology used | Genotyping array model/name and description of the software and version used for calling the genotypes |
| FASTQ files linked with de-identified participant ID (minus technical reads such as adapters, linkers, barcodes) | FASTQ files linked with de-identified participant ID (minus technical reads such as adapters, linkers, barcodes) | Raw intensity files linked with de-identified participant IDs (IDATs, CELs) |
| Binary Alignment files (BAMs, de-multiplexed) – linked with participant de-identified ID | Manifest file describing SNP or probe content on the genotyping array | |
| Associated phenotypic data collected | Associated phenotypic data collected | Associated phenotypic data collected |
| Variant calling files (VCFs) | Final analyses BIOM files (at minimum must contain OTUs) | Final reports and analysis files generated |
| Mapping file indicating the relationship between the submitted files | Mapping file indicating the relationship between the submitted files | Mapping file indicating the relationship between the submitted files (completed Array Format template) |

Figure 1
Timeline for submission of data to public repositories, extracted from the H3Africa Data Sharing, Access and release policy.

Figure 2
Diagram showing the process for submission of data to the Archive and EGA.
Table 2
Example speeds for time for moving data within and between the Archive and EGA.
| From | To | Average Mbp/s (Megabits) | Average Mb/s (Megabytes) | Time to transfer (days) | Size | |
|---|---|---|---|---|---|---|
| 1 | Vault | Landing Area | 120 | 15 | 6 | 8.9 TB |
| 2 | Other local server | EGA (Aspera) | 16 | 2 | 90 | 8.9 TB |
| 3 | Vault | Hard Drive (directly into port) | 200 | 25 | 3 | 8.9 TB |
| 4 | Landing Area | EGA (Aspera) | 240 | 30 | 2 | 8.9 TB |
