Have a personal or library account? Click to login
Time-Series Trend of Pandemic SARS-CoV-2 Variants Visualized Using Batch-Learning Self-Organizing Map for Oligonucleotide Compositions Cover

Time-Series Trend of Pandemic SARS-CoV-2 Variants Visualized Using Batch-Learning Self-Organizing Map for Oligonucleotide Compositions

Open Access
|Sep 2021

Figures & Tables

Table 1

Number of SARS-CoV-2 genome sequences with less than 10% (A) and less than 1% (B) unknown nucleotides used in this study.

Unknown: genome sequences for which continent was not registered.

(A) NUMBER OF SEQUENCES WITH LESS THAN 10% UNKNOWN NUCLEOTIDES
CLADE\CONTINENTASIAEUROPENORTH AMERICAOCEANIAAFRICASOUTH AMERICAUNKNOWNTOTAL
S7941,8603,4496641107406,951
L8233,1966006541104,699
V2474,687402253132305,625
G97920,9286,5681,1061,141461031,183
GH2,05810,32523,916964232176037,671
GR2,65742,8885,25111,1351,6321,129064,692
GV312,22931400012,249
O2,2201,127553531602504,516
Non-human host352471901413319
#Total9,81697,48740,76114,7323,1931,90313167,905
(B) NUMBER OF SEQUENCES WITH LESS THAN 1% UNKNOWN NUCLEOTIDES
CLADE\CONTINENTASIAEUROPENORTH AMERICAOCEANIAAFRICASOUTH AMERICAUNKNOWNTOTAL
S7311,0473,056466715805,429
L7601,9645494921003,334
V2283,036366207101703,864
G87715,2005,071858634300022,940
GH1,9238,36519,014717191150030,360
GR2,42532,5184,5499,1661,180871050,709
GV310,71231100010,729
O1,82452234941530903,149
Non-human host301761901013239
#Total8,80173,54032,97611,8892,1191,41513130,753
dsj-20-1344-g1.png
Figure 1

BLSOM for pentanucleotide usage. (A) Pentanucleotide composition and (B) their odds ratio for sequences with less than 10% unknown nucleotides. (C) Pentanucleotide composition and (D) their odds ratio for sequences with less than 1% unknown nucleotides. Lattice points that include sequences from more than one clade are indicated in black, those that contain no genomic sequences are indicated by blank, and those containing sequences from a single clade are indicated in color as follows: S (dsj-20-1344-g5.png), L (dsj-20-1344-g6.png), V (dsj-20-1344-g7.png), G (dsj-20-1344-g8.png), GH (dsj-20-1344-g9.png), GR (dsj-20-1344-g10.png), GV (dsj-20-1344-g11.png), O (dsj-20-1344-g12.png), non-human host (dsj-20-1344-g13.png). (E) Distribution of sequences by continent on the BLSOM with the pentanucleotide odds ratio. Lattice points that include sequences from more than one continent are indicated in black, those that contain no genomic sequences are indicated by blank, and those containing sequences from a single continent are indicated in color as follows: Asia (dsj-20-1344-g14.png), Europe (dsj-20-1344-g15.png), North America (dsj-20-1344-g16.png), Oceania (dsj-20-1344-g12.png), Africa (dsj-20-1344-g17.png), South America (dsj-20-1344-g18.png).

dsj-20-1344-g2.png
Figure 2

3D display of viral classification by clade and continent. The Z-axis corresponds to the number of sequences attributed to each lattice point. Results for all continents are shown in the ALL panel for each clade. In clades G, GH, GR and GV, lattice points where less than 5 sequences exist are not shown. The vertical bars for individual continents are distinguished by the following colors: Asia (dsj-20-1344-g14.png), Europe (dsj-20-1344-g15.png), North America (dsj-20-1344-g16.png), Oceania (dsj-20-1344-g12.png). Different subclusters are given suffix numbers.

dsj-20-1344-g3.png
Figure 3

3D display of temporospatial changes. The Z-axis corresponds to the number of sequences attributed to each lattice point. Results for all collection months are shown in the ALL panel for each continent. The vertical bars for individual clades are distinguished by the following colors: S (dsj-20-1344-g5.png), L (dsj-20-1344-g6.png), V (dsj-20-1344-g7.png), G (dsj-20-1344-g8.png), GH (dsj-20-1344-g9.png), GR (dsj-20-1344-g10.png), GV (dsj-20-1344-g11.png).

dsj-20-1344-g4.png
Figure 4

Analysis of 100% stack bar graph for time-series transition in each continent for each subcluster in clades S (A), L (B), V (C), G (D), GH (E), and GR (F). The colors of each subcluster are indicated at the bottom of each figure. The results for months with more than 100 sequences are shown as thick horizontal bars. The number of sequences used in this analysis is given in Supplementary Table S1.

Language: English
Submitted on: Mar 23, 2021
|
Accepted on: Sep 10, 2021
|
Published on: Sep 21, 2021
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2021 Takashi Abe, Ryuki Furukawa, Yuki Iwasaki, Toshimichi Ikemura, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.