Have a personal or library account? Click to login
A Framework for Data-Driven Solutions with COVID-19 Illustrations Cover

A Framework for Data-Driven Solutions with COVID-19 Illustrations

Open Access
|Nov 2021

Figures & Tables

dsj-20-1354-g1.png
Figure 1

A Johns Hopkins COVID–19 visualisation dashboard.

(Source: https://coronavirus.jhu.edu/map.html)

dsj-20-1354-g2.png
Figure 2

A diagrammatical illustration of the interaction of challenges, data and skills.

dsj-20-1354-g3.png
Figure 3

Graphical illustration of the CNN classification and assessment process.

dsj-20-1354-g14.png
Algorithm 1

Adaptation of the SMA Algorithm (Mwitondi et al. 2020) for Animation & Visualisation.

Table 1

Typical variables of interest for animation and visualisation.

VARIABLESNOTATIONDESCRIPTION AND RELEVANCE
PopulationδPopulation affected by a phenomenon: This may be a national, regional or city population from which other variables are obtained
GDPɣGross Domestic Product of a country: Vital for comparative purposes
UnemploymentξUnemployment rate: Global, national, regional or city level
LocationλWhere a phenomenon happens: Useful for spatio–temporal comparisons
TimeτYear, month, week, day etc: Useful for spatio–temporal comparisons
COVID–19κDeaths, infections, hospitalisation rates, variants
PPEπPersonal Protective Equipment: Associated with COVID–19 etc.
dsj-20-1354-g4.png
Figure 4

An architecture of a CNN model.

dsj-20-1354-g5.png
Figure 5

Convolutional values are obtained by sliding the kernel over data.

Table 2

Strategies for Reducing Learning Rate.

STRATEGYFORMULATIONDESCRIPTION
Power Schedulingη(t)=η0(1+tk)ν The learning rate η0, the steps k and the power ν are typically set to 1 at the beginning. The learning rate will keep dropping at each step, much faster in the early stages than later on. Fine tuning η(t) is one of the functions of the algorithm.
Exponential Schedulingη(t)=η0×0.1tkA much faster option for reducing η0, which drops by a factor of 10 every k steps. The researcher can fine tune the constant 0.1 to suit their needs
Piecewise Constant Schedulingη(t) = 0 for 10 epochsA constant η0 for a number of epochs (e.g. η0 = 0.2 for k = 10 then η0 = 0.1 for k = 30 etc).
Performance schedulingƐvValidation Error: Measuring it helps decide on reducing η0 by a specified factor when δv stops dropping.
dsj-20-1354-g15.png
Algorithm 2

Adaptation of the SMA Algorithm (Mwitondi et al. 2020) for CNN Classification.

dsj-20-1354-g6.png
Figure 6

GDP and unemployment patterns for selected 1st and 2nd quarters over the period 2008–2020.

dsj-20-1354-g7.png
Figure 7

Recorded deaths in parts of the UK between March and July 2020.

dsj-20-1354-g8.png
Figure 8

Images captured from animated plots for the first 7 months of 2020.

dsj-20-1354-g9.png
Figure 9

A CNN model is trained on imagery data to perform classification based on known classes.

Table 3

Selected training and validation model accuracy based on 50 CNN epochs.

SAMPLE #TRAIN %VALID %TRAIN-STARTTRAIN-CONVERGEVALID-STARTVALID-CONVERGE
180%20%87.98%99.57%20.99%98.95%
270%30%88.79%99.71%95.99%100.00%
360%40%90.68%100.00%83.99%99.00%
450%50%87.00%99.71%94.99%100.00%
dsj-20-1354-g10.png
Figure 10

Training and validation accuracy and loss patterns based on the 80%–20% split.

dsj-20-1354-g11.png
Figure 11

Training and validation accuracy and loss patterns based on the 70%–30% split.

dsj-20-1354-g12.png
Figure 12

Accuracy and loss patterns based on the 60%–40% (top) and 50%–50% (bottom) splits.

dsj-20-1354-g13.png
Figure 13

Accurate predictions of unlabelled new data for both positive and negative COVID–19 cases.

Table 4

Selected scenarios of interest for intervention through Algorithm 1.

SDG APPLICATIONRELATED ASPECTS OF DEVELOPMENTINTERDISCIPLINARITY
SDG #1 (Poverty)
  1. Sustainable livelihoods

  2. Access to basic social services

  3. International cooperation

Various attributes describe poverty eradication & empowerment: The impact of poverty on women requires gender specialist intervention (SDG #5). Co-ordinated efforts between donors & recipients (SDG #17). Good health (SDG #3) and education (SDG #4) lead to productivity (SDG #9), improved income and reduced inequality (SDG #10)
SDG #9 (Innovation)
  1. Resilient infrastructure

  2. Supporting economic development and human well-being

  3. Research and development

  4. Industrialisation

To deliver sustainable and resilient infrastructure countries need enhanced financial, technological and technical co-operation (SDG #17). Enhanced productivity in manufacturing, agriculture & services sectors requires quality education (SDG #4).
SDG #13 (Climate Action)
  1. Disaster risk reduction

  2. Sustainable transport

  3. Sustainable human settlement

  4. National strategies

Climate action spans across SDG from multi-disciplinary angles. Its key aspects include national strategies, disaster risk reduction, sustainable transport, sustainable cities & human settlement (SDG #11).
Table 5

Selected examples of interdisciplinary involvement for machine learning.

MODELLING TECHNIQUEPERFORMANCE INFLUENTIAL FACTORSINTERDISCIPLINARY INVOLVEMENT
K-Means
  1. Data distributional behavior

  2. Initial centroids

  3. Distance function adopted

Data choice is problem-driven but it is vital to have thorough considerations as to “what is interesting” before, during and after clustering.
CNN
  1. Topology/Architecture

  2. Initial weights

  3. Updating rule

  4. Learning rate

  5. Epochs

  6. Data/Data augmentation

Data choice is problem-driven and while the decision on the architecture may initially be by a data scientist, underlying domain knowledge is crucial in interpreting the results. Parameter tuning image data augmentation, handling of over-fitting/under-fitting require interdisciplinarity.
Table 6

Basic considerations for data–driven approaches to addressing global challenges.

COMMONALITIESFOCAL POINTSDESCRIPTION
Data
  1. Data owners

  2. Data managers

  3. National Statistics Offices

  4. Open access repositories

Making relevant data available to those who need it, when they need it
Computing Resources
  1. High Performance Computing

  2. Security

  3. Internet of Things (IoT)

Providing robust, secure and versatile computing resources for users by both the public and private sectors
Skills
  1. Data Science

  2. Domain–specific knowledge

  3. Interdisciplinarity

Adopting interdisciplinary approaches for the purpose of attaining unified solutions to global challenges
Strategies
  1. Research collaboration

  2. Students Exchange Programmes

  3. Apprenticeships & Internships

  4. Knowledge Transfer Partnerships

Devising institutional frameworks for sharing resources and knowledge through educational, vocational and research institutions
Legislations
  1. Privacy (e.g., GDPR)

  2. Cross–border data sharing

  3. Access to computing resources

  4. Patents and copy rights

Working towards operating open systems that talk to each other
Language: English
Submitted on: Apr 8, 2021
|
Accepted on: Oct 26, 2021
|
Published on: Nov 18, 2021
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2021 Kassim S. Mwitondi, Raed A. Said, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.