Have a personal or library account? Click to login
Smartwatch-Based Audio–Gestural Insights in Violin Bow Stroke Analyses Cover

Smartwatch-Based Audio–Gestural Insights in Violin Bow Stroke Analyses

Open Access
|Sep 2025

Figures & Tables

tismir-8-1-216-g1.png
Figure 1

A Synchronous Inertial Measurement Unit–Audio dataset recording, with vertical lines denoting note onsets peak‑picked from an onset detection function calculated through use of the Madmom1 audio signal–processing library.

Table 1

Participant Recognition Network Parameters.

LayerParameterMLPGRULSTMCNN
AUnits128
ActivationReLU
BUnits/Filters*12812812850*
128888850*
Kernel Size5
ActivationReLUReLUReLUReLU
CUnits88
ActivationReLU
DN/A
EUnits64
FUnitsN Classes
ActivationSoftmax
tismir-8-1-216-g2.png
Figure 2

Apple Watch rotational axes as relative to the device.

tismir-8-1-216-g3.png
Figure 3

Conventional deep neural network used in unimodal classification implementations, comprising an input layer (A), two variable layers (B), a dense layer (E) and an output layer (F).

Table 2

Participant recognition classification accuracy metrics, by network architecture and input data type.

Input Data TypeNetwork ArchitectureParticipant Recognition
Acc (%)AUCF‑Score
AudioMLP76.380.9480.765
LSTM82.430.9670.828
CNN1D80.750.9640.812
GRU79.390.9570.796
IMUMLP94.410.9910.947
LSTM96.430.9960.965
CNN1D96.000.9940.961
GRU91.830.9890.918
AudioMLPMLP94.660.9930.950
+LSTMLSTM93.850.9880.938
IMUCNN1DCNN1D94.200.9860.942
GRUGRU95.590.9950.956
MLPCNN1D94.080.9920.942
CNN1DMLP93.680.9920.937
LSTMCNN1D93.790.9870.936
CNN1DLSTM91.720.9790.911

[i] denotes corresponding subnetwork modality.

tismir-8-1-216-g4.png
Figure 4

Multi‑input deep neural network used in multimodal classification implementations, comprising two input layers (A), four variable layers (B), two flattened layers (C), a concatenation layer (D), a dense layer (E) and an output layer (F).

tismir-8-1-216-g5.png
Figure 5

Participant train/validation accuracies per fold, by epoch, averaged across networks by datatype.

tismir-8-1-216-g6.png
Figure 6

Distribution of participants’ G Major scale individual note tempi, by articulation.

tismir-8-1-216-g7.png
Figure 7

G Major scale tonal distributions per participant, by articulation.

Table 3

Tonal and temporal deviation descriptive statistics, by bow articulation and direction.

Tonal Deviation (cents)Temporal Deviation (BPM)
Bow ConditionDownUpBothDownUpBoth
Legato0.484.402.438.9715.9012.42
MeanSpiccato−6.16−4.19−5.1733.6453.0343.35
Both−3.04−0.19−1.6222.0735.7228.89
Legato12.0311.7012.0323.7823.4423.86
Std.Spiccato15.0314.9515.2052.5057.0855.68
Both14.1014.1914.2243.3548.3546.42
Legato144.8136.6144.5565.3549.3569.1
VarianceSpiccato225.9223.4225.52756.23257.93100.5
Both198.8201.3202.11879.62337.62154.6
Legato1.905.303.706.0212.228.80
MedianSpiccato−4.30−1.90−3.1018.6038.2927.89
Both−1.402.000.609.2819.6414.21
Legato1.6132.0641.65037.3729.6131.87
KurtosisSpiccato0.5100.8840.6585.3011.6312.866
Both1.0801.4711.19910.414.7436.755
Legato−2.16−0.49−0.3475.3164.5274.767
SkewSpiccato−0.50−.722−.6082.3051.4311.777
Both−0.528−.770−.6383.0752.1522.514
tismir-8-1-216-g8.png
Figure 8

Note‑by‑note tunings, in cents, of a recorded D Major scale (Legato).

tismir-8-1-216-g9.png
Figure 9

Piano roll graph depicting a recorded D Major scale (Legato).

tismir-8-1-216-g10.png
Figure 10

Piano roll graph depicting a multimodal recording of an excerpt from Gossec’s Gavotte, bars 27–34. Tempo: Allegretto (120 BPM).

DOI: https://doi.org/10.5334/tismir.216 | Journal eISSN: 2514-3298
Language: English
Submitted on: Aug 19, 2024
Accepted on: Aug 1, 2025
Published on: Sep 4, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 William Wilson, Niccolò Granieri, Samuel Smith, Carlo Harvey, Islah Ali-MacLachlan, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.