Have a personal or library account? Click to login
CCOM-HuQin: An Annotated Multimodal Chinese Fiddle Performance Dataset Cover

CCOM-HuQin: An Annotated Multimodal Chinese Fiddle Performance Dataset

Open Access
|Jul 2023

Figures & Tables

tismir-6-1-146-g1.png
Figure 1

Illustration of Chinese bowed instruments. The image XiQin(奚琴) is from an ancient book written by Chen (1101) and other HuQin instruments are from Liu (1992).

Table 1

Summary of relevant musical performance datasets. Note that all statistics are counted for bowed string instruments and the reported durations refer to solo excerpts. (*) The number of clips is not available in their documentation and the data comes from the randomly generated samples from a sound bank.

DATASET#INS#PT CLIPSEXCERPTS’ DURATIONANNOTATIONCONTENT
SOL412,000 clips, 15-classN/AN/Aaudio
RWC4236 clips, 5-class33.7minN/Aaudio
IPT-cello113.5h *, 18-classN/APTsaudio
TU-NOTE11,005 clips, 4-class15.8minnote transitions, PTsaudio
CVD1718 clips, 5-class37.3minN/Aaudio
HF11N/A42.6minpitch, emotionaudio, transcription
URMP4N/A78minpitchaudio, video, score, transcription
TELMI1N/AN/AN/Aaudio, video, sensor data
CTIS81,072 clips, 11-class10.3minN/Aaudio
CCOM-HuQin811,992 clips, 12-class77minpitch, PTsaudio, video, score, transcription
Table 2

PTs in Chinese, Pinyin, similar techniques used in the violin family if applicable and the abbreviations used in this paper. N/A means no corresponding technique in the violin family.

CHCH-PINYININ VIOLIN FAMILYABBR.
Bowing techniques
颤弓ChanGongTremoloTremolo
垫弓DianGongN/ADianG
顿弓DunGongMarteléDunG
断弓DuanGongDetachéDuanG
跳弓TiaoGongSpiccatoTiaoG
抛弓PaoGongRicochetPaoG
击弓JiGongN/AJiG
大击弓DaJiGongN/ADaJiG
Fingering techniques
揉弦RouXianVibratoVibrato
滚揉GunRouRolling VibratoRVib
压揉YaRouPressing VibratoPVib
滑揉HuaRouSliding VibratoSVib
滑音HuaYinPortamentoPort
上滑音Shang-Hua YinUpward PortamentoUPort
下滑音Xia-Hua YinDownward PortamentoDPort
上回滑音Shanghui HuaYinUp-Down PortamentoUDPort
下回滑音Xiahui HuaYinDown-Up PortamentoDUPort
垫指滑音Dianzhi HuaYinIntermediate PortamentoIPort
颤音ChanYinTrillTrill
打音DaYinN/ADaYin
短颤音DuanChanYinShort TrillShTrill
长颤音ChangChanYinLong TrillLoTrill
拨弦BoXianPizzicatoPizz
tismir-6-1-146-g2.png
Figure 2

RMS envelopes of bowing techniques (a-g) and a special fingering technique Pizz (h) with amplitude as y-axis; Pitch trajectories (i-p) of the other fingering techniques with F0 as y-axis.

tismir-6-1-146-g3.png
Figure 3

(a) The floorplan of the recording studio. (b) Examples of three camera views.

tismir-6-1-146-g4.png
Figure 4

The annotation pipeline.

tismir-6-1-146-g5.png
Figure 5

PT annotation examples of (a) Tremolo; (b) DianG; (c) PaoG; (d) Port; (e) Trill; (f) Vibrato.

tismir-6-1-146-g6.png
Figure 6

Statistics for PT short clips: (a) count distribution for HuQin instruments; (b) count distribution of PTs; (c) pitch distribution of HuQin instruments (A4=440Hz); (d) duration distribution of PTs.

tismir-6-1-146-g7.png
Figure 7

Number of notes in excerpts for (a) HuQin instruments, with the percentage of annotated notes; (b) PT distribution of all annotations.

tismir-6-1-146-g8.png
Figure 8

Pitch variation visualization of ground-truth pitch tracks (red) and score representation (black) for typical excerpts played on (a) Banhu, (b) Gaohu, (c) Erhu, (d) Zhuihu.

Table 3

Statistics of training and testing sets.

DATASETBOWED-STRING INSTRUMENTSCOUNT
CTIS-IErhu787
CTIS-IIBanhu, Soprano Banhu, Alto Banhu, XiQin, Zhonghu, Zhuihu285
HybridErhu252
CCOM-HuQinErhu, Soprano Banhu, Alto Banhu, Tenor Banhu, Bass Banhu, Gaohu, Zhonghu, Zhuihu11,014
Table 4

F1 score of classification results.

DATASETCNNCRNN
Homogeneous
CTIS-I97.54%99.19%
CCOM-HuQin96.07%97.85%
Heterogeneous (Train Validation/Test)
CTIS-I/CTIS-II + Hybrid69.21%70.39%
CCOM-HuQin/CTIS-I & II + Hybrid77.01%87.01%
tismir-6-1-146-g9.png
Figure 9

(a) Nine-class confusion matrix of CRNN classification result. (b) Spectrogram examples of two pairs of easily confused PTs.

Table 5

SVM classification accuracy on two pairs of confusing PTs.

COORDINATESDAYIN/PORTTRILL/VIBRATO
X-axis87.24%77.82%
Y-axis86.40%72.30%
Z-axis86.31%76.33%
tismir-6-1-146-g10.png
Figure 10

Hand pose visualization of Trill and Vibrato in (a) key-point’s change on x-, y- and z-axis and (b) selected video frames with fingertip labels.

tismir-6-1-146-g11.png
Figure 11

Comparison of RMS envelopes between (a) PaoG on Erhu and (b) ricochet on the violin.

Table 6

Comparison of the number of Ports between Erhu and violin music.

EXCERPTSDURATION#PORT#PORT/MIN
Erhu22.7min48421.4
Violin16.2min704.3
DOI: https://doi.org/10.5334/tismir.146 | Journal eISSN: 2514-3298
Language: English
Submitted on: Aug 10, 2022
Accepted on: Mar 11, 2023
Published on: Jul 12, 2023
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2023 Yu Zhang, Ziya Zhou, Xiaobing Li, Feng Yu, Maosong Sun, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.