Have a personal or library account? Click to login
Piano Sheet Music Identification Using Dynamic N-gram Fingerprinting Cover

Piano Sheet Music Identification Using Dynamic N-gram Fingerprinting

By: Daniel Yang and  T. J. Tsai  
Open Access
|Apr 2021

Figures & Tables

tismir-4-1-70-g1.png
Figure 1

Proposed architecture for large-scale piano sheet music retrieval.

tismir-4-1-70-g2.png
Figure 2

An excerpt of sheet music and its corresponding bootleg score.

tismir-4-1-70-g3.png
Figure 3

An example of constructing a dynamic N-gram. The length of the N-gram is chosen to balance runtime and retrieval accuracy.

Table 1

Comparing system performance on the camera-based piano sheet music identification task. The first three columns show the mean reciprocal rank on three retrieval tasks at varying levels of specificity. The last two columns show the system runtimes. The 1-gram system was proposed by Tsai (2020) and corresponds to the previous state-of-the-art.

Retrieval (MRR)Runtime (s)
SystemPieceVersionPageAvgSth
MAC.037.023.0261.17.12
SPoC.003.002.0021.14.09
GeM.025.017.0171.18.11
R-MAC.036.023.0240.96.11
1-gram.709.560.62021.512.5
2-gram.845.525.7752.76.36
3-gram.808.652.7231.99.21
4-gram.755.617.6651.23.13
5-gram.608.493.5991.07.08
dynamic.853.692.7850.98.12
tismir-4-1-70-g4.png
Figure 4

Frequency of N-gram fingerprints in the database, where fingerprints have been sorted from most frequent (left) to least frequent (right). Note that both axes are on a log scale.

tismir-4-1-70-g5.png
Figure 5

Effect of database size on system performance for the sheet music identification task. The largest database size corresponds to the full IMSLP piano database.

Table 2

Comparing performance on the sheet music identification task under two conditions: when the exact same printed edition exists in the database (left column) and when only an alternate edition of the same piece exists (right column).

Piece Retrieval (MRR)
SystemSame VersionAlternate Version
MAC.037.043
SPoC.003.004
GeM.025.029
R-MAC.036.039
1-gram.709.659
2-gram.845.784
3-gram.808.767
4-gram.755.722
5-gram.688.668
dynamic.853.812
Table 3

Comparison of system performance on MIDI-sheet image retrieval. Columns indicate the database size (DB), precision (P) and recall (R) and F-measure (F) for passage-level retrieval, mean reciprocal rank (MRR) for file-level retrieval, and average runtime.

SystemDBPRFMRRTavg
Random1.16.16.160.0s
SharpEye1.43.08.13
PhotoScore1.64.62.63
RetinaNet1.52.26.3511.7s
Dorfer 2018c1.69.28.4017.5s
Faster R-CNN1.84.87.8549.9s
DWD1.91.87.89213s
BS-DTW1.89.89.890.90s
BS-DTW200.90.87.88.9210.9s
1-gram200.58.59.59.691.28s
2-gram200.70.73.71.841.00s
3-gram200.60.72.66.791.00s
4-gram200.41.57.48.631.00s
5-gram200.36.49.42.531.00s
dynamic200.74.77.75.860.98s
BS-DTW2k.89.83.86.88167s
1-gram2k.48.51.49.592.28s
2-gram2k.63.69.66.781.10s
3-gram2k.49.63.55.691.09s
4-gram2k.39.55.46.601.04s
5-gram2k.35.47.40.511.05s
dynamic2k.66.72.69.80.98s
BS-DTW10k.83.77.80.83458s
1-gram10k.28.34.31.436.78s
2-gram10k.43.58.49.681.45s
3-gram10k.40.53.46.561.10s
4-gram10k.34.48.40.501.04s
5-gram10k.28.42.34.461.00s
dynamic10k.55.66.60.73.99s
tismir-4-1-70-g6.png
Figure 6

Comparing different strategies for handling repeats and jumps in MIDI-sheet image retrieval. The MIDI query is broken up into fragments of length L, either through deterministic shingling or a fixed number (K) of randomly sampled windows.

Table 4

Summary of verified matches between the Lakh MIDI dataset and IMSLP.

Composer# Lakh MIDI# IMSLP Pieces
Bach35694
Chopin29149
Beethoven18650
Mozart11630
Liszt7732
Debussy489
Brahms4110
Other31875
Total1433349
DOI: https://doi.org/10.5334/tismir.70 | Journal eISSN: 2514-3298
Language: English
Submitted on: Aug 29, 2020
Accepted on: Jan 23, 2021
Published on: Apr 1, 2021
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2021 Daniel Yang, T. J. Tsai, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.