Investigating the Perceptual Validity of Evaluation Metrics for Automatic Piano Music Transcription

Adrien Ycart; Lele Liu; Emmanouil Benetos; Marcus T. Pearce

doi:10.5334/tismir.57

References

1Allali, J., Ferraro, P., Hanna, P., & Robine, M. (2009). Polyphonic alignment algorithms for symbolic music retrieval. In 6th International Symposium on Auditory Display, CMMR/ICAD, pages 466–482. DOI: 10.1007/978-3-642-12439-6_24
Back to article
2Allan, H., Müllensiefen, D., & Wiggins, G. A. (2007). Methodological considerations in studies of musical similarity. In Proceedings of the 8th International Conference on Music Information Retrieval, ISMIR, pages 473–478.
Back to article
3Allison, P. D. (2009). Fixed Effects Regression Models, volume 160. Sage Publications. DOI: 10.4135/9781412993869
Back to article
4Bay, M., Ehmann, A. F., & Downie, J. S. (2009). Evaluation of multiple-f0 estimation and tracking systems. In Proceedings of the 10th International Society for Music Information Retrieval Conference, ISMIR, pages 315–320.
Back to article
5Benetos, E., Dixon, S., Duan, Z., & Ewert, S. (2019). Automatic music transcription: An overview. IEEE Signal Processing Magazine, 36(1), 20–30. DOI: 10.1109/MSP.2018.2869928
Back to article
6Bittner, R. M., & Bosch, J. J. (2019). Generalized metrics for single-f0 estimation evaluation. In Proceedings of the 20th International Society for Music Information Retrieval Conference, ISMIR, pages 738–745.
Back to article
7Cheng, T., Mauch, M., Benetos, E., & Dixon, S. (2016). An attack/decay model for piano transcription. In Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR, pages 584–590.
Back to article
8Cogliati, A., & Duan, Z. (2017). A metric for music notation transcription accuracy. In Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR, pages 407–413.
Back to article
9Daniel, A., Emiya, V., & David, B. (2008). Perceptually-based evaluation of the errors usually made when automatically transcribing music. In Proceedings of the 9th International Conference on Music Information Retrieval, ISMIR, pages 550–556.
Back to article
10Efron, B. (1992). Bootstrap methods: another look at the jackknife. In Breakthroughs in Statistics, pages 569–593. Springer. DOI: 10.1007/978-1-4612-4380-9_41
Back to article
11Emiya, V., Badeau, R., & David, B. (2010). Multipitch estimation of piano sounds using a new probabilistic spectral smoothness principle. IEEE Transactions on Audio, Speech and Language Processing, TASLP, 18(6), 1643–1654. DOI: 10.1109/TASL.2009.2038819
Back to article
12Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378. DOI: 10.1037/h0031619
Back to article
13Flexer, A., & Grill, T. (2016). The problem of limited inter-rater agreement in modelling music similarity. Journal of New Music Research, 45(3), 239–251. DOI: 10.1080/09298215.2016.1200631
Back to article
14Hadsell, R., Chopra, S., & LeCun, Y. (2006). Dimensionality reduction by learning an invariant mapping. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR, pages 1735–1742. DOI: 10.1109/CVPR.2006.100
Back to article
15Hawthorne, C., Elsen, E., Song, J., Roberts, A., Simon, I., Raffel, C., Engel, J. H., Oore, S., & Eck, D. (2018). Onsets and frames: Dual-objective piano transcription. In Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR, pages 50–57.
Back to article
16Hawthorne, C., Stasyuk, A., Roberts, A., Simon, I., Huang, C. A., Dieleman, S., Elsen, E., Engel, J. H., & Eck, D. (2019). Enabling factorized piano music modeling and generation with the MAESTRO dataset. In 7th International Conference on Learning Representations, ICLR.
Back to article
17Johnston, J. D. (1988). Transform coding of audio signals using perceptual noise criteria. IEEE Journal on Selected Areas in Communications, 6(2), 314–323. DOI: 10.1109/49.608
Back to article
18Kelz, R., Dorfer, M., Korzeniowski, F., Böck, S., Arzt, A., & Widmer, G. (2016). On the potential of simple framewise approaches to piano transcription. In Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR, pages 475–481.
Back to article
19Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR.
Back to article
20Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology.
Back to article
21McLeod, A., & Steedman, M. (2018). Evaluating automatic polyphonic music transcription. In Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR, pages 42–49.
Back to article
22Molina, E., Barbancho, A. M., Tardón, L. J., & Barbancho, I. (2014). Evaluation framework for automatic singing transcription. In Proceedings of the 15th International Society for Music Information Retrieval Conference, ISMIR, pages 567–572.
Back to article
23Mongeau, M., & Sankoff, D. (1990). Comparison of musical sequences. Computers and the Humanities, 24(3), 161–175. DOI: 10.1007/BF00117340
Back to article
24Müllensiefen, D., Gingras, B., Musil, J., & Stewart, L. (2014). The musicality of non-musicians: An index for assessing musical sophistication in the general population. PLoS One, 9(2). DOI: 10.1371/journal.pone.0089642
Back to article
25Müllensiefen, D., Gingras, B., Stewart, L., & Musil, J. (2011). The Goldsmiths Musical Sophistication Index (Gold-MSI): Technical report and documentation v1.0. Technical report, Goldsmiths, University of London.
Back to article
26Murphy, K. P. (2012). Machine Learning: A Probabilistic Perspective. MIT Press.
Back to article
27Raffel, C., McFee, B., Humphrey, E. J., Salamon, J., Nieto, O., Liang, D., & Ellis, D. P. W. (2014). mir_eval: A transparent implementation of common MIR metrics. In Proceedings of the 15th International Society for Music Information Retrieval Conference, ISMIR, pages 367–372.
Back to article
28Schramm, R., Nunes, H. D. S., & Jung, C. R. (2016). Audiovisual tool for solfège assessment. ACM Transactions on Multimedia Computing, Communications, and Applications, 13(1). DOI: 10.1145/3007194
Back to article
29Su, L., & Yang, Y.-H. (2015). Combining spectral and temporal representations for multipitch estimation of polyphonic music. IEEE/ACM Transactions on Audio, Speech and Language Processing, TASLP, 23(10), 1600–1612. DOI: 10.1109/TASLP.2015.2442411
Back to article
30Velardo, V., Vallati, M., & Jan, S. (2016). Symbolic melodic similarity: State of the art and future challenges. Computer Music Journal, 40(2), 70–83. DOI: 10.1162/COMJ_a_00359
Back to article
31Ycart, A., & Benetos, E. (2018). A-MAPS: Augmented MAPS dataset with rhythm and key annotations. In 19th International Society for Music Information Retrieval Conference, ISMIR, Late Breaking and Demos Papers.
Back to article
32Ycart, A., Liu, L., Benetos, E., & Pearce, M. T. (2020). Musical features for automatic music transcription evaluation. Technical report, Queen Mary University of London, UK.
Back to article

Investigating the Perceptual Validity of Evaluation Metrics for Automatic Piano Music Transcription

References

Paradigm

My account