Have a personal or library account? Click to login
Investigating the Perceptual Validity of Evaluation Metrics for Automatic Piano Music Transcription Cover

Investigating the Perceptual Validity of Evaluation Metrics for Automatic Piano Music Transcription

Open Access
|Jun 2020

References

  1. 1Allali, J., Ferraro, P., Hanna, P., & Robine, M. (2009). Polyphonic alignment algorithms for symbolic music retrieval. In 6th International Symposium on Auditory Display, CMMR/ICAD, pages 466482. DOI: 10.1007/978-3-642-12439-6_24
  2. 2Allan, H., Müllensiefen, D., & Wiggins, G. A. (2007). Methodological considerations in studies of musical similarity. In Proceedings of the 8th International Conference on Music Information Retrieval, ISMIR, pages 473478.
  3. 3Allison, P. D. (2009). Fixed Effects Regression Models, volume 160. Sage Publications. DOI: 10.4135/9781412993869
  4. 4Bay, M., Ehmann, A. F., & Downie, J. S. (2009). Evaluation of multiple-f0 estimation and tracking systems. In Proceedings of the 10th International Society for Music Information Retrieval Conference, ISMIR, pages 315320.
  5. 5Benetos, E., Dixon, S., Duan, Z., & Ewert, S. (2019). Automatic music transcription: An overview. IEEE Signal Processing Magazine, 36(1), 2030. DOI: 10.1109/MSP.2018.2869928
  6. 6Bittner, R. M., & Bosch, J. J. (2019). Generalized metrics for single-f0 estimation evaluation. In Proceedings of the 20th International Society for Music Information Retrieval Conference, ISMIR, pages 738745.
  7. 7Cheng, T., Mauch, M., Benetos, E., & Dixon, S. (2016). An attack/decay model for piano transcription. In Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR, pages 584590.
  8. 8Cogliati, A., & Duan, Z. (2017). A metric for music notation transcription accuracy. In Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR, pages 407413.
  9. 9Daniel, A., Emiya, V., & David, B. (2008). Perceptually-based evaluation of the errors usually made when automatically transcribing music. In Proceedings of the 9th International Conference on Music Information Retrieval, ISMIR, pages 550556.
  10. 10Efron, B. (1992). Bootstrap methods: another look at the jackknife. In Breakthroughs in Statistics, pages 569593. Springer. DOI: 10.1007/978-1-4612-4380-9_41
  11. 11Emiya, V., Badeau, R., & David, B. (2010). Multipitch estimation of piano sounds using a new probabilistic spectral smoothness principle. IEEE Transactions on Audio, Speech and Language Processing, TASLP, 18(6), 16431654. DOI: 10.1109/TASL.2009.2038819
  12. 12Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378. DOI: 10.1037/h0031619
  13. 13Flexer, A., & Grill, T. (2016). The problem of limited inter-rater agreement in modelling music similarity. Journal of New Music Research, 45(3), 239251. DOI: 10.1080/09298215.2016.1200631
  14. 14Hadsell, R., Chopra, S., & LeCun, Y. (2006). Dimensionality reduction by learning an invariant mapping. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR, pages 17351742. DOI: 10.1109/CVPR.2006.100
  15. 15Hawthorne, C., Elsen, E., Song, J., Roberts, A., Simon, I., Raffel, C., Engel, J. H., Oore, S., & Eck, D. (2018). Onsets and frames: Dual-objective piano transcription. In Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR, pages 5057.
  16. 16Hawthorne, C., Stasyuk, A., Roberts, A., Simon, I., Huang, C. A., Dieleman, S., Elsen, E., Engel, J. H., & Eck, D. (2019). Enabling factorized piano music modeling and generation with the MAESTRO dataset. In 7th International Conference on Learning Representations, ICLR.
  17. 17Johnston, J. D. (1988). Transform coding of audio signals using perceptual noise criteria. IEEE Journal on Selected Areas in Communications, 6(2), 314323. DOI: 10.1109/49.608
  18. 18Kelz, R., Dorfer, M., Korzeniowski, F., Böck, S., Arzt, A., & Widmer, G. (2016). On the potential of simple framewise approaches to piano transcription. In Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR, pages 475481.
  19. 19Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR.
  20. 20Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology.
  21. 21McLeod, A., & Steedman, M. (2018). Evaluating automatic polyphonic music transcription. In Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR, pages 4249.
  22. 22Molina, E., Barbancho, A. M., Tardón, L. J., & Barbancho, I. (2014). Evaluation framework for automatic singing transcription. In Proceedings of the 15th International Society for Music Information Retrieval Conference, ISMIR, pages 567572.
  23. 23Mongeau, M., & Sankoff, D. (1990). Comparison of musical sequences. Computers and the Humanities, 24(3), 161175. DOI: 10.1007/BF00117340
  24. 24Müllensiefen, D., Gingras, B., Musil, J., & Stewart, L. (2014). The musicality of non-musicians: An index for assessing musical sophistication in the general population. PLoS One, 9(2). DOI: 10.1371/journal.pone.0089642
  25. 25Müllensiefen, D., Gingras, B., Stewart, L., & Musil, J. (2011). The Goldsmiths Musical Sophistication Index (Gold-MSI): Technical report and documentation v1.0. Technical report, Goldsmiths, University of London.
  26. 26Murphy, K. P. (2012). Machine Learning: A Probabilistic Perspective. MIT Press.
  27. 27Raffel, C., McFee, B., Humphrey, E. J., Salamon, J., Nieto, O., Liang, D., & Ellis, D. P. W. (2014). mir_eval: A transparent implementation of common MIR metrics. In Proceedings of the 15th International Society for Music Information Retrieval Conference, ISMIR, pages 367372.
  28. 28Schramm, R., Nunes, H. D. S., & Jung, C. R. (2016). Audiovisual tool for solfège assessment. ACM Transactions on Multimedia Computing, Communications, and Applications, 13(1). DOI: 10.1145/3007194
  29. 29Su, L., & Yang, Y.-H. (2015). Combining spectral and temporal representations for multipitch estimation of polyphonic music. IEEE/ACM Transactions on Audio, Speech and Language Processing, TASLP, 23(10), 16001612. DOI: 10.1109/TASLP.2015.2442411
  30. 30Velardo, V., Vallati, M., & Jan, S. (2016). Symbolic melodic similarity: State of the art and future challenges. Computer Music Journal, 40(2), 7083. DOI: 10.1162/COMJ_a_00359
  31. 31Ycart, A., & Benetos, E. (2018). A-MAPS: Augmented MAPS dataset with rhythm and key annotations. In 19th International Society for Music Information Retrieval Conference, ISMIR, Late Breaking and Demos Papers.
  32. 32Ycart, A., Liu, L., Benetos, E., & Pearce, M. T. (2020). Musical features for automatic music transcription evaluation. Technical report, Queen Mary University of London, UK.
DOI: https://doi.org/10.5334/tismir.57 | Journal eISSN: 2514-3298
Language: English
Submitted on: Mar 1, 2020
Accepted on: Apr 20, 2020
Published on: Jun 12, 2020
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2020 Adrien Ycart, Lele Liu, Emmanouil Benetos, Marcus T. Pearce, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.