Skip to main content
Have a personal or library account? Click to login
PianoCoRe: Combined and Refined Piano MIDI Dataset Cover
By: Ilya Borovik  
Open Access
|Apr 2026

References

  1. Benetos, E., Dixon, S., Duan, Z., and Ewert, S. (2018). Automatic music transcription: An overview. IEEE Signal Processing Magazine, 36(1), 2030.
  2. Borovik, I., Gavrilev, D., and Viro, V. (2025). SyMuPe: Affective and controllable symbolic music performance. In Proceedings of the 33rd ACM International Conference on Multimedia, Dublin, Ireland, pp. 1069910708.
  3. Borovik, I., and Viro, V. (2023). ScorePerformer: Expressive piano performance rendering with fine‑grained control. In Proceedings of the 24th International Society for Music Information Retrieval Conference (ISMIR), Milan, Italy, pp. 588596.
  4. Bradshaw, L., and Colton, S. (2025). Aria‑MIDI: A dataset of piano MIDI files for symbolic music modeling. In Proceedings of the 13th International Conference on Representation Learning (ICLR), Singapore, Singapore.
  5. Bradshaw, L., Fan, H., Spangher, A., Biderman, S., and Colton, S. (2025). Scaling self‑supervised representation learning for symbolic piano performance. In Proceedings of the 26th International Society for Music Information Retrieval Conference (ISMIR), Daejeon, Korea, pp. 451459.
  6. Cancino‑Chacón, C. E., Grachten, M., Goebl, W., and Widmer, G. (2018). Computational models of expressive music performance: A comprehensive and critical review. Frontiers in Digital Humanities, 5, 25.
  7. Cancino‑Chacón, C. E., Peter, S. D., Karystinaios, E., Foscarin, F., Grachten, M., and Widmer, G. (2022). Partitura: A Python package for symbolic music processing. In Proceedings of the Music Encoding Conference (MEC), Halifax, Canada.
  8. Chou, Y.‑H., Chen, I.‑C., Ching, J., Chang, C.‑J., and Yang, Y.‑H. (2024). MidiBERT‑Piano: Large‑scale pre‑training for symbolic music classification tasks. Journal of Creative Music Systems, 8(1).
  9. Edwards, D., Dixon, S., and Benetos, E. (2023). PiJAMA: Piano jazz with automatic MIDI annotations. Transactions of the International Society for Music Information Retrieval, 6(1), 89102.
  10. Edwards, D., Dixon, S., Benetos, E., Maezawa, A., and Kusaka, Y. (2024). A data‑driven analysis of robust automatic piano transcription. IEEE Signal Processing Letters, 31, 681685.
  11. Emerson, K., and Harrison, P. M. C. (2025). Multimodal datasets for studying expert performances of musical scores. Transactions of the International Society for Music Information Retrieval, 8(1), 400428.
  12. Foscarin, F., Mcleod, A., Rigaux, P., Jacquemard, F., and Sakai, M. (2020). ASAP: A dataset of aligned scores and performances for piano transcription. In Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR), Montréal, Canada, pp. 534541.
  13. Goebl, W. (1999). The Vienna 4x22 Piano Corpus. 10.21939/4X22.
  14. Good, M. (2001). MusicXML for notation and analysis. In The Virtual Score: Representation, Retrieval, Restoration, 12, 113124.
  15. Guo, Z., Kang, J., and Herremans, D. (2023). A domain‑ knowledge‑inspired music embedding space and a novel attention mechanism for symbolic music modeling. In Proceedings of the 37th AAAI Conference on Artificial Intelligence, Volume 37, Washington, DC, USA, pp. 50705077.
  16. Hashida, M., Nakamura, E., and Katayose, H. (2018). Crest‑ MusePEDB 2nd edition: Music performance database with phrase information. In Proceedings of the 15th Sound and Music Computing Conference (SMC), Limassol, Cyprus.
  17. Hawthorne, C., Stasyuk, A., Roberts, A., Simon, I., Huang, C.‑Z. A., Dieleman, S., Elsen, E., Engel, J., and Eck, D. (2019). Enabling factorized piano music modeling and generation with the MAESTRO dataset. In Proceedings of the 7th International Conference on Representation Learning (ICLR), New Orleans, LA, USA.
  18. Hsiao, W.‑Y., Liu, J.‑Y., Yeh, Y.‑C., and Yang, Y.‑H. (2021). Compound word transformer: Learning to compose full‑song music over dynamic directed hypergraphs. In Proceedings of the 35th AAAI Conference on Artificial Intelligence (Volume 35, pp. 178186). Virtual Event.
  19. Hu, P., Marták, L. S., Cancino‑Chacón, C., and Widmer, G. (2024). Towards musically informed evaluation of piano transcription models. In Proceedings of the 25th International Society for Music Information Retrieval Conference (ISMIR), San Francisco, CA, USA, pp. 10681075.
  20. Hu, P., and Widmer, G. (2023). The Batik‑Plays‑Mozart corpus: Linking performance to score to musicological annotations. In Proceedings of the 24th International Society for Music Information Retrieval Conference (ISMIR), Milan, Italy, pp. 297303.
  21. Huang, Y.‑S., and Yang, Y.‑H. (2020). Pop music transformer: Beat‑based modeling and generation of expressive pop piano compositions. In Proceedings of the 28th ACM International Conference on Multimedia, Virtual Event and Seattle, WA, USA, pp. 11801188.
  22. Hung, H.‑T., Ching, J., Doh, S., Kim, N., Nam, J., and Yang, Y.‑H. (2021). EMOPIA: A multi‑modal pop piano dataset for emotion recognition and emotion‑based music generation. In Proceedings of the 22nd International Society for Music Information Retrieval Conference (ISMIR), pp. 318325. Online.
  23. Jeong, D., Kwon, T., Kim, Y., Lee, K., and Nam, J. (2019a). VirtuosoNet: A hierarchical RNN‑based system for modeling expressive piano performance. In Proceedings of the 20th International Society for Music Information Retrieval Conference (ISMIR), Delft, Netherlands, pp. 908915.
  24. Jeong, D., Kwon, T., Kim, Y., and Nam, J. (2019b). Graph neural network for music score data and modeling expressive piano performance. In Proceedings of the 36th International Conference on Machine Learning (ICML), Long Beach, CA, USA, pp. 30603070. PMLR.
  25. Kong, Q., Li, B., Chen, J., and Wang, Y. (2022). GiantMIDI‑Piano: A large‑scale MIDI dataset for classical piano music. Transactions of the International Society for Music Information Retrieval, Bengaluru, India, 5(1), 8798.
  26. Kong, Q., Li, B., Song, X., Wan, Y., and Wang, Y. (2021). High‑resolution piano transcription with pedals by regressing onset and offset times. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, 37073717.
  27. Kosta, K., Bandtlow, O. F., and Chew, E. (2018). MazurkaBL: Score‑aligned loudness, beat, expressive markings data for 2000 Chopin Mazurka recordings. In Proceedings of the 4th International Conference on Technologies for Music Notation and Representation (TENOR), Montréal, Canada, pp. 8594.
  28. Lam, S. K., Pitrou, A., and Seibert, S. (2015). Numba: A LLVM‑based Python JIT compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, Austin, TX, USA, pp. 16.
  29. Lee, K. J. M., Ens, J., Adkins, S., Sarmento, P., Barthet, M., and Pasquier, P. (2025). The GigaMIDI dataset with features for expressive music performance detection. Transactions of the International Society for Music Information Retrieval, 8(1), 119.
  30. Lerch, A., Arthur, C., Pati, A., and Gururani, S. (2020). An interdisciplinary review of music performance analysis. Transactions of the International Society for Music Information Retrieval, 3(1), 221245.
  31. Liang, X., Zhao, Z., Zeng, W., He, Y., He, F., Wang, Y., and Gao, C. (2024). PianoBART: Symbolic piano music generation and understanding with large‑scale pre‑training. In Proceedings of the 25th IEEE International Conference on Multimedia and Expo (ICME), IEEE, Niagara Falls, ON, Canada, pp. 16.
  32. Liao, Y., Luo, Z., Wang, Y., and Yin, Y. (2024). Symusic: A swift and unified toolkit for symbolic music processing. In Extended Abstracts of the 25th International Society for Music Information Retrieval Conference (ISMIR), San Francisco, CA, USA.
  33. Lipman, Y., Chen, R. T., Ben‑Hamu, H., Nickel, M., and Le, M. (2022). Flow matching for generative modeling. In The Proceedings of the 11th International Conference on Learning Representations (ICLR), Virtual Event.
  34. Long, P., Novack, Z., Berg‑Kirkpatrick, T., and McAuley, J. (2025). PDMX: A large‑scale public domain MusicXML dataset for symbolic music processing. In Proceedings of the 50th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, Hyderabad, India, pp. 15.
  35. Müller, M., Konz, V., Bogler, W., and Arifi‑Müller, V. (2011). Saarland Music Data (SMD). In Extended Abstracts for the Late‑Breaking Demo Session of the 12th International Society for Music Information Retrieval Conference (ISMIR), Miami, FL, USA.
  36. Nakamura, E., Yoshii, K., and Katayose, H. (2017). Performance error detection and post‑processing for fast and accurate symbolic music alignment. In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR), Suzhou, China, pp. 347353.
  37. Peter, S. D. (2023). Online symbolic music alignment with offline reinforcement learning. In Proceedings of the 24th International Society for Music Information Retrieval Conference (ISMIR), Milan, Italy, pp. 634641.
  38. Peter, S. D., Cancino‑Chacón, C. E., Foscarin, F., McLeod, A. P., Henkel, F., Karystinaios, E., and Widmer, G. (2023). Automatic note‑level score‑to‑performance alignments in the ASAP dataset. Transactions of the International Society for Music Information Retrieval, 6(1), 2742.
  39. Rhyu, S., Kim, S., and Lee, K. (2022). Sketching the expression: Flexible rendering of expressive piano performance with self‑supervised learning. In Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR), Bengaluru, India, pp.178185.
  40. Shi, Z., Sapp, C., Arul, K., McBride, J., and Smith III, J. O. (2019). SUPRA: Digitizing the Stanford University Piano Roll Archive. In Proceedings of the 20th International Society for Music Information Retrieval Conference (ISMIR), Delft, Netherlands, pp. 517523.
  41. Simonetta, F., Avanzini, F., and Ntalampiras, S. (2022). A perceptual measure for evaluating the resynthesis of automatic music transcriptions. Multimedia Tools and Applications, 81(22), 3237132391.
  42. Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., and Liu, Y. (2024). RoFormer: Enhanced Transformer with rotary position embedding. Neurocomputing, 568, 127063.
  43. Tang, J., Cooper, E., Wang, X., Yamagishi, J., and Fazekas, G. (2025). Towards an integrated approach for expressive piano performance synthesis from music scores. In Proceedings of the 50th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, pp. 15.
  44. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems (NIPS) (Volume 30, pp. 59986008). Curran Associates, Inc.
  45. Watson, M. (2018). MuseScore. Journal of the Musical Arts in Africa, 15(1–2), 143147.
  46. Xia, G. G. (2016). Expressive Collaborative Music Performance via Machine Learning [PhD thesis]. Carnegie Mellon University.
  47. Yan, Y., and Duan, Z. (2024). Scoring time intervals using non‑hierarchical Transformer for automatic piano transcription. In Proceedings of the 25th International Society for Music Information Retrieval Conference (ISMIR), San Francisco, CA, USA, pp. 973980.
  48. Ycart, A., Liu, L., Benetos, E., and Pearce, M. (2020). Investigating the perceptual validity of evaluation metrics for automatic piano music transcription. Transactions of the International Society for Music Information Retrieval, 3(1), 6881.
  49. Zacharov, I., Arslanov, R., Gunin, M., Stefonishin, D., Bykov, A., Pavlov, S., Panarin, O., Maliutin, A., Rykovanov, S., and Fedorov, M. (2019). “Zhores”: Petaflops supercomputer for data‑driven modeling, machine learning and artificial intelligence installed in Skolkovo Institute of Science and Technology. Open Engineering, 9(1), 512520.
  50. Zeng, M., Tan, X., Wang, R., Ju, Z., Qin, T., and Liu, T.‑Y. (2021). MusicBERT: Symbolic music understanding with large‑scale pre‑training. In Findings of the Association for Computational Linguistics: ACL‑IJCNLP 2021, pp. 791800.
  51. Zhang, H., Chowdhury, S., Cancino‑Chacón, C. E., Liang, J., Dixon, S., and Widmer, G. (2024). DExter: Learning and controlling performance expression with diffusion models. Applied Sciences, 14(15), 6543.
  52. Zhang, H., Tang, J., Rafee, S. R. M., and Fazekas, S. D. G. (2022). ATEPP: A dataset of automatically transcribed expressive piano performance. In Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR), Bengaluru, India, pp. 446453.
DOI: https://doi.org/10.5334/tismir.333 | Journal eISSN: 2514-3298
Language: English
Submitted on: Aug 17, 2025
Accepted on: Mar 16, 2026
Published on: Apr 27, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Ilya Borovik, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.