Have a personal or library account? Click to login
Measure by Measure: Measure-Based Automatic Music Composition with Modern Staff Notation Cover

Measure by Measure: Measure-Based Automatic Music Composition with Modern Staff Notation

By: Yujia Yan and  Zhiyao Duan  
Open Access
|Nov 2024

References

  1. Arnold, B. C., Castillo, E., and Sarabia, J. M. (2001). Conditionally specified distributions: An introduction. Statistical Science, 16(3), 249274.
  2. Bretan, M., Weinberg, G., and Heck, L. (2016). An unit selection methodology for music generation using deep neural networks. In International Conference on Innovative Computing and Cloud Computing (ICCC), pp. 7279.
  3. Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning phrase representations using RNN encoder‑decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 17241734.
  4. Devlin, J., Chang, M.‑W., Lee, K., and Toutanova, K. (2019). BERT: Pre‑training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 41714186.
  5. Dong, H.‑W., Chen, K., Dubnov, S., McAuley, J., and Berg‑Kirkpatrick, T. (2023). Multitrack music transformer. In International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 15, IEEE.
  6. Ens, J., and Pasquier, P. (2020). MMM: Exploring conditional multi‑track music generation with the transformer. ArXiv, 10.48550/arXiv.2008.06048.
  7. Germain, M., Gregor, K., Murray, I., and Larochelle, H. (2015). MADE: Masked autoencoder for distribution estimation. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, pp. 881889, PMLR.
  8. Gillick, J., Yang, J., Cella, C., and Bamman, D. (2021). Drumroll please: Modeling multi‑scale rhythmic gestures with flexible grids. Transactions of the International Society for Music Information Retrieval, 4(1), 156.
  9. Hadjeres, G., Pachet, F., and Nielsen, F. (2017). Deep‑Bach: A steerable model for bach chorales generation. In Proceedings of the 34th International Conference on Machine Learning (ICML), pp. 13621371.
  10. Heckerman, D., Chickering, D. M., Meek, C., Rounthwaite, R., and Kadie, C. (2000). Dependency networks for inference, collaborative filtering, and data visualization. Journal of Machine Learning Research, 1(Oct), 4975.
  11. Hochreiter, S., and Schmidhuber, J. (1997). Long short‑term memory. Neural Computation, 9(8), 17351780.
  12. Huang, C.‑Z. A., Cooijmans, T., Roberts, A., Courville, A. C., and Eck, D. (2017). Counterpoint by convolution. In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR), Suzhou, China, pp. 211218.
  13. Huang, C.‑Z. A., Vaswani, A., Uszkoreit, J., Simon, I., Hawthorne, C., Shazeer, N., Dai, A. M., Hoffman, M. D., Dinculescu, M., and Eck, D. (2019). Music transformer. In 7th International Conference on Learning Representations (ICLR).
  14. Huang, Y.‑S., and Yang, Y.‑H. (2020). Pop music transformer: Beat‑based modeling and generation of expressive pop piano compositions. In Proceedings of the 28th ACM International Conference on Multimedia, pp. 11801188.
  15. Jaques, N., Gu, S., Bahdanau, D., Lobato, J. M. H., Turner, R. E., and Eck, D. (2017). Tuning recurrent neural networks with reinforcement learning. In International Conference on Learning Representations (ICLR), Workshop Track.
  16. Jiang, J., Xia, G. G., Carlton, D. B., Anderson, C. N., and Miyakawa, R. H. (2020). Transformer VAE: A hierarchical model for structure‑aware and interpretable music representation learning. In International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 516520, IEEE.
  17. Kågebäck, M., and Salomonsson, H. (2016). Word sense disambiguation using a bidirectional LSTM. In Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex‑V), Osaka, Japan, pp. 5156.
  18. Mittal, G., Engel, J. H., Hawthorne, C., and Simon, I. (2021). Symbolic music generation with diffusion models. In J. H. Lee, A. Lerch, Z. Duan, J. Nam, P. Rao, P. van Kranenburg, and A. Srinivasamurthy (Eds.), Proceedings of the 22nd International Society for Music Information Retrieval Conference (ISMIR), Online, pp. 468475.
  19. Pati, A., Lerch, A., and Hadjeres, G. (2019). Learning to traverse latent spaces for musical score inpainting. In Proceedings of the 20th International Society for Music Information Retrieval Conference, (ISMIR), Delft, The Netherlands, pp. 343351.
  20. Roberts, A., Engel, J. H., Raffel, C., Hawthorne, C., and Eck, D. (2018). A hierarchical latent vector model for learning long‑term structure in music. In Proceedings of the 35th International Conference on Machine Learning (ICML), Stockholmsmässan, Stockholm, Sweden, pp. 43614370, PMLR.
  21. Sennrich, R., Haddow, B., and Birch, A. (2016). Edinburgh neural machine translation systems for WMT 16. In Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, Berlin, Germany, pp. 371376, Association for Computational Linguistics.
  22. Shaw, P., Uszkoreit, J., and Vaswani, A. (2018). Self‑attention with relative position representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pp. 464468.
  23. Simon, I., and Oore, S. (2017). Performance RNN: Generating music with expressive timing and dynamics. https://magenta.tensorflow.org/performance-rnn
  24. Stone, K. (1980). Music Notation in the Twentieth Century: A Practical Guidebook. W. W. Norton.
  25. Suzuki, M. (2022). Score transformer: Generating musical score from note‑level representation. In ACM Multimedia Asia (MMAsia), New York, NY, USA.
  26. Uria, B., Côté, M.‑A., Gregor, K., Murray, I., and Larochelle, H. (2016). Neural autoregressive distribution estimation. The Journal of Machine Learning Research, 17(1), 71847220.
  27. Uria, B., Murray, I., and Larochelle, H. (2014). A deep and tractable density estimator. In Proceedings of the 31st International Conference on Machine Learning, volume 32 of Proceedings of Machine Learning Research, Beijing, China, pp. 467475.
  28. Van Buuren, S., Brand, J. P., Groothuis‑Oudshoorn, C. G., and Rubin, D. B. (2006). Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, 76(12), 10491064.
  29. von Rütte, D., Biggio, L., Kilcher, Y., and Hofmann, T. (2023). FIGARO: Controllable music generation using learned and expert features. In The Eleventh International Conference on Learning Representations.
  30. Walder, C. (2016). Modelling symbolic music: Beyond the piano roll. In Proceedings of the 8th Asian Conference on Machine Learning, pp. 174189.
  31. Willshaw, D. J., Buneman, O. P., and Longuet‑Higgins, H. C. (1969). Non‑holographic associative memory. Nature, 222(5197), 960962.
  32. Yan, Y., Lustig, E., VanderStel, J., and Duan, Z. (2018). Part‑invariant model for music generation and harmonization. In Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), Paris, France, pp. 204210.
  33. Zhang, H., Qiu, S., Duan, X., and Zhang, M. (2020). Token drop mechanism for neural machine translation. In Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, pp. 42984303, Online.
DOI: https://doi.org/10.5334/tismir.163 | Journal eISSN: 2514-3298
Language: English
Submitted on: Mar 4, 2023
Accepted on: Aug 12, 2024
Published on: Nov 1, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Yujia Yan, Zhiyao Duan, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.