Have a personal or library account? Click to login
Differentiable Short-Term Models for Efficient Online Learning and Prediction in Monophonic Music Cover

Differentiable Short-Term Models for Efficient Online Learning and Prediction in Monophonic Music

Open Access
|Nov 2022

References

  1. 1Barr, D. R., and Thomas, M. U. (1977). An eigenvector condition for Markov chain lumpability. Operations Research, 25(6):10281031. DOI: 10.1287/opre.25.6.1028
  2. 2Brooks, F. P., Hopkins, A., Neumann, P. G., and Wright, W. V. (1957). An experiment in musical composition. IRE Transactions on Electronic Computers, EC-6(3):175182. DOI: 10.1109/TEC.1957.5222016
  3. 3Choromanski, K. M., Likhosherstov, V., Dohan, D., Song, X., Gane, A., Sarlos, T., Hawkins, P., Davis, J. Q., Mohiuddin, A., Kaiser, L., Belanger, D. B., Colwell, L. J., and Weller, A. (2021). Rethinking attention with performers. In 9th International Conference on Learning Representations (ICLR).
  4. 4Cleary, J., and Witten, I. (1984). Data compression using adaptive coding and partial string matching. IEEE Transactions on Communications, 32(4):396402. DOI: 10.1109/TCOM.1984.1096090
  5. 5Conklin, D., and Witten, I. H. (1995). Multiple viewpoint systems for music prediction. Journal of New Music Research, 24(1):5173. DOI: 10.1080/09298219508570672
  6. 6Cover, T. M., and Thomas, J. A. (2005). Entropy, Relative Entropy, and Mutual Information, chapter 2, pages 1355. John Wiley & Sons, Ltd. DOI: 10.1002/047174882X.ch2
  7. 7Eck, D., and Schmidhuber, J. (2002). Finding temporal structure in music: Blues improvisation with LSTM recurrent networks. In Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, pages 747756. IEEE. DOI: 10.1109/NNSP.2002.1030094
  8. 8Egermann, H., Pearce, M. T., Wiggins, G. A., and McAdams, S. (2013). Probabilistic models of expectation violation predict psychophysiological emotional responses to live concert music. Cognitive, Affective, & Behavioral Neuroscience, 13(3):533553. DOI: 10.3758/s13415-013-0161-y
  9. 9Graves, A., Wayne, G., and Danihelka, I. (2014). Neural Turing machines. CoRR, abs/1410.5401.
  10. 10Gumbel, E. J. (1935). Les valeurs extremes des distributions statistiques. Annales de l’institut Henri Poincare, 5(2):115158.
  11. 11Huang, C. A., Vaswani, A., Uszkoreit, J., Shazeer, N., Hawthorne, C., Dai, A. M., Hoffman, M. D., and Eck, D. (2018). An improved relative self-attention mechanism for transformer with application to music generation. CoRR, abs/1809.04281.
  12. 12Jang, E., Gu, S., and Poole, B. (2017). Categorical reparameterization with Gumbel-softmax. In 5th International Conference on Learning Representations (ICLR). OpenReview.net.
  13. 13Kannan, D., Sharpe, D. J., Swinburne, T. D., and Wales, D. J. (2020). Optimal dimensionality reduction of Markov chains using graph transformation. The Journal of Chemical Physics, 153(24):244108. DOI: 10.1063/5.0025174
  14. 14Katehakis, M. N., and Smit, L. C. (2012). A successive lumping procedure for a class of markov chains. Probability in the Engineering and Informational Sciences, 26(4):483508. DOI: 10.1017/S0269964812000150
  15. 15Katharopoulos, A., Vyas, A., Pappas, N., and Fleuret, F. (2020). Transformers are RNNs: Fast autoregressive transformers with linear attention. In International Conference on Machine Learning, pages 51565165. PMLR.
  16. 16Kingma, D. P., and Ba, J. (2015). Adam: A method for stochastic optimization. In Bengio, Y. and LeCun, Y., editors, 3rd International Conference on Learning Representations, (ICLR).
  17. 17Klambauer, G., Unterthiner, T., Mayr, A., and Hochreiter, S. (2017). Self-normalizing neural networks. In Guyon, I., von Luxburg, U., Bengio, S., Wallach, H. M., Fergus, R., Vishwanathan, S. V. N., and Garnett, R., editors, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, pages 971980.
  18. 18Lattner, S., Chacon, C. E. C., and Grachten, M. (2015a). Pseudo-supervised training improves unsupervised melody segmentation. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI), pages 24592465.
  19. 19Lattner, S., Grachten, M., Agres, K., and Chacon, C. E. C. (2015b). Probabilistic segmentation of musical sequences using restricted Boltzmann machines. In Proceedings of the 5th International Conference on Mathematics and Computation in Music (MCM), pages 323334. DOI: 10.1007/978-3-319-20603-5_33
  20. 20LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4):541551. DOI: 10.1162/neco.1989.1.4.541
  21. 21Liutkus, A., Cifka, O., Wu, S.-L., Simsekli, U., Yang, Y.-H., and Richard, G. (2021). Relative positional encoding for transformers with linear complexity. In International Conference on Machine Learning, pages 70677079. PMLR.
  22. 22Maddison, C. J., Mnih, A., and Teh, Y. W. (2017). The concrete distribution: A continuous relaxation of discrete random variables. In 5th International Conference on Learning Representations (ICLR). OpenReview.net.
  23. 23Maddison, C. J., Tarlow, D., and Minka, T. (2014). A* sampling. In Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. D., andWeinberger, K. Q., editors, Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems, pages 30863094.
  24. 24Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. (2019). PyTorch: An imperative style, highperformance deep learning library. In Wallach, H., Larochelle, H., Beygelzimer, A., d’Alche-Buc, F., Fox, E., and Garnett, R., editors, Advances in Neural Information Processing Systems 32, pages 80248035. Curran Associates, Inc.
  25. 25Pearce, M. (2005). The Construction and Evaluation of Statistical Models of Melodic Structure in Music Perception and Composition. PhD thesis, Department of Computing, City University, London, UK.
  26. 26Pearce, M., and Mullensiefen, D. (2017). Compressionbased modelling of musical similarity perception. Journal of New Music Research, 46(2):135155. DOI: 10.1080/09298215.2017.1305419
  27. 27Pearce, M. T., Mullensiefen, D., and Wiggins, G. A. (2010). Melodic grouping in music information retrieval: New methods and applications. In Advances in Music Information Retrieval, pages 364388. Springer. DOI: 10.1007/978-3-642-11674-2_16
  28. 28Pinkerton, R. C. (1956). Information theory and melody. Scientific American, 194(2):7787. DOI: 10.1038/scientificamerican0256-77
  29. 29Quastler, H. (1955). Discussion, following mathematical theory of word formation, by W. Fucks. In Information Theory: Third London Symposium, volume 168. New York: Academic Press.
  30. 30Roberts, M. G. (1982). Local Order Estimating Markovian Analysis for Noiseless Source Coding and Authorship Identification. PhD thesis, Stanford University.
  31. 31Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning representations by backpropagating errors. Nature, 323(6088):533536. DOI: 10.1038/323533a0
  32. 32Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., and Lillicrap, T. (2016). Meta-learning with memory-augmented neural networks. In Balcan, M. F. and Weinberger, K. Q., editors, Proceedings of the 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 18421850, New York, New York, USA. PMLR.
  33. 33Schlag, I., Irie, K., and Schmidhuber, J. (2021). Linear transformers are secretly fast weight programmers. In Meila, M. and Zhang, T., editors, Proceedings of the 38th International Conference on Machine Learning (ICML), volume 139 of Proceedings of Machine Learning Research, pages 93559366. PMLR.
  34. 34Schmidhuber, J. (1992). Learning to control fastweight memories: An alternative to dynamic recurrent networks. Neural Computation, 4(1):131139. DOI: 10.1162/neco.1992.4.1.131
  35. 35Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3):379423. DOI: 10.1002/j.1538-7305.1948.tb01338.x
  36. 36Shaw, P., Uszkoreit, J., and Vaswani, A. (2018). Selfattention with relative position representations. In Walker, M. A., Ji, H., and Stent, A., editors, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACLHLT), volume 2, pages 464468. Association for Computational Linguistics. DOI: 10.18653/v1/N18-2074
  37. 37Simonyan, K., Vedaldi, A., and Zisserman, A. (2014). Deep inside convolutional networks: Visualising image classification models and saliency maps. In Bengio, Y. and LeCun, Y., editors, 2nd International Conference on Learning Representations (ICLR), Workshop Track Proceedings.
  38. 38Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1):19291958.
  39. 39Sturm, B. L., Santos, J. F., Ben-Tal, O., and Korshunova, I. (2016). Music transcription modelling and composition using deep learning. In Proceedings of the Conference on Computer Simulation of Musical Creativity.
  40. 40van den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A. W., and Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio. In 9th ISCA Speech Synthesis Workshop, page 125. ISCA.
  41. 41Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. u., and Polosukhin, I. (2017). Attention is all you need. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 30, pages 59986008. Curran Associates, Inc.
  42. 42Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., and Wierstra, D. (2016). Matching networks for one shot learning. In Lee, D. D., Sugiyama, M., von Luxburg, U., Guyon, I., and Garnett, R., editors, Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems, pages 36303638.
  43. 43Witten, I., and Bell, T. (1991). The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression. IEEE Transactions on Information Theory, 37:10851094. DOI: 10.1109/18.87000
  44. 44Yang, L., Chou, S., and Yang, Y. (2017). MidiNet: A convolutional generative adversarial network for symbolic-domain music generation. In Cunningham, S. J., Duan, Z., Hu, X., and Turnbull, D., editors, Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR), pages 324331.
DOI: https://doi.org/10.5334/tismir.123 | Journal eISSN: 2514-3298
Language: English
Submitted on: Nov 5, 2021
Accepted on: Sep 12, 2022
Published on: Nov 29, 2022
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2022 Mathias Rose Bjare, Stefan Lattner, Gerhard Widmer, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.