References
- Bain, M. N. (2008). Real time music visualization: A study in the visual extension of music. Master’s thesis, The Ohio State University.
- Bégel, V., Seilles, A., and Dalla Bella, S. (2018). Rhythm workers: A music‑based serious game for training rhythm skills. Music & Science, 1, 2059204318794369.
- Bi, T., Fankhauser, P., Bellicoso, D., and Hutter, M. (2018). Real‑time dance generation to music for a legged robot. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1038–1044. IEEE.
- Böck, S., and Davies, M. E. (2020). Deconstruct, analyse, reconstruct: How to improve tempo, beat, and downbeat estimation. In Proceedings of the 21th International Society for Music Information Retrieval Conference (ISMIR), pp. 574–582.
- Böck, S., Krebs, F., and Widmer, G. (2014). A multi‑model approach to beat tracking considering heterogeneous music styles. In Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR), pp. 603–608.
- Böck, S., Krebs, F., and Widmer, G. (2016). Joint beat and downbeat tracking with recurrent neural networks. In Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR), (pp. 255–261).
- Böck, S., and Schedl, M. (2011). Enhanced beat tracking with context‑aware neural networks. In Proceedings of the 14th International Conference on Digital Audio Effects, DAFx, pp. 135–139.
- Chang, C.‑C., and Su, L. (2024). BEAST: Online joint beat and downbeat tracking based on streaming transformer. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE.
- Chang, H.‑J., Yang, S.‑W., and Lee, H.‑Y. (2022). Distilhubert: Speech representation learning by layer‑wise distillation of hidden‑unit BERT. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE.
- Chen, S., Wang, C., Chen, Z., Wu, Y., Liu, S., Chen, Z., Li, J., Kanda, N., Yoshioka, T., Xiao, X., Wu, J., Zhou, L., Ren, S., Qian, Y., Wu, J., Zeng, M., Yu, X., and Wei, F. (2022). WavLM: Large‑scale self‑supervised pre‑training for full stack speech processing. IEEE Journal of Selected Topics in Signal Processing, 16(6), 1505–1518.
- Chiu, C.‑Y., Müller, M., Davies, M. E. P., Su, A. W.‑Y., and Yang, Y.‑H. (2023). Local periodicity‑based beat tracking for expressive classical piano music. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 31, 2824–2835.
- Chiu, C.‑Y., Su, A. W., and Yang, Y.‑H. (2021). Drum‑aware ensemble architecture for improved joint musical beat and downbeat tracking. IEEE Signal Processing Letters, 28, 1100–1104.
- Cliff, D. (2000). Hang the DJ: Automatic sequencing and seamless mixing of dance‑music tracks. Hp Laboratories Technical Report Hpl, 104.
- Davies, M. E., and Böck, S. (2019). Temporal convolutional networks for musical audio beat tracking. In 2019 27th European Signal Processing Conference (EUSIPCO), pp. 1–5. IEEE.
- Davis, A., and Agrawala, M. (2018). Visual rhythm and beat. ACM Transactions on Graphics (TOG), 37(4), 1–11.
- De Clercq, T., and Temperley, D. (2011). A corpus analysis of rock harmony. Popular Music, 30(1), 47–70.
- Défossez, A. (2021). Hybrid spectrogram and waveform source separation. In Proceedings of the 22th International Society for Music Information Retrieval Conference (ISMIR), 2021 Workshop on Music Source Separation.
- Desblancs, D., Lostanlen, V., and Hennequin, R. (2023). Zero‑Note Samba: Self‑supervised beat tracking. In IEEE/ACM Transactions on Audio, Speech, and Language Processing.
- Elowsson, A. (2016). Beat tracking with a cepstroid invariant neural network. In Proceedings of the 21th International Society for Music Information Retrieval Conference (ISMIR), pp. 351–357.
- Eskimez, S. E., Maddox, R. K., Xu, C., and Duan, Z. (2019). Noise‑resilient training method for face landmark generation from speech. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28, 27–38.
- Eyben, F., Weninger, F., Ferroni, G., and Schuller, B. (2013). Tempo estimation and beat tracking with long short‑term memory neural networks and comb‑filters. Universität Augsburg, 2013.
- Federgruen, A., and Tzur, M. (1991). A simple forward algorithm to solve general dynamic lot sizing models with n periods in 0 (n \log n) or 0 (n) time. Management Science, 37(8), 909–925.
- Gkiokas, A., and Katsouros, V. (2017). Convolutional neural networks for real‑time beat tracking: A dancing robot application. In Proc. International Society for Music Information Retrieval Conference (ISMIR), pp. 286–293.
- Gkiokas, A., Katsouros, V., and Carayannis, G. (2012). Reducing tempo octave errors by periodicity vector coding and svm learning. In Proceedings of the 13rd International Society for Music Information Retrieval Conference (ISMIR), pp. 301–306.
- Goto, M. (2001). An audio‑based real‑time beat tracking system for music with or without drum‑sounds. Journal of New Music Research, 30(2):159–171.
- Goto, M. (2004). Development of the rwc music database. In Proceedings of the 18th International Congress on Acoustics, 2004, pp. 553–556.
- Goto, M., Hashiguchi, H., Nishimura, T., and Oka, R. (2002). RWC music database: Popular, classical and jazz music databases. In Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR), pp. 287–288.
- Goto, M., and Muraoka, Y. (1999). Real‑time beat tracking for drumless audio signals: Chord change detection for musical decisions. Speech Communication, 27(3–4), 311–335.
- Gouyon, F., Klapuri, A., Dixon, S., Alonso, M., Tzanetakis, G., Uhle, C., and Cano, P. (2006). An experimental comparison of audio tempo induction algorithms. IEEE Transactions on Audio, Speech, and Language Processing, 14(5), 1832–1844.
- Greenlees, M. (2020). Beat Tracking with Autoencoders. Zenodo. 10.5281/zenodo.4091524.
- Hainsworth, S. W., and Macleod, M. D. (2004). Particle filtering applied to musical tempo tracking. EURASIP Journal on Advances in Signal Processing, 2004, 1–11.
- Heydari, M., Cwitkowitz, F., and Duan, Z. (2021). BeatNet: CRNN and particle filtering for online joint beat downbeat and meter tracking. In Proceedings of the 22nd International Society for Music Information Retrieval Conference (ISMIR).
- Heydari, M., and Duan, Z. (2021). Don’t look back: An online beat tracking method using RNN and enhanced particle filtering. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE.
- Heydari, M., and Duan, Z. (2022). Singing beat tracking with self‑supervised front‑end and linear transformers. In Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR).
- Heydari, M., McCallum, M., Ehmann, A., and Duan, Z. (2022). A novel 1D state space for efficient music rhythmic analysis. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (pp. 421–425), IEEE.
- Heydari, M., Wang, J.‑C., and Duan, Z. (2023). SingNet: A real‑time singing voice beat and downbeat tracking system. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (pp. 1–5), IEEE.
- Holzapfel, A., Davies, M. E., Zapata, J. R., Oliveira, J. L., and Gouyon, F. (2012). Selective sampling for beat tracking evaluation. IEEE Transactions on Audio, Speech, and Language Processing, 20(9), 2539–2548.
- Hung, Y.‑N., Wang, J.‑C., Song, X., Lu, W.‑T., and Won, M. (2022). Modeling beats and downbeats with a time‑frequency transformer. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE.
- Katharopoulos, A., Vyas, A., Pappas, N., and Fleuret, F. (2020). Transformers are RNNs: Fast autoregressive transformers with linear attention. In International Conference on Machine Learning, pp. 5156–5165. PMLR.
- Kim, Y., and Rush, A. M. (2016). Sequence‑level knowledge distillation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1317–1327. Austin, Texas.
- Krebs, F., Böck, S., and Widmer, G. (2013). Rhythmic pattern modeling for beat and downbeat tracking in musical audio. In ISMIR, pp. 227–232.
- Krebs, F., Böck, S., and Widmer, G. (2015). An efficient state‑space model for joint tempo and meter tracking. In Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR), pp. 72–78.
- Li, B., Wang, Y., and Duan, Z. (2021). Audiovisual singing voice separation. Transactions of the International Society for Music Information Retrieval (TISMIR), 4(1), 195–209.
- Lu, W.‑T., Wang, J.‑C., Won, M., Choi, K., and Song, X. (2021). SpecTNT: A time‑frequency transformer for music audio. In Proceedings of the 22nd International Society for Music Information Retrieval Conference (ISMIR), pp. 396–403.
- Marchand, U., and Peeters, G. (2015).
Swing ratio estimation . In Digital Audio Effects 2015 (Dafx15), pp. 1–7. Trondheim, Norway. - Masri, P. (1996). Computer Modeling of Sound for Transformation and Synthesis of Musical Signals. PhD thesis, University of Bristol, Bristol, England.
- Meier, P., Krump, G., and Müller, M. (2021). A real‑time beat tracking system based on predominant local pulse information. In Demos and Late Breaking News of the 22nd International Society for Music Information Retrieval Conference (ISMIR).
- Morais, G., Davies, M. E. P., Queiroz, M., and Fuentes, M. (2023).
Tempo vs. pitch: Understanding self‑supervised tempo estimation . In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE. - Morena, E. (2021). A Creative Exploration of Techniques Employed in Pop/Rock Drum Patterns (1965–1992): A dissertation with supporting audio and video recordings. PhD thesis, University of Adelaide, Adelaide, Australia.
- Mottaghi, A., Behdin, K., Esmaeili, A., Heydari, M., and Marvasti, F. (2017). OBTAIN: Real‑time beat tracking in audio signals. International Journal of Signal Processing Systems, 5(4), 123–129.
- Oliveira, J. L., Gouyon, F., Martins, L. G., and Reis, L. P. (2010). IBT: A real‑time tempo and beat tracking system. In Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR), pp. 291–296.
- Rafii, Z., Liutkus, A., Stöter, F.‑R., Mimilakis, S. I., and Bittner, R. (2017).
The MUSDB18 corpus for music separation . In IEEE/ACM Transactions on Audio, Speech, and Language Processing, pp. 1893–1901. ACM. - Schloss, W. A. (1985). On the Automatic Transcription of Percussive Music–From Acoustic Signal to High‑level Analysis. Stanford University.
- Shiu, Y., and Kuo, C.‑C. J. (2007). A modified Kalman filtering approach to on‑line musical beat tracking. In IEEE International Conference on Acoustics, Speech and Signal Processing‑ICASSP. IEEE.
- Steinmetz, C. J., and Reiss, J. D. (2021). WaveBeat: End‑to‑end beat and downbeat tracking in the time domain.
- Tsunoo, E., Kashiwagi, Y., Kumakura, T., and Watanabe, S. (2019).
Transformer ASR with contextual block processing . In 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 427–433. IEEE. - Tzanetakis, G., and Cook, P. (2002). Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing, 10(5), 293–302.
- Wu, Y.‑K., Chiu, C.‑Y., and Yang, Y.‑H. (2022). Jukedrummer: Conditional beat‑aware audio‑domain drum accompaniment generation via transformer VQ‑VAE. In Proceedings of the 23rd International Society for Music Information Retrieval Conference, pp. 1–8. Bengaluru, India.
- Zhao, J., Xia, G., and Wang, Y. (2022). Beat transformer: Demixed beat and downbeat tracking with dilated self‑attention. In Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR).
- Zheng‑qing, C., and Jian‑hua, H. (2005). A comparative study between time‑domain method and frequency‑domain method for identification of bridge flutter derivatives. Engineering Mechanics, 22(6), 127–133.
