Have a personal or library account? Click to login
BeatNet+: Real‑Time Rhythm Analysis for Diverse Music Audio Cover
By: Mojtaba Heydari and  Zhiyao Duan  
Open Access
|Dec 2024

References

  1. Bain, M. N. (2008). Real time music visualization: A study in the visual extension of music. Master’s thesis, The Ohio State University.
  2. Bégel, V., Seilles, A., and Dalla Bella, S. (2018). Rhythm workers: A music‑based serious game for training rhythm skills. Music & Science, 1, 2059204318794369.
  3. Bi, T., Fankhauser, P., Bellicoso, D., and Hutter, M. (2018). Real‑time dance generation to music for a legged robot. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 10381044. IEEE.
  4. Böck, S., and Davies, M. E. (2020). Deconstruct, analyse, reconstruct: How to improve tempo, beat, and downbeat estimation. In Proceedings of the 21th International Society for Music Information Retrieval Conference (ISMIR), pp. 574582.
  5. Böck, S., Krebs, F., and Widmer, G. (2014). A multi‑model approach to beat tracking considering heterogeneous music styles. In Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR), pp. 603608.
  6. Böck, S., Krebs, F., and Widmer, G. (2016). Joint beat and downbeat tracking with recurrent neural networks. In Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR), (pp. 255261).
  7. Böck, S., and Schedl, M. (2011). Enhanced beat tracking with context‑aware neural networks. In Proceedings of the 14th International Conference on Digital Audio Effects, DAFx, pp. 135139.
  8. Chang, C.‑C., and Su, L. (2024). BEAST: Online joint beat and downbeat tracking based on streaming transformer. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE.
  9. Chang, H.‑J., Yang, S.‑W., and Lee, H.‑Y. (2022). Distilhubert: Speech representation learning by layer‑wise distillation of hidden‑unit BERT. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE.
  10. Chen, S., Wang, C., Chen, Z., Wu, Y., Liu, S., Chen, Z., Li, J., Kanda, N., Yoshioka, T., Xiao, X., Wu, J., Zhou, L., Ren, S., Qian, Y., Wu, J., Zeng, M., Yu, X., and Wei, F. (2022). WavLM: Large‑scale self‑supervised pre‑training for full stack speech processing. IEEE Journal of Selected Topics in Signal Processing, 16(6), 15051518.
  11. Chiu, C.‑Y., Müller, M., Davies, M. E. P., Su, A. W.‑Y., and Yang, Y.‑H. (2023). Local periodicity‑based beat tracking for expressive classical piano music. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 31, 28242835.
  12. Chiu, C.‑Y., Su, A. W., and Yang, Y.‑H. (2021). Drum‑aware ensemble architecture for improved joint musical beat and downbeat tracking. IEEE Signal Processing Letters, 28, 11001104.
  13. Cliff, D. (2000). Hang the DJ: Automatic sequencing and seamless mixing of dance‑music tracks. Hp Laboratories Technical Report Hpl, 104.
  14. Davies, M. E., and Böck, S. (2019). Temporal convolutional networks for musical audio beat tracking. In 2019 27th European Signal Processing Conference (EUSIPCO), pp. 15. IEEE.
  15. Davis, A., and Agrawala, M. (2018). Visual rhythm and beat. ACM Transactions on Graphics (TOG), 37(4), 111.
  16. De Clercq, T., and Temperley, D. (2011). A corpus analysis of rock harmony. Popular Music, 30(1), 4770.
  17. Défossez, A. (2021). Hybrid spectrogram and waveform source separation. In Proceedings of the 22th International Society for Music Information Retrieval Conference (ISMIR), 2021 Workshop on Music Source Separation.
  18. Desblancs, D., Lostanlen, V., and Hennequin, R. (2023). Zero‑Note Samba: Self‑supervised beat tracking. In IEEE/ACM Transactions on Audio, Speech, and Language Processing.
  19. Elowsson, A. (2016). Beat tracking with a cepstroid invariant neural network. In Proceedings of the 21th International Society for Music Information Retrieval Conference (ISMIR), pp. 351357.
  20. Eskimez, S. E., Maddox, R. K., Xu, C., and Duan, Z. (2019). Noise‑resilient training method for face landmark generation from speech. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28, 2738.
  21. Eyben, F., Weninger, F., Ferroni, G., and Schuller, B. (2013). Tempo estimation and beat tracking with long short‑term memory neural networks and comb‑filters. Universität Augsburg, 2013.
  22. Federgruen, A., and Tzur, M. (1991). A simple forward algorithm to solve general dynamic lot sizing models with n periods in 0 (n \log n) or 0 (n) time. Management Science, 37(8), 909925.
  23. Gkiokas, A., and Katsouros, V. (2017). Convolutional neural networks for real‑time beat tracking: A dancing robot application. In Proc. International Society for Music Information Retrieval Conference (ISMIR), pp. 286293.
  24. Gkiokas, A., Katsouros, V., and Carayannis, G. (2012). Reducing tempo octave errors by periodicity vector coding and svm learning. In Proceedings of the 13rd International Society for Music Information Retrieval Conference (ISMIR), pp. 301306.
  25. Goto, M. (2001). An audio‑based real‑time beat tracking system for music with or without drum‑sounds. Journal of New Music Research, 30(2):159171.
  26. Goto, M. (2004). Development of the rwc music database. In Proceedings of the 18th International Congress on Acoustics, 2004, pp. 553556.
  27. Goto, M., Hashiguchi, H., Nishimura, T., and Oka, R. (2002). RWC music database: Popular, classical and jazz music databases. In Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR), pp. 287288.
  28. Goto, M., and Muraoka, Y. (1999). Real‑time beat tracking for drumless audio signals: Chord change detection for musical decisions. Speech Communication, 27(3–4), 311335.
  29. Gouyon, F., Klapuri, A., Dixon, S., Alonso, M., Tzanetakis, G., Uhle, C., and Cano, P. (2006). An experimental comparison of audio tempo induction algorithms. IEEE Transactions on Audio, Speech, and Language Processing, 14(5), 18321844.
  30. Greenlees, M. (2020). Beat Tracking with Autoencoders. Zenodo. 10.5281/zenodo.4091524.
  31. Hainsworth, S. W., and Macleod, M. D. (2004). Particle filtering applied to musical tempo tracking. EURASIP Journal on Advances in Signal Processing, 2004, 111.
  32. Heydari, M., Cwitkowitz, F., and Duan, Z. (2021). BeatNet: CRNN and particle filtering for online joint beat downbeat and meter tracking. In Proceedings of the 22nd International Society for Music Information Retrieval Conference (ISMIR).
  33. Heydari, M., and Duan, Z. (2021). Don’t look back: An online beat tracking method using RNN and enhanced particle filtering. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE.
  34. Heydari, M., and Duan, Z. (2022). Singing beat tracking with self‑supervised front‑end and linear transformers. In Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR).
  35. Heydari, M., McCallum, M., Ehmann, A., and Duan, Z. (2022). A novel 1D state space for efficient music rhythmic analysis. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (pp. 421425), IEEE.
  36. Heydari, M., Wang, J.‑C., and Duan, Z. (2023). SingNet: A real‑time singing voice beat and downbeat tracking system. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (pp. 15), IEEE.
  37. Holzapfel, A., Davies, M. E., Zapata, J. R., Oliveira, J. L., and Gouyon, F. (2012). Selective sampling for beat tracking evaluation. IEEE Transactions on Audio, Speech, and Language Processing, 20(9), 25392548.
  38. Hung, Y.‑N., Wang, J.‑C., Song, X., Lu, W.‑T., and Won, M. (2022). Modeling beats and downbeats with a time‑frequency transformer. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE.
  39. Katharopoulos, A., Vyas, A., Pappas, N., and Fleuret, F. (2020). Transformers are RNNs: Fast autoregressive transformers with linear attention. In International Conference on Machine Learning, pp. 51565165. PMLR.
  40. Kim, Y., and Rush, A. M. (2016). Sequence‑level knowledge distillation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 13171327. Austin, Texas.
  41. Krebs, F., Böck, S., and Widmer, G. (2013). Rhythmic pattern modeling for beat and downbeat tracking in musical audio. In ISMIR, pp. 227232.
  42. Krebs, F., Böck, S., and Widmer, G. (2015). An efficient state‑space model for joint tempo and meter tracking. In Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR), pp. 7278.
  43. Li, B., Wang, Y., and Duan, Z. (2021). Audiovisual singing voice separation. Transactions of the International Society for Music Information Retrieval (TISMIR), 4(1), 195209.
  44. Lu, W.‑T., Wang, J.‑C., Won, M., Choi, K., and Song, X. (2021). SpecTNT: A time‑frequency transformer for music audio. In Proceedings of the 22nd International Society for Music Information Retrieval Conference (ISMIR), pp. 396403.
  45. Marchand, U., and Peeters, G. (2015). Swing ratio estimation. In Digital Audio Effects 2015 (Dafx15), pp. 17. Trondheim, Norway.
  46. Masri, P. (1996). Computer Modeling of Sound for Transformation and Synthesis of Musical Signals. PhD thesis, University of Bristol, Bristol, England.
  47. Meier, P., Krump, G., and Müller, M. (2021). A real‑time beat tracking system based on predominant local pulse information. In Demos and Late Breaking News of the 22nd International Society for Music Information Retrieval Conference (ISMIR).
  48. Morais, G., Davies, M. E. P., Queiroz, M., and Fuentes, M. (2023). Tempo vs. pitch: Understanding self‑supervised tempo estimation. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE.
  49. Morena, E. (2021). A Creative Exploration of Techniques Employed in Pop/Rock Drum Patterns (1965–1992): A dissertation with supporting audio and video recordings. PhD thesis, University of Adelaide, Adelaide, Australia.
  50. Mottaghi, A., Behdin, K., Esmaeili, A., Heydari, M., and Marvasti, F. (2017). OBTAIN: Real‑time beat tracking in audio signals. International Journal of Signal Processing Systems, 5(4), 123129.
  51. Oliveira, J. L., Gouyon, F., Martins, L. G., and Reis, L. P. (2010). IBT: A real‑time tempo and beat tracking system. In Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR), pp. 291296.
  52. Rafii, Z., Liutkus, A., Stöter, F.‑R., Mimilakis, S. I., and Bittner, R. (2017). The MUSDB18 corpus for music separation. In IEEE/ACM Transactions on Audio, Speech, and Language Processing, pp. 18931901. ACM.
  53. Schloss, W. A. (1985). On the Automatic Transcription of Percussive Music–From Acoustic Signal to High‑level Analysis. Stanford University.
  54. Shiu, Y., and Kuo, C.‑C. J. (2007). A modified Kalman filtering approach to on‑line musical beat tracking. In IEEE International Conference on Acoustics, Speech and Signal Processing‑ICASSP. IEEE.
  55. Steinmetz, C. J., and Reiss, J. D. (2021). WaveBeat: End‑to‑end beat and downbeat tracking in the time domain.
  56. Tsunoo, E., Kashiwagi, Y., Kumakura, T., and Watanabe, S. (2019). Transformer ASR with contextual block processing. In 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 427433. IEEE.
  57. Tzanetakis, G., and Cook, P. (2002). Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing, 10(5), 293302.
  58. Wu, Y.‑K., Chiu, C.‑Y., and Yang, Y.‑H. (2022). Jukedrummer: Conditional beat‑aware audio‑domain drum accompaniment generation via transformer VQ‑VAE. In Proceedings of the 23rd International Society for Music Information Retrieval Conference, pp. 18. Bengaluru, India.
  59. Zhao, J., Xia, G., and Wang, Y. (2022). Beat transformer: Demixed beat and downbeat tracking with dilated self‑attention. In Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR).
  60. Zheng‑qing, C., and Jian‑hua, H. (2005). A comparative study between time‑domain method and frequency‑domain method for identification of bridge flutter derivatives. Engineering Mechanics, 22(6), 127133.
DOI: https://doi.org/10.5334/tismir.198 | Journal eISSN: 2514-3298
Language: English
Submitted on: Apr 1, 2024
Accepted on: Sep 11, 2024
Published on: Dec 6, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Mojtaba Heydari, Zhiyao Duan, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.