Have a personal or library account? Click to login
Audio-Based Music Structure Analysis: Current Trends, Open Challenges, and Applications Cover

Audio-Based Music Structure Analysis: Current Trends, Open Challenges, and Applications

Open Access
|Dec 2020

References

  1. 1 Befus, C. (2010). Design and evaluation of dynamic feature-based segmentation on music. Master’s thesis, University of Lethbridge, Lethbridge, Alberta, Canada.
  2. 2 Bimbot, F., Blouch, O. L., Sargent, G., & Vincent, E. (2010). Decomposition into autonomous and comparable blocks: A structural description of music pieces. In Proc. of the 11th International Society for Music Information Retrieval Conference, pages 189194. Utrecht, The Netherlands.
  3. 3 Bimbot, F., Sargent, G., Deruty, E., Guichaoua, C., & Vincent, E. (2014). Semiotic description of music structure: An introduction to the Quaero/Metiss structural annotations. In Proc. of the AES 53rd Conference on Semantic Audio.
  4. 4 Bittner, R., Fuentes, M., Rubinstein, D., Jansson, A., Choi, K., & Kell, T. (2019). mirdata: Software for reproducible usage of datasets. In Proc. of the 20th International Society for Music Information Retrieval Conference, pages 99106. Delft, The Netherlands.
  5. 5 Bruderer, M. J. (2008). Perception and Modeling of Segment Boundaries in Popular Music. PhD thesis, Technische Universiteit Eindhoven.
  6. 6 Bruderer, M. J., McKinney, M. F., & Kohlrausch, A. (2009). The perception of structural boundaries in melody lines of Western popular music. Musicæ Scientiæ, 13(2), 273313. DOI: 10.1177/102986490901300204
  7. 7 Cambouropoulos, E. (2001). The local boundary detection model (LBDM) and its application in the study of expressive timing. In Proc. of the International Computer Music Conference, pages 1722. La Havana, Cuba.
  8. 8 Cannam, C., Landone, C., & Sandler, M. (2010). Sonic Visualiser: An open source application for viewing, analysing, and annotating music audio files. In Proc. of the 18th ACM International Conference on Multimedia, pages 14671468. ACM. DOI: 10.1145/1873951.1874248
  9. 9 Chen, T.-P., & Su, L. (2019). Harmony Transformer: Incorporating chord segmentation into harmony recognition. In Proc. of the 20th International Society for Music Information Retrieval Conference, pages 259267. Delft, The Netherlands.
  10. 10 Cheng, T., Smith, J. B. L., & Goto, M. (2018). Music structure boundary detection and labelling by a deconvolution of path-enhanced self-similarity matrix. In IEEE International Conference on Acoustics, Speech and Signal Processing, pages 106110. Calgary, Alberta, Canada. DOI: 10.1109/ICASSP.2018.8461319
  11. 11 Collins, T., Arzt, A., Flossman, S., & Widmer, G. (2013). SIARCT-CFP: Improving precision and the discovery of inexact musical patterns in point-set representations. In Proc. of the 14th International Society for Music Information Retrieval Conference, pages 549554. Curitiba, Brazil.
  12. 12 Dannenberg, R. B., & Goto, M. (2008). Music structure analysis from acoustic signals. In Havelock, D., Kuwano, S., & Vorländer, M., editors, Handbook of Signal Processing in Acoustics, pages 305331. Springer, New York, NY. DOI: 10.1007/978-0-387-30441-0_21
  13. 13 Dhariwal, P., Jun, H., Payne, C., Kim, J. W., Radford, A., & Sutskever, I. (2020). Jukebox: A generative model for music. arXiv preprint 2005.00341.
  14. 14 Dieleman, S., van den Oord, A., & Simonyan, K. (2018). The challenge of realistic music generation: Modelling raw audio at scale. In Advances in Neural Information Processing Systems 31, pages 79897999. Curran Associates, Inc.
  15. 15 Dinh, L., Sohl-Dickstein, J., & Bengio, S. (2017). Density estimation using real NVP. In 5th International Conference on Learning Representations (ICLR). Toulon, France.
  16. 16 Dong, H.-W., Hsiao, W.-Y., Yang, L.-C., & Yang, Y.-H. (2018). MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In Proc. of the 32nd AAAI Conference on Artificial Intelligence. New Orleans, LA, USA.
  17. 17 Engel, J., Agrawal, K. K., Chen, S., Gulrajani, I., Donahue, C., & Roberts, A. (2019). GANSynth: Adversarial neural audio synthesis. In 7th International Conference on Learning Representations (ICLR). New Orleans, LA, USA.
  18. 18 Flexer, A., & Grill, T. (2016). The problem of limited inter-rater agreement in modelling music similarity. Journal of New Music Research, 45(3), 239251. DOI: 10.1080/09298215.2016.1200631
  19. 19 Foote, J. (2000). Automatic audio segmentation using a measure of audio novelty. In Proc. of the IEEE International Conference of Multimedia and Expo (ICME), pages 452455. New York City, NY, USA. DOI: 10.1109/ICME.2000.869637
  20. 20 Fuentes, M., Maia, L. S., & Biscainho, L. W. P. (2019a). Tracking beats and microtiming in Afro-Latin American music using conditional random fields and deep learning. In Proc. of the 20th International Society for Music Information Retrieval Conference, pages 251258. Delft, The Netherlands.
  21. 21 Fuentes, M., McFee, B., Crayencour, H., Essid, S., & Bello, J. (2019b). A music structure informed downbeat tracking system using skip-chain conditional random fields and deep learning. In Proc. of the 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 481485. DOI: 10.1109/ICASSP.2019.8682870
  22. 22 Gemmeke, J. F., Ellis, D. P., Freedman, D., Jansen, A., Lawrence, W., Moore, R. C., Plakal, M., & Ritter, M. (2017). Audio Set: An ontology and humanlabeled dataset for audio events. Proc. of the 42nd IEEE International Conference on Acoustics, Speech and Signal Processing, pages 776780. DOI: 10.1109/ICASSP.2017.7952261
  23. 23 Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems 27, pages 26722680.
  24. 24 Goto, M. (2003). A chorus-section detecting method for musical audio signals. In Proc. of the 28th IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 437440. Hong Kong, China.
  25. 25 Goto, M. (2006a). AIST annotation for the RWC Music Database. In Proc. of the 7th International Conference on Music Information Retrieval, pages 359360. Victoria, BC, Canada.
  26. 26 Goto, M. (2006b). A chorus section detection method for musical audio signals and its application to a music listening station. IEEE Transactions on Audio, Speech, and Language Processing, 14(5), 17831794. DOI: 10.1109/TSA.2005.863204
  27. 27 Goto, M., Yoshii, K., Fujihara, H., Mauch, M., & Nakano, T. (2011). Songle: A web service for active music listening improved by user contributions. In Proc. of the 12th International Society for Music Information Retrieval Conference, pages 311316. Miami, FL, USA.
  28. 28 Grill, T., & Schlüter, J. (2015a). Music boundary detection using neural networks on combined features and two-level annotations. In Proc. of the 16th International Society for Music Information Retrieval Conference. Málaga, Spain. DOI: 10.1109/EUSIPCO.2015.7362593
  29. 29 Grill, T., & Schlüter, J. (2015b). Music boundary detection using neural networks on spectrograms and self-similarity lag matrices. In Proc. of the 23rd European Signal Processing Conference (EUSIPCO). Nice, France. DOI: 10.1109/EUSIPCO.2015.7362593
  30. 30 Groves, R. (2016). Automatic melodic reduction using a supervised probabilistic context-free grammar. In Proc. of the 17th International Society for Music Information Retrieval Conference, pages 775781. New York, NY, USA.
  31. 31 Guérin, É., Digne, J., Galin, É., Peytavie, A., Wolf, C., Benes, B., & Martinez, B. (2017). Interactive example-based terrain authoring with conditional generative adversarial networks. ACM Transactions on Graphics, 36(6), 228:1228:13. DOI: 10.1145/3130800.3130804
  32. 32 Hamanaka, M., Hirata, K., & Tojo, S. (2006). Implementing “A Generative Theory of Tonal Music”. Journal of New Music Research, 35(4), 249277. DOI: 10.1080/09298210701563238
  33. 33 Hargreaves, S., Klapuri, A., & Sandler, M. (2012). Structural segmentation of multitrack audio. IEEE Transactions on Audio, Speech, and Language Processing, 20(10), 26372647. DOI: 10.1109/TASL.2012.2209419
  34. 34 Humphrey, E. J., Bello, J. P., & LeCun, Y. (2012). Moving beyond feature design: Deep architecture and automatic feature learning in music informatics. In Proc. of the 13th International Society for Music Information Retrieval Conference, pages 403408. Porto, Portugal.
  35. 35 Humphrey, E. J., Salamon, J., Nieto, O., Forsyth, J., Bittner, R. M., & Bello, J. P. (2014). JAMS: A JSON annotated music specification for reproducible MIR research. In Proc. of the 15th International Society for Music Information Retrieval Conference, pages 591596. Taipei, Taiwan.
  36. 36 Janssen, B., de Haas, W., Volk, A., & Van Kranenburg, P. (2013). Discovering repeated patterns in music: State of knowledge, challenges, perspectives. In Proc. of the 10th International Symposium on Computer Music Multidisciplinary Research (CMMR), pages 225240. Marseille, France.
  37. 37 Jhamtani, H., & Berg-Kirkpatrick, T. (2019). Modeling self-repetition in music generation using generative adversarial networks. In Machine Learning for Music Discovery Workshop, ICML. Long Beach, USA.
  38. 38 Kaiser, F., & Peeters, G. (2013). A simple fusion method of state and sequence segmentation for music structure discovery. In Proc. of the 14th International Society for Music Information Retrieval Conference. Curitiba, Brazil.
  39. 39 Kaiser, F., & Sikora, T. (2010). Music structure discovery in popular music using non-negative matrix factorization. In Proc. of the 11th International Society for Music Information Retrieval Conference, pages 429434. Utrecht, Netherlands.
  40. 40 Kim, J. W., & Bello, J. P. (2019). Adversarial learning for improved onsets and frames music transcription. In Proc. of the 20th International Society for Music Information Retrieval Conference, pages 670677. Delft, The Netherlands.
  41. 41 Kingma, D. P., & Dhariwal, P. (2018). Glow: Generative flow with invertible 1 × 1 convolutions. In Advances in Neural Information Processing Systems 31, pages 1021510224, Montreal, Canada.
  42. 42 Kinnaird, K. M. (2016). Aligned hierarchies: A multiscale structure-based representation for musicbased data streams. In Proc. of the 17th International Society for Music Information Retrieval Conference, pages 337343. New York City, NY, USA.
  43. 43 Kinnaird, K. M. (2018). Aligned sub-hierarchies: A structure-based approach to the cover song task. In Proc. of the 19th International Society for Music Information Retrieval Conference, pages 585591. Paris, France.
  44. 44 Klien, V., Grill, T., & Flexer, A. (2012). On automated annotation of acousmatic music. Journal of New Music Research, 41(2), 153173. DOI: 10.1080/09298215.2011.618226
  45. 45 Lerdahl, F., & Jackendoff, R. (1983). A Generative Theory of Tonal Music. MIT Press.
  46. 46 Levy, M., & Sandler, M. (2008). Structural segmentation of musical audio by constrained clustering. IEEE Transactions on Audio, Speech, and Language Processing, 16(2), 318326. DOI: 10.1109/TASL.2007.910781
  47. 47 Levy, M., Sandler, M., & Casey, M. (2006). Extraction of high-level musical structure from audio data and its application to thumbnail generation. In Proc. of the 31st IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 5. DOI: 10.1109/ICASSP.2006.1661200
  48. 48 Liem, C. C., Gomez, E., & Schedl, M. (2015). PHENICX: Innovating the classical music experience. In Proc. of the 2015 IEEE International Conference on Multimedia and Expo Workshops, pages 36. Torino, Italy. DOI: 10.1109/ICMEW.2015.7169835
  49. 49 Logan, B., & Chu, S. (2000). Music summarization using key phrases. In Proc. of the 25th IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 2, pages 749752. Istanbul, Turkey. DOI: 10.1109/ICASSP.2000.859068
  50. 50 Lukashevich, H. (2008). Towards quantitative measures of evaluating song segmentation. In Proc. of the 9th International Conference on Music Information Retrieval, pages 375380. Philadelphia, PA, USA.
  51. 51 Maezawa, A. (2019). Music boundary detection based on a hybrid deep model of novelty, homogeneity, repetition and duration. In Proc. of the 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 206210. Brighton, United Kingdom. DOI: 10.1109/ICASSP.2019.8683249
  52. 52 Manzelli, R., Thakkar, V., Siahkamari, A., & Kulis, B. (2018). Conditioning deep generative raw audio models for structured automatic music. In Proc. of the 19th International Society for Music Information Retrieval Conference, pages 182189. Paris, France.
  53. 53 Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530549. DOI: 10.1109/TPAMI.2004.1273918
  54. 54 Mauch, M., Cannam, C., Davies, M., Dixon, S., Harte, C., Kolozali, S., Tidhar, D., & Sandler, M. (2009a). OMRAS2 Metadata Project 2009. In Late Breaking/Demo at the 10th International Society for Music Information Retrieval Conference. Kobe, Japan.
  55. 55 Mauch, M., Noland, K., & Dixon, S. (2009b). Using musical structure to enhance automatic chord transcription. In Proc. of the 10th International Society for Music Information Retrieval Conference, pages 231236. Kobe, Japan.
  56. 56 McCallum, M. (2019). Unsupervised learning of deep features for music segmentation. In Proc. of the 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 346350. Brighton, United Kingdom. DOI: 10.1109/ICASSP.2019.8683407
  57. 57 McFee, B., & Ellis, D. P. W. (2014a). Analyzing song structure with spectral clustering. In Proc. of the 15th International Society for Music Information Retrieval Conference, pages 405410. Taipei, Taiwan.
  58. 58 McFee, B., & Ellis, D. P. W. (2014b). Learning to segment songs with ordinal linear discriminant analysis. In Proc. of the 39th IEEE International Conference on Acoustics, Speech and Signal Processing, pages 51975201. Florence, Italy. DOI: 10.1109/ICASSP.2014.6854594
  59. 59 McFee, B., & Kinnaird, K. (2019). Improving structure evaluation through automatic hierarchy expansion. In Proc. of the 20th International Society for Music Information Retrieval Conference, pages 152158. Delft, The Netherlands.
  60. 60 McFee, B., Nieto, O., Farbood, M. M., & Bello, J. P. (2017). Evaluating hierarchical structure in music annotations. Frontiers in Psychology, 8: 1337. DOI: 10.3389/fpsyg.2017.01337
  61. 61 McFee, B., Raffel, C., Liang, D., Ellis, D. P. W., McVicar, M., Battenberg, E., & Nieto, O. (2015). librosa: Audio and music signal analysis in python. In Proc. of the 14th Python in Science Conference (SciPy), pages 1825. Austin, TX, USA. DOI: 10.25080/Majora-7b98e3ed-003
  62. 62 Müller, M., & Jiang, N. (2012). A scape plot representation for visualizing repetitive structures of music recordings. In Proc. of the 13th International Society for Music Information Retrieval Conference, pages 97102. Porto, Portugal.
  63. 63 Murthy, Y. V. S., & Koolagudi, S. G. (2018). Contentbased music information retrieval (CB-MIR) and its applications toward the music industry: A review. ACM Computing Surveys, 51(3). DOI: 10.1145/3177849
  64. 64 Nieto, O. (2015). Discovering Structure in Music: Automatic Approaches and Perceptual Evaluations. PhD thesis, New York University.
  65. 65 Nieto, O., & Bello, J. P. (2014). Music segment similarity using 2D-Fourier magnitude coefficients. In Proc. of the 39th IEEE International Conference on Acoustics Speech and Signal Processing, pages 664668. Florence, Italy. DOI: 10.1109/ICASSP.2014.6853679
  66. 66 Nieto, O., & Bello, J. P. (2016). Systematic exploration of computational music structure research. In Proc. of the 17th International Society for Music Information Retrieval Conference, pages 547553. New York City, NY, USA.
  67. 67 Nieto, O., Farbood, M. M., Jehan, T., & Bello, J. P. (2014). Perceptual analysis of the F-measure for evaluating section boundaries in music. In Proc. of the 15th International Society for Music Information Retrieval Conference, pages 265270. Taipei, Taiwan.
  68. 68 Nieto, O., Humphrey, E. J., & Bello, J. P. (2012). Compressing music recordings into audio summaries. In Proc. of the 13th International Society for Music Information Retrieval Conference, pages 313318. Porto, Portugal.
  69. 69 Nieto, O., & Jehan, T. (2013). Convex non-negative matrix factorization for automatic music structure identification. In Proc. of the 38th IEEE International Conference on Acoustics, Speech, and Signal Processing. DOI: 10.1109/ICASSP.2013.6637644
  70. 70 Nieto, O., McCallum, M., Davies, M., Robertson, A., Stark, A., & Egozy, E. (2019). The Harmonix Set: Beats, downbeats, and functional segment annotations of Western popular music. In Proc. of the 20th International Society for Music Information Retrieval Conference, pages 565572. Delft, The Netherlands.
  71. 71 Panagakis, Y., & Kotropoulos, C. (2012). Music structure analysis by ridge regression of beatsynchronous audio features. In Proc. of the 13th International Society for Music Information Retrieval Conference, pages 271276. Porto, Portugal.
  72. 72 Paulus, J., & Klapuri, A. (2009). Music structure analysis using a probabilistic fitness measure and a greedy search algorithm. IEEE Transactions on Audio, Speech, and Language Processing, 17(6), 11591170. DOI: 10.1109/TASL.2009.2020533
  73. 73 Paulus, J., Müller, M., & Klapuri, A. (2010). Audiobased music structure analysis. In Proc. of the 11th International Society for Music Information Retrieval Conference, pages 625636. Utrecht, Netherlands.
  74. 74 Pauwels, J., Kaiser, F., & Peeters, G. (2013). Combining harmony-based and novelty-based approaches for structural segmentation. In Proc. of the 14th International Society for Music Information Retrieval Conference. Curitiba, Brazil.
  75. 75 Peeters, G., & Bisot, V. (2014). Improving music structure segmentation using lag-priors. In Proc. of the 15th International Society for Music Information Retrieval Conference, pages 337342. Taipei, Taiwan.
  76. 76 Peeters, G., Burthe, A. L., & Rodet, X. (2002). Toward automatic music audio summary generation from signal analysis. In Proc. of the 3rd International Conference on Music Information Retrieval. Paris, France. ISMIR.
  77. 77 Peeters, G., & Deruty, E. (2009). Is music structure annotation multi-dimensional? A proposal for robust local music annotation. In Proc. of the 3rd International Worskhop on Learning the Semantics of Audio Signals (LSAS), pages 7590. Graz, Austria.
  78. 78 Pons, J., Nieto, O., Prockup, M., Schmidt, E. M., Ehmann, A. F., & Serra, X. (2018). End-to-end learning for music audio tagging at scale. In Proc. of the 19th International Society for Music Information Retrieval Conference, pages 637644. Paris, France.
  79. 79 Raffel, C., Mcfee, B., Humphrey, E. J., Salamon, J., Nieto, O., Liang, D., & Ellis, D. P. W. (2014). mir_eval: A transparent implementation of common MIR metrics. In Proc. of the 15th International Society for Music Information Retrieval Conference, pages 367372. Taipei, Taiwan.
  80. 80 Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140), 167.
  81. 81 Rafii, Z., Liutkus, A., & Pardo, B. (2014). REPET for background/foreground separation in audio. In Naik, G. R., & Wang, W., editors, Blind Source Separation, pages 395411. Springer. DOI: 10.1007/978-3-642-55016-4_14
  82. 82 Roberts, A., Engel, J., Raffel, C., Hawthorne, C., & Eck, D. (2018). A hierarchical latent vector model for learning long-term structure in music. In Proc. of the 35th International Conference on Machine Learning, volume 80 of Proc. of Machine Learning Research, pages 43644373. Stockholm, Sweden.
  83. 83 Rosenberg, A., & Hirschberg, J. (2007). V-measure: A conditional entropy-based external cluster evaluation measure. In Proc. of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language (EMNLPCoNLL), pages 410420.
  84. 84 Sargent, G., Bimbot, F., & Vincent, E. (2011). A regularity-constrained Viterbi algorithm and its application to the structural segmentation of songs. In Proceedings of the International Conference on Music Information Retrieval, pages 483488. Miami, United States.
  85. 85 Sargent, G., Bimbot, F., & Vincent, E. (2017). Estimating the structural segmentation of popular music pieces under regularity constraints. IEEE/ACM Transactions on Audio Speech and Language Processing, 25(2), 344358. DOI: 10.1109/TASLP.2016.2635031
  86. 86 Schedl, M., Gómez, E., & Urbano, J. (2014). Music information retrieval: Recent developments and applications. Foundations and Trends in Information Retrieval, 8(2–3), 127261. DOI: 10.1561/1500000042
  87. 87 Schnitzer, D., Flexer, A., Schedl, M., & Widmer, G. (2011). Using mutual proximity to improve content-based audio similarity. In Proc. of the 12th International Society for Music Information Retrieval Conference, pages 7984. Miami, FL, USA.
  88. 88 Seetharaman, P., & Pardo, B. (2016). Simultaneous separation and segmentation in layered music. In Proc. of the 17th International Society for Music Information Retrieval Conference. New York City, NY, USA.
  89. 89 Serrà, J., Müller, M., Grosche, P., & Arcos, J. L. (2014). Unsupervised music structure annotation by time series structure features and segment similarity. IEEE Transactions on Multimedia, Special Issue on Music Data Mining, 16(5), 12291240. DOI: 10.1109/TMM.2014.2310701
  90. 90 Serrà, J., Serra, X., & Andrzejak, R. G. (2009). Cross recurrence quantification for cover song identification. New Journal of Physics, 11(9), 11381151. DOI: 10.1088/1367-2630/11/9/093017
  91. 91 Shibata, G., Nishikimi, R., Nakamura, E., & Yoshii, K. (2019). Statistical music structure analysis based on a homogeneity-, repetitiveness-, and regularityaware hierarchical hidden semi-Markov model. In Proc. of the 20th International Society for Music Information Retrieval Conference, pages 268275. Delft, The Netherlands.
  92. 92 Smith, J. B. L. (2014). Explaining listener differences in the perception of musical structure. PhD thesis, Queen Mary University of London.
  93. 93 Smith, J. B. L., Burgoyne, J. A., Fujinaga, I., De Roure, D., & Downie, J. S. (2011). Design and creation of a large-scale database of structural annotations. In Proc. of the 12th International Society for Music Information Retrieval Conference, pages 555560. Miami, FL, USA.
  94. 94 Smith, J. B. L., & Chew, E. (2013). A meta-analysis of the MIREX structure segmentation task. In Proc. of the 14th International Society for Music Information Retrieval Conference, pages 251256. Curitiba, Brazil.
  95. 95 Smith, J. B. L., & Goto, M. (2016). Using priors to improve estimates of music structure. In Proc. of the 17th International Society for Music Information Retrieval Conference, pages 554560. New York City, NY, USA.
  96. 96 Smith, J. B. L., & Goto, M. (2017). Multi-part pattern analysis: Combining structure analysis and source separation to discover intra-part repeated sequences. In Proc. of the 18th International Society for Music Information Retrieval Conference, pages 716723. Suzhou, China.
  97. 97 Thickstun, J., Harchaoui, Z., Foster, D., & Kakade, S. (2019). Coupled recurrent models for polyphonic music composition. In Proc. of the 20th International Society for Music Information Retrieval Conference, pages 311318. Delft, The Netherlands.
  98. 98 Tian, M., & Sandler, M. B. (2016). Towards music structural segmentation across genres. ACM Transactions on Intelligent Systems and Technology, 8(2), 119. DOI: 10.1145/2950066
  99. 99 Tralie, C. J., & McFee, B. (2019). Enhanced hierarchical music structure annotations via feature level similarity fusion. In Proc. of the 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 201205. Brighton, United Kingdom. DOI: 10.1109/ICASSP.2019.8683492
  100. 100 Turnbull, D., Lanckriet, G., Pampalk, E., & Goto, M. (2007). A supervised approach for detecting boundaries in music using difference features and boosting. In Proc. of the 8th International Conference on Music Information Retrieval, pages 4249. Vienna, Austria.
  101. 101 Ullrich, K., Schlüter, J., & Grill, T. (2014). Boundary detection in music structure analysis using convolutional neural networks. In Proc. of the 15th International Society for Music Information Retrieval Conference, pages 417422. Taipei, Taiwan.
  102. 102 van den Oord, A., Dieleman, S., & Schrauwen, B. (2013). Deep content-based music recommendation. Advances in Neural Information Processing Systems 26, pages 26432651.
  103. 103 Wang, C.-I., Mysore, G. J., & Dubnov, S. (2017). Re-visiting the music segmentation problem with crowdsourcing. In Proc. of the 18th International Society for Music Information Retrieval Conference, pages 738744. Suzhou, China.
  104. 104 Wang, J.-C., Lee, H.-S., Wang, H.-M., & Jeng, S.-K. (2011). Learning the similarity of audio music in bag-of-frames representation from tagged music data. In Proc. of the 12th International Society for Music Information Retrieval Conference, pages 8590. Miami, FL, USA.
  105. 105 Weiss, R., & Bello, J. P. (2011). Unsupervised discovery of temporal structure in music. IEEE Journal of Selected Topics in Signal Processing, 5(6), 12401251. DOI: 10.1109/JSTSP.2011.2145356
DOI: https://doi.org/10.5334/tismir.54 | Journal eISSN: 2514-3298
Language: English
Submitted on: Feb 29, 2020
Accepted on: Oct 6, 2020
Published on: Dec 11, 2020
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2020 Oriol Nieto, Gautham J. Mysore, Cheng-i Wang, Jordan B. L. Smith, Jan Schlüter, Thomas Grill, Brian McFee, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.