Structural Segmentation of Alap in Dhrupad Vocal Concerts

Preeti Rao; Thallam Prasad Vinutha; Mattur Ananthanarayana Rohit

doi:10.5334/tismir.64

Abstract

Dhrupad vocal concerts exhibit a temporal evolution through a sequence of homogeneous sections marked by shared rhythmic characteristics. In this work, we address the segmentation of a concert audio’s unmetered improvisatory section into musically meaningful segments at the highest time scale. Motivated by the distinct musical properties of the sections and their corresponding acoustic correlates, we compute a number of features for the segment boundary detection task. Both supervised and unsupervised approaches are tested using a dataset of commercial performance recordings that is manually annotated. The dataset is augmented suitably for training and testing of the models to obtain new insights about the relevance of the different rhythmic, melodic and timbral cues in the automatic boundary detection task. We also explore the use of a convolutional neural network trained on mel-scale magnitude spectrograms for the boundary detection task to observe that while the implicit musical cues are largely learned by the network, it is less robust to deviations from training data characteristics. We conclude that it can be rewarding to investigate knowledge driven features on new genres and tasks, both to achieve reasonable performance outcomes given limited datasets and for drawing a deeper understanding of genre characteristics from the acoustical analyses.

References

1Allegraud, P., Bigo, L., Feisthauer, L., Giraud, M., Groult, R., Leguy, E., & Levé, F. (2019). Learning sonata form structure on Mozart’s string quartets. Transactions of the International Society for Music Information Retrieval, 2(1), 82–96. DOI: 10.5334/tismir.27
Back to article
2Bartsch, M. A., & Wakefield, G. H. (2005). Audio thumbnailing of popular music using chromabased representations. IEEE Transactions on Multimedia, 7(1), 96–104. DOI: 10.1109/TMM.2004.840597
Back to article
3Bello, J. P., Daudet, L., Abdallah, S., Duxbury, C., Davies, M., & Sandler, M. B. (2005). A tutorial on onset detection in music signals. IEEE Transactions on Speech and Audio Processing, 13(5), 1035–1047. DOI: 10.1109/TSA.2005.851998
Back to article
4Boersma, P., & Weenink, D. (2017). Praat: Doing phonetics by computer [computer program]. Version 6.0.28. http://www.praat.org/. Retrieved March 3, 2017.
Back to article
5Chen, S., & Gopalakrishnan, P. (1998). Speaker, environment and channel change detection and clustering via the Bayesian information criterion. In Proc. of the DARPA Broadcast News Transcription and Understanding Workshop, volume 8, pages 127–132, Virginia, USA.
Back to article
6Clarke, E. F. (1999). Rhythm and timing in music. In The Psychology of Music (Second Edition), pages 473–500. Elsevier. DOI: 10.1016/B978-012213564-4/50014-7
Back to article
7Clayton, M. (2001). Time in Indian Music: Rhythm, Metre, and Form in North Indian Rag Performance, Chapter 11: A case study in rhythmic analysis. Oxford University Press, UK.
Back to article
8Cooper, M., & Foote, J. (2003). Summarizing popular music via structural similarity analysis. In Proc. of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pages 127–130. DOI: 10.1109/ASPAA.2003.1285836
Back to article
9Dannenberg, R. B., & Goto, M. (2008). Music structure analysis from acoustic signals. In Handbook of Signal Processing in Acoustics, pages 305–331. Springer. DOI: 10.1007/978-0-387-30441-0_21
Back to article
10Dixon, S. (2001). Automatic extraction of tempo and beat from expressive performances. Journal of New Music Research, 30(1), 39–58. DOI: 10.1076/jnmr.30.1.39.7119
Back to article
11Foote, J. (2000). Automatic audio segmentation using a measure of audio novelty. In Proc. of the IEEE International Conference on Multimedia and Expo, volume 1, pages 452–455. DOI: 10.1109/ICME.2000.869637
Back to article
12Foote, J. T., & Cooper, M. L. (2003). Media segmentation using self-similarity decomposition. In Proc. SPIE 5021, Storage and Retrieval for Media Databases 2003, pages 167–176. DOI: 10.1117/12.476302
Back to article
13Grosche, P., Müller, M., & Kurth, F. (2010). Cyclic tempogram: A mid-level tempo representation for music signals. In Proc. of the IEEE International Conference on Acoustics Speech and Signal Processing, pages 5522–5525. DOI: 10.1109/ICASSP.2010.5495219
Back to article
14Gulati, S., & Rao, P. (2010). Rhythm pattern representations for tempo detection in music. In Proc. of the First International Conference on Intelligent Interactive Technologies and Multimedia, pages 241–244. DOI: 10.1145/1963564.1963606
Back to article
15Hermes, D. J. (1990). Vowel-onset detection. Journal of the Acoustical Society of America, 87(2), 866–873. DOI: 10.1121/1.398896
Back to article
16Jensen, K. (2006). Multiple scale music segmentation using rhythm, timbre, and harmony. EURASIP Journal on Advances in Signal Processing, 2007(1), 1–11. DOI: 10.1155/2007/73205
Back to article
17Jensen, K., Xu, J., & Zachariasen, M. (2005). Rhythm-based segmentation of popular Chinese music. In Proc. of the International Conference on Music Information Retrieval, pages 374–380.
Back to article
18Klapuri, A., Virtanen, T., Eronen, A., & Seppänen, J. (2001). Automatic transcription of musical recordings. In Proc. of the Consistent & Reliable Acoustic Cues Workshop. Aalborg, Denmark. DOI: 10.1109/ICCIMA.2007.138
Back to article
19Kumar, P. P., Rao, P., & Roy, S. D. (2007). Note onset detection in natural humming. In Proc. of the IEEE International Conference on Computational Intelligence and Multimedia Applications, volume 4, pages 176–180.
Back to article
20Logan, B. (2000). Mel frequency cepstral coefficients for music modeling. In Proc. of the International Symposium on Music Information Retrieval.
Back to article
21McCallum, M. C. (2019). Unsupervised learning of deep features for music segmentation. In Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing, pages 346–350. DOI: 10.1109/ICASSP.2019.8683407
Back to article
22McFee, B., Nieto, O., Farbood, M. M., & Bello, J. P. (2017). Evaluating hierarchical structure in music annotations. Frontiers in Psychology, 8, 1337. DOI: 10.3389/fpsyg.2017.01337
Back to article
23Paulus, J., Müller, M., & Klapuri, A. (2010). State of the art report: Audio-based music structure analysis. In Proc. of the International Society for Music Information Retrieval Conference, pages 625–636.
Back to article
24Peeters, G. (2003). Deriving musical structures from signal analysis for music audio summary generation: “Sequence” and “state” approach. In Proc. of the International Symposium on Computer Music Modeling and Retrieval, pages 143–166. DOI: 10.1007/978-3-540-39900-1_14
Back to article
25Peeters, G. (2007). Template-based estimation of timevarying tempo. EURASIP Journal on Applied Signal Processing, 2007(1), 158–171. DOI: 10.1155/2007/67215
Back to article
26Peeters, G., & Deruty, E. (2009). Is music structure annotation multi-dimensional? A proposal for robust local music annotation. In Proc. of 3rd Workshop on Learning the Semantics of Audio Signals, pages 75–90.
Back to article
27Ranganathan, S. (2013). Compositional models and aesthetic experience in dhruvapada. The Music Academy Journal, 84.
Back to article
28Ranjani, H., & Sreenivas, T. (2013). Hierarchical classification of Carnatic music forms. In Proc. of the International Society for Music Information Retrieval Conference.
Back to article
29Serra, X. (2011). A multicultural approach in music information research. In Proc. of the 12th International Society for Music Information Retrieval Conference, pages 151–156.
Back to article
30Smith, J. B. L., Burgoyne, J. A., Fujinaga, I., De Roure, D., & Downie, J. S. (2011). Design and creation of a large-scale database of structural annotations. In Proc. of the International Society for Music Information Retrieval Conference, pages 555––560.
Back to article
31Srinivasamurthy, A., Holzapfel, A., & Serra, X. (2014). In search of automatic rhythm analysis methods for Turkish and Indian art music. Journal of New Music Research, 43(1), 94–114. DOI: 10.1080/09298215.2013.879902
Back to article
32Sundberg, J. (1990). What’s so special about singers? Journal of Voice, 4(2), 107–119. DOI: 10.1016/S0892-1997(05)80135-3
Back to article
33Thoshkahna, B., Müller, M., Kulkarni, V., & Jiang, N. (2015). Novel audio features for capturing tempo salience in music recordings. In Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing, pages 181–185. DOI: 10.1109/ICASSP.2015.7177956
Back to article
34Tian, M., & Sandler, M. B. (2016). Towards music structural segmentation across genres: Features, structural hypotheses, and annotation principles. ACM Transactions on Intelligent Systems and Technology, 8(2), 23. DOI: 10.1145/2950066
Back to article
35Turnbull, D., Lanckriet, G. R., Pampalk, E., & Goto, M. (2007). A supervised approach for detecting boundaries in music using difference features and boosting. In Proc. of the International Conference on Music Information Retrieval, pages 51–54.
Back to article
36Ullrich, K., Schlüter, J., & Grill, T. (2014). Boundary detection in music structure analysis using convolutional neural networks. In Proc. of the International Society for Music Information Retrieval Conference, pages 417–422.
Back to article
37Verma, P., Vinutha, T. P., Pandit, P., & Rao, P. (2015). Structural segmentation of Hindustani concert audio with posterior features. In Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing, pages 136–140. DOI: 10.1109/ICASSP.2015.7177947
Back to article
38Vinutha, T. P., Sankagiri, S., Ganguli, K. K., & Rao, P. (2016). Structural segmentation and visualization of sitar and sarod concert audio. In Proc. of the International Society for Music Information Retrieval Conference, pages 232–238.
Back to article
39Wade, B. C. (2001). Music in India: The classical traditions, Chapter 7: Performance Genres of Hindustani Music. Manohar Publishers.
Back to article
40Widdess, R. (1994). Involving the performers in transcription and analysis: A collaborative approach to dhrupad. Ethnomusicology, 38(1), 59–79. DOI: 10.2307/852268
Back to article
41Widdess, R. (2011). Dynamics of melodic discourse in Indian music: Budhaditya Mukherjee’s ālāp in rāg pūriyā-kalyān. In M. Tenzer & J. Roeder (Eds.), Analytical and Cross-Cultural Studies in World Music, pages 187–224. Oxford University Press. DOI: 10.1093/acprof:oso/9780195384581.003.0005
Back to article
42Widdess, R. (2013). Schemas and improvisation in Indian music. In R. Kempson, C. Howes & M. Orwin (Eds.), Language, Music and Interaction, pages 197–209. College Publications.
Back to article

Structural Segmentation of Alap in Dhrupad Vocal Concerts

Abstract

Paradigm

My account