
Figure 1
Back cover of a Dhrupad concert audio CD by Pt. Nirmalya Dey.
Table 1
Dhrupad alap sections with descriptions.
| Section | Musical characteristics |
|---|---|
| Alap-proper | Rhythm free, slow and elaborate development of raga notes and phrases. A wide melodic range is spanned with focus gradually shifting from middle octave tonic to lower and then the higher octave. The melodic glide noom and mohra phrase serve as boundary cues. |
| Jod | Introduction of regular and slow pulsations via syllable rate. Melodic development and boundary cues similar to Alap-proper. |
| Jhala | Pulsation accelerates indicating climax. Syllable articulation more regular. The melodic range spanned is relatively narrow. |

Figure 2
Waveform and spectrogram of a 30 s excerpt around a Jod-Jhala boundary (labeled with a vertical dashed line) showing the melodic glide noom (with its first harmonic in the box).

Figure 3
Distribution of (a) section durations and (b) mean tempi of Dhrupad alap sections.

Figure 4
One instance (fold) of the 20-fold cross-validation process adopted for train-test data splitting.

Figure 5
Bi-phasic filter impulse response, with the discrete samples superposed, as applied to generate (a) onset detection function from sub-band energy, and (b) derivative features from short-time energy and short-time spectral centroid.

Figure 6
Rhythmogram of the UB_AhirBhrv alap (dashed lines indicate labeled boundaries).

Figure 7
Analysis of UB_AhirBhrv alap containing 4 sections: alap-proper, jod, jhala and jalad-jhala (a) Tempo, (b) Salience or pulse clarity, (c) Posteriors of rhythm, (d) Short-time energy difference, (e) Short-time centroid difference, and (f) MFCC C-1 coefficient. Dashed lines indicate manually labeled section boundaries.

Figure 8
Mel-spectrogram of UB_AhirBhrv alap (dashed lines indicate labeled boundaries).

Figure 9
Block diagram for the extraction of acoustic features.
Table 2
A key to the naming of feature subsets.
| Feature subset name | Features |
|---|---|
| Rhythm | Posteriors of (tempo, salience) |
| MFCC | First 13 MFCCs |
| Timbre | MFCC Short-time energy difference Short-time centroid difference |
| All | Rhythm Timbre |

Figure 10
SDM for UB_AhirBhrv alap with (a) ACF, (b) rhythm features, (c) posterior features, and (d) MFCC.

Figure 11
Novelty function for UB_AhirBhrv alap with (a) ACF, (b) rhythm features, (c) posterior features, and (d) MFCC. Dashed lines indicate manual boundaries.
Table 3
Performance of the unsupervised approach using different feature subsets with specified averaging window durations.
| Feature subset | Parameters | Performance | ||
|---|---|---|---|---|
| Window(s) | Precision | Recall | F-score | |
| Rhythm | 20 | 0.40 | 0.57 | 0.47 |
| MFCC | 3 | 0.59 | 0.57 | 0.58 |
| Timbre | 3 | 0.61 | 0.57 | 0.59 |
| All | 20 or 3 | 0.72 | 0.66 | 0.69 |
Table 4
Performance of RF classifier using different feature subsets with an averaging window of 20 s, for values of context (C) and #trees giving the best F-scores. In parentheses are results without training data augmentation.
| Feature subset | Parameters | Performance | |||
|---|---|---|---|---|---|
| C (±s) | # trees | Precision | Recall | F-score | |
| Rhythm | 50 | 30 | 0.17 | 0.26 | 0.21 |
| MFCC | 50 | 10 | 0.85 | 0.74 | 0.79 |
| Timbre | 20 | 50 | 0.86 | 0.81 | 0.83 |
| All | 20 | 100 | 0.90 (0.89) | 0.81 (0.75) | 0.85 (0.81) |
Table 5
Performance of CNN classifier with averaging window of 3 s with different context durations C.
| Parameters | Performance | ||
|---|---|---|---|
| C (±s) | Precision | Recall | F-score |
| 20 | 0.69 | 0.77 | 0.73 |
| 50 | 0.92 | 0.81 | 0.86 |
Table 6
Performance comparison of different feature combinations and methods.
| Segmentation approach | Precision | Recall | F-score |
|---|---|---|---|
| Without rhythm features | |||
| RF | 0.86 | 0.81 | 0.83 |
| CNN | 0.92 | 0.81 | 0.86 |
| Unsupervised | 0.61 | 0.57 | 0.59 |
| With rhythm features | |||
| RF | 0.90 | 0.81 | 0.85 |
| Unsupervised | 0.72 | 0.66 | 0.69 |
Table 7
Segmentation performance of the different methods on test concerts.
| Test alap | Unsupervised | RF Classifier | CNN | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Precision | Recall | F-score | Precision | Recall | F-score | Precision | Recall | F-score | |
| PN_Jog | 0.50 | 0.50 | 0.50 | 0.90 | 0.90 | 0.90 | 0.50 | 1.0 | 0.67 |
| PN_Maru | 0.50 | 0.50 | 0.50 | 0.47 | 0.70 | 0.56 | 0 | 0 | 0 |
Table 8
The Dhrupad alap dataset used in this work (see Github repository mentioned in Section 7 for details).
| Sl.# | Alap | Artist | Raga | Dur (min) | #Sections |
|---|---|---|---|---|---|
| 1 | GB_AhirBhrv | Gundecha Brothers | Ahir Bhairav | 49:47 | 4 |
| 2 | GB_Bhg | Gundecha Brothers | Bihag | 21:23 | 3 |
| 3 | GB_Bhim | Gundecha Brothers | Bhimpalasi | 17:22 | 3 |
| 4 | GB_Bhrv | Gundecha Brothers | Bhairav | 53:11 | 6 |
| 5 | GB_BKT | Gundecha Brothers | Bilaskhani Todi | 43:00 | 4 |
| 6 | GB_KRA | Gundecha Brothers | Komal Rishabh Asavari | 36:30 | 4 |
| 7 | GB_Mar | Gundecha Brothers | Marwa | 48:34 | 5 |
| 8 | GB_MMal | Gundecha Brothers | Miya Malhar | 45:42 | 5 |
| 9 | GB_Yam | Gundecha Brothers | Yaman | 46:32 | 4 |
| 10 | RS_Bind | Ritwik Sanyal | Bindeshwari | 19:57 | 4 |
| 11 | RS_Shr | Ritwik Sanyal | Shree | 26:90 | 3 |
| 12 | Sul_Man_Yam | Sulabha – Manoj Saraf | Yaman | 21:46 | 3 |
| 13 | UB_AhirBhrv | Uday Bhawalkar | Ahir Bhairav | 48:00 | 4 |
| 14 | UB_Bhg | Uday Bhawalkar | Bihag | 51:10 | 3 |
| 15 | UB_Bhrv | Uday Bhawalkar | Bhairav | 50:22 | 3 |
| 16 | UB_Jog | Uday Bhawalkar | Jog | 25:46 | 3 |
| 17 | UB_Malk | Uday Bhawalkar | Malkauns | 61:16 | 3 |
| 18 | UB_Maru | Uday Bhawalkar | Maru | 35:35 | 3 |
| 19 | UB_Shr | Uday Bhawalkar | Shree | 19:45 | 3 |
| 20 | WD_Bhg | Wasifuddin Dagar | Bihag | 40:22 | 3 |
