Have a personal or library account? Click to login
Supervised Contrastive Models for Music Information Retrieval in Classical Persian Music Cover

Supervised Contrastive Models for Music Information Retrieval in Classical Persian Music

Open Access
|Jan 2026

Figures & Tables

Table 1

Data distribution of the PCID dataset.

Instrument# Train# Test# Val
Daf (52 m) (6.5 m) (6.5 m)
Divan (59 m) (7 m) (7 m)
Dutar (50.5 m) (6 m) (6 m)
Gheychak (50 m) (6 m) (6 m)
Kamancheh (2 h, 14 m) (16.5 m) (16.5 m)
Ney Anban (1 h, 6 m) (8 m) (8 m)
Ney (2 h, 15 m) (17 m) (17 m)
Oud (2 h, 32 m) (19 m) (19 m)
Qanun (1 h, 1 m) (7.5 m) (7.5 m)
Rubab (50 m) (6 m) (6 m)
Santur (2 h, 11 m) (16 m) (16 m)
Setar (3 h, 22 m) (25 m) (25 m)
Tanbour (1 h, 18 m) (9.5 m) (9.5 m)
Tar (2 h, 7 m) (16 m) (16 m)
Tonbak (1 h, 9 m) (8.5 m) (8.5 m)
tismir-9-1-271-g1.png
Figure 1

Flowchart of our proposed model structure of the model.

tismir-9-1-271-g2.png
Figure 2

Our proposed contrastive (base) model architecture.

tismir-9-1-271-g3.png
Figure 3

Accuracy vs. input length tested on the Nava and PCID datasets (trained on the PCID 5 Instruments subset).

tismir-9-1-271-g4.png
Figure 4

Accuracy vs. input length tested on the Nava and PCID datasets (trained on PCID).

tismir-9-1-271-g5.png
Figure 5

Accuracy vs. input length tested on the Nava and PCID datasets (trained on the original Nava dataset).

tismir-9-1-271-g6.png
Figure 6

Comparison of test accuracy between the proposed model, Baba Ali et al. (2019), and Baba Ali (2024).

tismir-9-1-271-g7.png
Figure 7

Comparison of accuracy for Dastgah detection across Baba Ali et al. (2019), Baba Ali (2024), and the proposed method.

tismir-9-1-271-g8.png
Figure 8

Architecture of the best model for the classifier of the one‑second, 15‑class classification task.

tismir-9-1-271-g9.png
Figure 9

Architecture of the best model for the meta‑classifier of the 20‑second, 15‑class classification task.

tismir-9-1-271-g10.png
Figure 10

t‑SNE projection of penultimate‑layer features for 10,000 one‑second test segments from the PCID.

tismir-9-1-271-g11.png
Figure 11

Normalized confusion matrix (one‑second input, PCID test set).

Table 2

Comparison of instrument classification performance across different studies.

StudyDataset#of classesMethodologyAccuracy (%)F1‑Score (%)
Our StudyExtended Dataset (15 instruments)15Supervised contrastive learning with SSA97.4898
Our StudySubset of Extended Dataset (5 instruments)5Supervised contrastive learning with SSA99.78100
Our StudyNava Dataset (Modified)5Supervised contrastive learning with SSA99.88100
Agostini et al. (2003)Orchestral Instruments Dataset27Spectral features with KNN and neural networks70–80N/A
Essid et al. (2006)Solo Recordings and Mixtures of Western Instruments7MFCCs, timbral descriptors with SVM65–75N/A
Han et al. (2016)Subset of MIREX Dataset (Various Genres and Instruments)11Deep CNNs for predominant instrument recognition7580
Solanki and Pandey (2022)IMRAS Dataset (6705 recordings)11Eight‑layer deep CNN with mel spectrogram input92.61N/A
Prabavathy et al. (2020)RWC Database, MusicBrainz.org, IRMAS, NSynth16SVM and KNN with MFCC and sonogram features99.2995.15
Gong et al. (2021)ChMusic Dataset (Traditional Chinese Instruments)11MFCCs with KNN and majority voting94.15N/A
Humphrey et al. (2018)OpenMIC‑2018 Dataset20Deep learning with CNN and multi‑instance learningN/A78 (AUC‑PR)
Reghunath and Rajan (2022)Polyphonic Music Dataset11Transformer‑based ensemble method8579
Mousavi et al. (2019)PCMIR Dataset (Persian Classical Music)6MFCCs, spectral features with neural network80N/A
Baba Ali et al. (2019)Nava Dataset (Original)5MFCC and i‑vector with SVM84.7584
Baba Ali et al. (2024)Nava Dataset (Original)5Self‑supervised, pre‑trained models99.6499.64
DOI: https://doi.org/10.5334/tismir.271 | Journal eISSN: 2514-3298
Language: English
Submitted on: Apr 26, 2025
|
Accepted on: Dec 6, 2025
|
Published on: Jan 7, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Ali Ahmadi Katamjani, Seyed Abolghasem Mirroshandel, Mahdi Aminian, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.

Volume 9 (2026): Issue 1