Have a personal or library account? Click to login
Depict or Discern? Fingerprinting Musical Taste from Explicit Preferences Cover

Depict or Discern? Fingerprinting Musical Taste from Explicit Preferences

Open Access
|Jan 2024

Figures & Tables

tismir-7-1-158-g1.png
Figure 1

Heavy-tailed empirical distributions in the DL data sample. Top: Distribution of artists’ and songs’ number of “fans” (i.e. users who coined these artists/songs as “liked”). A large proportion of items is liked by only a few users, while some items are very popular (hundreds of thousands of fans). Bottom: The distribution of the number of given likes per user follows here again a heavy-tailed distribution, with some users liking ten thousand more items than other users. The proportion of users liking many items drops faster for artists than for songs.

Table 1

DL’s artists split in 6 popularity bins. The sum of likes for all artists is constant in each bin.

BinNumber of artistsNumber of likes
011619283 – 86877
13088534 – 19283
26763690 – 8534
318651253 – 3690
47925196 – 1253
55756221 – 196
tismir-7-1-158-g2.png
Figure 2

Proportion of DL users’ favorite artists in each music genre.

tismir-7-1-158-g11.png
Algorithm 1

Funiq _minsize(u)

tismir-7-1-158-g3.png
Figure 3

Share of identifiable users in DL depending on the number of items they have liked. For example, among users with 10 favorite artists and more, about 60% can be identified.

tismir-7-1-158-g4.png
Figure 4

Distributions of how many users (in proportion of DL) have all their favorite artists included in those of a “power-user”, for various ranges of “power-user” collection size. For example, the likes of 1% of users are fully included on average in those of a user with 750–1000 favorite artists.

tismir-7-1-158-g5.png
Figure 5

Proportion of users (from DL) whose favorite artists are included in the favorite artists of “power-users”. For example, 40% of users are included in users with more than 250 favorite artists.

tismir-7-1-158-g6.png
Figure 6

Ratio of users (from DS) identifiable through their liked and streamed artists, for different time periods. For example, 97% of the users are identifiable via their yearly streamed artists.

tismir-7-1-158-g7.png
Figure 7

Distributions of fingerprint sizes, computed with Funiq_rand and Funiq_minsize based on users’ favorite artists (DL).

Table 2

Distributions of fingerprint sizes, computed with Funiq_rand and Funiq_minsize based on favorite artists and songs, for different numbers of users in the dataset.

Artists
Sampling methodNumber of usersUnique users (%)Min F(u) sizeMax F(u) sizeMedian F(u) sizeMean F(u) sizeStandard deviation
Funiq_rand100087.311322.41.4
1000077.513333.52.3
10000067.715844.93.6
87124858.1113756.75.3
Funiq_minsize100087.31411.30.5
1000077.51711.60.7
10000067.711021.91.0
87124858.111422.31.2
Songs
Funiq_rand100096.8181.91.70.8
1000094.413322.21.2
1000009219832.91.7
88901789.9117633.62.4
Funiq_minsize100096.81211.00.1
1000094.41511.10.3
100000921811.30.5
88901789.9119411.41.1
tismir-7-1-158-g8.png
Figure 8

Distribution of popularity among the artists in the fingerprints. We compare the distribution of popularity among users’ favorite artists, Funiq_rand fingerprints and Funiq_minsize fingerprints (DL).

tismir-7-1-158-g9.png
Figure 9

Distribution of genres among the artists in the fingerprints. We compare the distribution of genres among users’ favorite artists, Funiq_rand fingerprints and Funiq_minsize fingerprints (DL).

Table 3

Item-wise and genre-wise prediction accuracy with Frep_kmedoid fingerprints and randomly sampled fingerprints of the same sizes on DS_favart.

Frep_randFrep_kmedoid
EvaluationNumber of favourite artistsMean accuracyStandard deviationMean accuracyStandard deviation
Item-wise<250.050.110.080.13
25–500.140.120.250.13
50–750.160.120.280.12
75–1000.180.110.300.12
100–1500.210.120.320.12
>1500.260.120.370.10
Genre-wise<250.380.310.400.28
25–500.650.140.730.09
50–750.700.130.780.08
75–1000.710.120.810.07
100–1500.770.100.830.08
>1500.880.120.970.05
Table 4

Prediction accuracy and optimal k with an item-to-item evaluation for Frep_kmedoid on favorite artists and streamed artists for different time periods (DS).

AccuracyOptimal k
Data sampleMeanStandard deviationMeanStandard deviation
Favorite artists0.090.122.663.26
Day streams0.070.111.861.67
Week streams0.130.115.034.42
Month streams0.260.138.825.70
Year streams0.350.129.736.23
tismir-7-1-158-g10.png
Figure 10

Item-wise (top) and genre-wise (bottom) prediction accuracy with Funiq_minsize fingerprints and Frep_kmedoid fingerprints, performed on DL_uniq.

DOI: https://doi.org/10.5334/tismir.158 | Journal eISSN: 2514-3298
Language: English
Submitted on: Dec 23, 2022
Accepted on: Nov 20, 2023
Published on: Jan 22, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Kristina Matrosova, Manuel Moussallam, Thomas Louail, Olivier Bodini, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.