Inferring Communities of Medieval Music Manuscripts Using Stochastic Block Models

Tim Eipert; Fabian C. Moss

doi:10.5334/tismir.298

Full Article

1 Introduction

In digital musicology, the analysis of medieval chant is a field that has long employed computational methods. The relatively simple notation of chant melodies (Hiley, 2009) lends itself to digital encoding (Helsen and Lacoste, 2011), which has led to one of the largest musicological datasets to date, the Cantus database (Lacoste, 2012). Complementarily, various websites and tools have been developed to provide access to the different chant datasets through advanced search tools (Morent, 2018).^¹ Moreover, various tools for manual (Burlet et al., 2012; Eipert et al., 2019; Regimbal et al., 2019, 2020) and automatic (Fujinaga and Vigliensoni, 2019, 2023; Hartelt et al., 2024) transcription, as well as infrastructure for computer‑based analysis (Cornelissen et al., 2020a; Eipert and Moss, 2023), increasingly enable researchers to address complex questions with quantitative methods.

Though many have worked on the melodic structure of medieval liturgical chants (Cornelissen et al., 2020a,2020b; Hajič Jr. et al., 2023; Lanz and Hajič Jr., 2023; Nakamura et al., 2023; Van Kranenburg and Maessen, 2017), one important genre, the so‑called tropes (Haug, 2018a), has been overlooked in computational studies. Trope elements are insertions into Gregorian chants, making them contingent on their primary chant. They thus represent a unique chant category, as they structurally depend on a primary repertoire—namely, Gregorian chant^². Simultaneously, tropes developed into a regionally diverse repertoire of more or less stable phrases that spread from the Carolingian Empire throughout Christian territories in medieval Europe. The oral transmission and ensuing evolution of these melodies occurred centuries before they were ever recorded in written manuscripts. To understand the complex context of the trope genre, scholars have typically examined the repertoires and musical variants preserved in regional manuscript collections (e.g., Hughes, 2017). In this way, the genre can stand in for the entire field, illustrating how oral musical transmission functions when our only witnesses are scattered and fragmentary manuscripts. Thus far, however, most studies have relied on manual examination and simple statistical techniques (Hiley, 2017).

Here, we aim to build on earlier research by introducing a quantitative model of a trope dataset using network analysis. We are interested in exploring the relationship between musical diversity and geographical spread. Since our specific goal is to trace the transmission of trope elements, our research question is:

How did trope elements spread through Central Europe, and what community blocks can be identified from the repertoire differences we observe?

Our study contributes to the field of computational musicology by addressing historical questions using large domain‑specific datasets and applying suitable computational methods. We draw on the largest available dataset of trope metadata. We compare a family of stochastic block models (SBMs) to infer communities of manuscripts from this dataset. For evaluation, we assess the inferred communities qualitatively, by comparing them with the grouping found in previous literature, as well as quantitatively, by comparing the inferred communities with a regional grouping. After fitting our model to the data, we find that there is a good overlap between regional groups, the groupings of previous literature, and our inferred communities.

2 Related Work

2.1 Computational analysis of medieval chants

Many computational analyses of medieval music focus on the melodic structure of chants. For instance, Cornelissen et al. (2020b) evaluate various models for classifying modes based on melody transcriptions in the Cantus database and suggest that even the mere contour of melodies contains information about the mode they belong to. Their results also indicate that segmenting pitch sequences according to neumes, syllables, and words of the sung text yields better classification outcomes than using simple n‑grams. This, however, has been contested by Lanz and Hajič Jr. (2023), who show that this result can not be replicated with musicologically informed data cleaning. Nakamura et al. (2023) extend investigations on modes of medieval chants by incorporating the historical dimension into the analysis. They find that, while the relative frequency of modes is stable over time, the internal pitch structure changes. Van Kranenburg and Maessen (2017) analyze melodies from five chant repertoires and classify them solely by melodic features.

Another small but significant research field concerns the relationships between musical manuscripts. Hajič Jr. et al. (2023) infer an evolutionary tree of manuscripts based on variations in their preserved repertoire. The resulting tree expresses the content‑based relationships between manuscripts. This methodology can be further developed to enable the tree to infer not only manuscript relationships but also their chronological sequence (Ballen et al., 2024; Hajič Jr. et al., 2025).

2.2 Trope manuscript comparisons

If we limit the scope to studies related to the trope repertoire, we can identify similar questions (manuscript comparison), albeit with mostly different (qualitative or noncomputational) methodologies. Early scholarship on the transmission of trope manuscripts often addressed questions of regional traditions. This is exemplified by Weiss (1964), who tackles the problem of grouping South French trope manuscripts, and Hiley (1980), who compares the manuscripts of Norman chant traditions and demonstrates connections between these and Britain and Southern Italy. These specific manuscript groupings often reflect broader historical divisions. Huglo (1999), for instance, connects the fundamental divergence of Gregorian chants into eastern and western traditions directly linked to the political fragmentation of the Carolingian Empire following the Treaty of Verdun in 843 CE.

Kruckenberg (2006) argued for the importance of the individual trope element as a unit of analysis. She shows that examining the geographical spread of those elements and their contextualization within historical territorial borders can yield insights into aspects such as their approximate date, which is generally quite different from the date of the manuscript in which they are first transmitted. Complementing such analytical approaches, new conceptual models have recently emerged from digital editing practices. For instance, Nardini (2021) has published a digital edition of prosulae—a specific, related form of tropes—as a website, using the analogy of hypertext to capture the material’s multidirectional and nonlinear nature. The methodological shift toward analyzing repertoires in a more quantitative way was further developed by Hughes (2017). He creates a similarity matrix to compare the repertoires of 12 Aquitanian manuscripts. The application of computational techniques represents the most recent step in the field. Hiley (2017) employs a simple computational approach using a self‑created dataset from the first volume of the Corpus Troporum (CT). His study quantifies and compares the concordances of trope elements across the manuscripts in the dataset, and his analysis yields the identification of distinct manuscript groups based on the frequency of agreement; notably, the groups with the highest levels of concordance tend to cluster in regionally coherent patterns. Our study builds directly on the concept of Hiley (2017) but introduces a new methodological approach by using better‑suited statistical models, e.g., SBMs.

2.3 Stochastic block models

Holland et al. (1983) introduce the SBM as a combination of block models (White et al., 1976) and probabilistic models of digraphs (Holland and Leinhardt, 1981). Their version partitions the vertices of a social network into blocks and lets the probability of an edge depend only on the blocks of the two endpoints. After this first formulation, researchers generalized the SBM in several directions. Nowicki and Snijders (2001) place the model in a Bayesian framework for networks without labeled blocks. Airoldi et al. (2007) allow each vertex to belong to multiple blocks, utilizing a mixed‑membership specification. Rohe et al. (2010) develop a spectral clustering estimator that scales to large, sparse graphs.

For many years, the SBM remained a rather niche method—descriptive algorithms dominated empirical network analysis instead. The situation changed with the study by Karrer and Newman (2011), who introduce the degree‑corrected SBM (DC‑SBM). By adding a single propensity parameter to every vertex, the DC‑SBM eliminates the unrealistic requirement that all vertices within the same block share similar degrees.

Peixoto (2011) introduces entropy as an objective criterion for comparing alternative formulations of the SBM. The subsequent release of a high‑performance graph‑tool software program (Peixoto, 2014a) made efficient inference algorithms widely accessible, allowing large SBMs and their extensions to be estimated routinely (Peixoto, 2017). Within the same framework, several variants have been proposed, including an SBM with nested, hierarchical blocks (Peixoto, 2014b) and versions that handle weighted edges (Peixoto, 2018a).

A growing body of research demonstrates the advantages of viewing networks as outcomes of explicit generative models. For example, Peixoto (2018b) shows how measurement error can be incorporated directly into the SBM likelihood, while Peixoto (2019) provides a fully Bayesian treatment. Building on these insights, Peixoto (2023) argues that, especially for historical and socio‑ science networks, the objective should be inference—explaining the observed structure—rather than mere pattern description. Peel et al. (2022) reinforce this position, contending that any substantive claim about network organization must account for uncertainty and that such quantification is only possible within a valid generative framework such as the SBM.

3 Dataset

3.1 Tropes and trope elements

A trope is an intentional interpolation—new words, new music, or a combination of both—inserted at fixed points within an established liturgical chant (Haug, 2018b). The added material enlarges the original piece and can offer additional theological or rhetorical nuance.

Tropes consist of a so‑called primary chant as their basis, with textual–melodic interjections or additions, which we call trope elements. Figure 1 illustrates a transcription of a trope to the primary chant Dominus Dixit from the Corpus Monodicum edition. The trope elements (red) are indexed by numbers; the primary chant parts (blue; often appearing only as incomplete cues) are indexed by letters. A trope element is defined as the smallest unit that is not interrupted by a primary chant cue across all manuscripts. Trope elements can therefore have very diverse lengths, ranging from a few words to entire pages.

Excerpt from a transcription of a trope to the antiphon *Dominus Dixit* with indicated structural elements. Trope elements have a numerical ID; primary chant cues are indicated by letters. Source: Tropus Hodie Cantandus Est, transcribed by David Catalunya, Corpus Monodicum Online Edition, https://corpus-monodicum.de/d/bdeea6d3-3c3f-4314-9f92-0ec8950535d6.

Analyzing the trope repertoire can reveal insights about local cultures, interactions between regional communities, and musical transmission processes. Variations in this repertoire, e.g., melodic variants of essentially the same trope, could be attributed to either intentional change or the fact that, for centuries, transmission was an oral phenomenon. Tropes were written down centuries after they were first conceived.

3.2 Corpus troporum dataset

To study possible transmission patterns of trope elements, we use a dataset of tropes for the Proper of the Mass, the CT Dataset (Eipert et al., 2025), which contains 18,239 edges linking 163 manuscripts to 4,407 trope elements in total. The dataset was derived by transcribing the trope element IDs from the concordance tables of CT. The full dataset is freely available at https://osf.io/fkdq5/.

The CT Dataset was derived from the printed scholarly edition of CT, which has documented trope transmission by identifying substantial portions of the repertoire through inventories of trope elements in manuscripts (Jonsson, 1978).

The dataset’s scope is bound to the selection of the editors of the CT volumes. Originally, only manuscripts copied before 1100 CE were considered for inclusion in the edition (Jonsson, 1975). For CT Volume X, the editorial guidelines were relaxed to accommodate all later sources that came to the editors' attention, resulting in a marked increase in the number of manuscripts, some of which date back as late as the 14th century (Jacobsson, 2011). Only tropes attested with musical notation in chant books were included; mere incipits of trope texts in ordines, ordinaria, or consuetudines were excluded, since they do not transmit the full text.

While CT lacks melodic transcriptions—a complex future task due to diverse notation systems—it provides a comprehensive foundation for analyzing transmission patterns based solely on metadata and text editions. For analytical work, it proved helpful to isolate the smallest coherent unit of tropes. CT refers to these units as trope elements (Jonsson, 1975), representing the smallest phrases that form a coherent whole. Obtaining the elements is an editorial process and decision of the volume editor. It documents their occurrence across the surviving manuscripts by assigning a unique ID to every trope element and by listing their order of appearance in specific chants in so‑called concordance tables. Because these are organized around trope element IDs, different sources can be linked by the trope elements they share, making trope elements a reliable token for computational comparison of regional chant traditions.

Table 1 shows an extract of a concordance table from CT. The table lists, in sequence of appearance, the trope elements that accompany the Christmas introit Puer natus est in two different manuscript sources, Ba 5 and SG 376; the numbers in the ‘Trope Elements’ column correspond to the element identifiers assigned by CT. While, in both cases, the first inserted trope element is the one with ID 25, and both share some trope elements, the insertions throughout the chant vary greatly, raising questions about the historical and geographical transmission of these units of medieval musical information.

Table 1

Occurrences of trope elements for the incipit Puer Natus est.

Primary Chant Incipit	Feast	Genre	Manuscript	Trope Elements (Sequential Order)
Puer Natus est	Nat III	intr	Ba 5	25 31 32 35 18 19 36 14 1 2 3 4 5 34 37 33
Puer Natus est	Nat III	intr	SG 376	25 2 3 4 11 12 30 13 1 7 8 9 10 5 26

For the present study, we selected two components of the CT Dataset: (i) the occurrences of individual trope elements within each manuscript and (ii) the manuscript‑level metadata specifying provenance. The precise geographic coordinates of every provenance site are often unknown. Therefore, they were manually annotated using geolocation from Google Maps, thereby generating latitude and longitude fields that permit cartographic visualization.

3.3 Geographical division

To evaluate whether the grouping of manuscripts actually forms regional clusters due to overlapping content, we draw on earlier musicological scholarship—namely, we use the geographical divisions of CT, Volume 10, which partitions the entire manuscript collection into the regions: East, Northern Italy, Southern Italy, Northwest and the Transition Zone, and Southwest. For manuscripts in the CT Dataset that were not included in that earlier compilation, we assigned the regional category manually. This was straightforward when another, already‑classified manuscript was housed in the same library; otherwise, we mapped the manuscript to the closest location for which a classification was available.

4 Methods

Network science provides a powerful framework for analyzing complex systems through their fundamental components. This field aims to describe an interaction system holistically, revealing insights beyond what could be observed by merely examining the sum of its parts (Peel et al., 2022; Pósfai and Barabási, 2016). The actual structure of the historical network depicting the transmission of trope elements across medieval Europe through manuscripts is unknown, and many parts are missing. The challenge commonly addressed in network analysis is therefore one of inference: reconstructing a presumed, hidden network from the available data based on reasonable assumptions (Peel et al., 2022) and informed by domain expertise—in our case, historical musicology.

4.1 Network construction

We model the data provided by the CT Dataset as follows: Let the complete set of trope elements be denoted by $T = {t_{1}, t_{2}, \dots, t_{T}}$ and the set of manuscripts be denoted by $M = {m_{1}, m_{2}, \dots, m_{M}}$ . In our case, $| T | = 4, 407; | M | = 163$ and $| T \cup M | = T + M = 4, 570$ .

We record the co‑occurrence of trope elements and manuscripts in a symmetrical, binary incidence matrix $A = (A_{i j}) \in {0, 1}^{T \times M}$ , whose entries are defined as $a_{i j} = 1$ if trope element $t_{i}$ occurs in manuscript $m_{j}$ and $A_{i j} = 0$ otherwise. It follows that $A_{i j} = A_{j i}$ . The matrix $A$ is the adjacency matrix of a bipartite graph $G = (T \cup M, E)$ , where an edge $(t_{i}, m_{j}) \in E$ indicates that trope element $t_{i}$ appears in manuscript $m_{j}$ . Accordingly, all subsequent network analyses treat trope elements and manuscripts as separate node partitions connected exclusively by these incidence edges.

4.2 Stochastic block models

To infer communities from a generative model, we employ and compare various types of SBMs. A standard SBM is defined by a partition of $N$ nodes into $B$ blocks^³, specified by the $N$ ‑dimensional vector $b$ with entries $b_{i} \in {1, \dots, B}$ specifying the block membership of node $i$ . The probability of an edge existing between any two nodes $i$ and $j$ depends on their block assignments $b_{i}$ and $b_{j}$ , as it is generally assumed that nodes within the same blocks are more likely to have connections between them than nodes from different blocks, and is determined by a symmetrical $B \times B$ probability matrix $p$ , where $p_{b_{i}, b_{j}}$ represents the probability of an edge between any two nodes in blocks $b_{i}$ and $b_{j}$ , respectively (Peixoto, 2019). The SBM thus shifts the perspective from the nodes of the network to its node communities, or blocks.

The simplest SBM can be derived via maximization of entropy (Jaynes, 1982) under the constraint of an expected edge count for each pair of blocks. It is characterized as follows:^⁴

1

\begin{matrix} P (A | p, b) = \prod_{i < j} p_{b_{i}, b_{j}}^{A_{i j}} {(1 - p_{b_{i}, b_{j}})}^{1 - A_{i j}} \end{matrix}

The standard SBM captures the intuition that group membership should affect the connectivity between nodes. It has been observed, however, that node degrees (i.e., the number of edges a node shares) vary significantly in empirical networks, even within the same blocks (Karrer and Newman, 2011). While a standard SBM assumes that all nodes belonging to a block have, on average, the same number of edges, the DC‑SBM (Peixoto, 2023) introduces an additional parameter to allow for heterogeneous node degrees within blocks.

Both the standard and the DC‑SBM variants abstract from the empirical nodes and model latent communities as blocks, which can be conceived of as new nodes on a higher level. A nested SBM (N‑SBM) (Peixoto, 2014b) extends this logic and stacks multiple blocks hierarchically: Instead of having one vector $b$ containing the block indices, this model consists of several block vectors $b^{0}, b^{1}, \dots, b^{L}$ , where $b^{l}$ represents the blocks on the $l$ ‑th level. Levels are nested hierarchically, which means that a block $b^{l}$ on level $l$ becomes a new node at level $l + 1$ . Note that the total number of hierarchical levels, $L$ , is a parameter of the model that takes part in the inference. This allows for analysis of both coarse and fine network structures without pre‑fixing the number of levels.

4.3 Inference procedure

SBMs assume that a specific node partition $b$ underlies the particular layout of a complex network $A$ . This corresponds to the generative model $P (A | b)$ . To detect communities, we want to infer which partition generated an observed network. We can obtain this by applying Bayes' rule to get $P (b | A)$ (Peixoto, 2017).

To evaluate the fit of a model, we use its description length (Grunwald, 2007), which represents the amount of information needed to encode the observed data together with the model parameters and is defined by:

2

\begin{matrix} Σ = - \log_{2} P (A, b) . \end{matrix}

The partition of a network that maximizes the posterior distribution $P (b | A)$ simultaneously minimizes its description length $Σ$ . In practice, we can use the Markov chain Monte Carlo (MCMC) method to sample from the posterior distribution or to find its maximum likelihood estimate, but, for performance reasons, we use the implementation of the Python library graph‑tools for all inferences of the various SBM variants (Peixoto, 2014a), which uses an optimized heuristic (Peixoto, 2014b, p. 7–8).^⁵ With this implementation, it is also possible to constrain the algorithm to respect the pre‑assignment of the two different node types. Because graph‑tool's agglomerative heuristic can converge to local minima (Peixoto, 2014b), we perform 10 runs and retain the partition with the lowest $Σ$ as a point estimate.

4.4 Quantitative model comparison

Two models can be compared using the description length $Σ$ , which is defined by the total number of bits needed to encode the adjacency matrix $A$ under model $M$ and partition $b$ . A consistent approach for this comparison is the posterior odds ratio $Λ$ (Peixoto, 2019). Suppose we have two different SBMs, $M_{1}$ and $M_{2}$ , with two distinct partitions $b_{1}$ and $b_{2}$ . We can now compare the two models using the posterior odds ratio that can be characterized by

3

\begin{matrix} Λ = \frac{P (b_{1}, M_{1} | A)}{P (b_{2}, M_{2} | A)} = \frac{P (A | b_{1}, M_{1}) P (b_{1}) P (M_{1})}{P (A | b_{2}, M_{2}) P (b_{2}) P (M_{2})} . \end{matrix}

As we have no prior preference of one model over the other, we set $P (M_{1}) = P (M_{2})$ , which yields

4

\begin{matrix} Λ = \frac{P (A | b_{1}, M_{1}) P (b_{1})}{P (A | b_{2}, M_{2}) P (b_{2})} = 2^{- (Σ_{M_{1}} - Σ_{M_{2}})} . \end{matrix}

If $Λ > 1$ , $M_{1}$ is $Λ$ times more plausible as an explanation for the data than $M_{2}$ . In contrast, a value of $Λ$ close to 1 should not be taken as grounds for rejecting model $M_{2}$ .

4.5 Visualizing hierarchical blocks on a map

The result of block inference is a partition of the nodes, or, in the case of nested block models, multiple partitions across different levels. We aim to track the distribution of manuscripts that share similar tropes across geographic regions. To focus on manuscripts, we project our bipartite network onto the manuscript layer so that each manuscript inherits connections based on shared tropes. When using nested (hierarchical) block models, we repeat this projection at each level of the hierarchy. That way, we can compare how the grouping of manuscripts changes from coarse to fine partitions. Each manuscript is plotted on a map at its true location based on the latitude/longitude of its place of origin. On the map, each point (manuscript) is colored according to its block membership. This reveals spatial clustering of similar‑trope manuscripts. However, as the number of blocks grows, simply eyeballing the map makes it harder to discern regional patterns for each block.

To streamline the visualization, we designate the top hierarchy level ( $b^{L}$ ) as the reference point. Its blocks—and all blocks derived from them—are color‑coded with the palette assigned to that level. For better visibility of the block structure, we outline the blocks of every level as polygons, each of which traces the concave hull enclosing all points belonging to the respective block. This makes it possible to visualize the emerging block subdivision of the blocks $b^{L - 1}, \dots, b^{0}$ .

4.6 Evaluation of geographic correspondence

To evaluate the geographical correspondence between inferred node blocks and known manuscript origins not only visually but also quantitatively, we apply the same model comparison approach as described in Section 4.4. Based on the relevant literature, we have two predefined regional groupings of manuscripts and aim to assess how well these explain the data compared to the partition inferred through our modeling procedure.

To do this, we initialize the network with a fixed manuscript partition $b^{*}$ and optimize only the trope element blocks by minimizing the description length. If the geographically defined models result in only a marginal improvement or the exact description length as the freely inferred model, this would indicate that regional divisions alone can explain the distribution of trope elements. In contrast, a freely inferred model may additionally capture nongeographical transmission patterns, such as those shaped by institutional ties across distant locations. Description length measures how well the model explains the data in relation to its complexity. This makes it a fair basis for comparing a standard SBM with a fixed manuscript partition against an N‑SBM, since the standard SBM would, in principle, be capable of achieving the same description length if the entire structure of the data could indeed be perfectly explained by regional factors alone.

5 Results

5.1 Descriptive network measures

Figure 2 illustrates both how frequently trope elements occur in individual manuscripts and how widely each trope element is distributed across the manuscript corpus. It shows that, on average, a manuscript includes 112 trope elements, and a trope element is included in approximately four manuscripts. Meanwhile, most manuscripts and trope elements have a very low degree of inclusion.

Degree distributions of manuscript nodes **(top)** and trope element nodes **(bottom)**. A base‑10 logarithmic scale is used for both y‑axes. On average, a manuscript node contains around 112 trope elements, and a trope element is contained in around four manuscripts.

Table 2 shows the 10 most‑connected manuscripts of the dataset (highest node degrees). Most of them are Aquitanian or Northwestern manuscripts that have been well‑studied in previous literature. For example, Pa 1118 is a prominent manuscript in studies of the Aquitanian repertoire (Hiley, 2017; Hughes, 2017; Weiss, 1964).

Table 2

The 10 most‑connected manuscripts in the dataset. The degree of a manuscript represents the overall number of trope elements included in that manuscript.

#	Manuscript	Degree
1	Pa 1118	784
2	Apt 17	727
3	Pa 909	688
4	Pa 1871	613
5	Pa 887	610
6	Pa 1120	603
7	Pa 1119	595
8	Lo 14	556
9	Pa 1121	534
10	Ox 775	506

5.2 Selecting the best model

We inferred the parameters for four different SBM versions and compared the fit between the model and data using the minimum description length criterion (see Section 4.4). The models are the standard SBM and its degree‑corrected version (DC‑SBM), as well as the N‑SBM and its degree‑corrected version (NDC‑SBM). Since the inference is a probabilistic process, we run it 10 times for each model and use the partition with the smallest description length. Each run consists of a multiflip MCMC scheme with 3,000 iterations at inverse temperature $β = 1$ , followed by 10,000 iterations with $β = \infty$ .

Figure 3 compares the models by their description lengths, expressed as the posterior odds ratio $Λ$ . Because $Λ$ can become vanishingly small, we plot $\log_{10} Λ$ and fix the scale so that the best run has $Λ_{best} = 1$ . In every run, the NDC‑SBM attains the lowest minimum description length, making it the clear choice.

Posterior odds ratios represented as $\log_{10} Λ$ . Colored points represent 10 independent runs per model, normalized so the $\log_{10} Λ$ of the best‑performing model is $1$ (right). All other points show each alternative model's odds ratio relative to this winner.

5.3 Inferred manuscript partition

The parameters inferred by the model with the lowest description length yield a hierarchical block structure with three levels ( $L = 2)$ . Table 3 gives the node counts at each level for the network projections onto the manuscript and trope–element layers. Since the algorithm explicitly models a bipartite network, each block contains only one node type—either manuscripts or trope elements—allowing us to classify blocks into those two categories. Given this constraint, the algorithm identifies three nontrivial blocks.^⁶ The first split, corresponding to block $b_{2}$ , yields four trope–element sub‑blocks and four manuscript sub‑blocks. At the deeper levels, the relative sizes of the two node types remain similar.

Table 3

Number of inferred blocks in the NDC‑SBM at each hierarchical level, reported for its manuscript and trope–element projections.

	$b^{0}$	$b^{1}$	$b^{2}$
Manuscripts	39	11	4
Trope elements	34	13	4

Since our analysis focuses on the manuscript constellations, we display the manuscript‑relevant blocks as a dendrogram in Figure 4, where the structure of the hierarchical blocks and the grouping of the manuscripts become clear. The four branches of block $b^{2}$ are shown in four distinct colors that persist through the lower levels. The grey circles denote the blocks $b^{0}$ – $b^{2}$ and mark the manuscript nodes $M$ .

Dendrogram of the inferred hierarchical blocks of the manuscript side, with terminals representing manuscripts. The shaded circles represent the manuscript level ( $M$ ) as well as community assignments on increasing levels ( $b^{0}, b^{1}, b^{2}$ ). Branch colors represent the block assignment $b_{i}^{2}$ of the highest block level for manuscripts.

When two manuscripts belong to the same block at level $b^{0}$ , this is reflected by their branches joining directly at the boundary of that block. For example, Ox 27 and Col 41 (both at the top of the circle) lie in the same block at level 0. In contrast, Col 41 and Pro 12 are maximally separated, since the branches leading to them only diverge at the top level, 2—an arrangement also indicated by their different colors.

5.4 Comparison of manuscript affiliations with earlier studies

Previous research has shown that geographical groupings can be identified by comparing trope repertoires through content analysis alone. Building on this, we examine the blocks of manuscript nodes inferred from the data. To assess whether the manuscript groupings align with the results of existing literature, we compare our results to Hiley's groupings (Hiley, 2017), which were derived through semi‑quantitative methods, albeit using only a subset of the manuscripts analyzed in our study ( $58$ out of $163$ manuscripts).

Significant overlaps can be observed between the two computed partitions. At the lowest level of granularity ( $B^{0}$ for our partition, highest correspondence for Hiley), the manuscripts from Winchester (Cdg 473, Lo 14) and Nevers (Pa 1235, Pa 9449), as well as those from St. Gall (SG 376, SG 378, SG 380, SG 382) and Vercelli (Vce 56, Vce 161, Vce 162), each form discrete clusters, mirroring the configuration reported by Hiley. One Aquitanian cluster, however, diverges in its composition at this same hierarchical depth. Whereas Hiley grouped Pa 779 and Pa 1118 together with a concordance exceeding 96%, the SBM inference assigns these manuscripts to separate clusters: Pa 779 aligns with Pa 1083b and Pa 1084b, while Pa 1118 is incorporated into a cluster with Apt 17. Although Hiley also records links among the members of this latter assemblage, those links appear less salient. Also, Monza (Mza 75, Mza 76) gets divided. Some of the assignments diverge sharply—for example, that of Pa 903. Hughes (2017) treats this manuscript as an isolated witness, identifying its closest affinities with Pa 1240, Pa 779, Pa 887, and Pa 1084b. These affinities are no longer visible in this partition, as none of the manuscripts named by Hughes are closely related to one another. The nearest relatives are Pa 1871 and Vic 105. Indeed, Pa 1240 appears in a different $b^{2}$ block (green). This difference can be explained by the fact that all manuscripts grouped with Pa 1240 in our results were not included in the earlier analyses, and the divergence likely arises from the larger amount of data now considered. Hughes's classification—and, to a lesser extent, Hiley's—concur in separating the two largest and most prototypical witnesses, Pa 1240 and Pa 1118, into distinct blocks; both authors diverge, however, in the allocation of the remaining manuscripts.

Björkvall and Haug (2017) describe an ensemble of manuscripts originating from St. Gall and its immediate sphere of influence. In the present hierarchy, this corpus emerges as a $b^{1}$ block that is further partitioned at $b^{0}$ into four subdivisions: (i) a stratum containing the earliest witnesses (SG 484, SG 381); (ii) a stratum comprising direct successors from St. Gall, together with a troper copied at Rheinau (SG 376, SG 378, SG 380, SG 382, Zü 97); (iii) a manuscript produced at St. Gall for Minden that likewise affords indirect insight into the repertoire (Be 11), which appears as a singleton cluster; and (iv) Wi 1609, affiliating with Ba 6, Vat 833, Wi 1845, Ka 55, and Ka 25.

5.5 Alignment of manuscript affiliations with broader geographic regions

We apply the visualization technique described in Section 4.6 to the inferred best DC‑NSBM. Figure 5 presents the spatial distribution of the blocks on the map, arranged in three subfigures, one for each of the three hierarchical levels.

Geographical distribution of inferred partitions at $b^{2}$ , $b^{1}$ , and $b^{0}$ . Each manuscript's provenance is indicated by a point colored by its $b^{2}$ group; blocks at different levels are represented by the concave hull between the inherent manuscripts.

At $b^{2}$ , the highest tier in the manuscript‑node hierarchy, four clear clusters emerge: a central cluster (green), a West‑Frankish cluster (red), an East‑Frankish cluster (blue), and a middle cluster (orange) that stretches from Italy to Great Britain.

Descending through the hierarchy, this picture gains nuance. At $b^{0}$ , the finest‑grained layer, the overall pattern persists, but the green cluster changes markedly: it divides into two subtypes. One subtype links northern Italy with the Western manuscripts, while the other bridges the Eastern and Western groups. Notably, there is no subgroup within the green cluster that connects Italy directly to East‑Frankish territory. A similar picture emerges for the red group, which is made up chiefly of West‑Frankish manuscripts and reaches as far as Britain. It further subdivides into small, distinctly regional clusters. The orange cluster also appears to differentiate primarily on a regional scale. In the blue block, by contrast, the clusters still overlap noticeably, even at $b^{0}$ and spread across a much wider area. Here, local distribution is therefore less decisive for the formation of clusters; instead, several supra‑regional repertoires overlap within it.

5.6 Evaluating geographic correspondence

Using the procedure outlined in Section 4.6, we construct a model in which the manuscript blocks are fixed according to the geographical groupings commonly employed in the scholarly literature. We then ask how well this partition actually explains the observed repertoire differences, compared with the SBMs estimated earlier, which infer blocks solely from the data and impose no geographic constraints. The result is $\log_{10} Λ = - 1387$ , a value markedly worse than that obtained even with the standard SBM. This means that even the poorest‑performing model—which allows for geographic groupings beyond those conventionally known—provides a substantially better explanation of our data.

6 Discussion

The block structure that best explains the observed network aligns strikingly with regions that have long been considered pivotal in the transmission of medieval chants. In broad terms, scholars usually distinguish an East‑Frankish, a West‑Frankish, and an Italian tradition, respectively (Huglo, 1999). Our results also reveal a diagonal link that joins Italian manuscripts to Great Britain—an association hinted at in earlier studies. Those studies point to the so‑called Lotharingian Axis (Kruckenberg, 2006), the swath of territory granted to King Lothar I after the division of the Carolingian Empire in 843 CE, which extends from the North Sea to Italy. The four blocks we identify at $b^{2}$ therefore suggest that this political partition influenced the shape of the chant repertoires. This fact may even enable more precise estimations of the chronology of trope element development (Kruckenberg, 2006).

A significant portion of the network structure can be attributed to the geographical origins of the manuscripts. Yet, a substantial number of cases cannot be accounted for in this way. These instances become apparent in the differences between the block assignments and the geographical partitions.

A key assumption in our analysis is the accurate identification of trope elements as provided in the CT editions. However, this identification is based solely on textual features, without considering melodic aspects. Consequently, a single identified trope element—defined solely by its text—may be transmitted with multiple distinct melodies. If melodic variation were taken into account, such cases might be considered as numerous separate trope elements and potentially assigned to different blocks. However, in the present dataset, all textually identical instances are treated as the same trope element, regardless of melodic differences. This limitation is particularly relevant in light of detailed studies on the interregional exchange and adaptation of tropes, which show that transmission processes often introduced both textual and melodic variations (Planchart, 1988).

Furthermore, the construction of the network deliberately omits the sequential order of trope elements within the manuscripts. This constitutes one of the significant methodological constraints of our approach. At present, there is no straightforward way to represent sequential data within the chosen network model. The bipartite structure of the network has the advantage over a projection (as in Hiley, 2017) in that all relevant entities are explicitly modeled, ensuring that no information is lost. Moreover, it allows for the possibility of conducting analyses involving the trope elements themselves in the future.

Another key aspect of our method is the use of hierarchical models, which permit partitions at multiple levels. Each level can capture a different historical process. In our case, the top level reveals a subdivision into historically significant political regions. The lower levels further refine these overarching divisions, uncovering additional processes that cannot be explained purely by geographic distance. Consequently, some blocks at the lower levels overlap (like in the blue block).

The method we use is grounded in information‑theoretic criteria such as the minimum description length and the posterior odds ratio. A common criticism in the humanities is that computational results are presented as inherently objective, thereby illegitimately legitimizing conclusions by invoking a supposed neutrality of the computer (Bishop, 2018; Huck, 2015). This, however, misconstrues the real strengths of computational methods: in our study, we employ models to describe musicological data and compare them using the minimum description length criterion. The result is objective in the sense that— assuming the heuristic algorithm truly finds the global optimum—it identifies the model that best compresses the data. By this measure, our method outperforms traditional approaches. Everything that follows is rooted in musicological interpretation and the search for coherent explanations. We illustrate this by comparing our findings with previous results and by invoking the theory of the political subdivisions of the Carolingian Empire. This exercise is neither fully objective nor definitive. However, because the abstraction of the data into four blocks, with a subsequent hierarchical partitioning, yields a far better description in information‑theoretic terms, it is justified to incorporate these findings into further musicological inquiry.

7 Conclusion

Our study employed an inferential community detection approach to analyze medieval chant repertoire and its geographical transmission. The resulting manuscript blocks broadly confirm previous findings from content‑based analyses while also revealing new structural patterns. Notably, a three‑part division (plus one pan‑regional block) emerges at higher hierarchical levels, aligning with the territorial fragmentation following the Treaty of Verdun (843 CE). The comparison with geographically predefined models reveals that regional origin accounts for a significant portion of the observed structure, although not all of it.

This raises questions for follow‑up studies: To what extent are other historical processes—such as institutional affiliations, trade, and pilgrimage routes—involved in the formation of blocks at different levels? These results suggest that both geographic and nongeographic factors influenced the transmission of trope elements, demonstrating the potential of statistical network models for the study of historical musicology. Further studies should also take the sequence of trope elements into account, incorporate the actual chant text and melody variants in the model, and try to include temporal data of the manuscripts.

The method presented here isn't limited to our specific application; it can be applied to any dataset exhibiting affiliation relationships. For example, one could use the same technique to explore connections among music albums via their contributing musicians or to measure the similarity of hip‑hop artists based on the samples they use, as has already been done for electronic music (Youngblood et al., 2021). This illustrates the method's relevance to the broader Music Information Retrieval community.

8 Reproducibility

The complete code for reproducing the results is available online at https://doi.org/10.17605/OSF.IO/UTS8F.

Acknowledgment

We would like to thank Jan Hajič Jr. for his ongoing support.

Competing Interests

The authors have no competing interests to declare.

Authors’ Contributions

Tim Eipert was the primary contributor and lead author of this article. Fabian C. Moss supervised the work and contributed to the writing. Both authors were actively involved in developing the conceptual framework and preparing the final manuscript.

Notes

[1] See, for example, Cantus Index (https://www.cantusindex.org/) and the Cantus Analysis Tool (https://www.cantusindex.org/analyse) or Corpus Monodicum Online (https://corpus-monodicum.de/).

[2] Unlike other chant genres, the trope is never sung without the primary chant.

[3] The blocks are representing our assumption that the manuscripts were generated from different chant communities. Note that these do not necessarily have to be a regional community.

[4] A more detailed description of the derivation of this formula is given in Peixoto (2019).

[5] graph‑tool is released as free software under the LGPLv3 license (https://www.gnu.org/licenses/lgpl-3.0.de.html).

[6] Here, we call a block “nontrivial” if it has more than the two subdivisions—between trope elements and manuscripts—that were constrained by parametrization by the algorithm.