1. Introduction
Archaeological data are generally ambiguous (admitting several interpretations), partial, imprecise (lack of detail in the measurement of a variable), uncertain (giving rise to doubts) or inaccurate (far from the real value). Various terms have been used in the literature to describe these data-related factors; in our case, we use the term ‘imperfection’. In the field of Computer Science, imperfection in data — an umbrella term encompassing imprecision, uncertainty, and vagueness (Smets 1999; Jousselme et al. 2006) — is recognised as an inherent characteristic arising from both technical limitations and the nature of the phenomena being represented. Addressing this issue is essential for producing more rigorous and transparent interpretations of results (Wylie 2015). Although the term imperfection may initially evoke the idea of a ‘lack of perfection’, this is not the intended meaning in most scholarly contexts. In fact, the specialised literature reveals that it is among the most frequently used descriptors, covering not only the types of data imperfections noted above (Achich 2019:305; Achich et al. 2021) but also phenomena such as incoherence or redundancy, depending on the author. These imperfections, whether intrinsic to the data or introduced during processing, can have a direct impact on the reliability and validity of any subsequent analyses.
Challenges multiply when working with legacy data. Firstly, archaeological data are often collected over long periods of time (weeks, months or even years) and at different times and sites, which can cause a fragmentation of information, making it harder to compile studies, but also to analyse the data coherently. In this sense, one of the most problematic issues in the discipline is indeed the management of imperfect dating. The representation of time and of the temporal ranges that may exist for the same phase due to the absence of precise data, as well as the differences in the chronological attribution of the different cultural labels assigned to the sites, can lead to working with chronological data of very different resolutions (Achich et al. 2021). Moreover, humans “handle temporal information using certain temporal notions like time intervals or time points, and they often have to deal with imperfections like imprecision, vagueness, uncertainties or inconsistencies possibly contained in the descriptions of these temporal notions” (Pons et al. 2014:191).
Something quite similar happens with the interpretations of the site functionalities attributed to the different phases of occupation or frequentation that we find in a site: the same corpus of materials can lead to different interpretations. In this study, we address these challenges by proposing a set of guidelines for the fuzzification and assignment of membership functions to variables related to chronology and functionality in archaeological sites, illustrated through real case studies. We also present a fuzzy metrics framework, based on the FuzzySQL language, which applies existing metrics and develops new ones to calculate the uncertainty contained in each variable and site, enabling comparisons between archaeological sites or between different interventions at the same site. The main objective is to provide a lightweight, replicable, and adaptable computational framework for managing and quantifying imperfection in legacy archaeological data. By applying this framework to two case studies drawn from both published sources and our own fieldwork, we aim to assess its effectiveness in identifying strengths and weaknesses in datasets, improving the representation of uncertainty, and offering openly available tools for reuse in similar archaeological contexts.
2. Different ways of managing imperfection
Over the last decades, the management of the imperfection of archaeological data has become increasingly important in research; some authors have tried to apply it in practical ways in their projects (De Runz 2008; De Runz et al. 2011; Fusco 2016; Tobalina-Pulido and González-Pérez 2020); others have explored this at a more theoretical level (Sánchez Trigueros 2013). Especially since the 1990s, interesting reflections have been made in Anglo-Saxon (Huggett 2000; 2020) and French contexts (Desjardin, Lefebvre and de Runz 2015), but the Spanish bibliography on this topic is not very extensive. Certainly, it seems that the problems of ambiguity, partiality, incompleteness and vagueness of dating have been the main focus of research in this sense, followed by proposals focused on improving the classification of certain archaeological items (ceramics, skeletal remains, lithics, etc.). In general, three approaches have been used to tackle this problem: Bayesian statistical procedures (Crema et al. 2014; Vos et al. 2021; Crema 2024), the aoristic method (Willet 2014; Bevan et al. 2013) and fuzzy logic (Barceló 1996; Taheri, Ghadim and Kabirian 2019; Tirpáková et al. 2021).
Firstly, one of the most used methods is based on Bayesian statistical procedures. According to this method, observations are used to infer the probability that a hypothesis might be true. In other words, it allows us to know the degree of uncertainty that an event may or may not occur. This method is recurrently used for the calibration of the radiocarbon dates (Prada Domínguez, 2015; Bevan and Crema 2021; Crema and Bevan 2021), being a standard deviation σ that reflects the uncertainty in the process of determining the dating of the sample. Its functionality has meant a reappraisal of stratigraphy as a source of chronological information for historical interpretation (Maestre, Padilla and Layrón 2014:52). Calibration of radiocarbon dating using this procedure is not yet widely applied, but it allows better estimates of the chronology of some archaeological contexts (Maestre, Padilla and Layrón 2014:52; Wescott 2005:56–61).
Secondly, the aoristic method is a technique that enables the quantification of temporal uncertainties and their incorporation into subsequent analyses. This approach assigns a probabilistic linear value within a range of values. It has been applied in archaeology principally as a way of statistically representing the temporal vagueness of archaeological artefacts. Willet (2014) applied it to the quantification of ceramic materials to confirm several hypotheses about the distribution patterns of Eastern sigillata ceramics and to validate the use of probabilistic methods in the formulation of models for subsequent archaeological interpretation. Based on this work, C. Cáceres Puerto and J. García Sánchez (2019) propose their use to model the use of burial areas by combining stratigraphic and typological dating methods of archaeological contexts with the use of probabilistic and cartographic representation methods that allow to consider the vagueness of ceramic dating by chrono-typology (Cáceres-Puerto and García Sánchez 2019:60). The authors assign a probabilistic linear value within a specific date range. For example, an artefact chronotypologically dated between 100 BC and 100 AD will have a probabilistic distribution in ranges of 25 years (Cáceres-Puerto and García Sánchez 2019:60).
Finally, fuzzy logic is the most widely used approach across disciplines because it enables the use of non-Boolean labels, which is essential in humanistic fields when working with qualitative data and highly linguistic content (Niccolucci and Hermon 2010; Owens and Coppola 2009; Martin-Rodilla and Gonzalez-Perez 2019; Martin-Rodilla, Pereira-Fariña and Gonzalez-Perez 2019). In research interested in the management of data imperfection, fuzzy logic has been the most widely used both theoretically and practically in archaeology, with some examples that we show here. S.M. Taheri, F.I. Ghadim and M. Kabirian (2019) consider three premises for the use of this approach in archaeology: a) when the data collected are imprecise, b) when the relationships between variables are imprecise and c) when there are differences of interpretation/opinion between specialists. Some works in archaeology, in which it has been applied with varying degrees, can be pointed out. In the 1990s, J. Barceló proposed the application of fuzzy logic to the study of Phoenician ceramics for his Pygmalion project (Barceló 1996). He proposed the classification of pottery dated between 800 and 550 BC using a prototype “Fuzzy Cognitive Map”. This allows the classification of ceramics by assigning them to one or more categories, indicating the possibility of belonging to one type of pottery or another (Barceló 1996). The final reflection made by the author is quite significant, indicating that at the end of the whole process, it is the archaeologist who must decide to which cluster the pottery under analysis belongs. The view of the expert is important in this type of analysis in the Humanities, both to establish the criteria of the fuzzy sets and to validate the framework.
S. Hermon and F. Niccolucci (2002:227–229) also applied fuzzy logic theory to materials, specifically to the study of 50 lithic items from a protohistoric site in Israel and their classification. Their approach consists of a fuzzy coefficient assignment. Five experts analysed the pieces, assigning a number from 0 to 1 based on their confidence in the classification. While theoretically appealing, this approach is complex to implement, as it requires multiple specialists, making the ceramic study more difficult and time-consuming. It is also not suitable for all archaeological projects due to economic and time constraints. Using a very broadly analogous approach, S.M. Taheri, F.I. Ghadim and M. Kabirian (2019) applied fuzzy logic to sex determination in human bones, since often the poor state of preservation raises doubts about the attribution of individuals to one sex or the other. They establish two membership functions, one for female and one for male subjects, with different variables that include bone samples in either set based on bone dimension, pelvis width, etc. This allows the determination of the degree of membership by assessing the possibility of the subject being female or male, possibly female/male, indeterminate, or unobservable (Taheri, Ghadim and Kabirian 2019). Some other projects have used possibility estimation for bone measurement (e.g., Osteoware; Smithsonian National Museum of Natural History 2021), but not using linguistic tags for sex determination as proposed by Taheri, Ghadim and Kabirian.
Fuzzy logic has been explored in various archaeological applications, particularly in relation to the uncertainty of categorical and spatial data (e.g., De Runz and Desjardin 2010; Figuera 2021; Zoghlami et al. 2012). However, its application to chronological reasoning remains comparatively limited. Existing studies often focus on case-specific implementations or conceptual proposals (e.g., Castiello 2023; Sifniotis 2012), and few offer standardized, transferable metrics or methodological frameworks that can be easily reused across projects. This suggests a potential for further development and generalization in the use of fuzzy logic for managing temporal uncertainty in archaeology. With studies such as Barceló (1996), Hermon and Niccolucci (2002), and Taheri, Ghadim and Kabirian (2019) as illustrative examples, we highlight a broader trend in archaeological research toward addressing uncertainty through formal and computational approaches. While these references do not represent the full scope of existing work, they exemplify key lines of inquiry that our study seeks to expand upon. Our intention is not to offer an exhaustive review, but rather to situate our proposal within this ongoing conversation. In this context, our central research question is: how can we quantify and manage the uncertainty — particularly in chronology and site functionality — that is inherited from legacy archaeological data? To address this, we present a computational framework based on fuzzy logic that proposes a set of measurable, reusable metrics for modelling imperfection in legacy datasets. Specifically, we define and apply fuzzy operators such as CDEG (Comprehensive Degree of Equivalence) and FEQ (Factor of Equivalence) to assess the degree of uncertainty in dating and functionality assignments across archaeological sites. We propose a framework adapted to the management of imperfect archaeological dating and site functionality data that solves some of the problems not considered by other approaches, such as the specialist’s view of data categorization (see also Borck et al. 2020, on the impact of diverse expert perspectives in ceramics classification). This complements our earlier CHR2024 proposal (Martín-Rodilla and Tobalina-Pulido 2024), which addressed annotator expertise at the micro-level during the value assignment process, whereas the present framework operates at the dataset level to provide general uncertainty metrics. Moreover, it also allows archaeologists themselves to improve the representation of imperfection and to be aware of these data imperfections.
We take two real case studies of sites from the bibliography and our own archaeological fieldwork, expressing the functionalities and chronologies using fuzzy logic principles. Thus, using the fuzzy operators, the degree of uncertainty in the data is assessed. This will show the weaknesses and strengths of our data and will make the results of the subsequent analysis as honest as possible. On the other hand, through the comparison of both sites, the usefulness of the proposed framework to measure the uncertainty injected in archaeological data in cases of legacy data is discussed, detecting possible points of future improvement of the model. Finally, a series of guidelines is proposed to reuse the function definition of each site’s chronological and typological information. The code resulting from applying the metrics to case studies allows the replication of the experiments carried out and their adaptation to other similar archaeological datasets. These scripts are openly available for use (https://anonymous.4open.science/r/ArqueoChronFuzzy-51D0). Therefore, this framework has three characteristics: 1) it is replicable (Figure 1); 2) it applies to any archaeological (and even historical) dataset; and 3) the framework is lightweight and efficient, requiring minimal time and computational resources to apply.

Figure 1
Workflow of the framework.
3. Methods
3.1. Theoretical foundations of the proposal. Fuzzy logic as an approach to the management of imperfection in archaeology
Fuzzy logic can be applied to concepts that do not have clear boundaries. To clarify, it allows the introduction of an imperfection degree in the items it qualifies. Thus, a fuzzy set “is a class of objects with a continuum of grades of membership. Such a set is characterised by a membership (characteristic) function which assigns to each object a grade of membership ranging between zero and one” (Zadeh 1965:338). A fuzzy set allows the possibility that a certain element may partially be part of it. Simply put, it operates from a flexible approach, unlike Boolean proposals, which are inflexible. Classical logic is binary (Boolean). For instance, it classifies elements into one group or another. Therefore, an element can belong to one set or the other, but not partially. It is a type of logic that allows us to model quantitative values efficiently, but it does not allow us to perceive the vagueness of the data, a major problem in archaeology.
Fuzzy logic extends this approach, enabling the possibility of partially assigning the same element to several sets by considering its degree of membership to a group. This allows greater elasticity to the sets. That is, it operates based on a flexible approach, in contrast to Boolean approaches, which are rigid. In human thought, elements are constructed by using linguistic-qualitative labels. Fuzzy logic makes it possible to represent this human knowledge quantitatively by using fuzzy set theory. Fuzzy sets are composed of two components: a domain (range of values for the set) and a membership function. The membership functions “allow us to measure the degree to which objects belong to such sets and satisfy the imprecisely defined properties” (Cepeda-Negrete 2011:5). For the case of fuzzy sets that are defined over continuous numerical domains, the most prominent types of membership functions in the literature are triangular, Gaussian and trapezoidal functions (Zadeh 1965; Thaker and Nagori 2018), each of them becoming more appropriate depending on the fuzzy logic issue we are dealing with.
From a practical point of view, especially because of their balance between simplicity and expressive capacity (they allow us to represent totally precise numbers, approximate numbers, intervals and approximate intervals), we have opted in this work to use the trapezoidal functions. It is defined by the lower and upper support limits (alpha and delta) and the lower and upper kernel limits (beta and gamma). Thus, the core is the area where all the elements with membership value 1 for the set are located (i.e., they belong completely to the set), while in the support, the elements with a membership value greater than zero are located, i.e., they do not belong completely to the given set.
Based on the imperfection management models developed by J.M. Medina (GEFRED Model; Medina Rodríguez 1994), we propose here a framework adapted to the management of imperfect archaeological data that resolves some of the issues not considered by more computer-based approaches, such as the specialist’s vision in the categorisation of the data, but which also allows improving the representation of the imperfection for the archaeologists themselves. We are aware of the potential concern, related to circular reasoning, that the same authors who designed the framework also applied the membership function values in the case studies. It is important to clarify that several of the metrics used were pre-existing in the literature, albeit not previously applied to the archaeological domain, while the new metrics proposed here were defined prior to any use with real archaeological data. This design-before-application approach avoids tailoring the metrics to fit the specific datasets analyzed. Furthermore, to reduce any possible bias in value assignment, we have ongoing experiments involving external annotators, who apply the framework to independent datasets. These tests will provide an additional layer of validation and help assess the robustness of the framework when applied by researchers other than its authors.
This framework aims to overcome some of the problems encountered in recent research (Tobalina-Pulido 2019; Tobalina-Pulido and González-Pérez 2020) and to define a proposal that allows the global management of imperfection in two of the most diffuse issues in archaeological data: the interpretation of the functionality of the sites and the chronology.
3.2. From theory to practice: building the framework
Two of the main issues with archaeological data are the imperfection of dating records and the inherent imperfection of interpretations of site functionality. The two are closely linked, as each archaeological site may have several possible functionalities at the same chronological moment, or even a site could have a single function at a given period. Dating and functionality interpretation are two essential aspects of any archaeological investigation. We will use the term ‘phase’ to refer to the chronological ranges of a site with specific cultural, archaeological, or historical characteristics (e.g., a century, a range of dates, or a cultural period), especially when absolute dates are unavailable. We will also consider dates obtained through techniques like radiocarbon dating as phases. This choice is intended as a practical means of structuring data for computational modelling, particularly when absolute dates are incomplete or unavailable. While ‘phase’ has historical roots in cultural-historical classification (Renfrew and Bahn 2016), our use is methodological rather than interpretative: it refers to temporal groupings informed by stratigraphy, material culture, or radiocarbon data, without necessarily implying cultural homogeneity. This aligns with current practice in digital archaeology, where the term is applied flexibly for data modelling and analysis (Huggett 2020; Hermon and Niccolucci 2002).
On the other hand, the concept ‘functionality’ refers to the type of function the site had (villa, necropolis, agricultural exploitation, etc.) based on the possible functionalities of the space at a specific chronological moment (phase). For the phases, we have chosen to use chronological ranges, that is, they are determined by the expert considering the archaeological characteristics of the site, its materials and the possible functionality it may have had. These phases may have been generated from absolute dating by 14C, by chronotypological criteria of the materials, by stratigraphy or a combination of all of them. Moreover, one of the most important issues when managing historical chronologies in databases is the problem of homogenising dating criteria — a challenge long recognised in archaeological informatics and spatiotemporal modelling (Huggett 2000; De Runz and Desjardin 2010; Tobalina-Pulido and González-Pérez 2020). In the case of not having absolute dates, we have opted for numerical chronological ranges instead of textual ones that allow us to set chronological limits for each phase, thus introducing a TPQ date (terminus post quem) and a TAQ date (terminus ante quem). This approach makes it possible to homogenise the chronological data, allowing for their computational processing.
One of the most important challenges this poses for the management of imperfect dating and typology is that we can create chained fuzzy sets, which can lead to a wider spread of data, but also to a greater complexity of the proposal. An example will help us to consider this issue. Considering a site X with a phase between the 1st and 3rd centuries AD and a possible later occupation between the 4th and 5th centuries AD, we can create general linguistic labels: possible start date, possible end date, and certain start and end dates. Additionally, we can create sub-labels within these sets to address the vagueness of the chronologies within each century. This would create more complexity in the framework and, in the end, we would be getting even more fuzzy results than the original ones.
Nevertheless, the above chronological ranges establish intervals that do not allow modelling the uncertainty about the start and end of the intervals. This issue can be solved by modelling these intervals as a fuzzy set. Although there are several ways of expressing fuzzy logic graphically, from a practical standpoint, due to their simplicity and ability to represent precise and approximate values and intervals, we have chosen trapezoidal functions, as explained above. It is defined by the lower and upper support limits (alpha and delta) and the lower and upper kernel limits (beta and gamma; Figure 2). Thus, the core is the area in which all the elements with a membership value of 1 for the set are located (i.e., they belong completely to the set) while in the support, the elements with a membership value greater than zero are located, i.e., they do not belong completely to the given set. Alpha indicates the starting date from which the year could be considered within this range, although the level of certainty is minimum; beta, the starting date from which the certainty is maximum; gamma, the end date of the range, from which the certainty is not maximum; and delta, the end date from which the level of certainty is zero. From a practical standpoint, we have chosen to use chronology (to refer to the phases of the sites) and usage (to refer to the typologies of the sites) in our analyses.

Figure 2
Trapezoidal Membership Function for Chronological Assignment.
3.3. Fuzzification process: Framework guidelines and defined metrics
As explained above and described in previous methods, such as the GEFRED model (Medina, 1994) or fuzzy model implementations, such as by Mamdani and Assilian (1975), the first step in using the principles and benefits of fuzzy logic is the step known as fuzzification. It consists of assigning and transforming each crisp input value of the domain into fuzzy inputs with the help of the membership function (Thaker and Nagori 2018). Thus, at least one human expert must define this membership function according to their knowledge of the domain and the available data, which may include data inherited from others who produced it. When dealing with legacy data, most of the expert knowledge we apply comes from the data itself. That is why we believe it is relevant at this point to offer some guidelines arising from our experience with archaeological legacy data for the definition of these membership functions. The expert will indicate the level of certainty regarding the assigned functionality and chronology using a value ranging from 0 to 1, in intervals of 0.25. A value of 0 represents no certainty about the functionality/chronology, while a value of 1 represents absolute certainty. Experts consider the following evaluation criteria when determining the appropriate value for the functionality of the site:
No certainty in the attribution (Value 0): The expert cannot assign a specific functionality due to a lack of sufficient evidence or because the area in question is outside their field of expertise. Example: Absence of clear evidence.
Basic indications, but inconclusive (Value 0.25): Some elements suggest a possible functionality, but the evidence is not definitive. Example: Fragments of dolia (storage jars) that could point to a house, farm, or villa, but without certainty as to which option applies.
Sufficient indications, but not definitive (Value 0.5): The evidence suggests a specific functionality, but it cannot be confirmed with certainty. Example: Mosaics found at a site that could indicate the presence of a villa, though this cannot be fully confirmed.
High likelihood of a specific functionality (Value 0.75): The expert has solid archaeological evidence supporting a specific functionality, based on established functionalities. Example: Structures and artifacts clearly identified as belonging to a specific type of building.
Complete certainty in the attribution (Value 1): The expert is entirely confident in the assigned functionality, supported by conclusive evidence. Example: All contextual elements match the proposed functionality without any doubt.
Similarly, they apply these criteria when assessing the chronology of the site:
No certainty in the chronology (Value 0): The expert cannot assign a chronology due to a lack of evidence. Example: No clear chronological markers are present.
Basic chronological indications, but inconclusive (Value 0.25): Some elements provide a general idea of chronology, but they are not definitive. Example: Only tegulae (roof tiles) have been found, suggesting a Roman period but without specifying the century.
Sufficient indications, but not definitive (Value 0.5): The evidence points to a specific chronology, but it cannot be confirmed with certainty. Example: The presence of terra sigillata (a type of Roman pottery) from the High Imperial period, found alongside tegulae.
High likelihood of a specific chronology (Value 0.75): The expert has solid archaeological evidence supporting a specific chronological attribution. Example: Well-defined chronological markers align with established archaeological functionalities.
Complete certainty in the chronology (Value 1): The expert is entirely confident in the assigned chronology, supported by conclusive evidence. Example: All elements of the context match the proposed period without any doubt.
Note that, in addition to the legacy data characteristics, the definition of the membership function is a process in which some more qualitative factors may be influencing the application of these guidelines. For instance, the background and expertise of the annotator (as the person or group of people who define said membership function). In this paper, this dimension will not be addressed, since the contribution presented as a framework (metrics to measure the uncertainty of the dataset) is applied once these membership functions have been defined. That said, the role of the uncertainty and background of the annotator in fuzzy systems for archaeology is recently attracting more attention in research, making it possible to incorporate it into the proposal as part of the membership functions definition phase. We will therefore focus on measuring the uncertainty inherited as part of the legacy data process for an annotator with sufficient expertise (on the domain and the inherited data) to define an appropriate membership function for the data, as discussed in the literature on fuzzification (Martín Rodilla and Tobalina Pulido 2024). It is worth clarifying the distinction with our earlier CHR24 proposal (Martín-Rodilla and Tobalina-Pulido 2024). In that work, we presented only a small pilot experiment aimed at explicitly incorporating annotator expertise — modelled in a fuzzy manner — into archaeological annotations at the micro level, during the value-assignment process itself. In contrast, the present study introduces a complete fuzzy-logic-based framework for uncertainty metrics, operating at the dataset level rather than annotation by annotation. While the CHR24 approach is focused on representation during annotation, the metrics proposed here are more general and designed to assess the overall uncertainty patterns of an entire dataset.
Once we have all membership functions defined and the fuzzification step is completed, our framework allows us to apply different metrics to the dataset to know the level of uncertainty that we have inherited when using legacy data. Below, we will present the metrics defined in the framework. This makes the framework generalisable, as it operates independently of specific annotation cases and can be applied to any dataset where variables can be represented with fuzzy membership functions.
CDEG (Comprehensive Degree of Equivalence)
The Comprehensive Degree of Equivalence (CDEG) operator in the FuzzySQL language calculates the degree of fulfilment of a fuzzy condition for all fuzzy variables (the attributes of a data tuple), or a particular variable. It provides a measure of how well an element (in our case, each archaeological site, but it can be any specific information piece) fulfils the specified fuzzy conditions. This operator is useful when the data and conditions are not precise, and you need to handle membership degrees. When using it for a particular variable, it is calculated using the membership function of each fuzzy set and the assigned membership value. In archaeology, evidence is rarely absolute. The CDEG operator captures this by quantifying how closely a site, phase, or artefact meets fuzzy criteria, allowing partial truths and uncertainty to be measured and compared systematically.
The general CDEG formula for a variable (for example ‘archaeological site functionality’) is:
Where:
μvar(i) is the membership function for the value of var i.
value_var is the assigned fuzzy value.
Thus, CDEG(var) gives us information on the maximum value of the degree of conformity of each element concerning the given variable, for example, archaeological site functionality.
The operator CDEG is especially useful when evaluating all the fuzzy conditions at once, providing results that allow us to know the level of uncertainty handled in each element (in our case, each archaeological site, summing up variables):
Where:
μj(i) is the membership function for the value i of var j.
value_var is the assigned fuzzy value for each variable analyzed.
CDEGavg (var) and CDEGavg(*): Average CDEG metrics by variable and joined
The previous section introduced how the CDEG FuzzySQL operator is able to give us a numerical dimension of the degree of uncertainty presented by each element to be analysed, in our case, each archaeological site. However, note that when applying CDEG operators, it recovers the maximum value of the membership functions and then evaluates the minimum value by adding all the fuzzy variables. This means that it does not capture situations in which the membership functions present a lot of uncertainty in some variables and less in others. For example, in an archaeological site where we know the chronology values with a fair degree of certainty but the values in terms of archaeological functionalities are not so clear, or even variations within the same variable (such as knowing for sure that it was a hermitage but not so much if it was also a cemetery). These casuistries are very common in archaeological data, especially when we have not worked with data in the first person (i.e., legacy data).
To offer metrics that allow us to capture these differences in uncertainty in the function of the variables evaluated, the framework is complemented by adding the calculation of average metrics of the previous operators, which capture all the values of the membership functions and offer a single-value metric of the degree of uncertainty present in the complete element (archaeological site) or a particular variable (for example, ‘archaeological chronology’ or ‘archaeological site functionality’):
Where:
μvar(i) is the membership function for the value of var i.
value_var is the assigned fuzzy value.
nj is the total number of values defined for var i.
Following the same notation, the average function for the CDEG(*) operator is defined as:
Where:
m is the total number of variables defined for each fuzzy element (in our case, the archaeological site)
Comparative metric based on the FSQL Fuzzy equal (FEQ) operator
We also propose to incorporate into the framework a fuzzy metric that allows to compare information between legacy datasets about their degree of global uncertainty concerning another given legacy dataset. This involves applying the fuzzy operator also present in FuzzySQL Factor of Equivalence (FEQ) (Hassine et al. 2008) on the previous values, to compare the different archaeological sites with each other. FEQ allows us to obtain a metric for each pair of elements (archaeological sites in our case) that we want to compare, representing the degree of coverage of each site considering the membership functions provided and, therefore, a measure of the similarity or not in terms of uncertainty inherited from each pair. Therefore, for a pair of archaeological sites A and B:
FEQ operator was defined in FuzzySQL as a final value between 0 and 1 based on the similarity of coverage between pairs. We have generalized the operator formula in our framework to work with continuous values of CDEG(*). This allows us to work in a continuum in all the different cases in the archaeological sites’ comparison: that a dataset or element presents more, equal or less uncertainty than another in any degree of uncertainty.
Note that this comparative functionality allows us to compare fuzzy legacy data elements (archaeological sites) with each other, but only regarding the level of uncertainty inherited from their fuzzy data. This means that the reasons in terms of the archaeological domain (that is, the archaeological context, the data management process, etc.) behind this metric can be varied, and it will be up to the expert archaeologist to go deeper into the reasons for the FEQ results in comparison.
As can be seen, these metrics derived from the FuzzySQL operators collect information about the uncertainty inherited in archaeological data from legacy contexts, simply modelling this scenario. In the following section, we will apply the defined metrics to two case studies (San Blas and Santa María de Hito). It illustrates the use of the proposal to measure the uncertainty inherited from such legacy data in a real-world context, as well as discussing its potential applications.
4. Applying the framework. The case studies of San Blas and Santa Maria de Hito
4.1. Archaeological Sites Selection
Two archaeological sites were selected, in which the archaeologists responsible for them defined the membership functions for both variables (functionality and chronology). The choice is made considering that both present very different casuistry, temporality, use and narratives about their past.
San Blas. Site located in Olite (Navarra, Spain). The bibliography indicates that it is a Roman villa dated between the High Imperial Period and the 5th century AD. A hermitage is believed to have existed in the same space in medieval times. In addition, it is mentioned that some materials were found that indicate a possible function as a necropolis between the 3rd and 5th century AD, according to the ceramic and metal materials found (Iriarte Kortazar 2000). During the surveys carried out in 2021, we were able to confirm a chronology between the beginning of the 1st century AD and the second half of the 4th century or the beginning of the 5th century, and a function as a villa. In addition, no signs of a necropolis or a hermitage were found. This is a site that has not been excavated, but to which, based on chance findings, different functionalities have been attributed. This gave rise to significant uncertainty due to the vagueness of the information, although we had some archaeological materials from the site that allowed us to attribute greater precision to some of the interpretations given. By carrying out the survey in 2021, we were able to verify and check some of the functionalities attributed to the site. The lack of location of remains of a medieval hermitage mentioned in the bibliography leads us to increase the uncertainty about this interpretation and to verify the classification as a town.
Santa María de Hito. Site located in Valderredible (Cantabria, Spain). The Roman levels have allowed us to date the functionality of a villa to the 3rd-4th centuries based on the documented ceramic materials. For the necropolis, we have several absolute radiocarbon dates: CSIC-838: 1430 ± 40 (14C) (sample: human bones, individual 292 with the initials E5-2-83); CSIC-840: 1360 ± 40 (14C) (sample: human bones, individual 45 with the initials E3-E4-2-82); CSIC-837: 1320 ± 50 (14C) (sample: human bones, individual 72 from sector II, slab tomb of the atrium); CSIC-839: 980 ± 40 (14C) (sample: human bones, individual 4 with the initials Z5-7-83), according to the compilation carried out by E. Gutiérrez Cuenca (2002:91). In the case of this site, the greatest chronological precision is given by the large number of absolute C14 datings that we have. In addition, the function of a necropolis is easier to characterize than others, so we have less inaccuracy. In the case of functionality as a villa, although the archaeological intervention is not recent, it is a type of site with characteristics for its attribution quite well defined by the bibliography.
Both case studies are used to demonstrate the proposed framework in direct connection with our research question on how archaeological variables can be represented as fuzzy membership functions and whether the existing uncertainty can be quantified using fuzzy metrics. Its application is organized in three parts. First, the framework metrics are applied to the information from both sites for two variables — chronology and functionality — considered here as independent variables. We call this first validation of the framework Illustration 1. This stage addresses the first part of the research question, focusing on the representation and quantification of uncertainty when variables are modelled independently. Next, we explore how the framework metrics perform in the same case studies when one variable is considered dependent on the other — specifically, functionality dependent on chronology. This stage addresses the second part of the research question, examining how variable dependence influences the quantification of uncertainty. We call this second validation Illustration 2.
Finally, we focus on the comparative potential in terms of the uncertainty of the proposed framework, modelling a scenario for a single archaeological site whose uncertainty inside the legacy data has changed due to the performance of a recent archaeological intervention. We call this third validation Illustration 3.
4.2. Illustration 1. Uncertainty in functionality and chronology as independent variables in San Blas and Santa Maria de Hito
This section shows how the fuzzification process is applied and how the framework metrics are calculated for the two case studies presented. It involves fuzzy modelling of the membership functions of both sites (San Blas and Santa María de Hito), focusing on their variables of functionality and chronology, these variables being independent of each other.
As we explained previously, the first step is fuzzification. We define the membership functions in terms of the two variables functionality and chronology for each site. Note that this process is carried out by expert archaeologists and follows the guidelines detailed previously in section 3.3. In this case, San Blas has a certain functionality as a villa (A2) and, with little certainty, a second functionality as a necropolis (D1), since we only have one mention in the bibliography of materials that can be related to a funerary functionality, although they could also be part of a residential functionality related to the villa. As for Santa María de Hito, it has two defined functionalities (villa and necropolis) with a high degree of certainty, although the materials that allow a site to be classified as a necropolis are more certain than those of a villa, which can be interpreted as a farm or another type of settlement. The resulting membership functions are:
μ_ functionality_sanblas = {′A2′: 1.00,′D1′: 0.25}
μ_chronology_sanblas = {′c1′: 0.75,′c2′: 0.50}
μ_ functionality_hito = {′A2′: 0.75,′D1′: 1.00}
μ_chronology_hito = {′c1′: 1.00,′c2′: 0.75,′c3′: 0.75,′c4′: 0.75}
By applying the framework metrics, we can know what degree of injected uncertainty we inherit in legacy data from these variables independently, as well as measure how much uncertainty we inherit jointly. Once we have the membership functions defined following the previous guidelines, we can apply the metric’s framework. First, the operator CDEG(var) evaluates the membership functions for the different variables considered, as well as CDEG(*) evaluates the membership functions of all the defined fuzzy variables (in our case, it evaluates the membership functions of functionality and chronology at the same time).
In addition, the framework also incorporates the value of CDEG(var)avg and CDEG(*)avg, which consider the average of the degrees of membership per variable and of all the degrees of membership of all the membership functions evaluated, respectively. These metrics allow us to obtain an average result of the uncertainty of each archaeological site. The results of the metrics applied to both archaeological sites are detailed in Table 1.
Table 1
Metrics applied to both archaeological sites.
| ARCHAEOLOGICAL SITE | CDEG (FUNCTIONALITY) | CDEG (CHRONOLOGY) | CDEG(*) | CDEGAVG (FUNCTIONALITY) | CDEGAVG (CHRONOLOGY) | CDEGAVG (*) TOTAL |
|---|---|---|---|---|---|---|
| San Blas | 1 | 0.75 | 0.75 | 0.625 | 0.625 | 0.625 |
| Santa María de Hito | 1 | 1 | 1 | 0.875 | 0.812 | 0.843 |
| Joint | – | – | – | 0.7500 | 0.7187 | 0.7343 |
San Blas: These results confirm the manual calculation previously performed. They show that the functionality of the site has a higher degree of certainty (CDEG(functionality) = 1.0), while the chronology has a lower degree of certainty (CDEG(chronology) = 0.75). The overall value of CDEG, considering both attributes (CDEG(*) = 0.75), implies a low value in overall uncertainty.
Santa María de Hito: These results indicate that both the functionality and the chronology of the Hito site have maximum membership values for the considered fuzzy sets, resulting in a total degree of certainty (CDEG(functionality = CDEG(chronology = 1.0) for both attributes and, consequently, for the complete set of attributes (CDEG(*) = 1.0).
By applying the CDEG operator formula of the framework (calculating an average value in each of the variables), we observe that the inherited uncertainty in San Blas for each attribute is relatively high and equal for both variables (CDEGavg(*) = 0.625), while in the case of Santa María de Hito the uncertainty is clearly lower in both variables, being higher in chronology (CDEGavg(chronology) = 0.812) than in functionality (CDEGavg(functionality) = 0.875), and obtaining a value between 0 and 1 that indicates the inherited uncertainty of each of the archaeological sites. In addition, we can calculate the average of the CDEGavg(*) operator, which gives us a joint idea of the uncertainty in the data provided in both the variables of functionality and chronology simultaneously. In this case, the uncertainty for Santa María de Hito site relative to the specified fuzzy sets is minimal (CDEGavg(*) = 0.843), while in San Blas we inherited more injected uncertainty (CDEGavg(*) = 0.625).
It is also possible to do a joint analysis of the uncertainty contained in both sites at the same time. This can be useful, as we mentioned above, to evaluate a dataset with several variables defined jointly. In this case, the inherited uncertainty when using the information from both sites at the same time is higher in terms of chronology (CDEGavg(chronology) = 0.7187) than in terms of functionality (CDEGavg(functionality) = 0.7500) and evaluating both sites at the same time, the data presents a low uncertainty (CDEGavg(*) = 0.7343).
Once we have all the values for the average metrics, we can apply the FuzzySQL FEQ operator on these values with the formula presented in the framework to compare the sites with each other. The FEQ operator provides a measure of the intensity or relevance of a fuzzy characteristic (in this case, functionality and/or chronology) comparing between two sets, helping to understand the relative differences between them. It therefore compares how the membership functions of these variables are distributed between both sites, giving an idea of which archaeological sites have, compared to the rest, a higher (less inherited uncertainty) or lower (more inherited uncertainty) degree of coverage. Table 2 presents the FEQ values for the variables of functionality and chronology, and for both together, comparing the archaeological sites of San Blas and Santa María de Hito:
Table 2
FEQ Values for Functionality, Chronology, and Combined Variables: Comparison of San Blas and Santa María de Hito.
| FEQ SANTA MARíA DE HITO/SAN BLAS CDEGAVG(FUNCTIONALITY) | FEQ SANTA MARíA DE HITO/SAN BLAS CDEGAVG(CHRONOLOGY) | FEQ SANTA MARíA DE HITO/SAN BLAS CDEGAVG(*) |
|---|---|---|
| 1.4 | 1.3 | 1.35 |
These results indicate that Santa María de Hito site has a higher degree of certainty on average than San Blas (greater in functionality and slightly lower chronology). This also means that Santa Maria de Hito site presents 1.35 times greater certainty in both dimensions (functionality and chronology) than San Blas site. This comparison value corresponds to the view of the archaeologists regarding the evidence that we have from both sites. In the first case, we have C14 dating that allows us to better refine the chronologies, while in San Blas we only have data that allow a characterization by ceramic chrono-typologies or other materials.
This first illustration of the application of the framework allows us to obtain values for five simple metrics from the membership function of each site, evaluating two independent fuzzy variables (functionality and chronology). In the following section we present a second illustration for the application of the framework to the same case studies but now considering the fuzzy variables as dependent variables.
4.3. Illustration 2. Uncertainty in functionality and chronology as dependent variables in San Blas and Santa Maria de Hito
This section shows how the fuzzification process and how the framework is applied for the two case studies presented, focusing on their functionality variable as a chronology-dependent variable. We therefore consider that the definition of specific chronologies for an archaeological site in a fuzzy way already influences the site functionalities that we can attribute to the site, as well as its fuzzy value. That is, if we consider, for example, chronologies in Roman periods in the membership function, this will influence the selection of functionalities that will be attributed to the space. Thus, in this case, we consider that San Blas has a certain functionality as a villa (Phase 1) and another as an indeterminate rural settlement of which we have practically no evidence (Phase 2). For its part, Santa María de Hito has absolute datings for three temporal moments in chronology, giving this a great security to the dating and their functionality as a necropolis (Phases 2 and 3). For the villa functionality, however, we only have relative dating by chrono-typology (Phase 1). The resulting membership functions are:
μ_ functionality_SanBlasPhase1 = {‘A2′: 1.00}
μ_ functionality_SanBlasPhase2 = {‘A1′: 0.25}
μ_ functionality _HitoPhase1 = {‘A2′: 0.75}
μ_ functionality_HitoPhase2 = {‘D1′: 1.00}
μ_ functionality_HitoPhase3 = {‘D1′: 1.00}
μ_ functionality_HitoPhase4 = {‘D1′: 1.00}
Once we have the membership functions defined following the previous guidelines, we apply the framework in a similar way to Illustration 1, obtaining values for the five defined metrics: CDEG(var), CDEG (*), CDEGavg(var), CDEGavg(*), and FEQ. Note that in this case, we evaluate only the variable functionality, a chronology-dependent variable that we have defined in the membership functions. The results are detailed in Table 3.
Table 3
CDEG(var), CDEG(), CDEGavg(var), CDEGavg(), and FEQ for Each Site and Combined.
| ARCHAEOLOGICAL SITE | CDEG (FUNCTIONALITY) | CDEG(*) | CDEGAVG (FUNCTIONALITY) | CDEGAVG(*) TOTAL |
|---|---|---|---|---|
| San Blas | 0.75 | 0.75 | 0.625 | 0.625 |
| Santa María de Hito | 0.75 | 0.75 | 0.9375 | 0.9375 |
| Joint | – | – | 0.7812 | 0.7812 |
These results indicate that at both sites the functionality of the site has a high value of degree of certainty (CDEG(functionality) = 0.75) and, consequently, for the complete set of attributes. By calculating CDEGavg(var) and CDEGavg(*), we can obtain more information about the inherited uncertainty by modeling the functionality as a variable dependent on the chronology. We observe that the inherited uncertainty in San Blas is relatively high (CDEGavg(functionality) = 0.625), while in the case of Santa María de Hito, the uncertainty is clearly lower regarding functionality, presenting a value of almost total certainty (CDEGavg(functionality) = 0.9375). It is also possible to make a certain joint analysis of the uncertainty contained in both sites at the same time. This can be useful, as we mentioned above, to evaluate a dataset with several records jointly. In this case, the uncertainty inherited from using information from both sites at the same time in terms of functionality (CDEGavg(*) = 0.7812) is relatively low.
As in the previous Illustration 1 and once we have all the values for the average metrics, we can apply the FEQ operator to these values to compare the different deposits with each other. In this case, the variable to be compared is the functionality.
Table 4 shows the FEQ value in this case. Its value of 1.5 indicates the degree of coverage of the functionality status (and therefore, of certainty) at the site of Santa María de Hito is 1.5 times higher than at San Blas, considering the membership functions provided for each phase. Let us remember that the comparison made only concerns the degree of inherited uncertainty that the data have in terms of functionality, and not the archaeological reasons or data management and representation reasons that underlie it. That said, the application of FEQ allows us to reinforce the idea that we have ‘better’ data (less uncertainty) both chronologically and functionally for Santa María de Hito.
Table 4
Application of the FEQ operator to average metrics for comparing sites based on the functionality variable.
| FEQ HITO/SAN BLAS CDEGAVG(FUNCTIONALITY) | FEQ HITO/SAN BLAS CDEGAVG(*) |
|---|---|
| 1.5 | 1.5 |
If we evaluate both applications of the framework to the same case studies in Illustration 1 (as independent variables) and Illustration 2 (as dependent variables), modelling as dependent variables shows less data legacy uncertainty. Although further applications and evaluation of the metrics presented in the framework are necessary, a possible main reason is that the membership functions defined for a dependent variable are defined more precisely, since a membership function per phase is incorporated. Note also that when comparing the archaeological sites data using FEQ, the independent variable framework assigns a slightly lower certainty value between Santa María del Hito with respect to San Blas (FEQ = 1.4), while the dependent framework assigns a greater difference in favour of Santa María de Hito (FEQ = 1.5).
These values also support what was stated above about fuzzy modelling as dependent variables showing better values. As a first approximation, we believe that if the chronology of the defined archaeological sites is clear and there is no room for fuzzification, this approximation is more precise. However, it is common for inherited data to also have uncertainties regarding chronology. In this case, the independent framework (Illustration 1) shows us much more information about how much uncertainty we inherit from the legacy data.
The following section shows the application of the proposed framework to measure the cases in which the uncertainty of the variables of functionality and chronology of the same site increases or decreases, generally because we inherit data but then investigate it again, obtaining new membership functions.
4.4. Illustration 3. Uncertainty in functionality and chronology to monitor changes in knowledge about a single site: San Blas
This section shows how the fuzzification process is applied, and the metrics of the framework are calculated to monitor changes in the inherited uncertainty in the same site. It involves the definition of the membership functions of the San Blas site for two temporal realities: what was known about the site (membership function μ_functionality_sanblas1) and what is known from a new survey at the site (membership function μ_functionality_sanblas2). This allows the metrics to be used to compare two temporal stages of the knowledge we have about the site (and therefore also the uncertainty in it). There may be cases in which a new investigation allows us to reduce the uncertainty (because we have new findings or findings that support the previous hypotheses) or that increase the uncertainty of the site (for example, because we find findings in other directions and/or that support other hypotheses). It should be noted here that we have chosen not to consider the case in which uncertainty remains, since we consider that new research always alters the knowledge we may have about the site. Having said that, future work is to explore these scenarios and quantify them to see if it is possible to maintain the previous membership functions and, therefore, a situation of continued uncertainty despite recurrent research. We focus on variable functionality, modelling membership functions for stages 1 and 2 of the research:
μ_ functionality_sanblas1 = {′A2′: 1.00,′D1′: 0.25, ‘E’: 0.50}
μ_ functionality_sanblas2 = {′A2′: 0.75,′D1′: 0.1, ‘E’: 0.25}
In this case, we define the membership functions considering that there is more uncertainty in the functionality of the site as a hermitage (E) in San Blas in stage 2 than in stage 1. This is explained because in stage 2, an intensive archaeological survey was carried out. As a result, no remains of hermitage buildings were found. This does not mean that the hermitage did not exist, since we have data from oral testimonies that somewhere in said toponym there would have been a hermitage, but that in the space studied archaeologically, the archaeological data recovered lead us to reduce the certainty defined the stage 1 to a lesser certainty of the μ_functionality in stage 2.
Note that the proposed average metrics presented in the framework allow for such comparisons across time stages by implementing the FuzzySQL operators CDEG and FEQ for functionality with average uncertainty values (rather than with maximum or minimum values of each set as initially defined. This is especially pertinent in this case, since in the face of changes over time in the membership functions of the archaeological site (mainly due to subsequent investigations in terms of functionality and chronology) the use of such set values is not able to reflect the change in uncertainty from one stage to another, while the metrics CDEGavg (var) — for each fuzzy variable treated — such as CDEGavg(*) reflect such a change.
Table 5 shows the values of the average metrics in each of the phases (μ_ functionality _sanblas1 and μ_ functionality_sanblas2).
Table 5
Average Metrics in Each Phase (μ_ functionality_sanblas1 and μ_ functionality_sanblas2).
| CDEGAVG (FUNCTIONALITY) | CDEGAVG(*) TOTAL | |
|---|---|---|
| μ_functionality_sanblas1 | 0.6250 | 0.6250 |
| μ_functionality_sanblas2 | 0.3375 | 0.3375 |
| Joint | – | 0.4812 |
| Difference | – | 0.2875 |
The results show the uncertainty inherited in stages 1 and 2 of the San Blas site in terms of its functionality, decreasing the average value of CDEG in functionality from stage 1 (CDEGavg(functionality) = 0.6250) to stage 2 (CDEGavg(functionality) = 0.3375).
The value of CDEGavg(*) of the set gives us an idea of the average uncertainty that we handle in the site if we want to take into account both studies of the same, while the difference allows us to know the magnitude of the change suffered from one stage to another. In this case, the uncertainty in the site has increased by 0.2875 points, so we have more doubts about the functionality of San Blas (remember that it is a value between 0 and 1, where 0 is maximum uncertainty and 1 is maximum certainty). Along the same lines and using the same type of analysis and metrics, the opposite situation could occur, in which a site decreases the level of uncertainty at a new stage in its investigation. In this case, the CDEGavg(*) values of stage 2 would be higher than those of stage 1, and the difference would be negative.
5. Conclusions, limitations of the framework and future work
The framework presented in this paper does not pretend to be a unique and complete mathematical framework that collects hundreds of intrinsic characteristics of uncertainty in archaeology, or to implement theoretical models of the discipline (De Runz 2008; Fusco 2016; Martín-Rodilla and Tobalina-Pulido 2024). It is a proposal in line with previous works (Tobalina-Pulido and González-Pérez 2020; Martín-Rodilla and Tobalina-Pulido 2024), which allows (taking advantage of the predominant use of fuzzy logic in the literature on uncertainty in archaeological data) to define, implement and use in a practical way some metrics to have real data on the degree of inherited uncertainty in terms of functionality and chronology when we use data from legacy data in archaeology. This is a very common scenario that entails difficulties due to the heterogeneity of archaeological data. In addition, the proposal includes some guidelines that allow to define, in a simple way (and appropriate to the evidence available to the archaeologist), the membership functions for these fuzzy variables.
The ease of use of the approach has allowed us to illustrate throughout this paper how the proposed metrics reflect the uncertainty of two sites with very different past case histories and narratives. The metric results allow us to quantify the uncertainty inherited in each of them, comparing the modelling of the fuzzy variables as independent and dependent variables and illustrating how to compare these sites with each other. In addition, it also allows us to collect temporal changes in the uncertainty of the same site, since this can be different for each of the phases of the site, but it can also change with the investigations carried out.
However, we are aware that this approach has some limitations. Firstly, the uncertainty that we take into account in the defined metrics is collected in the definition of membership functions, which express levels of findings or evidence that we have at the level of functionality and/or chronology for a given archaeological site. This means that other dimensions of uncertainty when inheriting data in archaeology, such as errors in the data or the level of expertise of the annotator (the person who, seeing this evidence, evaluates it and chooses a value for the membership function) are not included in this framework. Regarding the last point, modelling of the annotator’s level of expertise is in line with some recent proposals (Martín-Rodilla and Tobalina-Pulido 2024) that could be connected with the framework presented. It is important to emphasise the difference between the two approaches and the fact that they address different dimensions of information. The framework described here focuses on overall uncertainty in legacy datasets, calculated without considering the identity or expertise of the annotators who originally assigned the values. In contrast, the CHR24 proposal by Martín-Rodilla and Tobalina-Pulido addresses the annotators themselves, modelling their level of expertise during the process of assigning membership function values. Both perspectives can be applied independently or in combination: one could first apply the CHR24 approach during value assignment to incorporate annotator expertise and then use the present framework to compute general uncertainty metrics already informed by that expertise.
Secondly, the presented framework is based on FuzzySQL operators, which allow fuzzy databases based on it to implement these metrics and perform searches according to them. FuzzySQL allows fuzzy modelling of data, preferably in disjoint categories. This means that the presented metrics allow the definition of membership functions in functionality and chronology as long as the category scheme or possible values of functionalities and chronologies do not overlap with each other or present hierarchical structures. The generalization of the proposed metrics for hierarchical category schemes and how to represent uncertainty and changes in them is another possible line of work.
One strength of the framework is that it builds on FuzzySQL, a language already robustly tested for uncertain data in various domains. This robustness provides confidence in applying it to the archaeological domain at the pilot-experiment level, ensuring that the uncertainty present in the data is faithfully reflected. At the same time, further research is needed to deepen its adaptation to archaeology and to ensure its generalization. In particular, aspects such as the framework’s suitability across diverse archaeological contexts, the sensitivity of the metrics to different membership functions and case configurations, and the perspectives of archaeologists regarding its usability and interpretability merit dedicated investigation. These considerations are included in our proposed future work, as they are crucial for moving from pilot applications to broader adoption in the discipline.
Finally, and related to the previously mentioned aspect of the annotator’s expertise, there is the fact that the attribution of belonging values is carried out by an archaeologist, considering that there is always a degree of subjectivity in the interpretation of an archaeological site. We consider, however, that this approach allows this to be alleviated, to a certain extent, by having metrics that facilitate the comparison between data in databases. In this sense, we propose that the application of fuzzy logic to the functionalities of the site itself could be further explored, using tables of values similar to those used in the project by Taheri, Ghadim and Kabirian (2019) for the classification of bones. This would allow for tables of defined characteristics for each typology and for researchers to be able to use these criteria as a basis for establishing the functional attributions of each site analysed.
With these clarifications on their functionality, the metrics presented allow, in a simple way, to define membership functions that fuzzify the functionality and chronology assigned to the phases of the sites, as well as to elaborate studies on the degree of uncertainty handled in legacy data in archaeology, both at a micro level (studying a specific site) and at a macro level (comparing sites with each other or measuring the global uncertainty of a dataset). These studies are replicable and independent of the casuistry of each site and allow the comparative evaluation between sites at the level of uncertainty in functionality and chronology, decision-making and conformity analysis based on these results.
Nevertheless, we recognize that the present work does not include a formal validation of the framework’s results against independent benchmarks such as improvements in decision-making, increases in classification accuracy, or alignment with expert consensus. Such validation would be necessary to fully assess its practical utility and to substantiate claims about its applicability. While the conceptual basis of the framework may be transferable to a wide range of archaeological and even historical datasets, the specific implementation and metrics presented here have only been tested on two case studies. Broader applicability to other types of data — such as lithic assemblages or epigraphic corpora — remains to be demonstrated and should be the subject of future research.
Competing Interests
César Gonzalez-Perez supervised the first author for two years at Incipit-CSIC and has co-authored publications with both authors. He did not participate in the peer review process.
Author Contributions
First author: Conceptualization, methodology, validation, writing – original draft, writing – review & editing, supervision
Second author: Conceptualization, methodology, software, formal analysis, data curation, writing – original draft, writing – review & editing
