Have a personal or library account? Click to login
Machine Learning Applications in Archaeological Practices: A Review Cover

Figures & Tables

Table 1

Summary of results of both automatic and manual protocol searches on the six online portals. Note that the “Sum of unique totals” refers only to the sum of the number of non-duplicate results from each search; however, there were many publications present in both searches (see below), which would further reduce the sum total of unique items.

BIBLIOGRAPHICAL DATABASEKEYWORDS MATCHES
Automatic screening:
Web of Science969
PubMed413
Tübingen University Library51
German Archaeological Institute24
German National Library3
Total unique558
Manual screening:
Google Scholar300
Total unique285
Total1760
Sum of unique totals730
jcaa-8-1-201-g1.png
Figure 1

Review process from source selection to analysis. Inspired by the PRISMA 2020 flow diagram (Page et al. 2021). Reason 1 = Ineligible with automation tool; Reason 2 = Non-English record; Reason 3 = Full text not accessible; Reason 4 = Non-journal-based publications; Reason 5 = Absence of abstract; Reason 6 = Archaeology and machine learning keywords from the list not present in the text; Reason 7 = Archaeology and machine learning keywords from the list are not present in the abstract or in the title; Reason 8 = Preliminary exclusion (i.e. no access to publication, publications or contribution by current authors, entire books, non-academic reports, preprints, reviews or theoretical papers, potentially predatory journal); Reason 9 = Excluded based on the title; Reason 10 = Excluded based on abstract; Reason 11 = Excluded based on the full text first reading; Reason 12 = Full text does not involve archaeological research; Reason = 13 Full text does not involve machine learning methods as defined in our protocol; Reason 14 = Conflicts of interest (publication by the authors of this review or in which the authors contributed); Reason 15 = Theory or review paper. Figure created using Microsoft Word and Inkscape.

Table 2

The nine features collected systematically from the review.

FEATURENUMBER OF CATEGORIES
Model70
Best model17
Family9
Subfield15
Input data11
Evaluation3
Task19
Result5
Pre-training4
jcaa-8-1-201-g2.png
Figure 2

The fourth field of information recorded in the review presents significant characteristics to explain variation in machine learning applications in archaeology and their related classes/categories. One study case might have been attributed to several subfields or architecture categories. Figure generated with R 4.2.2 (code available in supplementary material 3) and additional editing with Inkscape.

jcaa-8-1-201-g3.png
Figure 3

Number of publications per year between 1997 and 2022, in light blue the articles published after 2018 concentrated more than 80% of the publications. The dashed line represents publications from 1 January 2023 to 31 September 2024. Figure generated with R 4.2.2 (code available in supplementary material 3) with additional editing in Adobe Illustrator.

Table 3

The ten most represented journals and their h-index and Impact factor (IF) score and total score by the number of articles, n = 135. Metrics were consulted on 14/07/2024 on the paper website for the impact factor or on SJR for the h-index (supplementary file).

JOURNALNUM. OF ARTICLESH-INDEXN ⋅ H-INDEXIFN ⋅ IF
Remote Sensing1519328954.263
Journal of Archaeological Science1415221282.636.4
PLOS One1143547853.7541.25
Scientific Reports631518903.822.8
Journal of Computer Applications in Archaeology51575N/AN/A
Archaeological Prospections4461842.18.4
Journal on Computing and Cultural Heritage3351052.78.1
Archaeological and Anthropological Sciences3421262.146.42
Palaeogeography Palaeoclimatology Palaeoecology31775312.67.8
Virtual Archaeology Review317511.64.8
jcaa-8-1-201-g4.png
Figure 4

Number of articles published per country based on the country of the first author’s affiliation. Figure generated with R 4.2.2 (code available in supplementary material 3).

jcaa-8-1-201-g5.png
Figure 5

(A) Number of articles from each archaeological subfield between 1997 and 2022. (B) Number of articles from each architecture class between 1997 and 2022. Empty bar charts represent the number 1. Figure generated with R 4.2.2 (code available in supplementary material 3).

jcaa-8-1-201-g6.png
Figure 6

Tree map of the different models seen in our corpus as well as the family of models they belonged to in our categorisation. Figure generated with R 4.2.2 (code available in supplementary material 3).

jcaa-8-1-201-g7.png
Figure 7

(A) The five more represent classes of input data among the reviewed papers, n = 148. (B) Results of the reviewed papers according to the authors or presented results, n = 147. Figure generated with R 4.2.2 (code available in supplementary material 3).

jcaa-8-1-201-g8.png
Figure 8

Alluvial diagram of the different tasks in the analysed studies on the left, the related architecture of machine learning models on the right and the evaluation process in the background. Tasks and architectures poorly represented (n < 5) have been classified as “others”. A study might have applied numerous models, or its research objectives could be classified into more than one task. In such cases, we created multiple entries for each paper where applicable (see supplementary material). Figure generated with R 4.2.2 (code available in supplementary material 3).

jcaa-8-1-201-g9.png
Figure 9

Alluvial diagram of the different tasks in the analysed studies on the left, the related archaeological subfields on the right with the evaluation process in the background. Tasks and subfields poorly represented (n < 5) have been classified as “others”. A study might have been attributed to several subfields, or its research objectives could be classified into more than one task. In such cases, we created multiple entries for each paper where applicable (see supplementary material). Figure generated with R 4.2.2 (code available in supplementary material 3).

jcaa-8-1-201-g10.png
Figure 10

Alluvial diagram of the different tasks in the analysed studies on the left, the related results on the right with the evaluation process in the background. Tasks poorly represented (n < 5) have been classified as “others”. A study might have its research objectives classified into more than one task. In such cases, we created multiple entries for each paper where applicable (see supplementary material). Figure generated with R 4.2.2 (code available in supplementary material 3).

jcaa-8-1-201-g11.png
Figure 11

Workflow adapted to machine learning solutions applied to archaeological problematics. Figure generated with Microsoft Word and Inkscape.

Annexe 1

List of algorithms used present in the papers under review organized study cases reviewed, organised by the approach and family of analysis methods they were categorised in, along with their abbreviations and number of use. In the case the model was compared to others, we highlighted the, number of times he performed used in our corpus, and number of times selected by the authors of a study case as the best algorithm when performing a comparison of various models.

FAMILYMODELACRONYMNB. OF USESNB. OF TIME BEST
Artificial Neural NetworkFeedforward Neural NetworkFNN234
Convolutional Neural NetworkCNN141
Residual Neural NetworkResNet122
Mask Region-based Convolutional Neural NetworkMR-CNN91
Faster Region-based Convolutional Neural NetworkFR-CNN80
Visual Geometry GroupVGG82
U-NetU-Net74
Inception NetworkINC41
AlexNetAlexNet30
RetinaNetRN30
YOLOYOLO30
DeepLabv3+DL320
Semantic Segmentation ModelSegNet20
Adaptive Deep Learning for Fine-grained Image RetrievalADLFIG10
Bidirectional Encoder Representations from TransformerBERT10
Bidirectional Gated Recurrent UnitBiGRU10
Bidirectional Long Short-Term Memory NetworkBiLSTM10
Dynamic Graph Convolutional Neural NetworkDGCNN10
DenseNet201DN20110
Generative Adversarial NetworkGAN10
Jason 2JAS210
Neural Support Vector MachineNSVM10
Region-based Convolutional Neural NetworkR-CNN10
Simple NetworkSimpleNet10
Single Shot MultiBox DetectorSSD10
Bayesian ClassifierNaïve BayesNB110
Maximum EntropyMaxEnt20
Decision Trees and Rule InductionC5.0C5.072
C4.5C4.540
Decision Tree/Classification TreeDT40
Conditional Inference TreesCTREE20
Iterative Dichotomiser 3ID320
Classification And Regression TreeCART10
Fast and Frugal TreeFFT10
Learning with Galois LatticeLEGAL10
Representative TreesREPTree10
Random TreesRT10
Ensemble LearningRandom ForestRF5420
Adaptative BoostAdaBoost20
Stochastic Gradient BoostingSGB21
eXtreme Gradient BoostingXGB21
Bootstrap AgreggatingBAgg10
Discrete Super LearnerdSL10
Fast Random ForestFRF10
Gradient boosting Regression TreeGboostRT10
LogitBoostLB10
Quantile Random ForestQRF10
Sequential Backward Selection-Random Forest RegressionSBS-RFR11
Synthetic Minority Over-sampling Technique BoostSMOTEBoost10
Synthetic Minority Oversampling Technique + Edited Nearest Neighbor RuleSMOTEENN10
Super LearnerSP11
Viola-Jones Cascade ClassifierVL-CC10
Genetic AlgorithmGenetic AlgorithmGA10
Linear ClassifierSupport Vector MachineSVM262
Structured Support Vector MachineSSVM10
Nearest Neighbour Classifierk-nearest neighborskNN191
Weighted k-nearest neighborskkNN30
Polynomial ClassifierSupport Vector Machine with Radial Basis Function KernelSVMr71
Unsupervised Learning and ClusteringAffinity PropagationAF10
Hierarchical Cluster-Based Peak AlignmentCluPA10
Databionic SwarmDBS10
Expectation-Maximisation ClusteringEMC10
Graph-based Semi-Supervised LearningGSSL11
Iterative Closest PointICP10
Iterative Self-Organizing Data AnalysisISODATA10
Nearest CentroidNC10
Simple Linear Iterative ClusteringSLIC10
Self-Organizing MapSOM10
Tilburg Memory-Based LearningTiMBL10
Time series clusteringTSC10
DOI: https://doi.org/10.5334/jcaa.201 | Journal eISSN: 2514-8362
Language: English
Submitted on: Jan 23, 2025
Accepted on: Oct 14, 2025
Published on: Dec 12, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Mathias Bellat, Jordy Didier Orellana Figueroa, Jonathan Scott Reeves, Ruhollah Taghizadeh-Mehrjardi, Claudio Tennie, Thomas Scholten, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.