Lay summary
We used machine learning to reconstruct the past ranges of 65 plant species that were used by prehistoric societies in West Asia around 10,000 years ago, including the ancestors of the first crops. Our results suggest that under the cold, arid conditions of that time these were much less widespread than they are now, and that they became even scarcer just before the emergence of agriculture. The Mediterranean coast of the Levant and, to our surprise, Cyprus and Western Anatolia appear to have been refugia for many economically important plants. Our data introduces an as-yet unexplored new line of evidence on early plant domestication and has implications for how archaeologists interpret the botanical remains they recover from sites, as for example the recovery of a plant from outside its ‘native range’ is often taken as evidence that it was being managed or cultivated. It also gives insights into the broader ecosystems from which the world’s first agricultural systems emerged.
Our approach was based on ‘ecological niche modelling’, a technique widely used by ecologists to model the distribution of species now or predict where they will be in the future (for example, under climate change) but which can also be combined with palaeoclimate data to ‘retro-dict’ where they were in the past. In our case, we used palaeoclimate data from the ‘CHELSA-TraCE21k’ experiment, which took one of the global models used to predict future climate change and essentially turned it backwards. Combining this with large open datasets of where plants have been observed growing today, in relation to the same set of environmental variables, we were able to model a large number of species which we know are found at early agricultural sites in West Asia. We also used data on the species observed at archaeological sites to test the predictions of our models, though this had mixed results – something we intend to probe further in future research.
1. Introduction
The Pleistocene–Holocene transition in West Asia marked a turning point in global environmental history, as humans brought the first plants under cultivation and began modifying surrounding ecosystems to support their own subsistence. West Asia is part of the native range of a remarkable number of domesticable plant species, including wild relatives of wheat, barley, peas, lentils, and other crops of global importance (Harlan and Zohary 1966; Diamond 2002; Zohary, Weiss, and Hopf 2012). These species supported uniquely dense and complex Late Epipalaeolithic (15–11.7 ka) societies (Bar-Yosef 1998; Maher, Richter, and Stock 2012) based on foraging (Harris and Hillman 1989; Colledge 2001; Weiss et al. 2004) and eventually plant management and pre-domestication cultivation (Colledge 2001; Weiss, Kislev, and Hartmann 2006; Harris 2007; Willcox, Fornite, and Herveux 2008). The first agro-ecosystems emerged as these plants were domesticated in the Pre-Pottery Neolithic period (11.7–8.5 ka) and were shaped by the broader broader ecosystems in which they were embedded.
Decades of research in archaeobotany and zooarchaeology have reconstructed the subsistence economies of Late Epipalaeolithic and Neolithic sites in great detail. Together with studies of other environmental archaeological records and a variety of palaeoclimate archives (Jones et al. 2019), they also tell us much about the environments surrounding these settlements. However, each of these sources of evidence is subject to the wide variety of taphonomic and recovery biases that are inherent in any direct record of the past. They are also, by definition, records of the (human) environment at particular times and places. Interpolating these snapshots to give a holistic picture of the regional ecologies is not straightforward – to date, it has tended to rely on non-explicit, inductive modelling. The majority are also filtered through human action, producing a mixed single that makes it difficult to disentangle anthropic effects from the background of environmental change in this period of rapid climatic alteration.
In this paper we present a complementary, deductive approach based on ecological niche modelling. Rather than inferring environmental conditions from preserved physical evidence, we predict the potential niche of individual species relevant to human subsistence based on a model of their current environmental niche and simulations of past palaeoclimate. Though hypothetical, this gives us an independent line of evidence on past ecologies that is independent of the environmental archaeological and palaeoclimatic records. This means that the ancient data can be reserved for assessing the model’s ability to ‘hindcast’ past conditions. In this sense, discrepancies between the two records are perhaps the most interesting result, as they indicate processes affecting one or both records that are not fully accounted for and therefore generate new questions. Our computational approach is also readily scaled up, allowing us to model spatially-explicit palaeodistributions for a large number of species, for the whole region, under multiple reconstructed climate scenarios.
2. Background
The transition to agriculture represents one of the most fundamental changes in human history. West Asia is one of the regions where this process has been studied in the most detail: decades of research have traced the gradual development of a Neolithic way of life and the changes that occurred in the plant species and their geographical distribution as a result. Although archaeobotanical assemblages can be biased due to issues of preservation, sampling, recovery techniques, and lab procedures (Dennell 1976; Hastorf and Popper 1988)—and although they include not just food remains but plant resources that were used for other purposes or arrived at the site accidentally (Hastorf and Popper 1988)—large-scale studies have still revealed coherent patterns in the exploitation of plants over time (Colledge, Conolly, and Shennan 2004; Arranz-Otaegui et al. 2016).
The possibility of an abrupt, geographically-constrained process of plant domestication was proposed in the 1990s (Hillman and Davies 1990, 1992; Heun et al. 1997; Özkan et al. 2002) and developed as an explanatory model in the 2000s (Lev-Yadun, Gopher, and Abbo 2000a; Gopher, Abbo, and Yadun 2001; Abbo et al. 2005). As part of this model, some authors (Lev-Yadun, Gopher, and Abbo 2000a; Gopher, Abbo, and Yadun 2001; Abbo, Lev-Yadun, and Gopher 2010, 2012) argued that eight plant species, collectively referred to as ‘founder crops’ or the ‘Neolithic crop package’ (Zohary and Hopf 1988) were selected and domesticated once, without any phase of pre-domestication cultivation (Abbo, Lev-Yadun, and Gopher 2011, 177). This process could have been rapid under strong artificial selection (Hillman and Davies 1990, 1992) and may have occurred in a single region or ‘core area’ – generally located in southeast Turkey (Ladizinsky and Adler 1976; Heun et al. 1997; Özkan et al. 2002; Özkan et al. 2005; Mori 2003; Luo et al. 2007). From this single point of origin, it was supposed that domesticated or semi-domesticated plants radiated outwards to other regions (Abbo et al. 2006; Kilian et al. 2007; Özkan et al. 2011).
The ‘short gestation’ paradigm was challenged by others (Helbæk 1969; Harris 1989; Kislev 1989; Colledge 2001; Weiss et al. 2004; Willcox, Fornite, and Herveux 2008; Fuller et al. 2018). Helbæk (in Kirkbride 1966) argued that before the appearance of domesticated plants, a phase of cultivation of wild seeds must have taken place. The existence of a phase of cultivation of morphologically wild cereals or ‘pre-domestication cultivation’ was identified in the archaeological record through the study of plant domestication traits such as grain size and shattering v. non-shattering rachises. The archaeobotanical evidence shows that during the Pre-Pottery Neolithic A (PPNA) cereals exhibited sizes similar to those recorded in domestic species, but their dispersal mechanism was still the same as the one present in morphologically-wild species (i.e. shattering, see Kirkbride 1966; Kislev 1989; Hillman et al. 2001; Colledge 2001; Willcox, Fornite, and Herveux 2008). This evidence suggested that wild cereal stands could have been cultivated for as much as a thousand years before non-shattering domestic forms became prevalent in the archaeological record (Tanno and Willcox 2006, 2012; Arranz-Otaegui et al. 2016). Additional archaeobotanical (Colledge 2001; Willcox, Fornite, and Herveux 2008; Willcox, Buxo, and Herveux 2009; Riehl, Zeidi, and Conard 2013; Arranz-Otaegui et al. 2016; Weide et al. 2018; Douché and Willcox 2018; Whitlam et al. 2018) and genetic data (Badr et al. 2000; Molina-Cano et al. 2005; Kilian et al. 2007; Özkan et al. 2011; Iob and Botigué 2023) in recent years has further challenged the short-gestation model to explain the origins of plant domestication and agriculture in West Asia.
Similarly, the concept of a limited set of eight ‘founder crops’ (Zohary and Hopf 1988) that were the first species cultivated, domesticated and then spread as the basis of Neolithic agricultural systems, is not supported by the latest evidence. Our previous analyses of the composition of available archaeobotanical datasets shows that these crops were of marginal importance during the Epipalaeolithic period (Arranz-Otaegui et al. 2018) and that Neolithic subsistence did not rely either solely or primarily on the exploitation of these species (Arranz-Otaegui and Roe 2023). Instead, multiple species of grasses, legumes, fruits, nuts, and other plants were exploited over the Late Pleistocene-Early Holocene transition in southwest Asia.
2.1 Biogeography and agricultural origins
The study of the natural distribution of the progenitors of domesticated crops has been a central part of discussions on the origins of agriculture and plant domestication from the beginning. von Humboldt (1807) acknowledged the importance of the natural distribution of wild species to explain the origin and domestication of crops like spelt and rye. Candolle (1886) integrated the study of plant ecology and biogeography and influenced Darwin (1859), who later reflected in detail about the geographical distribution of plants and species diversity. At that time, there was intense debate about whether there were single or multiple “centres of creation” of species. Researchers aimed to evaluate whether plant and animal species emerged in the same locations where they were currently distributed.
After Darwin, this early interest in crop origins evolved into more specific discussions about the “centres of plant domestication”. Vavilov (1926) was among the first to seek to determine the number of regions in which plants had been independently domesticated (Harris 1990). His main method was ‘differential phytogeography’: he classified the variation within a crop and established the regions of maximum diversity, to locate the geographic regions in which crops originated. Using this method, Vavilov suggested that there were at least eight centres of origin. His work was later criticised by Harlan (1971), who argued that ‘centres of origin’ and ‘centres of diversity’ had to be separated. For Harlan, a ‘centre’ was as an “area in which things originate and out of which things are dispersed” (Harlan 1971, 468), and he suggested that three main centres of origin of domesticated crops existed. He further indicated that Vavilov’s approach to the question was simplistic and that more data proxies had to be considered (e.g. archaeology, history, geology), an approach more in the tradition of Candolle (1886). Indeed, the inclusion of archaeobotany and genetics in the last decades, together with the study of wild relative distributions has been fundamental in characterising the origins of agriculture (Fuller and Colledge 2008). As a result of modern interdisciplinary studies, the number of recognised centres of plant domestication has increased considerably, from the three centres suggested by Harlan in 1971 to the six to eight centres argued for in the 1990s (Smith 1995) and up to as much as 24 potential centres reported in 2009 (Purugganan and Fuller 2009; see also Fuller 2010).
Biogeographic research into the centres of origin and/or domestication of crops has also long informed broader understanding of the process of agricultural origins. In Man Makes Himself, Childe (1936) correctly located the centre of origin of European agriculture in the ‘Fertile Crescent’ of West Asia (unlike for example Pumpelly (1908) before him). This was not based on the region’s prehistoric archaeological record, which at the time had only been cursorily explored. Instead he was guided to the region by biogeographic work by Vavilov and Peake and Fleure (1927); only later was this prediction validated by archaeological work on the Epipalaeolithic and Neolithic of Palestine (Boyd 2018). In subsequent decades, the search for more precise origin zones of specific domestic plants relied on the assumption that “the locus of domestication of a wild plant would presumably be within its area of original distribution in the wild state” (Butzer 1971; paraphrasing Helbæk 1959) – and that this “natural habitat” has not changed significantly over the last 12,000 years (Butzer 1971).
Contemporary research on crop origins was pioneered by Harlan and Zohary, who compared the current distribution of the wild progenitors of domesticated plants in southwest Asia (Harlan and Zohary 1966; D. Zohary 1969; M. Zohary 1973; Zohary and Hopf 1973) to the rapidly-expanding archaeobotanical record (Harlan 1971, 1977; see also Zohary and Hopf 1988; Harlan and Zohary 1966). They both were interested in evaluating which were the wild ancestors of domesticated crops and studying their natural distribution to understand their domestication process (D. Zohary 1969; M. Zohary 1973; Zohary and Spiegel-Roy 1975). Indeed, the natural distribution of the wild relatives of domestic plant species was later used as a criterion to infer ‘pre-domestic cultivation’ in the archaeological record. For example, the presence of seeds of chickpea at Jericho led M. Hopf (1986) to interpret the remains as cultivars, as the natural distribution of the wild form of chickpea was located further to the north. The same rationale was applied to the einkorn remains found at several Pre-Pottery Neolithic sites in the southern Levant (Hopf 1969; Colledge 2001), as the wild progenitors of this species was thought to be restricted to the northern Levantine area (Heun et al. 1997; Zohary, Weiss, and Hopf 2012). The same idea—presence of plants outside their natural range—has been repeated in the literature more recently by several other authors (Tanno and Willcox 2006; Willcox, Fornite, and Herveux 2008; Hillman et al. 2001).
Despite the importance of biogeography in the development and validation of hypotheses regarding the origins of agriculture, there have been few studies of the wild range of specific crop progenitors or other relevant plant species (cf. for domestic animals, e.g. Yeomans, Martin, and Richter 2017). Observations regarding translocation or range expansion must therefore rely on a relatively rough and ahistoric notion of a species’ ‘natural distribution’ – that is, one based primarily on contemporary or recent-historic occurrences. Yet we know there has been considerable climatic and environmental change in West Asia since the terminal Pleistocene (Jones et al. 2019), so it is very unlikely that these ranges were in fact static. Reconstructions of broader environments have attempted to trace their fluctuations through time, either for the entire region (e.g. van Zeist and Bottema 1991; Hillman in Moore, Hillman, and Legge 2000) or parts of it (e.g. Cordova 2007), but these are at the level of the vegetation zone rather than individual species. They also invariably rely on what might be called ‘expert interpolation’ (where the author composes a map based on his or her own knowledge of the relevant data) rather than an explicit modelling process. This makes it difficult, if not impossible, for users of such reconstructions to understand exactly how they were derived or what could explain, for example, the significant discrepancies between the predictions of different experts.
2.2 Ecological niche modelling in archaeology
Ecological niche modelling or species distribution modelling is widely used by ecologists to mathematically predict the geographic distribution of a species from a sample of occurrences (Franklin and Miller 2009; Sillero et al. 2021). Essentially, it involves combining records of where an organism has been observed with environmental data (climate, topography, etc.) for those locations to model the range of environmental values at which that species – its environmental niche. This model can then be used to predict the potential niche of the organism in question either in the same or a different environment. Townsend Peterson and Soberón (2012) suggests reserving the term ‘species distribution modelling’ for when the method is used to recover the verifiable range of a species in a real and existing environment, and using ‘ecological niche modelling’ as the broader term covering hypothetical or predictive applications – a convention we follow here when referring to predictive or ‘hindcast’ models of past niches. Within this overarching framework, ecological niche modelling encompasses a wide range of applications and a variety of potential environmental predictors, modelling approaches, and methodologies, which we will not attempt to review here.
Ecological niche modelling has long been of interest to archaeologists as both a means of exploring the biological niche of humans and for reconstructing the past environments they inhabited (Polly and Eronen 2011; Franklin et al. 2015). In the first sense, it has been used most extensively to model the niche of humans and other hominin species (e.g. Benito et al. 2017; Yousefi et al. 2020; Banks et al. 2021; Yaworsky, Hussain, and Riede 2024; Yaworsky, Nielsen, and Nielsen 2024; Guran et al. 2024), especially in the Palaeolithic. This overlaps with what archaeologists usually call generically ‘predictive modelling’ (Verhagen and Whitley 2020)—or more precisely ‘site distribution modelling’—which is essentially the same approach as (and often borrows methodologies from) ecological niche modelling but applied to the occurrence of archaeological sites. Here what is modelled is not strictly a biological niche alone, but also aspects of human geography, taphonomy, and archaeological visibility. These applications can be distinguished from ‘palaeoecological niche modelling’, where the object of model remains, as in ecology, a non-human biological niche.
Franklin et al. (2015) review palaeoecological niche modelling and advocate for its greater adoption in environmental archaeology. In an early application to West Asia, Conolly et al. (2012) used the occurrence of wild and domestic Bos remains at prehistoric archaeological sites to map the evolving niche of cattle over the Pleistocene–Holocene transition. It has been used to model the availability of fauna exploited by humans at wider scales (e.g. de Andrés-Herrero, Becker, and Weniger 2018; Yaworsky, Hussain, and Riede 2023) and, in a West Asian context, of foraged plant resources in the landscape around the Neolithic sites on the Konya Plain (Collins et al. 2018). Modelling the spread of crops has been another significant archaeological application (e.g. Krzyzanska et al. 2022; Krzyzanska 2023), though not as yet applied to West Asia.
In the majority of studies to date (palaeo)ecological niche modelling has been applied to archaeological data in an ‘inductive’ fashion, i.e. faunal and botanical remains from ancient sites are used as the occurrence dataset for training a model using either past or present environmental data. However, both the zooarchaeological and archaeobotanical records are sparse and subject to a complex array of depositional, taphonomic and recovery biases, many of which are not fully understood and/or cannot be corrected for. This means that while the archaeological attestation of the presence of a species might generally be relied upon, it is highly unlikely that its absence is representative of true past distributions.
The alternative approach is to train the model using contemporary occurrence and environmental data and then use palaeoenvironmental data to ‘hindcast’ its predictions backwards in time. Like Franklin et al. (2015), we view the hindcasting approach as more promising, because training datasets for both occurrences and environment are far more readily available, complete and reliable for the present than the past. There is some scepticism in the ecological niche modelling literature about the ability of such models to make accurate predictions in unknown environments (like the past, Franklin et al. 2015), but here the hindcasting approach also presents an opportunity: it reserves archaeological occurrence data as an independent dataset that can be used to assess the retrodictive performance of the model. This possibility was suggested by Franklin et al. (2015) but to our knowledge our study represents the first attempt to actually do so.
The major practical limitation of the hindcasting approach is that it relies on spatially explicit, high resolution palaeoenvironmental surfaces with continuous coverage of the region and periods of interest. Until recently, this has not been widely available for most applications, which is perhaps why only a minority of studies use it (cf. Krzyzanska et al. 2022; Yaworsky, Hussain, and Riede 2023). In this study, we are able to take advantage of the increasing availability of high resolution, global palaeoclimate data derived from simulation experiments with general circulation models of climate (Brown et al. 2018; Brown et al. 2020; Karger et al. 2023).
3. Data and model
The aim of our study was to model the biogeography of species relevant to human subsistence economies in West Asia (excluding the Southern Arabian peninsula, see Figure 1) during the archaeological Late Epipalaeolithic (15–11.7 ka) and Pre-Pottery Neolithic (11.7–8.3 ka) periods. Based on current understandings, we assume that plant-based subsistence during this period was broad, geographically- and temporally-varied, and reflects a gradual, geographically decentred, and nonlinear transition to greater reliance on cultivars (i.e. agriculture, see Section 2). Our starting point was a list of 68 taxa (Table 1) comprising the identified species observed at at least 3 Late Epipalaeolithic/Pre-Pottery Neolithic sites, according to our previous study of the regional archaeobotanical data (Arranz-Otaegui and Roe 2023). This was based on dataset collated from three previously published regional archaeobotanical databases: ADEMNES (Riehl and Kümmel 2005), ORIGINS (Wallace, Livarda, et al. 2018), and COMPAG (Lucas and Fuller 2018; Fuller et al. 2018; based on Colledge, Conolly, and Shennan 2004; Shennan and Conolly 2007). We did not attempt to distinguish between the source of the remains (cf. Wallace, Jones, et al. 2018); archaeobotanical assemblages are subject to a variety of preservational and recovery biases, so by no means were all the species on our list consumed or even deliberately collected by people. However, we assume that their presence at a site of human settlement at least implies that they were part of the wider ecosystem that supported habitation there.

Figure 1
Map of the study region (West Asia, grey box) with locations of Late Epipalaeolithic and Pre-Pottery Neolithic archaeobotanical assemblages.
Table 1
Summary of species modelled.
| TAXON | OCCURRENCES | MODEL | ||||
|---|---|---|---|---|---|---|
| ARCH. | CUR. | ACC. | ROC-AUC | SENS. CUR. | SENS. ARCH. | |
| Aegilops crassa | 4 | 223 | 0.98 | 0.98 | 0.26 | 0.00 |
| Aegilops speltoides | 0 | 448 | 0.99 | 0.99 | 0.77 | — |
| Aegilops tauschii | 0 | 1257 | 0.97 | 0.99 | 0.84 | — |
| Aizoanthemopsis hispanica1 | 4 | 762 | 0.99 | 1.00 | 0.87 | 0.50 |
| Ammi majus | 4 | 10337 | 0.97 | 0.99 | 0.98 | 0.00 |
| Androsace maxima | 17 | 4946 | 0.94 | 0.98 | 0.90 | 0.00 |
| Arenaria serpyllifolia | 3 | 101617 | 0.98 | 0.99 | 1.00 | 0.00 |
| Arnebia decumbens | 21 | 617 | 0.97 | 0.97 | 0.59 | 0.05 |
| Arnebia linearifolia | 13 | 239 | 1.00 | 1.00 | 0.83 | 0.00 |
| Atriplex prostrata | 4 | 80186 | 0.99 | 0.99 | 0.99 | 0.00 |
| Avena sterilis | 5 | 15906 | 0.96 | 0.99 | 0.98 | 0.40 |
| Bassia arabica | 4 | 262 | 1.00 | 0.99 | 0.88 | 0.00 |
| Bolboschoenus glaucus | 5 | 418 | 0.98 | 0.97 | 0.63 | 0.00 |
| Bolboschoenus maritimus2 | 31 | 53813 | 0.98 | 1.00 | 0.99 | 0.00 |
| Brachypodium distachyon | 5 | 13800 | 0.98 | 0.99 | 0.99 | 0.20 |
| Bromus sterilis | 3 | 89764 | 0.99 | 0.99 | 1.00 | 0.00 |
| Buglossoides arvensis | 23 | 23004 | 0.95 | 0.98 | 0.98 | 0.09 |
| Buglossoides tenuiflora | 26 | 399 | 0.99 | 0.98 | 0.79 | 0.04 |
| Capparis spinosa | 4 | 8617 | 0.96 | 0.99 | 0.95 | 0.00 |
| Carex divisa | 9 | 7400 | 0.96 | 0.99 | 0.97 | 0.00 |
| Chenopodium album | 5 | 184412 | 0.99 | 0.99 | 1.00 | 0.00 |
| Cicer reticulatum3 | 3 | 52 | 1.00 | 1.00 | 1.00 | 0.00 |
| Citrullus colocynthis | 4 | 1823 | 0.94 | 0.96 | 0.71 | 0.00 |
| Crithopsis delileana | 3 | 430 | 1.00 | 1.00 | 0.93 | 0.00 |
| Euclidium syriacum | 3 | 338 | 0.98 | 0.96 | 0.31 | 0.00 |
| Ficus carica | 8 | 64440 | 0.98 | 0.99 | 0.99 | 0.38 |
| Fumaria densiflora | 4 | 3704 | 0.96 | 0.99 | 0.95 | 0.00 |
| Gypsophila elegans | 3 | 1613 | 0.96 | 0.99 | 0.83 | 0.00 |
| Gypsophila pilosa | 5 | 348 | 0.99 | 0.99 | 0.69 | 0.00 |
| Gypsophila vaccaria4 | 8 | 1177 | 0.95 | 0.96 | 0.69 | 0.00 |
| Henrardia pubescens | 3 | 15 | — | — | — | — |
| Hordeum bulbosum | 4 | 4134 | 0.97 | 0.99 | 0.94 | 0.00 |
| Hordeum murinum | 5 | 74770 | 0.98 | 0.99 | 1.00 | 0.00 |
| Hordeum spontaneum6 | 76 | 3098 | 0.99 | 1.00 | 0.98 | 0.21 |
| Lathyrus aphaca | 4 | 18652 | 0.96 | 0.99 | 0.98 | 0.00 |
| Lathyrus oleraceus7 | 8 | 819 | 0.98 | 0.99 | 0.78 | 0.00 |
| Lathyrus sativus | 4 | 2120 | 0.93 | 0.97 | 0.77 | 0.00 |
| Lepidium perfoliatum | 3 | 2764 | 0.94 | 0.98 | 0.84 | 0.00 |
| Linum bienne8 | 14 | 12824 | 0.98 | 0.99 | 0.99 | 0.07 |
| Lolium rigidum | 5 | 10620 | 0.97 | 0.99 | 0.98 | 0.00 |
| Lolium temulentum | 3 | 4463 | 0.94 | 0.98 | 0.94 | 0.00 |
| Medicago radiata | 20 | 834 | 0.98 | 1.00 | 0.90 | 0.30 |
| Phalaris paradoxa | 3 | 3287 | 0.96 | 0.99 | 0.94 | 0.00 |
| Phragmites australis | 4 | 373066 | 0.99 | 0.99 | 1.00 | 0.00 |
| Pistacia atlantica | 6 | 2419 | 0.98 | 1.00 | 0.93 | 0.33 |
| Poa bulbosa | 5 | 30769 | 0.96 | 0.98 | 0.99 | 0.20 |
| Polygonum arenarium9 | 7 | 255 | — | — | — | — |
| Polygonum corrigioloides | 6 | 125 | — | — | — | — |
| Prosopis farcta | 5 | 2512 | 0.99 | 1.00 | 0.96 | 0.00 |
| Rumex pulcher | 6 | 17606 | 0.97 | 0.99 | 0.99 | 0.00 |
| Salsola kali | 6 | 14701 | 0.98 | 1.00 | 0.99 | 0.00 |
| Salvia absconditiflora10 | 3 | 128 | 1.00 | 1.00 | 0.81 | 0.00 |
| Secale cereale | 4 | 14813 | 0.95 | 0.98 | 0.98 | 0.00 |
| Secale strictum11 | 3 | 162 | 0.99 | 0.98 | 0.62 | 0.00 |
| Suaeda fruticosa | 3 | 577 | 0.99 | 0.99 | 0.76 | 0.00 |
| Taeniatherum caput-medusae | 4 | 1969 | 0.97 | 0.99 | 0.90 | 0.00 |
| Triticum aestivum12 | 4 | 217 | 0.99 | 0.96 | 0.50 | 0.00 |
| Triticum durum | 3 | 65 | 0.99 | 0.95 | 0.11 | 0.00 |
| Triticum monococcum13 | 47 | 870 | 0.97 | 0.99 | 0.80 | 0.04 |
| Triticum turgidum14 | 53 | 345 | 0.98 | 0.91 | 0.25 | 0.00 |
| Triticum urartu | 0 | 420 | 1.00 | 1.00 | 0.95 | — |
| Verbena officinalis | 3 | 89655 | 0.99 | 0.99 | 1.00 | 0.00 |
| Vicia ervilia | 26 | 1924 | 0.95 | 0.98 | 0.84 | 0.27 |
| Vicia faba | 7 | 42644 | 0.97 | 0.99 | 0.99 | 0.29 |
| Vicia narbonensis15 | 3 | 2826 | 0.95 | 0.99 | 0.90 | 0.33 |
| Vicia orientalis16 | 16 | 41 | 1.00 | 0.97 | 0.14 | 0.00 |
| Vitis sylvestris | 3 | 63 | 1.00 | 1.00 | 0.50 | 0.00 |
| Zygophyllum fabago | 3 | 3554 | 0.97 | 0.99 | 0.94 | 0.00 |
[i] 1Including Aizoon hispanicum.
2Including Scirpus maritimus.
3Including Cicer arietinum.
4Including Vaccaria pyramidata.
5Excluded from modelling due to sample size.
6Including Hordeum vulgare.
7Including Pisum sativum and Pisum elatius.
8Including Linum usitatissimum.
9Including Polygonum venantianum.
10Including Salvia cryptantha.
11Including Secale montanum.
12Including Triticum spelta and Triticum aestivocompactum.
13Including Triticum boeoticum.
14Including Triticum aestivum, Triticum dicoccum, and Triticum dicoccoides.
15Including Vicia narbonense.
16Including Lens culinaris and Lens orientalis.
The taxonomic identifications of archaeobotanical material given in our source databases were previously controlled to ensure consistency between sources and to remove taxa that cannot be reliably distinguished (for details see Arranz-Otaegui and Roe 2023). Taxonomic names were then matched to the canonical form specified in the GBIF Backbone Taxonomy (GBIF Secretariat 2023) so they could be related to modern occurrences. Archaeologically-attested domestic species meeting our inclusion criteria were substituted for their wild progenitors (where different) when gathering occurrence data, since the domestic forms are now widespread and presumably uninformative of the species’s original niche.
3.1 Occurrence data
Georeferenced occurrence data for West Eurasia between 0 and 60° of latitude was obtained from the Global Biodiversity Information Facility (GBIF) using via and the R package ‘rgbif’ (Chamberlain and Boettiger 2017; Chamberlain et al. 2024). The GBIF dataset (GBIF 2025a) excluded fossil occurrences, recorded absences, and records with missing or dubious coordinates. Although niche models have reasonable predictive power even with small training samples (Stockwell and Peterson 2002; Hernandez et al. 2006; Wisz et al. 2008), we did not attempt to model three taxa with less than 40 usable occurrences, following recommendations for niche models generally and Random Forest-based models specifically (Stockwell and Peterson 2002; Luan et al. 2020). Multiple records of the same taxon at the same coordinate were discarded because they do not impart information to the model. The resulting cleaned dataset used to train our niche models comprises 3,392,920 occurrences from 4769 constituent datasets.
GBIF is currently the best available general-purpose occurrence dataset for the West Asia region, its coverage is uneven both geographically and from species to species. The Southern Levant, and Israel specifically, is significantly more densely sampled than other parts of West Asia (Figure 2).

Figure 2
Georeferenced occurrence records from West Eurasia used to train models (N = 1412083). Inset, right: prediction region (West Asia).
Random Forest is a presence–absence approach to niche modelling and therefore requires not just data on where a species is present, but where it is definitely not present. However ‘absence’ data is rarely available because it requires exhaustive survey. In practice, most applications of niche modelling are ‘presence-only’ and, where absence data is required (as for Random Forest), it is supplied as a random background sample of points.1 The purpose of this sample is to inform the model about the nature of the underlying environment. The stochastic generation process means that some of these points will overlap or fall close to presences, so ensuring the model is not overly influenced by background samples is critical to its predictive importance (Valavi et al. 2022). Here we follow the advice of Barbet-Massin et al. (2012) for regression-based species distribution models and use a large (≈10000) uniform sample of points from across the land area of the study region. These points are then weighted equally against the presences in the regression to produce a ‘balanced Random Forest’ (Valavi et al. 2022).
3.2 Predictor data
We modelled the occurrence of species as a function of 24 geospatial predictor variables. These included:
Sixteen ‘bioclimatic’ variables derived from monthly temperature and precipitation values, following standard practice for species distribution models (Hijmans et al. 2005). Contemporary bioclimatic predictor data for West Asia was extracted from the global CHELSA dataset (Karger et al. 2017), which predicts temperature and precipitation from downscaled general circulation model output at 1 km resolution.
Terrain aspect and slope, which at high resolution perform well as proxies for solar radiation when modelling plant occurrence (Austin and Van Niel 2011; Leempoel et al. 2015); and the topographic wetness index (TWI), which serves as a proxy for soil moisture and is particularly important in modelling arid environments (Kopecký and Čížková 2010; Campos et al. 2016; Di Virgilio et al. 2018). All three were derived from the SRTM30+ digital elevation model using algorithms from WhiteboxTools (Lindsay 2016).
Edaphic data from SoilGrids (Hengl et al. 2014, 2017), which improves model performance for plants (Dubuis et al. 2013; Mod et al. 2016; Velazco et al. 2017). Based on a recent assessment of the reliability of SoilGrids data for species distribution modelling (Miller, Blackwood, and Case 2024), we used a subset of four variables relating to soil texture (clay, silt, sand) and pH at the surface (0-5 cm depth).
For hindcasting, we used reconstructed bioclimatic data for three key climatologies generated from downscaled paleoclimate simulations from the HadCM3 general circulation model (Fordham et al. 2017; Brown et al. 2018): the Bølling–Allerød (c. 14.7–12.9 ka), the Younger Dryas (c. 12.9–11.7 ka), and the Early Holocene (11.7–8.3 ka). Terrain and soil predictors were held constant, since reconstructions of these variables in the past are not available at sufficient scale. It is unlikely that either macroscale topography or soil characteristics have altered significantly over the period of time considered here, so we assume that this does not degrade model performance, and may in fact benefit it by providing ‘anchoring’ predictors that are independent of climate change.
For training, test predictions, and archaeological predictions predictor data was left in its native projection and resolution. For hindcast palaeodistributions, it was transformed to common equal-area projection and resolution of 5 km.
3.3 Random Forest
Ecological niche modelling is a classification problem that can be approached with a wide range of statistical methods. A substantial literature exists on the relatively performance of these approaches and their respective parameterisations (reviewed in Valavi et al. 2022). Random Forest, a widely-used machine learning algorithm, is amongst the best performing methods for presence-only species distribution models, providing it is appropriately parameterised to account for the class imbalance between presence and background samples (Valavi et al. 2021, 2022). For our application, it also has the advantage of requiring little to no manual parameter tuning to achieve good predictive results, which makes it easier to model a larger numbers of taxa.
For each taxon we trained a classification model to predict occurrence (presence or absence/background) based on up to 24 predictor variables (Section 3.2). Highly correlated (Pearson’s r > 0.7) predictors were removed on a taxon-by-taxon basis, to mitigate issues of overfitting due to colinearity (Dormann et al. 2013), as were redundant predictors with zero variance. We used the Random Forest algorithm implemented in the R package ‘ranger’ (Wright and Ziegler 2017) and the ‘tidymodels’ (Kuhn and Silge 2022) framework for data preprocessing and model selection. To avoid overfitting, we follow Valavi et al. (2021) in their recommended hyperparameters and use of down-sampling to balance presence and background samples. Models for each taxon were fit independently, with redundant zero-variance predictors excluded, and assessed based on balanced training (¾) and test (¼) partitions.
4. Model assessment
We trained Random Forest models for 65 taxa using contemporary occurrence data from GBIF, a random sample of background points, and the predictor variables described in Section 3.2. Substituting the “current” climate predictors for those derived from palaeoclimatic simulations (Brown et al. 2018), we could then generate hindcast predictions for reconstructed past environments in 4 key climate periods – a total of 260 modelled palaeodistributions. Predicted distributions for individual taxa are presented in the appendix and accompanying material.
We assessed the predictive performance of the fitted niche models in the contemporary environment based on the reserved test partition (Table 1). Model accuracy (proportion of correctly classified presence and background samples) ranged between 93% and 100%, with an average of 98%. Sensitivity (proportion of correctly classified presence samples) ranged between 11% and 100%, with an average of 83%. The area under the models’ receiver operating characteristic curves (ROC-AUC) was on average 0.986±0.017. Model sensitivity is loosely correlated with the number of occurrences available for training (Figure 3), with the worst-performing models all having less than 1000 recorded occurrences: Triticum durum, Euclidium syriacum, Triticum turgidum, Aegilops crassa, and Vicia orientalis. Test metrics and ROC curves for the individual models are included in the appendix.

Figure 3
Model sensitivity by number of training occurrences.
The ability of the hindcast models to predict the occurrence of specific species at archaeological sites is worse, with just 8% of presences in archaeobotanical assemblages successfully predicted at a threshold of 0.5. Model sensitivity (proportion of corrected predicted presences) in relation to the archaeological data is on average 0.06±0.12 (Table 1). A full assessment of the hindcasting performance of the individual models can be found in the appendix.
In summary, the majority of our models perform poorly as simple binary classifiers with a threshold of P(present) > 0.5. However, this is an arbitrary threshold; lowering it can improve sensitivity (proportion of occurrences correctly predicted) at the cost of specificity (proportion of non-occurrences incorrectly predicted). There are several approaches to selecting informative thresholds for binary classification in the ENM literature (Liu et al. 2005; Liu, White, and Newell 2013; Liu, Newell, and White 2016), with most aiming for an optimum compromise between sensitivity and specificity. We were unable to find a thresholding technique that significantly improved the hindcast models’ performance as binary classifiers. By definition, binarizing model output also discards information from the underlying probabilistic prediction, so we have opted to present results probabilistically as far as possible. In principle the archaeological occurrences themselves could be used to ‘calibrate’ the hindcast models and select an optimum threshold, but there aren’t enough archaeological attestations of most of the modelled species for this to be practicable at this time.
5. Discussion
5.1 Reduction in potential niche sizes over the Pleistocene/Holocene boundary
Our reconstructed palaeodistributions (shown in full in the appendix) indicate that the majority of species had significantly different potential geographic niches in the Late Pleistocene/Early Holocene compared today. 55 of 65 species are predicted reduced niches in the past; 52 of more than 10% or more. Though the magnitude of the change in potential niche size from prehistory to the present likely reflects a degree of overfitting in the model (discussed further in Section 5.3), fluctuations in modelled niche size between the Bølling-Allerød (14.7–12.9 ka), Younger Dryas (12.9–11.7 ka), and Early Holocene (11.7–8.3 ka) are more directly comparable (Figure 4). The average potential niche size of modelled species was 35% smaller in the Early Holocene compared to the Bølling–Allerød, and 22% smaller during the Younger Dryas. This perhaps indicates that although this period is considered one of climatic amelioration globally (Jones et al. 2019), the colder conditions of the Pleistocene may have supported more extensive plant-based economies in West Asia specifically.

Figure 4
Distribution of predicted species niche size by period (logarithmic scale). Dashed lines indicate the median niche size.
Many taxa that occur (or are predicted to occur) across the ‘hilly flanks’ today—including most crop progenitors—are reconstructed to have had a significantly more restricted distribution in the terminal Pleistocene/Early Holocene (Figure 5). These include Ficus carica (fig); Hordeum spp. (wild barleys); Lathyrus aphaca and L. sativus (both marginally edible legumes); Triticum aestivum compactum (in the N. Levant), T. monococcum aegilopoides, T. durum, and Triticum urartu (but not the other wheat progenitor, T. turgidum dicoccum – see Section 5.2); Aegilops speltoides, but not Aegilops tauschii (goatgrasses); and Vicia spp. (vetches), including Vicia faba (broad beans). Most of Anatolia, Northern Mesopotamia, and the Zagros Mountains in particular disappear from the predicted niches of these species, leaving the Levant and to a lesser extent the Aegean and Cyprus as refugia.

Figure 5
Predicted species richness (sum of predicted ranges) by period.
Our results for the Levant are consistent with the current understanding of this region as developing early intensive foraging economies (the Natufian culture, Bar-Yosef 1998) and as a centre of origin of agriculture (Zeder 2011). Within the Levant, many species show moderate reductions in potential niche size over the Pleistocene/Holocene boundary, retreating from the Badia/Transjordan region (e.g. barley, Figure 8).
Loss of the Northern Mesopotamia–Anatolia region from the predicted potential niches of crop progenitors is interesting in light of the ‘golden triangle’ hypothesis (Lev-Yadun, Gopher, and Abbo 2000b; Kozłowski and Aurenche 2005; Abbo, Lev-Yadun, and Gopher 2010), which puts this region at the centre of the development of agriculture and plant domestication. Multiple lines of archaeological evidence have emerged that point away from this hypothesis and towards a more geographically diverse origin (Asouti 2006; Fuller, Willcox, and Allaby 2011; Arranz-Otaegui et al. 2016), and our reconstructions are also consistent with the late arrival of intensive plant-based foraging economies in this region (cf. the Natufian of the Levant).
The near-absence of the Zagros in any predicted niches is also surprising, given mounting evidence that animal domestication took place just as early in the eastern Mashriq as it did in the west (Zeder 2024). We consider that the most likely explanation for this is that our flora does not include the species that were most important to plant subsistence in the east. Archaeobotanical data on Neolithic sites in the Zagros is limited (compared to the Levant in particular) due to a hiatus in field research there from the 1980s to early 2010s (Zeder 2024). Recent research (Riehl, Zeidi, and Conard 2013; Weide et al. 2017, 2018; Whitlam et al. 2018; González Carretero et al. 2023) indicates that plant subsistence in this region was based on a distinct set of species than that of the Levant and Anatolia.
Cyprus and the Aegean are not conventionally considered part of the primary zone of domestication but rather amongst the first regions that acquired agriculture from West Asia. Our analysis complicates this picture, as it indicates that the wild distribution of many crop progenitors included these regions. Early examples of several domesticates are recorded at sites in Cyprus, Western Anatolia and Greece (Arranz-Otaegui and Roe 2023), and the Aegean region was probably connected to West Asia by a land bridge via Anatolia until the Early Holocene (Aksu and Hiscott 2022). Were these areas part of the same broader ‘interaction sphere’ that produced Neolithic agriculture in West Asia?
Exceptions to the dominant trend of niche size reduction include Cicer reticulatum (wild chickpea), which has a relatively stable potential niche centered on Northern Mesopotamia; and Triticum turgidum dicoccum (wild emmer wheat), which is predicted to occur in two limited zones centered around the Black Sea Coast of Anatolia and the Palmyra basin. In the latter case, neither of these areas are part of the predicted (by our model) modern distribution of wild emmer, which is centered around the Caucasus and Northern Mesopotamia. But it would be consistent with archaeological evidence for early cultivation at sites in the Upper Euphrates (Willcox 2024).
5.2 Biogeography of crop progenitors
Almost all the cereal and legume crop progenitors we modelled are predicted to have only been found in the Levant during the terminal Pleistocene and Early Holocene (see appendix). Part of this may be do with the fact that both our initial flora and training occurrence dataset have a strong bias towards the southern Levant, but the modelled current potential niche of these plants do tend to include Anatolia and the Zagros, so this cannot be the only factor. It has previously been proposed that the Southern Levant was an “important glacial refuge area for wild cereals” (Roberts et al. 2018), which accords with these results.
One notable exception is the wild ancestor of chickpea (Cicer reticulatum), which is predicted to have a distribution centered on Northern Mesopotamia but encompassing much of the ‘hilly flanks’ (except the southern Levant, Figure 6). Another is rye (Secale cereale), which is inferred to be primarily Anatolian (Figure 6). This is perhaps relevant to rye’s unusual domestication history, as a crop of West Asian origin that was intensively exploited (Hillman et al. 2001; Douché and Willcox 2018) but apparently not first cultivated until much later than the ‘founder crops’, in Europe (Schreiber et al. 2021).

Figure 6
Predicted palaeodistribution of wild chickpea and rye in the Early Holocene (11.7–8.2 ka).
Flax (Linum bienne) is predicted to have had a highly concentrated distribution in Cyprus and along the Mediterranean coast of the southern Levant (Figure 7). This is consistent with its low ubiquity in archaeobotanical assemblages (Arranz-Otaegui and Roe 2023), despite conventionally being considered a ‘founder crop’, and presumably implies that its domestication was similarly geographically constrained. It is the only unambiguous crop progenitor with such a restricted potential niche, though pistachio (Pistacia atlantica) and clubrush (Bolboschoenus maritimus) are similarly constrained to the Mediterranean coast (and North Africa, in the case of pistachio) (Figure 7). This is despite the fact that they are well-attested in the archaeobotanical record from across West Asia.

Figure 7
Predicted palaeodistribution of flax, pistachio and clubrush in the Early Holocene (11.7–8.2 ka).
Wild barley (Hordeum spontaneum) and its relatives show a contraction of their predicted potential niche from the Pleistocene to the Holocene, concurrent with it being brought into cultivation (Figure 8). It also sees a marked decline in the archaeobotanical record from the Early PPNA/Early PPNB (where it was amongst the most common taxa) to the Late PPNB and Late Neolithic (Arranz-Otaegui and Roe 2023). Pistachio (Pistacia atlantica) shows similar trends, but it is less certain that this species was managed in the Neolithic. Conversely, the various wild wheat species native to West Asia show almost no response to Pleistocene/Holocene climate change, even within the Levant, and in the archaeobotanical record wheat displays the opposite trend to barley and pistachio – becoming gradually more abundant through the course of the Neolithic and dominant by its end (Arranz-Otaegui and Roe 2023).

Figure 8
Predicted palaeodistribution of wild barleys.
Bread wheat (Triticum aestivum), the most common wheat cultivar today, has a complex ancestry that involves two recent hybridisation events (Levy and Feldman 2022): most recently between domestic emmer (Triticum turgidum dicoccum) and a goatgrass (Aegilops tauschii) c. 9 ka, and before that, in emmer, between wild red einkorn (T. urartu) and another goatgrass (Aegilops speltoides). The predicted potential niches of the two recent progenitors—T. turgidum dicoccum and A. tauschii—have relateively limited areas of overlap. These include the Syrian Jazira (Figure 9), close to the only two sites in our archaeobotanical database where bread wheat (T. aestivum) co-occurs with both of the recent progenitors (A. tauschii and T. turgidum dicoccum): Tell Abu Hureyra and El Kowm II (Arranz-Otaegui and Roe 2023).2 Taken together this suggests an origin of domesticated bread wheat, otherwise only loosely geographically constrained to the Levant–Upper Euphrates corridor (Levy and Feldman 2022), in this vicinity.

Figure 9
Combined predicted palaeodistributions of bread wheat progenitors.
5.3 Disagreement between hindcast models and archaeobotanical composition
In general there is middling to low agreement between our hindcast predictions of species niches and observed archaeological occurrences of the same species (Table 1).
The hindcasting performance of models is not strongly correlated with either the number of occurrences in the training dataset, the performance of the model on test data, or the number of attested archaeological occurrences (Figure 10). This indicates against a single, simple explanation for the discrepancy, i.e. that one or the other signal is ‘wrong’.

Figure 10
Summary of hindcasting performance of models in relation to model performance and available archaeological test data.
The core issue is that the two sets of evidence discussed here—modelled palaeodistributions and archaeologically-attested palaeooccurrences—are signals of related but distinct phenomena. The hindcast niche models are projections of the species’ present realised niche into a reconstructed past climatology. The archaeobotanical record is an incomplete sample of a past realised niche passed through various anthropic and taphonomic filters. Neither can be assumed to be fully accurate representations of the ultimate aim of this study, which is the fundamental niche of the species in relation to reconstructable climate and environmental factors – though both carry information about it.
For each specific species there are a variety of potential explanations for why the signals differ, but without further lines of evidence we are not in a position to distinguish them. In general, our models probably underestimate the fundamental niche or potential distribution of the target taxon for a variety of reasons:
The inherent tendency of machine learning models to overfit to the training scenario, despite our efforts to mitigate this;
Uneven spatial sampling intensity of the GBIF occurrence dataset (see Section 3.1), meaning that some parts of the species’ niche are probably either over- and under-represented in the training data;
Use of fairly coarse (2000 year) time-averaged climate slices, which capture a general trend but are probably unrepresentative of the environment around sites at the specific time at which they were occupied.
At the same time, there are several reasons to believe that the archaeobotanical data (at least as it has been compiled for this study) overestimates the fundamental niche:
Archaeological deposits are also time-averaged signals, in some cases over quite large spans of time;
The chronology attached to our archaeobotanical data is imprecise, e.g. primarily based on point estimates of radiocarbon dates (Michczyński 2007), producing a further averaging effect;
Archaeological occurrences could be the result of human transplanting, range expansion and/or transport, especially considering this is the period where agriculture began to be practiced;
Most archaeobotanical samples in our archaeological test dataset were small, so individual errors and misidentifications have a correspondingly large impact on quantification of model performance.
At the same time, we cannot rule out more substantive reasons for the discrepancy between predicted and observed archaeological occurrences. The niches of the modelled species could have changed since the Early Holocene, which would not be captured in a model trained purely on modern specimens. Human economic choices—mobility, foraging strategies, cultivation, etc.—could also produce archaeobotanical assemblages whose composition depart significantly from that of the surrounding local flora.
What are the implications of this assessment? In the immediate term, the modelled palaeodistributions presented here should be taken as conservative estimates – a minimal likely potential niche of the species. Closer inspection of discrepancies between the modelled and attested distributions of particular species in relation to specific environmental, taphonomic and cultural factors could yield further insights. Future applications of hindcast palaeoecological niche models in archaeology could refine the methodology in several ways: using more finely resolved palaeoclimate sequences (e.g. Karger et al. 2023); building more refined archaeological chronologies; performing hyperparameter tuning to further mitigate overfitting;3 and compiling expanded archaeological datasets that could be used to calibrate binary classification.
6. Conclusion
We present the first continuous, spatially explicit models of the palaeodistributions of 65 plant species found regularly in association with Late Epipalaeolithic and Early Neolithic sites in West Asia. This deductive approach—modelling the niche of a species based on its occurrence in relation to environmental factors today, and using this together with palaeoclimate simulations to infer its past distribution—represents a new line of evidence on the archaeoecology of the world’s first agricultural societies. It provides a complementary picture to that gleaned from environmental archaeology and climatological archives because it is independent of the taphonomic, anthropic, and recovery-related processes that affect these records.
The modelled palaeodistributions of the species presented here (see Appendix) represent plausible minimum potential niche under the average climatic conditions of the Bølling-Allerød, Younger Dryas, and Early Holocene periods. The models’ generally high performance as assessed against independent contemporary test datasets lends confidence to these predictions. Their application as a binary predictor of archaeological presence or absence is significantly less promising, which likely reflects a combination of methodological limitations of our modelling approach, the incomplete and coarsely temporally-resolved nature of the archaeobotanical test dataset, and genuine discrepancies between the species’ fundamental niche and its occurrence in the archaeological record. Whether on the broad scale (e.g. the restricted geographic range of most species compared to their attestation in the archaeological record), or relating to specific species (i.e. false positives and false negatives), these discrepancies suggest several avenues for future investigation.
Modelling a large number of species using machine learning, the substantial occurrence datasets available for the present day, and a hindcasting approach to past distributions also represents a significant advance in the methodology of palaeoecological niche modelling. This approach is enabled by the availability of high quality, global open datasets in ecology (GBIF 2025b; GBIF Secretariat 2023), earth science (Farr et al. 2007), and climatology (Karger et al. 2017; Brown et al. 2018). However, several areas of methodological improvement are evident. Notably, the state of open data availability in archaeology lags conspicuously behind the fields mentioned above. Though we have benefited from the relatively long tradition of compiling archaeobotanical data in our region of study (Colledge, Conolly, and Shennan 2004; Shennan and Conolly 2007; Riehl and Kümmel 2005; Lucas and Fuller 2018; Fuller et al. 2018; Wallace, Jones, et al. 2018; Wallace, Livarda, et al. 2018), further development of open, comprehensive and up-to-date ‘backbone’ datasets on site locations and chronologies is needed to advance archaeoecological modelling to the same level.
Data Accessibility Statement
The data and R code used to produce this study is archived with Zenodo at https://doi.org/10.5281/zenodo.14629984. Modelled palaeodistributions in raster format (.TIF) can be found in the same repository.
Additional File
The additional file for this article can be found as follows:
Appendix
The ecological niche models of 65 plant species relevant to the subsistence of Late Epipalaeolithic (15–11.7 ka) and Neolithic (11.7–8.2 ka) societies in West Asia. DOI: https://doi.org/10.5334/oq.163.s1
Notes
[2] We use a background sample that is representative of the entire study area, as opposed to attempting to construct a ‘pseudo-absence’ sample. See Sillero et al. (2021) for the distinction.
Acknowledgements
We are grateful to Alex Weide and the second, anonymous reviewer of our manuscript, whose suggestions have greatly improved this study.
Competing Interests
The authors have no competing interests to declare.
