1. Context
Introduction
Southern Africa (defined here as South Africa, Lesotho, Eswatini and Namibia) is renowned for its rich archaeological record of Stone Age occupation spanning over a million years. Today, this region comprises nine vegetation biomes and experiences a west-to-east seasonal rainfall gradient [1]. Since John Goodwin and Clarence van Riet Lowe’s seminal work ‘The Stone Age Cultures of South Africa’ [2], archaeological sites—specifically lithic assemblages—have been attributed to the three phases of the Earlier (ESA), Middle (MSA) and Later Stone Age (LSA). Until the introduction of radiocarbon dating in the 1960s, the Stone Age sequence was based entirely on relative chronology. While subsequent developments in radiometric dating techniques have extended absolute chronologies further back into the Pleistocene, the broad lithic sequence has, overall, stood the test of time. However, the taxonomy, nomenclature and specific temporal boundaries have undergone continual refinement in subsequent decades [e.g. 3, 4, 5, 6, 7, 8]. The past 20 years have seen major developments in chronological and Palaeoenvironmental data, providing new context for lithic assemblages. This has been important in advancing discussion surrounding the nature and tempo of technological and demographic changes. While these trends are well-resolved in some regions (for example, the southern Cape coast), our understanding of the behavioural and settlement histories of other areas remains more limited. To achieve a more representative picture of human development in southern Africa, data from all ecological zones must be incorporated into scientific narratives. Accordingly, there is a need for a spatially comprehensive, multi-period index of Stone Age sites to facilitate research at a macro-regional scale.
Background
Early schemes for Stone Age archaeology in southern Africa drew heavily on European frameworks, using perceived similarities with the Palaeolithic record of France to construct a relative sequence [2, 9, 10]. Visits by European scholars to South Africa reinforced this, directly importing French terminology (e.g. Acheulean, Mousterian, Aurignacian) and diffusionist narratives [11, 12, 13, 14]. As knowledge increased, African-specific terms were introduced (e.g. Capsian, Magosian, Lupemban), and southern Africa’s own nomenclature established (e.g. Fauresmith, Stillbay, Wilton) [4, 15, 16]. With the advent of radiocarbon dating and the development of chronometric techniques extending beyond its ~40,000-year limit, this relative chronology, built on stratigraphic positions and typological markers, was refined [17, 18, 19, 20, 21, 22]. Yet, the utility and application of culture-historic labels or NASTIES (“Named Stone Tool Industries”) has been repeatedly questioned [23, 24, 25, 26].
The acceleration of research in the twenty-first century, with tightly contextualised stratigraphies, ages, and palaeoenvironmental proxies, prompted a re-evaluation of the South African and Lesotho Stone Age sequence (SALSA) [7]. This synthesis considered over 240 dated assemblages and proposed a revised framework of technocomplexes [sensu 27] for the region. Inclusion in SALSA’s accompanying site list required directly dated sequences and well-described lithic assemblages, meaning that undated excavated and surface sites were excluded. While this was entirely justified for producing a unified sequence, there has since been a shift in research emphasis from identifying the origins of behavioural traits [e.g. 28, 29, 30], towards documenting localised adaptations and broader demographic trends [e.g. 31, 32, 33]. Rather than high temporal resolution, this requires broader spatial coverage across southern Africa’s diverse environments.
Southern Africa’s varied physiographic, geomorphic and socio-economic contexts have strongly dictated the trajectory of archaeological study [34]. Areas with deep cave and rock shelter sequences and good organic preservation, such as the southern Cape coast and Cederberg and Maloti-Drakensberg mountains, have long dominated research. By contrast, interior zones like the Karoo, where deep-sequence shelters are rare, have been studied mainly through surface survey. Notably work by Peter Beaumont [35, 36] and Garth Sampson [37, 38, 39] has drawn attention to the abundance and significance of open-air occurrences for understanding wider population histories [4, 40, 41, 42, 43].
Various survey programmes, through both Cultural Resource Management (CRM) and research, have generated high-resolution regional datasets. Namaqualand [44, 45, 46] and Lesotho [47, 48] have seen extensive mitigation study ahead of mining or dam construction. Centralised national databases for South Africa include all heritage management cases (South African Heritage Resources Information System, SAHRIS) [49, 50] and rock art data (South African Rock Art Digital Archive, SARADA) [51], though access to some spatial information is restricted to protect the archaeological sites. Large-scale regional studies include Sampson’s interior-zone Orange and Seacow River surveys [37, 38, 39, 41], and Parkington’s Spatial Archaeology Research Unit surveys in the Western Cape west coast and Cederberg mountains [52, 53, 54, 55, 56]. Recent studies provide detailed landscape-scale insights into the Olifants River [57], Doring River [58], Tankwa Karoo [59, 60], Modder River [61], Limpopo River [62], Stormberg Mountains [63] Senqunyane and Phutiatsana Rivers in Lesotho [64, 65], and Zebra River in Namibia [66, 67]. While these illustrate the richness of open-air archaeology, they also present the challenge of defining site boundaries and integrating diffuse surface scatters into spatial syntheses [68].
Existing syntheses and datasets
Southern Africa’s long archaeological and research history has generated several multi-period syntheses [4, 7, 8, 69, 70] as well as temporally-focused overviews which address the ESA, MSA and LSA [e.g. 71, 72, 73, 74, 75, 76, 77]. Additionally, regional reviews have highlighted the interior zone, including Lesotho [78, 79], the Kalahari [80], and the Free State [81]. There are three key existing spatial and chronological datasets for southern African sites: SALSA (recently referred to as the “Southern African Sequence,” now including Eswatini, Namibia, Botswana, Zimbabwe and Mozambique) [7, 8], the Southern African Radiocarbon Database (SARD) [21], and the ROCEEH Out of Africa Database (ROAD) [82]. These are complementary resources, each prioritising different data and aims, and serving distinct analytical purposes (Table 1).
Table 1
Comparison of southern African Stone Age site databases.
| DATA SCOPE | SALSA | SARD | ROAD | SASSI |
|---|---|---|---|---|
| Spatial coverage | South Africa Lesotho Eswatini Namibia Botswana Zimbabwe Mozambique | South Africa Lesotho Eswatini Namibia Botswana Zimbabwe Mozambique | Africa Europe Asia | South Africa Lesotho Eswatini Namibia |
| Temporal coverage | ESA, MSA, LSA | Late MSA, LSA (<50 ka) | ESA, MSA, early LSA (3 Ma–20 ka) | ESA, MSA, LSA |
| Chronological resolution | Dated only | Radiocarbon dated only | Dated and undated | Dated and undated |
| Context | Excavated | Excavated | Excavated and surface | Excavated and surface |
| Location coordinates | None | Approximate (~1 km) | Accurate (1 m–1 km) | Accurate (1 m–1 km) |
| Archaeological information | Technocomplex | Technocomplex | Technocomplex, assemblage and non-lithic materials | Technocomplex |
The SALSA site list contextualises lithic assemblages within a robust temporal and taxonomic scheme, aiming to “serve as a useful resource to both students and professionals, and to fuel research and debate” [8: 172]. While it has been critiqued for homogenising variable assemblages under single labels that perpetuate “type site” naming systems [26, 83, 84], in practical terms archaeologists require a heuristic shorthand to draw comparisons and interpret behavioural patterning [8, 42]. Its detail and consistency make it a highly useful reference tool for identifying large-scale techno-typological trends; however, its format as a listed appendix rather than a searchable database, and its exclusion of undated and surface sites remain significant limitations.
Within a wider practice of compiling standardised radiocarbon datasets [e.g. 85, 86, 87], Loftus et al.’s Southern African Radiocarbon Database (SARD) provides an open-access compilation of ~2500 published radiocarbon dates from over 600 sites across southern Africa [21, 88]. Each entry includes laboratory identifiers, sample types, methods, and uncalibrated ages which are compatible with open-source OxCal calibration software. Through the IntChron Integration Tool (https://intchron.org), this offers integrated modelling and mapping tools such as Kernel Density Estimates and time-slice visualisation of probabilistic ranges for calibrated dates. While highly useful for palaeodemographic modelling within the last 50 ka [89, 90, 91], SARD’s primary limitation is its chronological scope, excluding undated or non-radiocarbon-dated sites, thus it remains unsuitable for spatially comprehensive or deep-time studies.
Beyond southern Africa alone, the ROCEEH Out of Africa Database (ROAD) documents early human behaviour and dispersals within and out of Africa [82]. With over 2685 sites and 27,732 assemblages from Africa, Europe and Asia, covering a timeframe from 3 million to 20,000 years ago, ROAD provides a relational, GIS-linked online database. This includes approximately 600 African localities, with a strong representation in South Africa (25%), particularly the southern coastal and north-eastern regions. Unlike SALSA or SARD, ROAD incorporates detailed contextual information across multiple archaeological material classes (cultural, faunal, botanical and human remains) and summarises lithic data (raw material, typology, technology and function) at the assemblage scale. Its scale and interoperability make it a powerful resource for macro-scale comparative studies [e.g. 92, 93, 94, 95]. However, its spatial and chronological breadth comes at the expense of local-scale resolution and, as such, its uncritical use can produce a biased picture.
Dataset aims
The Southern African Stone Age Site Index (SASSI) does not aim to supersede these datasets; rather, it works in conjunction with the taxonomic framework of SALSA, radiocarbon chronology of SARD, and contextual detail of ROAD. The primary aim of SASSI is to provide an updated, FAIR (Findable, Accessible, Interoperable, Reusable) dataset for southern Africa’s full Stone Age sequence, integrating both spatial and temporal information. Its novel contributions are its spatial scope, and inclusion of sites which lack radiometric ages but can be securely attributed to a chrono-cultural period based on lithic characteristics. By encompassing open-air, surface, and ‘legacy’ sites, the dataset seeks to highlight contrasting research histories across southern Africa’s diverse ecological zones and stimulate renewed investigation in underrepresented regions (Table 2).
Table 2
Number of sites shared between SASSI and other southern African datasets, by biome (ordered by decreasing area in South Africa).
| BIOME | SALSA | ROAD | SARD | SASSI | SASSI (DATED) | SASSI (UNDATED) |
|---|---|---|---|---|---|---|
| Savanna | 50 | 37 | 46 | 101 | 77 | 24 |
| Grassland | 49 | 22 | 55 | 93 | 77 | 16 |
| Nama-Karoo | 11 | 6 | 42 | 60 | 47 | 13 |
| Fynbos | 52 | 28 | 77 | 122 | 104 | 18 |
| Albany Thicket | 10 | 2 | 7 | 13 | 12 | 1 |
| Succulent Karoo | 13 | 3 | 61 | 77 | 64 | 13 |
| Forest | 1 | 0 | 1 | 1 | 1 | 0 |
| Desert | 9 | 3 | 14 | 20 | 20 | 0 |
| Indian Ocean | 3 | 3 | 3 | 5 | 4 | 1 |
| Total | 198 | 104 | 306 | 492 | 406 | 86 |
Spatial coverage
SASSI consists of 492 Stone Age archaeological sites from southern Africa, including South Africa (n = 434), Lesotho (n = 15), Eswatini (n = 8) and Namibia (n = 35) (Figure 1) It has attempted to be fully comprehensive for sites in South Africa and Lesotho where lithic assemblages have been described. While key (dated) sites in Namibia and Eswatini have been included, data from these regions must be considered incomplete due to less developed research histories and more limited accessibility of the literature. However, recent work is remedying this situation [96, 97, 98, 99].

Figure 1
Map of sites in the SASSI database, by present-day biome. Base map: Natural Earth Data.
Northern boundary: –17.02809 (Ovizorombuku 96)
Southern boundary: –34.81991 (Paapkuil Fontein 7)
Eastern boundary: +31.98889 (Border Cave)
Western boundary: +12.20027 (Cunene River, Cafema)
Temporal coverage
The dataset includes archaeological sites attributed to the ESA (n = 61), MSA (n = 163) and LSA (n = 392). The ESA spans from around 2 Million (Ma), to 200,000 (ka) years ago, the MSA from 300–40 ka, and the LSA from 40 ka to historical times, less than 300 years ago (Figure 2). Of the total LSA sites, 119 (30%) have only Ceramic LSA, post-dating 2 ka, while 273 (70%) have pre-ceramic LSA remains.

Figure 2
Map of sites in the SASSI database (focused on South Africa), by time period (ESA, MSA, LSA). Circles denote open sites; triangles denote caves and rock shelters. Base map: Natural Earth Data.
2. Methods
Steps
The SASSI dataset was initially compiled between 2013 and 2018 for a demographic comparison of the southern African Stone Age presented in Chapter 10 of the author’s doctoral dissertation and as Appendix C [100]. This work was framed around new lithic and landscape-use insights gained from the previously unstudied Tankwa Karoo region of the South African interior [59, 60]. The site index was subsequently updated in June 2022 and March 2025, when it was first shared for reuse on the Open Science Framework (SASSI v1, 08/03/2025). Two further rounds of updates and screening for accuracy and consistency preceded its publication in its current form (SASSI v3, 19/01/2026).
Systematic and detailed literature searches aimed to extract the following key information for reported sites: (1) location, (2) lithic technocomplexes/culture-historical units present, and (3) ages where available. Digitally accessible published literature was thoroughly searched using Google Scholar, beginning with existing reviews and syntheses, then pursuing the original sources cited to verify information. In addition to academic journals, sources included unpublished research dissertations available from online repositories, as well as Cultural Resource Management ‘grey literature’, such as Heritage Impact Assessment reports, accessed from the SAHRIS online database (https://sahris.org.za/). Undigitised sources (i.e. books, dissertations, reports) not in the author’s library were accessed at the Department of Archaeology, University of Cape Town (2011–2013), or the Haddon Library, University of Cambridge (2013–2018). Sites included in SALSA, SARD and ROAD were evaluated against SASSI’s criteria (see Sampling strategy for details), with a further 81 sites added.
Sampling strategy
Site type
Stone Age sites were included if lithic assemblages could be attributed at least to the broad period of ESA, MSA or LSA. ‘Sites’ include excavated and surface assemblages in cave/rock shelter and open-air contexts. For sites included in SARD [21, 88], only ‘occupation’ sites are considered; dated burials [101, 102], rock art [103] and pottery [104, 105] without associated assemblages were excluded. Sites where human or hominin fossils occur without associated cultural remains are also not included [but see 106]. Some further archaeological sites in the ROAD database were not included in SASSI, specifically those documented in older (pre-1970s) literature with limited spatial resolution and assemblage information.
Spatial criteria
For landscape-based studies of undated scatters [e.g. 39, 57, 59], only ‘named’ sites with specifically identifiable periods are included. The same applies to Heritage Impact Assessments, since these would accentuate research bias in certain regions such as the Western and Northern Cape. This is particularly evident in Namaqualand; however, since sites are published in a research format alongside radiocarbon dates, they meet the criteria for inclusion [44, 45, 46].
Chronological criteria
Published, undated occurrences that represent coarse-grained ESA, MSA or LSA occupation with reasonably secure locations are included at the ‘time-period’ scale but not given finer-grained industry-level attributions. This often includes surface sites, but also some cave/shelter sites with information from older excavations. For the final phase of the LSA after 2 ka, sites with hunter-gatherer occupation that show herder or farmer interaction (i.e. Ceramic LSA, Iron Age, Historical) are not fully comprehensive unless there are also pre-contact LSA lithic assemblages at the site. All sites with ages older than 300 years ago that are listed in SALSA and SARD are included to maintain consistency across databases.
Quality Control
Spatial data
Site locations were obtained from the literature (e.g. descriptions, maps, images, GPS coordinates), checked against Google Earth satellite imagery, and adjusted for accuracy where appropriate. Sites were not included if the location was deemed too unreliable. Given the need to protect the specific location of sites from unregulated visits and potential damage, not all publications provide precise coordinates, requiring some approximation based on location descriptions.
Spatial data are assigned an ‘accuracy’ score of 1 to 4 (see Data structure below). The higher accuracy of site locations scoring 1 or 2 should enable them to be relocatable in the field; for sites scoring 3 or 4, the spatial data are sufficient for analyses at the 30 arc-second (~1 km) scale, appropriate for use with modelled palaeoclimatic data such as pastclim [107].
Chronological data
Radiometric ages are included in the dataset where available but were not a requirement for site inclusion. Uncalibrated radiocarbon ages were calibrated using OxCal 4.4 or extracted from SARD through the IntChron tool. Where multiple individual dates were obtained for a single technocomplex or industry, ages with good agreement were generally grouped and reported as ranges. The SALSA review provides chronological ages from radiocarbon and other radiometric techniques with individual ages, error margins, and sample numbers [7, 8]. Some radiocarbon ages from older samples (1970s–1990s) that were not included in SARD or SALSA are provided in SASSI, clearly marked as both calibrated and raw ages.
Constraints
Southern Africa’s archaeological research history spans more than 100 years, with early prospection, excavation and collection occurring prior to GPS, radiometric dating and modern documentation standards. As a result, there are some sites with obvious limitations to their data resolution and quality. Nevertheless, it was important to include these in the dataset where possible, since some key sites remain frequently cited (e.g. Howieson’s Poort Shelter: [108]; Peers Cave: [109]), or could highlight potential for modern reinvestigation (e.g. Holley Shelter: [110, 111]; Olieboomspoort: [112, 113]).
Open-air, especially surface, sites also present certain challenges, since they are often reported only in broad chronological terms (ESA, MSA or LSA) [e.g. 36], or with approximate spatial information [e.g. 114, 115]. Yet, without these sites—especially in the interior—these regions appear almost entirely devoid of Stone Age occupation, which does not accurately reflect their abundant surface archaeology and human settlement history [e.g. 39, 59, 60]. A further limitation lies in using the “site” as the spatial basis for database entries. In areas where detailed surface survey and mapping has taken place, such as the Seacow River [39], Tankwa Karoo [59, 60], and Zebra River in Namibia [67], it is impractical to include large numbers of often low-density scatters with limited technological and chronological information, and therefore the archaeology of these areas remains under-represented in SASSI.
Data structure
The SASSI dataset, deposited on Open Science Framework (SASSI v3, 10.17605/OSF.IO/2HMS4), is organised into four .csv files, also available as a single Microsoft Excel workbook. The first (S1) presents the main site index with data arranged in columns detailed in Table 3 and S4. The second file (S2) provides the data in a simplified format with logical values. The third file (S3) gives full bibliographic references cited in the main site index. The fourth (S4) provides definitions of the data structure (after Tables 3 and 5).
Table 3
Data structure for SASSI (see text and S4. Data dictionary for further details).
| DATA | DESCRIPTION |
|---|---|
| Site | Site name (alternative names given in brackets) |
| Site abbreviation | Abbreviated name as a unique code |
| Latitude, Longitude | Decimal degrees (WGS84), given to 5 decimal places (up to 1 m precision) |
| Country, Region | Modern geopolitical names (following the ISO 3166 standard) |
| Modern Rainfall Zone (RFZ) | Location within the present-day Winter, Year-Round or Summer Rainfall Zone (from BIO15 of Worldclim 2.1) |
| Biome | Location within present-day southern African terrestrial biomes |
| Period | Three individual columns to denote attribution (ESA, MSA, LSA) (see Table 5) |
| Technocomplex | Sixteen individual columns to denote attribution (see Table 5), with alternative names indicated where relevant |
| Context | Shelter (i.e. cave/rock shelter), open or karst infill |
| Open context | (Optional) Surface or excavated |
| Site type | (Optional) Rock art, shell midden, ochre mine, depositional setting (especially for open-air sites) |
| References | Listed in date order (see S3. References for full details) |
| SALSA | Technocomplex information for sites listed in SALSA |
| ROAD | Assemblage information for sites listed in ROAD |
| SARD | Sites listed in SARD |
| Dating | Radiometric dating method where not listed in SARD (e.g. TL, OSL, PM, ESR, C14 etc.), or undated |
| Ages ka | Summary of ages (ka, thousand years ago) with corresponding technocomplex/layer for multi-period sites |
Site names
The most commonly used name for a site is given, usually derived from a farm or property name, with alternative names indicated in brackets. Some of these secondary names may be the main reference name in the SARD database. “Cave” or “Shelter” was omitted from site names for brevity except for certain cases where this is well-established in the literature (e.g. Border Cave, Bushman Rock Shelter). Site names with numbered suffixes indicate multiple nearby localities. In general, localities were grouped together where only low-resolution spatial data were available, and there was no variation in individual site chronologies. Separate numbered localities were retained when precise locations were given and/or site chronologies were different. However, this can differ from the approaches of the other databases; for example, Geelbek Dunes and Riverview Estates are given multiple entries in ROAD, but are grouped under a single record in SASSI given their overall proximity and common time periods represented. Site codes were assigned as unique three-character abbreviations (with numbers where required), aiming to adhere to existing naming conventions where possible (e.g. Blombos Cave, BBC; Boomplaas, BPA).
Site location and accuracy
Spatial coordinates (Latitude, Longitude) were taken directly from the literature where available and converted to decimal degrees. Where coordinate data were not provided, or coordinates were of low resolution, published maps and figures were used in conjunction with Google Earth satellite imagery to determine site locations. The quality of the location data (Accuracy) was classified as: 1. Accurate (precise coordinates provided or extracted from a high-resolution map or aerial image) (52% of sites); 2. Good approximation (precise coordinates not provided but location could be determined from published maps and figures and verified on Google Earth) (30%); 3. Reasonable approximation (location determined from a low-resolution published map or figure, or description of distinctive geographic features) (16%); 4. Approximate (only a general area could be identified based on the available information) (2%). Sites in accuracy categories 1 and 2 are suitable for high-resolution spatial analysis (landscape- or local-scale) but those in 3 and 4 are only sufficient for larger-scale regional analysis. The present-day geopolitical location (Country) is identified, together with regional province or district (Region).
Rainfall and Biome
Present-day classifications of seasonal rainfall zones (RFZ) and vegetation biomes (Biome) are given, with the caveat that while a west-to-east climate gradient and certain physiographic factors create overall stability through time, modelled palaeoenvironmental data should be used for temporally specific conditions. Rainfall seasonality was determined from WorldClim2 Global Climate precipitation data (BIO15) (https://www.worldclim.org/), divided into Winter (W), Year-round (Y) or Summer (S) [116]. Biomes were identified from the Mucina and Rutherford [1] terrestrial biomes shapefile, covering South Africa, Lesotho and Eswatini. For Namibia, the World Wildlife Fund Terrestrial Ecoregions shapefile was used [117].
Time period and technocomplex
The broad time-period represented at a site was indicated as ESA, MSA, LSA (Tables 4 and 5). To allow the filtering of LSA sites with only pre-ceramic LSA remains, these are marked by an asterisk (“LSA*”). Two further columns (GenMSA, GenLSA) note the presence of generalised MSA or LSA where no specific technocomplex details are given, particularly at multi-period or undated sites.
Table 4
Number of ESA, MSA and LSA sites recorded in SASSI, by site context and biome. For karst infill sites (n = 6), these are grouped with shelters.
| BIOME | ESA | MSA | LSA | TOTAL | ||||
|---|---|---|---|---|---|---|---|---|
| SHELTER | OPEN | SHELTER | OPEN | SHELTER | OPEN | SHELTER | OPEN | |
| Savanna | 4 | 25 | 21 | 22 | 56 | 18 | 58 | 42 |
| Grassland | 4 | 6 | 15 | 20 | 60 | 9 | 68 | 25 |
| Nama-Karoo | 0 | 7 | 7 | 14 | 13 | 37 | 13 | 47 |
| Fynbos | 1 | 7 | 24 | 16 | 41 | 56 | 50 | 72 |
| Albany Thicket | 0 | 4 | 1 | 0 | 6 | 2 | 7 | 6 |
| Succulent Karoo | 0 | 2 | 6 | 9 | 11 | 59 | 12 | 65 |
| Forest | 0 | 0 | 0 | 0 | 1 | 0 | 1 | |
| Desert | 0 | 0 | 2 | 0 | 12 | 8 | 12 | 8 |
| Indian Ocean | 0 | 1 | 3 | 0 | 3 | 0 | 4 | 1 |
| Total | 9 | 52 | 82 | 81 | 203 | 189 | 226 | 266 |
Table 5
Period and technocomplex definitions applied in SASSI. Technocomplex age ranges are based on [7, 8] and descriptions based on [5], [8], and other cited literature.
| PERIOD | ASSEMBLAGE CHARACTERISTICS AND ALTERNATIVE TERMS | N SITES |
|---|---|---|
| LSA ~40–0.3 ka | Core technology: bipolar, bladelet, single platform; Tools: scrapers (microlithic and macrolithic), backed artefacts (bladelets, geometrics), adzes, other diverse forms; Non-lithic: bone tools, ostrich eggshell beads/flasks; shell ornaments, grinding stones, bored stones, grooved stones, ochre, rock art, ceramics (post-2 ka) | 392 |
| MSA ~300–40 ka | Core technology: prepared cores: preferential Levallois (flake, point, blade), discoidal/radial, volumetric blade; Tools: unifacial points, bifacial points, backed artefacts, scrapers, denticulates; Non-lithic: bone tools, shell beads, (engraved) ochre, engraved ostrich eggshell, grindstones | 163 |
| ESA ~2 Ma–200 ka | Core technology: cobble choppers, platform, discoidal/radial, blade (late ESA); Tools: bifacial large cutting tools (handaxes, cleavers, picks); Non-lithic: flaked/polished bone tools | 61 |
| Technocomplex | ||
| Ceramic LSA <2 ka | Both microlithic and more informal variants occur regionally. Tools can include long end-scrapers and backed microliths, or tools can be rare. Contemporaneous with Final LSA but assemblages with ceramics may be associated with herders or hunter-gatherers. Alternative terms: Smithfield (interior, informal) [4]; Swartkop (Northern Cape/Bushmanland hunter-gatherer, blades, backed blades, grass-tempered ceramics), Doornfontein (Northern Cape/Bushmanland herder, informal flakes, quartz preference, frequent ceramics) [35]; Group 2 (Namaqualand herder?, informal, no bladelets, single platform cores or retouch), Group 3 (Namaqualand hunter-gatherer?, >95% clear quartz, backed tools outnumber scrapers [121] | 208 |
| Final LSA 4–0.1 ka | Both microlithic (similar to Wilton) and more informal variants (Smithfield) occur. Adzes/spokeshaves (concave scrapers) are common. Alternative terms: Springbokoog (Northern Cape/Bushmanland, >2 ka, no ceramics, backed bladelets; described as equivalent to the Wilton but falling within 4.3–2.3 ka) [35]; Group 1 (Namaqualand, <2 ka, Wilton-like microlithic) [121] | 169 |
| Wilton 8–4 ka | Microlithic flake and bladelet tradition with numerous formal tools, including standardised backed microliths and small convex scrapers. Wide range of raw materials. | 104 |
| Oakhurst 12–7 ka | Macrolithic flake-based tradition with informal cores. Tools are mostly medium/large scraper forms (D-shaped, Woodlot, naturally backed knives). High use of hornfels and quartzite. Alternative terms: Lockshoek (interior Karoo, emphasis on hornfels) [38], Albany (southern Cape) [122], Kuruman (Northern Cape) | 77 |
| Robberg 20–12 ka | Systematic unretouched bladelets from single platform or bipolar cores. Bipolar scaled pieces/outils écaillés occur but few retouched tools. High use of fine-grained raw materials and quartz. Alternative terms: Late Pleistocene LSA (Namibia, macrolithic Oakhurst-like with Robberg ages) [123] | 42 |
| Early LSA 40–20 ka | Generally informal, combining some MSA (prepared) and LSA (bipolar technique, microlithic) technological characteristics. Alternative terms: MSA/LSA (Lesotho and KwaZulu-Natal where features overlap); generic: MIS 2 | 29 |
| Final MSA 40–30 ka | Late MIS 3 with typical MSA features (e.g. prepared cores, points, blades) but may include bipolar and microlithic technologies. Some regional tool variants (e.g. hollow-based points). Alternative terms: generic: MIS 3 MSA | 19 |
| Late MSA 50–40 ka | Mid-MIS 3 with typical MSA features (e.g. prepared cores, points, blades). May include retouched point forms (unifacial, bifacial). Alternative terms: Orangian (interior Karoo, emphasis on hornfels, points and blades) [37]; generic: MIS 3 MSA | 27 |
| Post-Howiesons Poort 60–45 ka | Early MIS 3 with typical MSA features (e.g. prepared cores, points, blades). Unifacial points, scrapers, rare backed artefacts. High use of silcrete in some regions (e.g. Western Cape). Nubian Levallois point production in the Karoo. Alternative terms: Sibudan (proposed by [7, 124] but low uptake), MSA 3 [125] | 21 |
| Howiesons Poort 70–60 ka | Blade-based technology, prepared blade cores and small blade/bladelets. Standardised backed tools (geometrics, bladelets) in high frequencies, strangulated-notches. Points are rare. High use of silcrete. | 37 |
| Still Bay 80–70 ka | Prepared core (radial/Levallois) flake production, some blades. Bifacial foliate/lanceolate points, may involve heat-treatment, pressure-flaking or serration. High use of silcrete. Alternative terms: pre-Howiesons Poort (interior Free State/Lesotho, lacking typical bifacial points) | 30 |
| Pre-Still Bay 130–80 ka | Prepared core (radial/Levallois) flake, point and blade production. Some incipient Still Bay features (bifacial flaking, serration), scrapers. Alternative terms: Mossel Bay [7, 8]; Pietersburg (interior Limpopo/Gauteng) [112, 126]; MSA 2b [125]; generic: MIS 5 MSA | 26 |
| Early MSA 300–130 ka | Prepared Levallois and discoidal/radial cores, flakes, blades from volumetric cores, points (usually unretouched), denticulates, notches. Alternative terms: MSA 1, MSA 2a [125] | 38 |
| Fauresmith 600–>200 ka | Small, symmetrical handaxes, prepared (Levallois cores), large blades, points. Diverse raw materials, some fine-grained. Alternative terms: Late Acheulean (less emphasis on MSA transitional elements), Victoria West (Northern Cape variant, prepared cores) [127], Sangoan (pan-African term, emphasis on MSA features, picks, denticulates, notched scrapers) [128] | 21 |
| Acheulean 1.5 Ma–300 ka | Bifaces (handaxes and cleavers), scrapers, large flake blanks, some core preparation in late Acheulean. Coarse-grained raw materials, usually local. | 45 |
| Oldowan >2–1.5 Ma | Cobble, core or flake tools, no core preparation and little retouch. Coarse-grained raw materials, usually local. | 5 |
The naming system for technocomplexes, their chronology, definitions and regional variability, are complicated, and beyond the scope of discussion here. The chrono-cultural framework applied represents a compromise between dominant and regional terms (Table 5, Figure 3). For example, “pre-Still Bay” or “MIS 5 MSA” is preferred over “Mossel Bay” since at many sites typical features of the Still Bay, such as bifacial flaking and serration, are observed [e.g. 118, 119]. Late and Final MSA phases are distinguished to capture variability in mid-to late MIS 3 [120]. The name “Fauresmith” is used with some caution since its characteristics are regionally specific to the Vaal River and Northern Cape, therefore ESA-MSA transitional might be a more appropriate broader term [but see 73, 121].

Figure 3
Frequencies of sites in the SASSI database, by technocomplex (see Table 5 for details) and biome. Biomes are ordered by decreasing area covered.
Technocomplex attributions were derived from assemblage descriptions in the literature, rather than assigned based on expectations from ages, therefore these differ from Lombard et al. [8] for some sites. Where publications assign alternative or regional terms to specific assemblages, these are named in the relevant technocomplex field.
Context
Site context (cave/rock shelter, open, or karst infill) are indicated, with further optional details to indicate whether open sites were surface or excavated, and specific site type or setting (e.g. rock art, shell midden, river terrace, pan etc.).
References
References are given for the key literature providing site locations, chronologies and assemblage data, but are not exhaustive for more specialised studies (e.g. lithics, fauna etc.). Full references are listed alphabetically in the file S3. References.
Database comparison
To allow direct comparison and cross-referencing with information contained within the other three main databases discussed, these are summarised in the fields SALSA, ROAD and SARD. This includes the technocomplexes recognised at a site according to SALSA [8], and the types of assemblage data included in ROAD [82]. Radiocarbon ages included in SARD [88] are indicated (see below).
Dating
Information related to chronology is provided in the fields Dating and Ages_ka. Where the radiocarbon dating information was obtained from SARD, it is indicated (“C14”) in the SARD field, and ages calibrated by the author using the IntChron Integration Tool (https://intchron.org/archive/SARD/SARD/index.json). These are summarised as a mean age, or range for multiple dates, in the Ages_ka field with the prefix “C14 cal” (or “AMS C14 cal”). The SARD database should be consulted directly for individual and raw ages. Calibrated ages use the SHCal20 curve [129]. For radiocarbon ages not included in SARD and derived from less accessible literature, raw uncalibrated ages (“C14uncal”) are also given.
Where alternative dating methods are used, or sites are undated, these are indicated in the field Dating. Methods include Amino Acid Racemisation (AAR), Cosmogenic Nuclides (CN), Electron Spin Resonance (ESR), Infrared Stimulated Luminescence (IRSL), Optically Stimulated Luminescence (OSL), Palaeomagnetism (PM), Thermoluminescence (TL) and Uranium Series (US; specified as U-Pb or U-Th where possible). Radiocarbon ages not included in SARD are also indicated in this field.
All ages are given as thousand years ago (Ages_ka). For early Pleistocene ages, million years ago (Ma) is indicated. While calibrated radiocarbon ages for more recent time periods are commonly given as calendar years BC or AD, they are presented here as thousand years ago to allow for consistency and comparison with older ages. Most ages are accompanied by contextual information including layers, cultural associations and error margins, although for sites with complex dating histories, some details are summarised into age brackets for clarity. Ages are given in chrono-stratigraphic order from oldest to youngest, with different dating methods used clearly indicated. All sources for ages are given in the References field.
Simplified dataset
A condensed version of the SASSI dataset (S2. SASSI_simpl) is provided as an additional file, giving an overview of location, context and technocomplex information entered as logical “True/False” values. This facilitates interoperability of key information with other software environments, such as R or QGIS. Queried technocomplex attributions are treated as True values. In the Excel datasheet, cells with positive values are also colour-coded along a gradient according to technocomplex to provide a more visual presentation of occupation trends.
3. Dataset description
Object name
Southern African Stone Age Site Index (SASSI) (SASSI_v3)
S1. SASSI_v3 (492 site entries)
S2. SASSI_simpl (reduced data with logical values)
S3. References (587 literature sources)
S4. Data dictionary (details of field, description, type and values in S1)
Data type
Secondary data from previously published literature.
Format names and versions
.csv, .xlsx
Creation dates
Systematic compilation of the dataset commenced in 2018, with updates in 2022 and 2025, culminating in its current version (v3) in January 2026.
Dataset Creators
Emily Hallinan created the dataset. Emma Loftus [88] compiled the SARD dataset used for calibrated radiocarbon ages through IntChron (https://intchron.org/archive/SARD/SARD/index.json).
All published literature and references to associated relevant publications, theses and reports are referenced in dataset. Any transcription errors rest with the author. Updates and amendments to site records are encouraged from the research community.
Language
English
License
CC-By Attribution 4.0 International
Repository location
Open Science Framework: Southern African Stone Age Site Index (SASSI)
Publication date
08/03/2025 (v1), 29/10/2025 (v2), 19/01/2026 (v3)
4. Reuse potential
In southern Africa, several resources already provide chronological [21, 88], spatial [82, 88], and cultural [7, 8, 82] information for archaeological sites. However, SASSI presents a novel combination of these dimensions, emphasising broad spatial and temporal coverage across both excavated and surface, dated and undated, sites. As a result, the interior zones of southern Africa, where open sites predominate, are far better represented than in previous datasets. SASSI offers a comprehensive, carefully validated, openly accessible dataset that removes the need for students and researchers to compile geospatial and chrono-cultural site data independently, thereby streamlining map creation, literature review, and cross-regional comparison.
Spatial analytical techniques derived from the biological sciences, such as species distribution and ecological niche modelling [e.g. 95, 130, 131], are increasingly applied in archaeology, though still rarely in southern Africa [33]. These methods depend on a robust spatiotemporal dataset as a starting point, which SASSI provides. The data can be filtered by contextual and chronological criteria, and imported into GIS or R for further analyses, for example, alongside modelled palaeoclimatic data [103, 132]. Planned annual updates will employ date-filtered searches, and users are encouraged to contribute new dates or site records directly, supporting the development and maintenance of this community-oriented resource. Together, these features position SASSI as a foundational tool for regional-scale synthesis [e.g. 60, 120] and future quantitative modelling of the southern African Stone Age.
Acknowledgements
Osama Samawi offered helpful comments on a draft of the manuscript and two reviewers are thanked for providing valuable feedback.
Competing Interests
The author has no competing interests to declare.
