Have a personal or library account? Click to login
Getting to the Point: Defining, Reconstructing and Investigating Shape Through a Procrustean Protocol Cover

Getting to the Point: Defining, Reconstructing and Investigating Shape Through a Procrustean Protocol

Open Access
|May 2025

Full Article

Background

Procrustean Methods and their Challenges

Investigating complex structures has been facilitated by the development of geometric morphometric (GM) approaches (e.g., Klingenberg and McIntyre 1998; Weisensee and Jantz 2011; Wigley et al. 2024). Coordinate-based approaches are especially attractive when dealing with sets of structures possessing a shared suite of homologous landmarks, curves or surfaces as these can be defined through matrix configurations of Cartesian points (Dryden and Mardia 2016: 3–5).

When objects are represented by coordinate configurations which summarise the positions of homologous points, it is possible to capture shape variation through Procrustean methods. To do this, configurations are superimposed and Procrustes aligned through an iterative least-squares optimisation process. During alignment, differences between objects due to location are initially removed by shifting the centre of each shape configuration to a common origin. Size differences are then reduced through rescaling before matrices are finally rotated to remove differences due to orientation (Dryden and Mardia 2016: 134; Bookstein 1991: 258–269). This process leaves the newly aligned coordinate configurations registered in Kendall’s shape space. As this is a non-Euclidean morphospace, generally shapes are projected into tangent linear space where, despite a degree of distortion, the Euclidean distances between them can be decomposed through standard statistical methods with acceptable accuracy (Bookstein 1991: 269; Dryden and Mardia 2016: 88–95; Kendall 1984). As subtle patterns of variation are often more apparent in shape differences, complex patterns can be discerned with much greater clarity in Procrustes-aligned data and summarised both visually and in easily understood metrics (Bookstein 1991; Klingenberg 2015). Consequently, GM approaches broadly and Procrustean techniques specifically permit a diverse and more nuanced range of hypotheses to be explored than previously possible (e.g., Thulman et al. 2023; Weisensee and Jantz 2011; Wigley et al. 2024).

The application of Procrustean techniques can, however, be complicated by several frequently encountered challenges. Often the first among these is the need to estimate an appropriate density of coordinate points and the potential this creates to either under- or over-sample an object. Determining the optimal number and location of points through which to represent shape is essential as, if a structure is defined by too few or poorly situated points, not enough morphological data will be captured, limiting an investigator’s ability to visually and statistically appreciate differences in shape. Meanwhile, over-sampling increases the length of time spent collecting data, decreases computational efficiency, and reduces statistical power in the final analysis due to the introduction of extraneous information (Dryden and Mardia 2016: 369; Macleod 1999; Wasiljew et al. 2020). Missing data is frequently the second complication encountered. In circumstances where only a few cases contain missing data, it may be convenient to remove them from the analysis, while a specific subset of points may be deleted if missing data more frequently occur in the same region throughout a sample. Both solutions, however, reduce analytical sensitivity and filling gaps in data through imputation is often a better approach (Dryden and Mardia 2016: 338–339). Parametric ‘statistical’ methods of imputation (e.g., linear regression procedures) are, however, constrained in the amount of missing data that they can reliably handle (Adams et al. 2023: 41–43; Bucchi et al. 2022; Gunz et al. 2009: 58; Little and Rubin 2019; van Buuren 2018). For example, while Partial Least Squares regression is popular, to effectively model relationships between variables it requires that m × d + m objects are included in the analysis, where m is the dimensionality of the data and d the number of missing coordinate points (Adams et al. 2023: 41–43; Gunz et al. 2009: 50). This inhibits its application to larger samples composed of objects with only slight or minimal damage, imposing real constraints in an archaeological setting.

In short, GM methods represent an interrogatively attractive option for many research projects. The techniques are theoretically well established and permit the visual and statistical exploration of complex structures. Practical challenges, however, restrict their wider application.

The Os Coxae: An Inseparable but Divided Whole?

The human os coxae (also known as the hip bone or innominate bone) provides an ideal opportunity to explore, and hopefully provide some solutions to, the aforementioned challenges. Ossa coxae initially develop as three distinct skeletal modules – the ilium, ischium and pubis – which eventually fuse to form a single integrated osteological structure over the course of adolescence (Schaefer et al. 2009: 230–255; White et al. 2012: 227–240) (Figure 1). Even after union, they contribute variably to the different functions associated with the pelvis. In locomotor activity, for instance, muscles key to abduction of the thigh take their origin from the ilium while antagonistic adductors are associated with the pubis (Gosling et al. 2002; Standring 2016: 1337–1348). Meanwhile, ischial and pubic shape are influenced by obstetric demands to a greater degree than the ilium which may be impacted more by thermoregulatory requirements (Candelas González et al. 2017; Kurki 2007; Ruff 1993; Warrener 2023). As such, it can be hypothesised that complex patterns of variation and covariation, which reflect idiosyncrasies of development and function, exist within this skeletal structure and that modules may exhibit significant levels of independence – i.e., the structure could be ‘modular’ (Adams 2016: 565; Adams and Collyer 2019: 2352; Zelditch and Swiderski 2023: 147). Currently, however, there exists no ready-made, open-access GM protocol amenable to capturing os coxae morphology. The development of such a Procrustean protocol therefore offers the opportunity to illustrate how coordinate density can be estimated, missing data handled in archaeologically-recovered remains and statistical patterns explored.

jcaa-8-1-161-g1.png
Figure 1

Modules of the os coxae (lateral view of the left side). The ilium, ischium and pubis are illustrated and, in light blue, their secondary ossification centres and the joints which fuse in maturity (Gray 1918: 238).

Materials

To develop a GM protocol, 29 well-preserved, unfragmented and fully-fused ossa coxae were analysed. These originated from two archaeologically-recovered skeletal assemblages, with 17 ossa coxae drawn from the early-medieval Black Gate, Newcastle-upon-Tyne collection and 12 from post-medieval Coronation Street, South Shields collection (Mahoney Swales 2019: 202; Raynor et al. 2011: 76). Both collections are curated by the University of Sheffield’s Archaeology and Heritage Science Facility.

Methods

Scanning and Digitisation

Each os coxae was scanned with an Artec Eva, a portable, non-contact structured-light scanner to produce high-resolution three-dimensional scans. Scans were processed in Artec Studio to create meshes which were saved in Polygon File Format (Artec 2023; Marić et al. 2022). Left side ossa coxae were preferentially assessed, but where this was not possible, the right side was scanned and the scan flipped about a vertical axis so that it would be comparable with the left side. As the human skeleton is bilaterally symmetric (White et al. 2012: 11) and directional and fluctuating asymmetry account for a small proportion of overall morphological variation (e.g., Wigley et al. 2024), this and similar approaches have been employed in studies of human remains, including the os coxae (e.g., Betti et al. 2013; Fischer and Mitteroecker 2017: 699). Digitisation protocols (discussed below) were created in Viewbox 4 (version 4.1.0.12, dHAL Software), enabling os coxae structure to be defined with a template of coordinate points representing homologous landmarks and semi-landmarks (e.g., Betti et al. 2013; Rissech et al. 2001; Standring 2016: 1337–1348; White et al. 2012: 227–240). At the end of digitisation, os coxae were summarised by a k × m matrix configuration (X), where k was the number of points and m = 3 as points along the x, y and z axes were digitised. Thus,

X=(x1y1z1x2y2z2xkykzk).

Point order corresponded exactly between configurations which were saved as comma delimited files so that they could be loaded into the R environment for further analyses (R Core Team 2023). To enable morphological exploration via Procrustean techniques in R, matrices X1, …, Xn were combined into a k × m × n array.

Coordinate Density, Alignment and Structural Modularity

To estimate optimal coordinate point density, a preliminary template was designed to substantially over-sample the element’s structure (Figure 2a); the assumption that the structure had been over-sampled was based on the volume of points used in past research (e.g., Betti et al. 2013; Fischer and Mitteroecker 2017: 699). This comprised 25 fixed landmarks, 159 semi-landmarks along curves, and 425 surface semi-landmarks and was applied to a subsample of five randomly selected ossa coxae from the Black Gate collection. Thus, in this preliminary analysis k = 609 and n = 5.

jcaa-8-1-161-g2.png
Figure 2

Os coxae digitisation templates created in Viewbox 4. The first template (a) substantially over-sampled the os coxae with coordinate points located at landmarks (dark blue), semi-landmarks along curves (light blue at extremal borders and red at borders between modules) and surface semi-landmarks (green). The revised template (b) contained fewer coordinate points and included landmarks (dark blue) and semi-landmarks along extremal borders (light blue). The final template is available at: https://doi.org/10.15131/shef.data.25498276) (Wigley and Blackwell 2024).

Following digitisation, to determine the optimal number of points needed to capture shape variation, configurations were subjected to Watanabe’s (2018) Landmark Sampling Evaluation Curve (LaSEC) procedure, as implemented in R (Watanabe 2018; R Core Team 2023). Configurations were initially Procrustes-aligned. Coordinate points from ‘parent’ configurations (i.e., the complete originals), were then iteratively and randomly subsampled to create ‘new’ subsampled configurations; point selection order corresponded between configurations and was automated through the LaSEC package (Watanabe 2018). The first round selected three points from each parent configuration (the minimum number required to form a shape) and continued until all coordinates were included in the last round. New subsampled shapes generated in each round were aligned to their parent configuration through an Ordinary Procrustes Analysis (OPA) (i.e., between two objects rather than multiple as occurs in a Generalised Procrustes Analysis) (Dryden and Mardia 2016: 126–136; Watanabe 2018: 5). As new and parent configurations contained a differing number of coordinate points in all but the last round, vectors of zeros were added to subsampled configurations so that they matched the dimensionality of their parent configurations; by adding zeros, this equivalence was achieved without the creation and addition of artificial variation. To assess the ‘fit’ of the new subsampled coordinate configurations (i.e., how accurately they summarised their parent configurations), first the sum of squared differences between new and parent configurations was calculated. The result was then subtracted from one; an outcome value of one would suggest a perfect match and that all morphological variation had been captured in the subsampled configuration, while a result closer to zero would imply a poorer fit with the parent configuration. Median fit values were plotted to produce a sampling curve (Bardua et al. 2019: 18–19: Watanabe 2018: 4–5). This process was repeated ten times. As potentially subtle patterns of variation between separate regions of the os coxae were to be investigated, the above process was also applied to the ilium, ischium and pubis in isolation. Results were used to determine coordinate number in a revised template which aimed to balance the need to contain sufficient points to effectively capture shape without over-sampling and constraining analytical power (Figure 2b).

Additionally, Principal Components Analysis (PCA) was employed to investigate where semi-landmarks were best placed. In brief, PCA generates Principal Components (PCs) which sum to the same total variance contained in the original input variables, but are independent linear combinations of them (e.g., Zelditch et al. 2012: 136–146). Thus, PCA simplifies complex data, facilitating exploration. Here, using plots of vector displacements to compare the coordinate configurations that contributed the greatest and least variation to the first four PCs (e.g., Wigley et al. 2024: 1412), it was possible to identify the areas of the ox coxae where variation was most prevalent. A greater density of semi-landmarks could therefore be allocated to regions of highest variability and/or complexity.

Once all os coxae had been digitised (n = 29), configurations were registered in shape space through a Generalised Procrustes Analysis (GPA) and projected into tangent linear space. So that semi-landmarks retained their structural relationships to points within the same configuration during this process, differences between semi-landmarks in separate configurations were reduced by minimising bending energy (Bookstein 1997; Gunz and Mitteroecker 2013; Perez et al. 2006: 770). To identify factors affecting shape, a Procrustes Analysis of Variance (ANOVA) procedure was employed to decompose tangent space distances, found as the root sum of squared differences between coordinate configurations (e.g., to quantify the impact of sexual dimorphism the distances between the mean female and male configuration and the grand mean was totalled). Significance was determined through a residual randomisation permutation procedure (RRPP) (Adams et al. 2023: 118–126; Collyer et al. 2015; Collyer and Adams 2018; Goodall 1991; Zelditch et al. 2012: 228–240).

Modularity, or the independence of morphological features between modules, was quantified through the decomposition of the covariance matrix (S). S summarises the covariances of all the coordinate points and, in a structure composed of two modules, is separated into four blocks. The first of these, S11 and S22, contain the covariances among the coordinate points located in the first module and second module respectively. Meanwhile, S12 contains the covariances of coordinates between the first and second module, while S21 is the transpose of the latter block (Zelditch and Swiderski 2023: 149). A covariance ratio (CR) coefficient expressing the total covariance between the two modules in relation to the within-module covariance – less the variance found in each individual coordinate point – can therefore be found as

CR=trace(S12S21)trace(S11*S11*)trace(S22*S22*)

where S11* and S22* are the covariances of the two modules with the diagonals replaced by zeros (Adams 2016: 568; Zelditch and Swiderski 2023: 149). Under the null hypothesis, CR = 1. Thus, if CR > 1, it is implied that covariation between modules is higher than expected under the null hypothesis and the structure exhibits morphological integration. Conversely, when CR < 1, covariation between modules is less than that found within modules, suggesting the object is composed of parts which are more distinct and independent than would be expected under the null (i.e., it is ‘modular’) (Adams 2016: 565–568; Adams and Collyer 2019: 2352; Zelditch and Swiderski 2023: 147).

Because there were more than two modules (i.e., the ilium, ischium and pubis), the CR coefficient was computed for each possible combination of modules; the mean of these pairwise comparisons gave CRν, an estimate of overall structural modularity (Adams 2016: 568; Adams et al. 2023: 68–70; Klingenberg 2009: 410). Significance was determined through a permutation procedure. To do this, coordinate points were arbitrarily assigned to modules, disassociating relationships between them so that covariation within and between modules was randomised, producing a distribution of CR values centred around 1, as expected under a null hypothesis. Thus, when CRν coefficients were compared to this distribution, the percentage of permuted values below that point functioned as a p-value (Adams 2016: 568; Adams et al. 2023: 69). In addition, configurations were resampled with replacement and tested for modularity to produce a bootstrapped range of plausible CRν coefficients (Adams et al. 2023: 68–69).

Simulating and Imputing Missing Data

To address the issue of how to deal with missing data, it was firstly decided to replicate the full k × m × n data array and introduce varying levels of simulated missing data (i.e., deleting x-y-z coordinate points and defining these missing entries as absent by coding them NA). Although the ‘missingness’ within the replicated data was artificial, the volume and distribution of missing points was designed to reflect real-world patterns that may be encountered when recovering archaeological skeletal materials (e.g., McKinley 2004; White et al. 2012: 319–328). Thus, arrays were organised so that the sequence of points reflected the anatomical relationships of the structures they represented (i.e., the arrangement of coordinates mimicked the physical proximity of landmarks). Missing data was then introduced in two stages. Initially, NAs were seeded randomly. Then, to create blocks of missing data that simulated the tendency for destructive factors to affect adjacent regions, the second stage added further NAs but increased the likelihood of these occurring next to existing missing points. This created ‘patches’ of NAs whose size was constrained by controlling the probabilities associated with NA introduction at each stage of the process. Specifically, three degrees of missingness were introduced into a hundred replicate arrays each (Table 1). Missing data introduced in this way was ‘missing at random’ in that the presence of NAs was not completely random, but the systemic patterns in missingness were related to observed rather than unobserved data (Azur et al. 2011: 41; Mack et al. 2018: 7; Rubin 1976).

Table 1

The levels of ‘missingness’ introduced into simulated datasets.

‘MISSINGNESS’CONFIGURATIONS AFFECTEDVOLUME OF NAS IN DATASETSIMULATED SCENARIO
Low25%10%Small patches of NAs simulating a reasonably complete assemblage with minor cortical erosion
High75%30%Large patches of NAs mimicking assemblages affected by post-mortem disturbance and severe fragmentation
Diffuse100%20%Patches of NAs reflecting an assemblage deposited in a substrate incompatible with cortical preservation and affected by factors promoting moderate fragmentation

In GM contexts, imputation of missing data is often carried out prior to alignment as Procrustean procedures ordinarily require complete configurations (Adams et al. 2023: 41–43; Arbour and Brown 2014: 18). However, configurations of differing sizes can be compared (e.g., Watanabe 2018: 5). There are thus two stages at which imputation can occur: before or after Procrustes alignment. To explore these two options, each replicated array containing missing data was duplicated (so that the simulation study contained six hundred datasets). For the latter approach (i.e., pre-imputation alignment), in duplicated arrays all complete configurations were superimposed and aligned through a GPA and a consensus shape computed. Following this, each incomplete configuration was superimposed and OPA-aligned with the matching points from the previously determined consensus shape (Arbour and Brown 2014: 17–18; Arbour and Brown 2022). Once missing data had been filled, GPA-alignment registered complete and formerly incomplete configurations in the same shape space (Arbour and Brown 2014: 17–18; Dryden and Mardia 2016: 126–136). Obviously, the process of obtaining a consensus configuration from complete cases was not possible in the data arrays with ‘diffuse’ missing data. To investigate the viability of one potential solution, two os coxae casts produced by France Casting (https://www.francecasts.com/) were digitised and their configurations employed to create a consensus shape. As ossa coxae are sexually dimorphic (Ferembach et al. 1980; White et al. 2012: 415–418; Buikstra and Ubelaker 1994: 15–20), a female (RI002) and male (RI003) cast were used in the hope that the consensus configuration represented a population ‘average’. By imputing lost data in arrays of both unaligned and aligned configurations, which contained identical volumes and distributions of missing data, it was possible to discern which approach was optimal.

The volume of simulated missing data and sample size (i.e., frequent missingness combined with small sample) made the use of standard ‘statistical’ methods inadvisable (see Background). This assumption appeared justifiable when various techniques (e.g., predictive mean and regression modelling) were employed in preliminary tests. Therefore, missing coordinate points were filled using one ‘geometric’ and one non-parametric ‘statistical’ method (Gunz et al. 2009: 50). The geometric approach involved firstly computing a reference shape, which was generally found as the consensus configuration of all complete cases. Following this, incomplete target configurations were aligned to the reference and missing points mapped from the former to the latter via a Thin-Plate Spline (TPS) interpolation (Adams et al. 2023: 41–43; Gunz et al. 2009: 50–51). As before, in arrays containing ‘diffuse’ missing data, the two previously described casts were employed to create a reference. The second method tested was Random Forest (RF) imputation using Stekhoven and Bühlmann’s (2012) missForest algorithm. RF is a non-parametric machine learning procedure in which an ensemble of decision/regression trees are employed for classification/predictive purposes (Breiman 2001; Stekhoven and Bühlmann 2012). To estimate missing data, the missForest procedure initially fills missing observations with a synthetic value (e.g., the mean). Variables are then ordered according to the volume of missing data and values imputed for each absent variable sequentially through a trained random forest of t trees constructed from observed data. The process is repeated for i iterations and results aggregated to improve predictions (Stekhoven and Bühlmann 2012; Tang and Ishwaran 2017: 2–3). For efficiency, here t = 100 and i = 10 (Stekhoven 2022: 2–6). These two methodologies were chosen as they represent effective, flexible and distinct approaches which, to the author’s knowledge, have not previously been tested in an archaeological context.

Finally, the efficacy of each approach (alignment pre- and post-imputation) and method (TPS and RF) was evaluated. To do this, firstly all datasets were registered in the same shape space. To access the accuracy and consistency of imputation in reconstructing morphological patterns, the tangent space distance between the consensus configuration for the observed data (i.e., the original data without any missing coordinate points) and each array containing filled missing data was found as the root sum of squared differences between configurations (Adams et al. 2023: 51–57; Bookstein 1991: 268–269; Senck et al. 2015: 829). To investigate how faithfully statistical properties relating to patterns of variation and covariation had been reconstructed, the CRν coefficient was calculated for each simulated dataset and contrasted to the bootstrapped range computed from resampling of the observed dataset (Adams et al. 2023: 68–69). This enabled the impact of each method on CRν coefficient estimation and hypothesis testing to be clearly illustrated. Distances between consensus configurations and differences in coefficient values were tested through standard univariate methods. Code for the implementation of all statistical procedures described here can be found at https://doi.org/10.15131/shef.data.25498276.

Results

Coordinate Point Density and Structural Modularity

As expected, the LaSEC procedure revealed that the template containing 609 coordinate points substantially over-sampled os coxae morphology (Watanabe 2018). Following the calculation of distances between parent and subsampled configurations, median fit values implied that eight coordinate points summarised circa 95% of variation while 30 captured in the region of 99%. However, when individual modules were analysed, between 9 and 33 points were necessary to summarise 95–99% of iliac morphological variation, while the same percentage of variation was captured by 9–35 points for the ischium and 7–16 points for the pubis (Figure 3a-d). This implied that os coxae morphology can be coarsely explored with as few as eight coordinate points, but closer to one hundred is more appropriate to explore finer variation dispersed in several regions of the structure. Interestingly, when the coordinate configurations which contributed the greatest and least variation to PCs were compared, semi-landmarks on surfaces (e.g., the iliac blade) varied relatively little between individuals. In contrast, more pronounced differences were observable between semi-landmarks located on curves at extremal borders (e.g., along the posterior iliac crest). Thus, when a refined, lower-density coordinate template was created in which k = 107, surface semi-landmarks were omitted but 84 semi-landmarks along curves were retained along with the 23 fixed landmarks used to anchor them. The density of points per curve was varied to ensure an even dispersal. Along with the final template, the locations of these points are described at https://doi.org/10.15131/shef.data.25498276.

jcaa-8-1-161-g3.png
Figure 3

LaSEC sampling results. Plots illustrate the ‘fit’ between full and sampled configurations achieved by each of the ten iterations of sampling (grey lines) as well as the overall median fit (black line) for the whole os coxae (a), ilium (b), ischium (c) and pubis (d).

Following the above procedures, the refined template was applied to the complete sample (n = 29). A Procrustes ANOVA found that significant factors in morphological variation included sex (p = 0.001) and site (p = 0.004) (Table 2). The decomposition of the covariation matrix and subsequent permutation procedure implied significant modularity (CRν = 0.746, p = 0.001). The 95% confidence intervals (CI) generated through bootstrap resampling produced a range of plausible CRν coefficients from 0.728 to 0.896, supporting the supposition that os coxae structure exhibits modularity and providing a benchmark for assessing the impact of data imputation methods.

Table 2

Procrustes ANOVA to investigate factors affecting os coxae morphology. Significance determined through RRPP with 1000 permutations.

EFFECTSDFSSMSR2FP
site10.01350.01350.06992.17560.004
sex10.01830.01830.09472.94750.001
residuals260.16120.00620.8354
Total280.1929

Incomplete Cases and Data Imputation

After collating the results from arrays containing a ‘low’ density of NAs, it was clear that in unaligned configurations RF imputation was less accurate and consistent at reconstructing data than TPS. That is, a t-test with Welch’s correction found the distances of consensus shapes containing RF filled values from the observed average were significantly greater (t = 52.97, df = 99.95, p < 0.001) with large effect size (Cohen’s d = 7.5) than those calculated after TPS imputation. A Levene’s test also revealed significant differences in variance between groups (F(1,198) = 129, p < 0.001). Moreover, when CRν values were contrasted, although medians appeared comparable, the range for RF filled datasets was broader, with a substantial proportion falling below the point that bootstrapping suggested was plausible. In contrast, the CRν coefficients generated in TPS filled datasets were congruent with the bootstrapped range. This pattern changed, however, when RF and TPS imputation were employed to fill gaps in Procrustes-aligned data arrays. While TPS imputation performed near-equally under these conditions and the distances between the observed and simulated consensus configurations for RF imputed datasets were still significantly greater (t = 27.40, df = 171.38, p < 0.001, Cohen’s d = 3.9), the disparity in distances had lessened. Crucially, CRν values associated with both TPS and RF imputation on Procrustes-aligned data fell within the bootstrapped range (Table 3 and Figure 4a).

Table 3

Numerical summary of the distances between consensus configurations from datasets with imputed coordinate points and the consensus of the observed data.

MISSING VALUESALIGNMENTMETHODMINIMA1ST QUARTILEMEDIANMEAN3RD QUARTILEMAXIMA
LowunalignedRF0.0094520.0120470.0135770.0136640.0149650.021430
TPS0.0012090.0013870.0014980.0015190.0016420.001906
alignedRF0.0015930.0021490.0023200.0023070.0024650.002744
TPS0.0012140.0013900.0015030.0015210.0016390.001909
HighunalignedRF0.0504200.0664900.0739600.0748800.0832800.100650
TPS0.0053110.0065650.0070050.0070460.0075200.009426
alignedRF0.0055850.0072050.0077870.0077980.0083450.009446
TPS0.0052280.0064640.0069080.0069400.0073650.009446
DiffuseunalignedRF0.0470700.0622500.0662900.0678400.0739400.087280
TPS0.0061120.0068540.0073830.0073200.0077170.008631
alignedRF0.0059730.0066550.0069730.0070910.0073640.009446
TPS0.0061960.0069210.0075060.0074420.0078390.008845
jcaa-8-1-161-g4.png
Figure 4

Boxplots evaluating the performanace of imputation methods. These represent the distance between the observed consensus configuration and the consensus configurations of unaligned and Procrustes-aligned simulated datasets with ‘low’ (a), ‘high’ (b) and ‘diffuse’ (c) missing data after NAs had been filled by either RF or TPS imputation. On the righthand, the CRν coefficients associated with unaligned RF (1), aligned RF (2), unaligned TPS (3) and aligned TPS datasets (4) are also contrasted to the observed coefficient (blue line) and the upper and lower 95% CIs suggested by boostrapping (red lines).

A subtly different pattern emerged in simulated datasets with a ‘high’ density of NAs. As before, when compared to TPS imputation, the distances between the observed shape and the consensus shapes of RF filled datasets remained significantly and substantially greater in unaligned configurations (t = 59.32, df = 99.98, p < 0.001, Cohen’s d = 8.4) and the associated range of CRν coefficients was comparatively broad. However, even though still significant (t = 7.62, df = 198.0, p < 0.001) with large effect size (Cohen’s d = 1.1), the differences between methods in distances between the observed and aligned configurations were much reduced. Importantly, however, RF imputation on aligned data produced CRν values that resampling procedures suggested were plausible. Meanwhile, although TPS generated imputed coordinates still minimised the distance between simulated and observed consensus configurations, a proportion of CRν coefficients from arrays with ‘high’ frequencies of missing data fell below the bootstrapped range established in the observed data (Figure 4b).

More pronounced differences emerged in datasets with ‘diffuse’ missingness. While RF imputation in unaligned datasets performed as before, previous trends were reversed for aligned datasets. In these, distances between simulated and the observed consensus configuration were smaller for aligned datasets filled through the machine learning procedure. A Welch’s t-test revealed these differences were significant (t = –3.49, df = 195.67, p < 0.001) with moderate effect size (Cohen’s d = 0.5). Differences in the variance in distances between methods was also no longer significant (F(1,198) = 0.34, p = 0.850). Moreover, the full spectrum of CRν values associated with RF imputation fell within the 95% CI values suggested was plausible. In contrast, in addition to being less accurate in aligned datasets, TPS imputation was again associated with CRν coefficients which fell beyond the benchmarks established through bootstrapping (Figure 4c). Plots of the first two PCs also highlighted differences between methods and implied that although TPS imputation was generally more accurate and consistent, when ‘diffuse’ NAs were dealt with, reconstructed configurations likely occupied a different location in tangent linear space (Figure 5).

jcaa-8-1-161-g5.png
Figure 5

PCA visualisations. Plots illustrate the relationship between the observed consensus configuration (green) and consensus configurations associated with aligned and unaligned datasets with ‘low’ (a), ‘high’ (b) and ‘diffuse’ (c) levels of missing values filled by RF (blue) and TPS (red) imputation.

Discussion

This project explored solutions to commonly encountered challenges in GM investigations and developed a three-dimensional digitisation template (available at: https://doi.org/10.15131/shef.data.25498276) (Wigley and Blackwell 2024). The model of the os coxae was sufficiently sensitive to detect significant modularity and differences in morphology associated with sex and site, even with a relatively modest sample size. Narrowly, it can be surmised the template can be employed to generate further insights concerning os coxae shape, possibly relating to the impact that relatively independent development in specific regions of the structure has on final morphology. Future research may, however, find that the template presented here is not suitable for testing other hypotheses and, even if usable, further reliability testing would be necessary as previous work has demonstrated that while well-trained, individual observers may be highly consistent in their application of GM methods, inter-observer reliability can be low (e.g., Kenyhercz et al. 2014; Robinson et al. 2002). More broadly, the general approach (i.e., determining coordinate point density, reconstruction of missing values, and statistical procedures) could be employed to develop morphometric protocols suitable to many situations. The extent to which the structural properties of separate components of a putatively integrated object vary independently is, for example, of interest in various archaeological subfields, especially those which deal with how objects change over time. Thulman et al. (2023), for instance, found that covariation between ‘haft’ and ‘point’ modules in Clovis stone tools was less than that found within modules and implied that the lithics were composed of distinct segments whose shape was defined by technological rather than arbitrary factors. Moreover, the equipment and software utilised here to create scans and digitisation templates are highly compatible with other archaeological subdisciplines (i.e., due to portability, free access, etc) (e.g., Marić et al. 2022; Marcy et al. 2018; Shott and Trail 2010; Thulman et al. 2023). Meanwhile, the analytical process, though sensitive, is sufficiently streamlined that it can be conducted without the need of specialist computing equipment and is manageable within the freely available R environment (R Core Team 2023).

The balancing of statistical sensitivity and practical convenience was possible in large part through consideration of coordinate point density. Specifically, the LaSEC procedure in which the distance between a configuration and a sampling of its coordinates is calculated to gauge when the addition of further points to capture morphological variation becomes redundant was critical in preventing over-sampling (Watanabe 2018). This ensured the data collection and analytical processes were efficient and not unnecessarily cumbersome. A degree of caution should, however, be exercised with this procedure. When evaluated as a single structure, it appeared as though os coxae morphology could be summarised with very few coordinate points – potentially only eight. However, when the ilium, ischium and pubis were assessed in isolation it became apparent that this initial estimate would severely under-sample shape and likely confound many types of analysis. In short, the number of coordinates needed to capture a shape coarsely will differ greatly from that required for an exploration of potentially subtle patterns and, while quantitative procedures may mitigate against over-sampling, consideration must also be given to the specific hypotheses in question. Moreover, while LaSEC is useful for exploring how many coordinate points are necessary to define shape, it does not indicate which landmarks/semi-landmarks vary the most/least between shapes and therefore should be retained/omitted. Here this was determined empirically through the comparison of os coxae which contributed the most and least variation to each PC and with reference to past research (e.g., Betti et al. 2013; Candelas Gonzalez et al. 2017; Rissech et al. 2001; Standring 2016: 1337–1348; White et al. 2012: 227–240). The identification that certain semi-landmarks varied either more (i.e., semi-landmarks on curves at extremal borders) or less (i.e., surface semi-landmarks) between individuals is congruent with past research. Macleod (1999: 110), for example, posited that morphologically complex regions required more points to adequately capture shape variation, highlighting that it is not just the volume of points that need to be considered but also their placement.

The investigation of imputation methods and approaches generated useful as well as unexpected insights. The first salient point was the generally significantly smaller differences between the consensus configurations of simulated datasets containing TPS filled NAs and the observed consensus configuration. TPS imputed datasets also exhibited clustering along the first and second PCs in all permutations of the simulation study. Together, this implies that TPS interpolation accurately and consistently reconstructs coordinate matrices. Given past research, this was unanticipated. In a comparative GM study, Arbour and Brown (2014), for example, found that standard statistical techniques performed better (i.e., achieved greater accuracy). Similarly, Neeser et al. (2009) reported large error rates for TPS imputation, especially when estimating coordinate points over large areas, and suggested that configuration reconstruction was best accomplished in anatomical structures through linear regression models, even when those models were developed on distantly related species. Other work has, however, proposed that TPS imputation can be successful, if reference configurations are chosen carefully (e.g., Gunz et al. 2009; Senck et al. 2015). The outcomes of this study not only reiterate the latter point, but also highlight how difficult the selection of an appropriate reference shape may be. While TPS generally performed well here by using the consensus of all complete configurations as a reference, in the simulated dataset in which all configurations contained NAs and two casts were used to construct a reference, PC plots show that while imputed consensus configurations still cluster (i.e., imputation had been consistent) they do so away from the observed consensus shape. Moreover, distances between these configurations proved to be significantly greater than for RF imputation. This suggests that the use of reference shapes from outside of the study sample has the capacity to skew results. From a bioarchaeological perspective, at least, this may mean that intra-specific variation may be sufficiently pronounced in certain skeletal elements that it may not be appropriate to use even reference shapes derived from the same species if they occupy a different sociocultural or environmental niche. This supposition is supported by the Procrustes ANOVA which indicated significant differences in shape between ossa coxae from the Black Gate and South Shields assemblages.

Moreover, it was discovered that TPS imputation may interfere with patterns of covariation. This is evidenced in the simulated datasets containing ‘high’ and ‘diffuse’ NAs in which a portion of CRν coefficients fell outside the CIs generated from bootstrap resampling of the observed data. Covariation was thus either elevated within modules or reduced between them as a result of TPS imputation. It is speculated that when the interpolation function maps coordinate points observed in the reference shape onto regions of missing data in the target, the local morphology of the reference configuration distorts overall patterns of covariation in the target shape; this could substantially decrease covariation between modules in configurations with large patches of missing data and account for the increase in modular signal. In contrast, the regression trees that are used to fill missing values in RF imputation consider relationships between points throughout configurations. Thus, in Procrustes-aligned arrays in which extraneous data had been removed, due to the more holistic frame of reference this method generally appeared to preserve patterns of covariation within and between modules reasonably well – even when shapes from outside the sample were used to facilitate the initial configuration alignment in morphospace. Therefore, when determining how to deal with missing data, the volume and distribution of missing values as well as the hypotheses being tested must be considered. For example, if working with a small number of missing values, TPS imputation will likely be optimal. However, when missing data is more extensive or the primary focus of the investigation is the exploration of statistical patterns, machine learning approaches can be used as an effective way of reconstructing morphological data (Stekhoven and Bühlmann 2012; Stekhoven 2022). Future work may wish to consider the use of out-of-bag error estimates, a measure of prediction accuracy in machine learning applications, as well as number of trees and iterations used in the implementation of this procedure and the interplay of imputation accuracy and computational efficiency in order to achieve best results. One caveat regarding the RF approach from a GM perspective is that configurations should be aligned beforehand, removing redundant information relating to location, size and orientation so that algorithm can assess morphological relationships between points more effectively (Arbour and Brown 2014: 17–18; Arbour and Brown 2022).

Conclusion

This paper presents an open-access, three-dimensional template of the human os coxae as well as the Procrustean protocols used in model design (Wigley and Blackwell 2024). Regarding the protocol’s wider application, the equipment, software and procedures employed in template creation and the data collection process are accessible and can be used effectively without specialist training. Specifically documented were techniques to estimate optimal coordinate point density (Bardua et al. 2019; Watanabe 2018); their use led to consideration of the differences between the number of points needed to define shape broadly compared to the number required to reveal subtle patterns of variation/covariation. Furthermore, two methods of data imputation were contrasted through a simulation study (Gunz et al. 2009; Stekhoven and Bühlmann 2012). It emerged, that while both are capable of effectively handling missing values when parametric statistical methods cannot, their use is associated with caveats. While imputation through a geometric method may produce more consistent results, reference selection impacts accuracy. Moreover, where large patches or high volumes of data are missing (as may be expected in archaeological investigations), this method may distort statistical patterns. In contrast, non-parametric machine learning approaches provide a robust and flexible means of imputing missing data which better preserve inherent patterns of morphological covariation. In summary, the findings discussed here can be utilised to guide future research design and the methodologies adapted to diverse archaeological subfields (e.g., bioarchaeology, lithics analysis, geoarchaeology).

Ethics and Consent

Permission to access to human skeletal collections for non-destructive research purposes was provided by the University of Sheffield’s Department of Archaeology.

Acknowledgements

This work was supported by the Engineering and Physical Sciences Research Council (EPSRC). For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising.

Competing Interests

The authors have no competing interests to declare.

DOI: https://doi.org/10.5334/jcaa.161 | Journal eISSN: 2514-8362
Language: English
Submitted on: Apr 3, 2024
Accepted on: Mar 25, 2025
Published on: May 6, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Ben R. Wigley, Paul G. Blackwell, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.