
Figure 1
Illustration of the inherent ambiguity of clustering. One large cluster along with a set of outliers that could be considered either as a separate cluster or as noise.
Table 1
Comparisons of Clusterability Methods. Method/Ref refers to the method and its citation.
| METHOD/REF | DATA TYPE | TEST | PRE CLUST | TYPE I | NOTES | LANG | PACKAGE |
|---|---|---|---|---|---|---|---|
| PCA dip [4] | Num | Y | Y | Y | Good perf: robust | R | Y |
| PCA Silv [4] | Num | Y | Y | Y | Good perf: small clust | R | Y |
| SPCA dip [15] | Num | Y | Y | Y | Good perf: robust | R | Y |
| SPCA Silv [15] | Num | Y | Y | Y | Good perf: small clust | R | Y |
| Dist. dip [4] | Num | Y | Y | Y | Robust; power varies | R | Y |
| Dist. Silv: [4] | Num | Y | Y | Y | Good perf, small clust | R | Y |
| Hopkins [24] | Num | Y | Y | N | poor perf | R, py | N |
| PC dip [4] | Num | Y | Y | N | poor perf | R, py, mat | N |
| PC Silv. [4] | Num | Y | Y | N | poor perf | R, py, mat | N |
| GMM [21] | Num | Y | M1 | NT2 | Assume Gauss | R | N |
| Epter [16] | Num | N | Y | N/A | Historical | None | N |
| Klopotek [9] | Num | N | N | N/A | Validation | None | N |
| TestCat [28] | Cat | Y | Y | Y3 | cat. data | R, Mat | N |
| PHI/PSI [14] | Num/Mix | N | Y | N/A | score only | Py | N |
| VAT [17] | Num/Cat4/Mix4 | N | Y | N/A | visual only | Mat, Py, R | N |
| iVAT [18] | Num/Cat4/Mix4 | N | Y | N/A | visual only | Mat, Py, R | N |
| aVAT [18] | Num/Cat4/Mix4 | N | Y | N/A | visual only | Manual5 | N |
| Ultramet [20] | Num | N | Y | N/A | score only | None | N |
| Miasnikof [32] | Graph | Y | Y | Y | Graph test, good perf | None | N |
| Gao/Zhang [30] | Graph | N6 | N | N/A | Y/N descn; no p-value | None | N |
| FOCS 2018 [33] | Graph | N6 | Y | N/A | Y/N descn; no p-value | None | N |
| Li. 2025 [31] | Graph | N6 | Y | N/A | Y/N descn; no p-value | None | N |
| FCN [34] | Graph | Y | Y | Unclear7 | Graph test, No T1E | None | N |
| PHIClust [22] | RNA-seq | TH6 | Y | N/A | App spec | R | N |
| Build Clust [23] | Spatial | N | Y | N/A | App spec | None | N |
[i] Data type includes numeric (Num), categorical (Cat), or mixed (Mix).
Test refers to whether or not the method conducts a formal statistically backed test of whether the data is clusterable or whether it is not clusterable.
Pre Clust refers to whether or not the test is conducted prior to clustering, without explictly requiring a clustering algorithm to be chosen first.
Type I refers to whether or not the method has type I error tested and close to the nominal value.
Lang describes the languages available for immediate implementation of the methodology. Options are R, py (python), mat (Matlab), or None.
Package Y/N denotes whether or not the method is included in our clusterability package on CRAN. Methods with Package=Y are in bold.
Notes summarizes the method framework, with performance notes or major limitations.
1 M: Gaussian Mixture Models (GMM), while done prior to clustering, do assume that a GMM is a reasonable fit to the data. We consider this model-based and therefore exclude it from our package.
2 NT: We could not find simulations testing type I error in their paper.
3 Based on the chi-square distribution, the test is theoretically controlled when assumptions are met. Type I error evaluations were mostly favorable, with a few minor exceptions.
4 Existing software is readily available for numeric data. Categorical and mixed type data is theoretically possible but may require some preprocessing before implementation.
5 Similarly, aVAT may require preprocessing to run through VAT.
6 These tests provide binary decisions (clusterable or not clusterable), but do not provide p-values and have not been type I error tested.
7 For FCN, simulations reported average p-values. The type I error rate, i.e. the proportion of the time the p-value was below the nominal level, was not reported.

Figure 2
Flowchart for options of data reduction methods and multimodality tests.

Figure 3
Three plots showing the normals1, normals2, and normals3 datasets, which are sampled from mixtures of one or more normal distributions.

Figure 4
Plots displaying the normals4 and normals5 datasets.
Table 2
Results from the clusterabilitytest() function used on five normal mixtures and the iris and cars datasets. Values are p values reported by the test. Results are rounded to 6 digits as specified by function parameter values (s_adjust=TRUE and s_digits=6).
| DATA SET | DIP PCA | DIP SPARSE PCA (EN) | DIP DISTANCE | SILVERMAN PCA | SILVERMAN SPARSE PCA (EN) | SILVERMAN DISTANCE |
|---|---|---|---|---|---|---|
| normals1 | 0.98522 | 0.97978 | 0.99361 | 0.80364 | 0.78757 | 0.10218 |
| normals2 | 0 | 0 | 0 | 0 | 0 | 0 |
| normals3 | 0.04596 | 0.01817 | 3.291×10–5 | 0.00101 | 0.00401 | 9.0×10–6 |
| normals4 | 6.468×10–6 | 0 | 0.01969 | 0 | 0 | 0 |
| normals5 | 1.380×10–5 | 0.00181 | 4.726×10–5 | 1.571×10–5 | 0.00057 | 0 |
| iris | 0 | 0 | 0 | 9.0×10–6 | 0 | 0 |
| cars | 0.85818 | 0.83202 | 0.66042 | 0.52577 | 0.41324 | 0.99425 |

Figure 5
Histograms with scores from the first principal component, pairwise distances, and first sparse principal component for the iris dataset.

Figure 6
A plot showing the original cars dataset and histograms with pairwise distances and scores from the first principal component and first sparse principal component.
Table 3
Median execution time of the clusterabilitytest() function for each dataset and test/dimension reduction combination. Time is measured in milliseconds.
| DATA SET | DIP PCA | DIP SPARSE PCA (EN) | DIP DISTANCE | SILVERMAN PCA | SILVERMAN SPARSE PCA (EN) | SILVERMAN DISTANCE |
|---|---|---|---|---|---|---|
| normals1 | 2.35 | 3.35 | 4.45 | 948 | 698 | 3,130 |
| normals2 | 2.62 | 4.04 | 5.17 | 751 | 688 | 2,920 |
| normals3 | 2.23 | 5.54 | 3.79 | 732 | 827 | 2,920 |
| normals4 | 2.43 | 4.51 | 4.40 | 724 | 697 | 2,880 |
| normals5 | 2.42 | 4.64 | 5.53 | 663 | 712 | 2,840 |
| iris | 2.51 | 4.50 | 4.40 | 827 | 702 | 2,830 |
| cars | 2.49 | 3.95 | 2.16 | 756 | 702 | 847 |
