Skip to main content
Have a personal or library account? Click to login
Sources and patterns of uncertainty in construction MSMEs: A machine learning study in southwestern Colombia Cover

Sources and patterns of uncertainty in construction MSMEs: A machine learning study in southwestern Colombia

Open Access
|May 2026

Figures & Tables

Fig. 1:

Classification of uncertainty sources in construction projects. Source: (Hazr and Ulusoy 2020).

Fig. 2:

Data processing pipeline. Source: Own 2025. CT, classification tree.

Fig. 3:

Bootstrap-based feature importances (1,000 iterations) for internal sources of uncertainty in construction projects (a) Feature importance of organisational Unc. (b) Feature importance of activity durations Unc. (c) Feature importance of resource use Unc. (d) Feature importance of changes in Req. & Qual. Unc. (e) Feature importance of resource availability Unc. Source: Own, 2025.

Fig. 4:

CT with the highest frequency across bootstrap samples for internal uncertainty sources in construction projects (a) Most frequent CT for organisational uncer. (frequency: 200/1000) (b) Most frequent CT for activity durations uncer. (frequency: 230/1000) (c) Most frequent CT for resource use uncer. (frequency: 42/1000) (d) Most frequent CT for changes in Req. & Qual. Uncer. (frequency: 139/1000) (e) Most frequent CT for resource availability uncer. (frequency: 35/1000). Source: Own, 2025. CT, classification tree.

Fig. 5:

Bootstrap-based feature importances (1,000 iterations) for external sources of uncertainty in construction projects (a) Feature importance of logistics Unc. (b) Feature importance of environmental Unc. (c) Feature importance of sociopolitical Unc. (d) Feature importance of market Unc. (e) Feature importance of technological Unc. Source: Own, 2025.

Fig. 6:

CT with the highest frequency across bootstrap samples for external uncertainty sources in construction projects (a) Most frequent CT for logistics uncer. (frequency: 268/1,000). (b) Most frequent CT for environmental uncer. (frequency: 83/1,000). (c) Most frequent CT for sociopolitical uncer. (frequency: 71/1,000). (d) Most frequent CT for market uncer. (frequency: 626/1,000). (e) Most frequent CT for technological uncer. (frequency: 241/1,000). Source: Own, 2025. CT, classification tree.

Perceived frequency of the sources of uncertainty_

Source of uncertaintyVariableMean*SD
OrganisationalInherent complexity of the construction project4.621.30
Ambiguity in selection criteria3.851.59
Experts’ consultation4.311.26
Risk-taking willingness of decision makers3.961.18
Activity durationsActivity duration differing from actual duration4.121.58
Resource useInaccurate resource estimation3.811.52
Requirement changes and quality IssuesChanges in project requirements3.771.27
Resource availabilityInflexible resource availability3.541.36
LogisticsSafety issues3.771.21
Site access conditions2.881.18
Supply availability fluctuations3.581.21
EnvironmentalInconsistent weather4.001.41
Adverse geographic conditions3.651.60
SociopoliticalPolicies and regulations3.501.39
Social conditions3.271.48
MarketMarket conditions4.271.56
TechnologicalEquipment reliability and construction methods3.421.33

Summary of the most influential features and dominant classification rules associated with higher perceived uncertainty across domains_

Uncertainty sourceFeature importanceCT rules for a higher level of uncertainty
Most important featureMDIMost important signalMDINumber of activitiesNumber of state projectsMonths in serviceOriginSignalHighest% companies classified in a leaf node (%)
OrganisationalNumber of activities0.217Subjective expert information0.185 -≤327-Subjective expert information70.7
Activity durationsMonths ¡n service0.388Leader decision timing0.044---Outside of Valle del Cauca and Huila-50
Resource useMonths in service0.341Inflexible cost estimation0.087≤3-≥60Outside of Cauca 43.9
Requirement changes and Quality issuesMonths in service0.379Design changes0.042≥l-≥67 and ≤276--34.1
Resource availabilityMonths in service0.315Limited availability of capable workers in the area0.066≤3-≤I4IOutside Valle del Cauca-37.8
LogisticsMaterial acquisition0.212Material acquisition0.212-- Other signals different to Material Acquisition and supply chain structure57.3
EnvironmentalMonths in service0.406Heavy rains0.030≤2-≤327-Heavy rains42.7
SociopoliticalMonths in service0.202Worker social discontent0.127≤2- -Other signals different to worker social discontent and non-working days granted62.2
MarketMonths in service0.414Supply prices0.082--≥29--82.9
Technologicalnumber of states where the company has projects0.314Renewable resource efficiency0.048≤2≤3 --74.4

Descriptive statistics of the surveyed companies_

Variable meaningSub-variablesMeanSDMinMax
Origin state of the companyCauca0.350.4901
Nariño0.190.4001
Valle del Cauca0.350.4901
Huila0.120.3301
Number of states where the company has projects-1.620.9814
Number of months since the commercial registration date of the company-92.3181.715363
Size of the companyMicro0.920.2701
Small0.040.2001
Medium0.040.2001
Number of ISIC activities the company executes-2.000.9414
Economic activities carried out by companiesConstruction of residential buildings0.350.4901
Construction of non-residential buildings0.230.4301
Construction of roads and railways0.120.3301
Construction of utility projects0.230.4301
Construction of other civil engineering works0.420.5001
Other specialised activities0.270.4501
Real estate activities0.040.2001
Architectural activities0.150.3701
Technical consultancy0.190.4001

Validity of controlled data expansion strategy across internal and external sources of uncertainty_

Uncertainty sourceLower uncertainty classHigher uncertainty classTotal observations
Original sample numberAugmented sample numberLowest p-value across all variablesOriginal sample numberAugmented sample numberLowest p-value across all variables
Organisational280.6223950.45103
Activity durations5210.2920820.68
Resource use7290.6918740.52
Requirement changes and quality issues13540.5412490.56
Resource availability8330.4717700.40
Logistics280.6223950.29
Environmental4160.3221870.52
Sociopolitical5210.2920820.47
Market141.0024990.56
Technological4160.3721870.27

Perceived magnitude of the sources of uncertainty (Likert scale: 1–4)_

Source of uncertaintyVariableMean*SD
OrganisationalInherent complexity of the construction project3.150.78
Ambiguity in selection criteria2.770.95
Experts’ consultation2.690.88
Risk-taking willingness of decision makers2.811.06
Activity durationsActivity duration differing from actual duration2.920.80
Resource useInaccurate resource estimation2.921.06
Requirement changes and quality IssuesChanges in project requirements2.381.02
Resource availabilityInflexible resource availability2.850.97
LogisticsSafety issues3.150.88
Site access conditions2.190.98
Supply availability fluctuations3.120.95
EnvironmentalInconsistent weather3.080.74
Adverse geographic conditions2.920.98
SociopoliticalPolicies and regulations2.850.88
Social conditions2.811.02
MarketMarket conditions3.350.75
TechnologicalEquipment reliability and construction methods3.040.82

Comparison with previous approaches to uncertainty assessment in construction projects and this study’s contribution_

StudyApproach/methodProject scaleData natureLimitationStudy contribution
Ali et al. (2018)Expert-based RIIPublic infrastructureFive-item Likert scale from domain expertsSubjective weighting; limited empirical validationProvides baseline prioritisation for public-sector risk budgeting; relies on expert weighting rather than empirical inference
Shabani et al. (2023)Narrative search and semi-structured expert interviewsPublic road projectsExpert-informed categorisationSubjective categorisation; limited replicability across contextsEnhanced understanding of contextual, operational and strategic uncertainty through expert narratives
Erol et al. (2022)ANP model with a two-round Delphi processMega construction projectsDomain experts weightingSubjective weighting and limited applicability to MSMEsRisk quantification model for mega construction projects
Ulupui et al. (2024)Partial Least Squares and RA for ARIMultisector MSMEs from IndonesiaFive-item Likert scale from MSME representativesApplied to MSMEs, but not the construction industry explicitlyFramework for quantifying the interactions of technological, organisational and environmental risk dimensions among MSMEs
Our approachRF feature importance and CTsConstruction MSMEs from ColombiaEmpirical survey data combined with class-preserving synthetic augmentation for small-sample modellingStrategic-level focus; does not capture operational dynamicsFirst interpretable, machine-learning framework modelling ten internal and external uncertainty sources in construction MSMEs
DOI: https://doi.org/10.2478/otmcj-2026-0005 | Journal eISSN: 1847-6228 | Journal ISSN: 1847-5450
Language: English
Page range: 64 - 81
Submitted on: Sep 9, 2025
Accepted on: Jan 7, 2026
Published on: May 26, 2026
Published by: University of Zagreb
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Cristian David Tobar Montilla, Mariela Muñoz-Añasco, Adriana M. Nieto-Muñoz, Elvia Ruiz-Beltran, published by University of Zagreb
This work is licensed under the Creative Commons Attribution 4.0 License.