Have a personal or library account? Click to login

Expert-Based Risk Assessment for Flight Safety Using Kendall's W and Pearson's Chi-Square

Open Access
|Oct 2025

Full Article

1.
INTRODUCTION

Studies examining patterns and trends in aviation accidents (AA) and incidents indicate that in-flight technical malfunctions of aviation equipment occur about four times as frequently as errors committed by flight crews. Yet, when such incidents escalate into catastrophic events, human error is found to be roughly four times more likely to be the cause than technical failure. Most accidents and incidents in which an airworthy aircraft collides with the ground during controlled flight result primarily from violations of flight procedures and shortcomings in the professional performance of aviation personnel at various levels of the air transport system. This issue has become sufficiently serious and widespread that in various countries, airlines, along with international aviation organizations and professional associations, have intensified efforts to address it.

2.
CURRENT APPROACH TO THE PROBLEM

The primary objective of flight safety (FS) management in air transport is to develop measures that prevent the ongoing trend toward hazardous situations in civil aviation by establishing a continuously operating FS monitoring and control system. This system should be based on the principles of international quality standards (ISO 9000) and the ICAO Safety Management System (SMS), particularly emphasizing the process-oriented approach to aviation enterprise operations. Such a system must be equipped with tools for technical and economic analysis and make use of scientific advances in the field of goal-oriented management of complex systems that ensure flight safety [1].

Key Data from 2024

In 2024, there were 14 fatal airline accidents worldwide, resulting in 304 fatalities. In Europe, three fatal accidents occurred in commercial air transport, causing three deaths. In general aviation, 27 fatal accidents involving non-complex airplanes led to 44 fatalities. These figures are consistent with historical data [2].

Fig. 1.

Structure of the Annual Safety Review 2024 [2].

This also highlights the relevance of the topic under investigation. The process of flight safety monitoring based on the assessment of risk levels enables the identification of adverse factors occurring during flight and the prediction of their potential consequences.

The nature of flight crew errors in flight remains insufficiently studied, and the numerous unsystematic preventive measures currently in use have proven to be of limited effectiveness. Consequently, reliable methods for preventing aircraft accidents caused by crew errors have yet to be developed. One of the main reasons for this persistent problem lies in the inadequate approach taken by airline management to address it. While significant time and resources are spent analyzing detected errors, flight operations specialists often lack the ability to assess risk levels effectively—an assessment that would allow them to:

  • identify potentially hazardous situations;

  • evaluate the probability of danger occurrence;

  • select alternative measures to reduce the level of risk; and

  • assess the effectiveness of the implemented solutions.

Therefore, to improve flight safety management, it is essential to develop new methods that enable flight services to operate efficiently according to the assessed risk levels of adverse factors. These methods should be integrated into an Automated Control System (ACS) capable of collecting, storing, and analyzing relevant operational data and processing events across all hierarchical levels through a unified algorithmic framework.

Data and management

The proposed approach complies with the ICAO requirements set out in the Safety Management Manual (SMM). According to these guidelines, introducing the concept of an acceptable level of flight safety (ALoS) requires not only adherence to the standard safety principles and requirements already in place, but also the application of an approach based on measurable flight safety indicators.

An acceptable level of flight safety represents the goals defined by supervisory authorities, operators, and customers, which must be achieved and maintained within the field of flight safety. This level serves as a benchmark against which regulatory bodies can assess flight safety performance. When determining the acceptable level of safety, several factors must be considered: the existing level of operational risk, the cost–benefit ratio of improving the risk assessment system, and public expectations regarding safety in the aviation sector.

One possible method of doing so involves assessing the hazard coefficient of adverse factors based on the number of recorded incidents. The key principle in monitoring flight safety levels is therefore the evaluation of hazardous in-flight events.

In practice, airlines most often measure flight safety performance using data on severe adverse events. By continually reducing the number of such incidents, specialists can in turn decrease the overall frequency of aviation accidents. This relationship can be expressed as: Δf.possible.min=<Δf.fact=Δf.possible.max \Delta f.possible.\min = < \Delta f.fact = \Delta f.possible.\max where Δf represents the frequency of events.

A major limitation of this method, however, lies in its reliance on “high-severity” events (accidents and serious incidents) when analyzing risk levels and identifying adverse in-flight factors. This approach tends to be coarse and lacks precision in defining intermediate levels of risk. To minimize these shortcomings, the author proposes expanding the analysis to include not only dangerous occurrences and deviations, but also “other negative events” (see Fig. 2) [3].

Figure 2.

Pyramid of negative events (illustrating the “1:10:30:600 rule”) [3].

Following the ICAO recommendations, “other negative events” refer to less critical cases that pose potential safety threats. Although these events may seem minor, they can serve as early indicators of latent problems in flight safety. Ignoring such underlying issues may lead to an increased number of more serious incidents. Recurrent events are particularly significant, as the data they provide are valuable for statistical evaluation.

The main difficulty of the current method lies in defining the weighting factor for each negative event, due to inherent inequalities [1]. Therefore, it is crucial to develop a system for the quantitative assessment of these weighting factors, based on the theory of risk evaluation.

It can thus be concluded that unifying the theoretical foundations of flight safety within the existing ICAO risk-based models remains incomplete for several reasons:

  • The concept and modelling of risk vary across disciplines (finance, ecology, engineering, etc.), and sector-specific features are often incorrectly taken as a foundation for new risk models.

  • From the general standpoint of mathematical formalization and the theory of random processes, only two models or formulas of risk can realistically be used: those based on the accident rate or the uncertainty of the studied phenomena. Moreover, within the broader theoretical framework, no significant difficulties arise in defining risk or assessing system safety based on risk interactions across various systems.

  • The direct transfer of methods from reliability theory to the evaluation of hazards caused by system failures does not yield satisfactory or unambiguous results. In particular, it fails to explain adequately the causes of accidents as rare and improbable events.

Method for Quantitative Evaluation of the Hazard Level of Adverse In-Flight Factors

Risk assessment makes it possible to classify identified events into groups of similar occurrences according to decreasing levels of risk. The resulting quantitative values can then be used to establish a priority order for implementing flight safety measures.

To determine risk levels based on operational monitoring data, the evaluation follows the rules of flight airworthiness that regulate the probabilities of special in-flight situations, as illustrated in Figure 3.

Figure 3.

Airworthiness requirements for the functional reliability of aviation equipment.

Here, CFC refers to control flight configuration, DS means dangerous or difficult situations, E stands for expected incident, CS for catastrophic situations; Px(O) is the probability of special situation caused by a functional failure, and Px(∑) is the total probability of special situations caused by functional failures).

In this case, the total risk assessment is expressed as the sum: Px+PCS+PE+PDS+PCFC. {{\rm{P}}_{\rm{x}}} + {{\rm{P}}_{{\rm{CS}}}} + {{\rm{P}}_{\rm{E}}} + {{\rm{P}}_{{\rm{DS}}}} + {{\rm{P}}_{{\rm{CFC}}}}.

For the purpose of achieving a higher level of flight safety, particular attention should be paid to seemingly minor events with limited immediate consequences. The risk level of such events is best assessed through expert evaluation, since the use of purely mathematical methods is often not feasible for this purpose. The main challenges in developing mathematical rules for the quantitative evaluation of adverse in-flight factors include:

  • difficulties in ranking such negative events;

  • difficulties in defining their potential consequences;

  • challenges in analyzing flight development within the entire chain of adverse factors;

  • difficulties in determining flight outcomes involving several event ranges within short time intervals; and

  • difficulties in assessing how one adverse factor may trigger others and contribute to the cascading development of events.

Figure 4.

Application of airworthiness criteria in risk evaluation [3].

A graphical representation of the method for the quantitative evaluation of negative event frequency was shown above in Fig. 2, according to which the rule 1:10:30:600 (a conditional ratio of the recurrence of negative events) and 1:10:1000:10000(:>10000) (a conditional ratio of the recurrence of special in-flight situations) can be applied. Formally, this relationship is expressed as: qA:qF:qSI:qI=1:10:30:600, {q_A}:{q_F}:{q_{SI}}:{q_I} = 1:10:30:600, where qA is the number of accidents, qF the number of failures, qSI the number of serious aviation incidents, and qI the number of aviation incidents.

In the updated model of the flight conditions pyramid, the relationship between event categories is expressed as: qCS:qE:qDS:qCFC:qWCFC=1:10:103:104:(>104), {q_{CS}}:{q_E}:{q_{DS}}:{q_{CFC}}:{q_{WCFC}} = 1:10:103:104:(> 104), where qCS is the number of catastrophic situations, qE the number of emergencies, qDS the number of difficult situations, qCFC the number of situations with complication of flight conditions, and qWCFC the number of situations without complication of flight conditions.

This classification method is straightforward to apply and allows for continuous monitoring of the current risk level during flight safety management. The assessment is conducted in accordance with the risk classification presented in Table 1 and Table 2.

Table 1.

Probability of an accident for different event types.

i Index of event typeEvent type (special situation in flight)Qi Accident probabilityni Number of controllable events of type iT
1WCFC (Situation without complication of flight conditions)Q1=10−5q1 – Number of controllable events of WCFC typeFlight hours during the flight safety monitoring
2CFC (Situation with complication of flight conditions)Q2=10−4q2 – Number of controllable events of CFC type
3DS (Difficult situation)Q3=10−3q3 – Number of controllable events of DS type
4E (Emergency)Q4=10−1q4 – Number of controllable events of E type
5CS (Catastrophic situation)Q5=100q5 – Number of controllable events of CS type
Table 2.

Risk classification.

Rank of consequences
Insignificant (WCFC)Insignificant (CFC)Significant (DS)Dangerous (E)Catastrophic (CS)
Probability of eventFrequent 10−3 < Q ≤ 100Subject to analysisUnacceptableUnacceptableUnacceptableUnacceptable
Quite probable Q ≤ 10−3Subject to analysisSubject to analysisUnacceptableUnacceptableUnacceptable
Probable Q ≤ 10−5AcceptableSubject to analysisSubject to analysisUnacceptableUnacceptable
Improbable Q ≤ 10−6AcceptableAcceptableSubject to analysisSubject to analysisUnacceptable
Extremely improbable Q ≤ 10−7AcceptableAcceptableSubject to analysisSubject to analysisSubject to analysis

It is necessary to use expert evaluation to assign an event class, since other mathematical methods are not applicable. The manifestation of an adverse factor, the crew's actions to mitigate its consequences, and the resulting flight outcome are random events. Therefore, as an objective, integral measure for assessing flight safety, we adopt the probability of an unsuccessful flight outcome (failure or accident). This indicator is hereafter referred to as the flight risk level, Q. The mitigation-complexity factors together with the manifested adverse factors form a likelihood matrix: (1) {αij}=α11αijα1nαi1αijαinαm1αmjαmn \{{\alpha _{ij}}\} = \left| {\matrix{{{\alpha _{11}} \ldots {\alpha _{ij}} \ldots {\alpha _{1n}}} \hfill \cr {\ldots \ldots \ldots \ldots \ldots \ldots} \hfill \cr {{\alpha _{i1}} \ldots {\alpha _{ij}} \ldots {\alpha _{in}}} \hfill \cr {\ldots \ldots \ldots \ldots \ldots \ldots} \hfill \cr {{\alpha _{m1}} \ldots {\alpha _{mj}} \ldots {\alpha _{mn}}} \hfill \cr}} \right| where αin denotes the number of flight stages and αjm notes the number of adverse factors.

The outcome hazard factor of a special situation is defined as βij the probability of an aviation incident resulting from an unmitigated i-type adverse factor occurring at the j-th stage of flight. The outcome hazard factors of special situations in flight form a likelihood matrix: (2) {βij}=β11βijβ1nβi1βijβinβm1βmjβmn \{{\beta _{ij}}\} = \left| {\matrix{{{\beta _{11}} \ldots {\beta _{ij}} \ldots {\beta _{1n}}} \cr {\ldots \ldots \ldots \ldots \ldots \ldots} \cr {{\beta _{i1}} \ldots {\beta _{ij}} \ldots {\beta _{in}}} \cr {\ldots \ldots \ldots \ldots \ldots \ldots} \cr {{\beta _{m1}} \ldots {\beta _{mj}} \ldots {\beta _{mn}}} \cr}} \right|

Introducing the concepts of the complexity factor for mitigating an adverse factor and the outcome hazard factor of a special flight situation makes it possible to define the overall danger factor of a special situation caused by the occurrence of an adverse factor, expressed as: (3) {OPij}={αij}{βij} \{{OP}_{ij}\} = \{{\alpha _{ij}}\} \otimes \{{\beta _{ij}}\} where the symbol ⊗ denotes element-wise (step-by-step) matrix multiplication. Accordingly, the value OPij represents the degree of hazard associated with an adverse factor. Given the known a priori probability of occurrence qij, the flight risk level Qij can be evaluated as: (4) Qij=OPij×qij {Q_{ij}} = {OP}_{ij} \times {q_{ij}}

By using the quantitative values of α and β, it becomes possible to determine OP based on the level of danger of the flight's special situation. This, in turn, enables the development of a rational strategy for crew actions to mitigate the consequences of adverse factors, the formulation of requirements for flight safety systems, and the establishment of appropriate training standards for flight crews.

This approach allows for the evaluation of the current level of flight safety during aircraft operation and the forecasting of the effectiveness of planned preventive measures.

3.
RISK FACTOR AND MODELLING

The first flight operations risk category to be analyzed here is Controlled Flight into Terrain (CFIT). To identify the contributory risk factors and their interrelationships, human experts from various domains of flight operations were interviewed. Each general risk factor was subsequently decomposed into its component elements [4].

This process of decomposition and mapping of interrelationships is referred to as the risk structure. The structured knowledge elicitation method employed here is loosely based on the Analytical Hierarchy Process (AHP) [4].

To ensure that the experts' opinions were consistent and not random, the coefficient of concordance (W ) was used as a coordination criterion. Kendall's coefficient of concordance (W ) is a statistical measure used to assess the degree of agreement among multiple experts or judges evaluating the same set of items. It is a non-parametric statistic, particularly suitable when data do not meet the assumptions of parametric tests such as normality. The coefficient ranges from 0 to 1, where 0 indicates no agreement among experts and 1 represents perfect agreement [5].

Note that the W statistic should only be applied when measuring concordance between variables that evaluate the same general property of objects. If both positive and negative correlations are considered equally meaningful, this test would not be appropriate.

The ranking of adverse events was performed according to the classification presented in Table 3.

Table 3.

Ranging of adverse events by independent experts.

12...iM
1c11c21c1jcm1
2c12c22c2jcm2
….
jc1jc2jcijcmj
...
nc1nc2ncincmn

The number of columns (m) in the table corresponds to the number of experts who participated in the survey, while the number of rows (n) corresponds to the number of adverse events evaluated by those experts. At the intersection of the i-th column and the j-th row is the element Cij, representing the rank (or position) assigned by the i-th expert to the j-th event.

Based on the data obtained from the expert survey table, both the hazard indicators of the evaluated events and the degree of agreement among expert judgments are assessed. These results make it possible to develop a hazard assessment scale, that is, to determine which type of situation (E, DS, CS, CFC, or WCFC) a given adverse risk factor is likely to produce.

First, the events are grouped into five categories: E (emergency), DS (difficult situation), CS (catastrophic situation), CFC (complication of flight conditions), and WCFC (without complication of flight conditions). The first group includes the most dangerous events, the second group somewhat less dangerous ones, and so on, with the fifth group containing the least dangerous events.

Within each group, events are ranked according to their degree of hazard – the most dangerous event occupies the first position, the next less dangerous event the second, and so forth. In this way, all events are systematically ranked by hazard level. In addition to these five main groups, two supplementary categories may also be identified: IC (indifferent condition) – events in which no danger arises, and RRC (risk-reducing condition) – events that may partially mitigate the likelihood of hazardous situations.

Kendall's coefficient of concordance (W) was then applied to the observations from all experts within each category independently. Kendall's W is calculated as follows: (5) W=12Sm2n3nmT W = {{12S} \over {{m^2}\left({{n^3} - n} \right) - mT}} where S is the sum-of-squares from the row sums of ranks Ri, n is the number of objects, m is the number of experts (observers) and T is a correction factor accounting for tied ranks.

For each row of ranks assigned by a given expert, the sum of ranks is calculated. This sum represents a random variable that, in the general case, provides an estimate of the variance relative to the maximum possible variance of ranks – effectively yielding a measure of rank correlation. (6) Dx=1n1i=1n(RiR¯)2=1n1S Dx = {1 \over {n - 1}}\sum {\sum\limits_{i = 1}^n {{{({R_i} - \overline R)}^2} = {1 \over {n - 1}}S}} (7) Dmax=m2n3n12(n1) {D_{\max}} = {{{m^2}\left({{n^3} - n} \right)} \over {12(n - 1)}} (8) R¯=1ni=1n(Ri) \overline R = {1 \over n}\sum\limits_{i = 1}^n {({R_i})} (9) S=j=1mi=1nRijR¯2 S = \sum\limits_{j = 1}^m {\left({\sum\limits_{i = 1}^n {{{\left({{R_{ij}} - \overline R} \right)}^2}}} \right)} (10) T=g=1qtg3tg T = \sum\limits_{g = 1}^q {\left({t_g^3 - {t_g}} \right)} where D denotes the calculated and maximum possible variance, Ri represents the rank and mean rank, q is the number of groups and tg is the number of tied ranks in each ( g) of q groups.

The obtained value is evaluated for significance using the Pearson chi-square (χ2) test, by multiplying this coefficient by the number of experts and by the number of degrees of freedom (m − 1). The resulting criterion so obtained is then compared with the corresponding tabular value. If the calculated χ2 exceeds the latter, the concordance coefficient under study is considered statistically significant. (11) χ2=n(m1)W {\chi ^2} = n(m - 1)W

The calculated χ2 values are compared with the tabulated (critical) values corresponding to χT2 \chi _T^2 degrees of freedom. This comparison determines the probability that the observed (calculated) value exceeds the tabular value, that is: (12) P(χ2>χv2)=α P({\chi ^2} > \chi _v^2) = \alpha

If the obtained χ2 values are statistically significant at a high confidence level (α > 0.95), this indicates that the concordance among the n experts is not due to chance. The developed mathematical model of integrated flight risk evaluation can therefore be applied to obtain quantitative estimates of risk levels for special flight situations using the expert evaluation method.

Critical χ2 distribution values can be found in Table 8 of the Pearson distribution reference [6]. Thus, the coefficient of concordance (W) reflects the degree of agreement among multiple experts: the closer its value is to 1 (and the further from 0), the greater the consistency of expert opinions.

4.
NUMERICAL EXAMPLE FOR ASSESSING CONSISTENCY

Experts were asked to express their opinions through a questionnaire and to rank the most significant factors contributing to serious flight events, assigning each a rank from 1 to 7. A higher number indicates a greater perceived significance of the factor.

Tables 45 present the results of this ranking exercise. The highest rank corresponds to the factor with the largest relative weight. Consequently, when selecting elements of a Safety Management System (SMS) for developing measures to enhance flight safety, special attention should be paid to the factor receiving the highest weight, as it plays the most influential role. The relative weights of all factors are shown in the last row of Table 5.

Table 4.

Factors affecting flight safety and assessment of their significance based on expert survey.

Experts ↓Weather conditionsRadio nav. System failuresBirds and foreign obstacleCollisions on the groundFinancial conditions and economyPsychological factorOrganization and Control
Factors →1234567
Technician Stuff17324561
Captain23545362
Pilot 232613745
Dispatcher TC44653352
Quality Manager54345467
Airport services64235657
Table 5.

Weight of the factors.

1234567ΣΣ Xij = 177
17324561
23545362
32613745
44653352
54345467
64235657
Weights24251925283224
0.1350.1410.1070.1410.15810.18070.1355

To calculate the weights, the sums of all expert rankings across columns and rows were obtained. The total sum of all rankings is: 7+3+2+4+4+4++1+2+5+2+7+7=177. 7 + 3 + 2 + 4 + 4 + 4 + \ldots + 1 + 2 + 5 + 2 + 7 + 7 = 177.

Next, the values for each column were summed to determine their relative weights. For example, in the first column the weight is 24/177, in the second column 25/177, and so forth. Based on these calculations, it was found that the psychological factor has the greatest weight, indicating that, according to the experts, it is the most significant contributor to flight safety risk.

The next step is to verify the consistency of expert opinions using Kendall's coefficient of concordance (W). If all ranks are identical – that is, if the experts are in complete agreement – then W = 1. In practice, however, the coefficient usually falls within the range 0 ≤ W ≤ 1. When W = 0 or takes a very small value, it indicates that there is little or no agreement among the experts.

Let us now calculate Kendall's W for the given data and test the degree of concordance among the experts. S=ΣSi=27.5 S = \Sigma {S_i} = 27.5

We calculate the sum of the event ranks for each expert, for example, in the first row 7 + 3 + 2 + 4 + 5 + 6 + 1 = 28, in the second row 3 + 5 + 4 + 5 + 3 + 6 + 2 = 28, and so on, then we find the average rank value R̄ = 29.5.

It should also be noted that the sum of ranks in each row was checked to ensure that the dataset indeed contains properly ranked data. Since there are seven subjects (events), the sum of rankings in each row should equal 1 + 2 + ⋯ + 7 = 7*(7 + 1) / 2 = 28, which it does (see Table 6). When there are multiple tied ranks, a revised definition of Kendall's W is used. For each rater j, define tg, where each g represents a group of tied ranks for that rater, and tg is the number of tied ranks in that group. An example of the distribution of tied ranks is presented in Table 7.

Table 6.

Rank of the experts.

1234567RiR̄ − RiSi
1732456128−1.52.25
2354536228−1.52.25
3261374528−1.52.25
4465335228−1.52.25
54345467333.512.25
64235657322.56.25

For Expert 1 in the example, there are no tied ranks, and therefore T1 = 0. Similarly, T3 = 0. For Expert 6, there is one group of two tied ranks, and so T6 = 23 – 2 = 6. For Expert 2, there are two groups of tied ranks, giving T2 = (23 – 2) + (23 – 2) = 6 + 6 = 12. Similarly, T4 = 12 and for Expert 5, there is one group with three ties (ranks 1, 3, and 5), giving T5 = 33 – 3 = 24. Hence, the total correction factor for ties is T = 0 + 12 + 0 + 12 + 24 + 6 = 54 (see Table 7).

Table 7.

Ties in expert rankings.

1234567Ti
100000000
2120210012
300000000
4001221012
5101010024
600010106

The coefficient of concordance is W = 12 × 27.5 / (62 × (73 – 7) – 54 × 6) = 0.028. The resulting value W= 0.028 indicates a non-zero, but very low level of agreement among the experts. To determine whether this result is still statistically acceptable – and to verify whether Factor 6 can indeed be considered the most significant – we test its significance using Pearson's chi-square criterion: χ2 = 7 × (6 – 1) × 0.028 = 0.981. The degrees of freedom in our case DoF = m – 1 = 7 – 1 = 6.

At a significance level of 98%, the calculated coefficient of concordance can be considered statistically significant. This means that, overall, the experts' assessments show a low but non-random level of agreement. Although general consistency among the experts is weak, the importance of one particular factor – identified as the most influential – remains statistically meaningful and should not be excluded when evaluating the performance of the flight safety management system (see Table 8).

Table 8.

Chi-square (χ2) distribution reference table listing critical values [6].

Degree of freedomLevel of significance α
0.990.9750.950.90.10.050.0250.01
1------0.0010.0040.022.73.856.6
20.020.0510.1030.2114.667.49.2
30.1150.2160.3520.5846.257.89.411.3
40.2970.4840.7111.0647.789.511.113.3
50.5540.8311.151.619.2411.112.815.1
60.8721.241.642.210.6512.614.416.8
71.241.692.172.8312.0214.11618.5
81.652.182.733.4913.3615.517.520.1
92.092.73.334.1714.6816.91921.7
5.
CONCLUSIONS

The example presented herein demonstrates that this method of calculation is straightforward and accessible to anyone familiar with the basics of mathematics. To simplify the process, electronic spreadsheet templates can be used for the computations.

The obtained value of the risk level (R) characterizes the degree of hazard associated with the operation of an aviation system over a given period and allows for the determination of its safety rating. Similar calculations can be performed for individual airlines, aircraft types, or other operational categories. The results can be widely applied for monitoring and assessing flight safety status, and consequently, for managing flight safety more effectively.

Based on the results of the presented example, Factor 6 was identified as the most significant among those evaluated in the expert survey. This conclusion follows directly from the weighting results. Most experts agreed that the psychological factor has the greatest influence and can, in many cases, lead to catastrophic situations. However, although expert opinions indicate this factor's importance, Kendall's coefficient (W) showed a low level of agreement, suggesting a lack of strong consensus. The Pearson chi-square test confirmed the statistical significance of the factor but also revealed divergence among expert judgments.

This method is particularly recommended in cases where expert responses show moderate similarity, enabling the identification of key influences in complex control systems. Currently, probability estimates of specific events have not been included, as they are not yet required.

The W statistic should only be used to assess consistency among variables that measure the same general properties of objects or events. For instance, if both positive and negative correlations carry equal meaning within the same dataset, Kendall's W would not be appropriate. In flight safety analyses, such cases may occur when one parameter increases while another decreases within the same time frame, creating clear but opposite correlations.

If W = 0, this means that experts ranked the list of factors entirely differently or at random. Conversely, W = 1 indicates that all experts ranked the list identically, following a predetermined order. However, this leads to a limitation: the method does not assess probabilities or absolute parameter values but only examines the structure of expert responses— specifically whether their rankings coincide. If any participant imitates another's responses, the reliability of the results is compromised.

When resolving expert disagreements or when probability estimates are required, more advanced statistical techniques such as Cohen's Kappa or Spearman's rank correlation coefficient should be applied. These methods, however, were beyond the scope of the present study. For practical applications that require simplicity and rapid evaluation of expert survey results, the approach presented here remains the most suitable and efficient.

Language: English
Page range: 75 - 91
Submitted on: Jul 25, 2025
Accepted on: Oct 13, 2025
Published on: Oct 31, 2025
Published by: Sciendo
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2025 Igors Petuhovs, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.