Wearable devices with inertial measurement units (IMUs) are increasingly used in health care [6, 17, 23], health monitoring [12, 32, 33], and sports [4, 13, 14]. Companies, such as Garmin (1), Fitbit (2), and Suunto (3), sell affordable wearable devices for general sporting activities; however, athletes involved in competitive sports exist at the cutting edge of health care, physiological optimization, and technology. One important sport-science metric used to assess the explosive power in an athlete’s legs is the countermovement jump (CMJ). Measurement of CMJ is important because it involves the detection of key physiological variables such as flight time, peak force, and average loading rate. Because this particular mixture of variables encompasses so many athletic and physiological properties, CMJ (or the variables) can be used to assess the general physical condition and neuromuscular function of an athlete.
Conventionally, logging the variables required to compute a CMJ value requires an athlete to perform a CMJ several times while standing on a force plate. The corresponding force-time data are measured by the force plate, and the variables are computed from those observations. Sport professionals can evaluate the quality of a CMJ by utilizing these variables. The primary shortcoming of this system is the size, weight, and expense of the equipment involved. Laboratory quality force plate systems range from £30,000 GBP to £70,000 GBP, and (somewhat) portable equivalents range from £10,000 GBP to £15,000 GBP [20]. Conversely, IMUs for consumer or industrial use cost from US$10 to US$1,000 [2]. Compared to professional tier systems, a wearable device that can detect similar variables utilizing IMUs would have the benefit of being considerably cheaper, lighter, and more mobile. Given the aforementioned aspects, the motivation of this study is to design a wearable, yet economical, CMJ performance variable prediction model, using off-the-shelf IMUs. The performance variables of interest to assess the CMJ include peak force, second peak force, flight time, average loading rate, net impulse, peak power, and rate of force development (RFD). To show that this system can be deployed in a real-world usable situation, this study also creates a prototype using a wearable IMU that can automatically predict the seven CMJ performance variables, after a test subject has performed a series of CMJs.
As shown in Figure 1A, there are four key CMJ phases: eccentric, concentric, flight, and contact. In the eccentric phase, the jumper is standing on a force plate with their hands on their waist, squatting until their knees are maximally bent. The concentric phase begins when the body has bounced, passing its lowest point, but has not yet left the ground. Once the feet leave contact with the ground, the flight phase begins, and the process concludes with the contact phase as the jumper lands. Conventionally, the related force-time data are recorded by the force plate, as shown in Figure 1B. With that force-time data, the seven aforementioned performance variables related to the jumper’s physical condition can be computed, as shown in Figure 2.

The process and corresponding force-time data of a CMJ. (A) The entire CMJ process. (B) Force-time data corresponding to the CMJ shown in (A). CMJ, countermovement jump.

Seven performance variables used to assess CMJ, derived from the force-time curve and the power-time curve. CMJ, countermovement jump.
To accomplish our goal, the major challenge we need to face is how to use the IMU sensing data (e.g., acceleration-time data) to accurately predict the CMJ performance variables calculated from the force-time data measured by a force plate. To address this problem, we tie a wearable device with IMUs around a jumper’s waist and ask her or him to stand on a force plate to perform a CMJ several times. After a series of CMJs is performed, we collect the acceleration-time and force-time records measured by the wearable device and the force plate, respectively. Then, the seven performance variables can be derived from the force-time record of the force plate as the ground truth values. We then extract some jump performance-related features from the acceleration-time data measured by the IMUs for training a model to fit the performance variables. Since the seven performance variables we want to estimate are numerical, we model the problem of predicting a different performance variable as an independent regression problem, only relying on the related features extracted from the acceleration-time data of wearable IMUs. That is, we train a regression model to predict each performance variable of a CMJ obtained from force-time data as ground truth, and the input attributes (independent variables) of the regression model are the features extracted from acceleration-time data of IMUs. As a result, five regression models, such as linear regression (LR), linear support vector regression (LSVR), decision tree regression (DTR), light gradient boosting machine regression (LGBMR), and eXtreme gradient boosting regression (XGBR), are adopted and compared in our experiments. We collect 870 CMJs from 19 jumpers for experiments and compare the goodness of these regression models in terms of prediction errors. Experimental results demonstrate that LGBMR outperforms other models in predicting most performance variables.
Furthermore, to demonstrate the practicality of our method, we develop a demonstration system that receives the acceleration-time data from a wearable device with IMUs tied to a jumper’s waist via Bluetooth in real-time. After a series of CMJs is completed by the jumper, the system will immediately estimate the value of each of the seven performance variables, only relying on the acceleration-time data based on the underlying machine learning model (i.e., LGBMR in this paper). With this system, a jumper can easily obtain his/her jump performance variables through wearable IMUs instead of using an expensive and hard-to-carry force plate.
The rest of this paper is organized as follows. Section II reviews the related work. Section III describes the proposed method, while Section IV shows the experimental results. The demonstration system is given in Section V. Finally, Section VI concludes this paper.
The vertical jump is a specific action often utilized in sports science to assess the general physical condition of an athlete. CMJ is one of the most common types of exercise used for general physical assessment. To measure the physical condition of a subject, a researcher first asks the subject to perform a vertical jump (e.g., CMJ) while standing on a force plate. From this, a stream of force-time data is captured and converted into a series of variables (usually 7), which reflect the subject’s jumping performance and the overall physical condition. Swinton et al. [31] described how to apply the force-time curve of a CMJ and squat jump (SJ) to resistance training. They established a connection between resistance training activity, SJ, and CMJ, demonstrating a way of improving training efficiency. Rice et al. [29] examined the power-time and force-time curves for male and female basketball players to determine whether there was a significant difference between the genders. Pupo et al. [28] measured the CMJ and SJ performance of volleyball players and sprint runners, and then discussed the relation between the jump performance variables. They showed that the maximum force and the peak velocity were the main determinants of jump height and power, and that the sprint runners’ jump performance was superior to volleyball players. Kirby et al. [19] discussed the relationship between jump performance variables, including net impulse, peak force, peak power, peak velocity, and jump height. The result showed that relative net vertical impulse has a significant correlation to jump height, and peak force is negatively correlated with jump height. Lake et al. [21] used push press and loaded CMJ to quantify load and its effect on resistance training, by evaluating peak and mean power. McMahon et al. [26] researched senior and academy rugby league players as they performed CMJs. They found that senior players had superior jump height and impulse performance. McLellan et al. [25] discussed how RFD affects vertical jump performance. They studied the relationship between RFD and other vertical jump variables such as peak force and peak power in the eccentric phase. The result showed that RFD has a significant association with some of those variables.
Some studies have already applied machine learning to vertical jump data. Zhou et al. [34] used machine learning methods to predict football players’ CMJ jump height. Their models utilized LR, random forest, and decision-tree regression. The result showed that the Pearson product-moment correlation coefficient of all their models was >0.85. Kipp et al. [18] used a machine learning approach to predict the CMJ performance of track and field athletes after resistance training. Chiang et al. [11] proposed using wearable IMU devices instead of a force plate, to evaluate CMJ performance. They collected CMJ data from the IMU device and the force plate, then converted the acceleration data into features, and finally used the force plate data as the ground truth to train the machine learning models so that they could predict the CMJ performance. The models they utilized were decision-tree regression, LSVR, and LR. The experiment results showed that the machine learning models made good predictions for CMJ peak force, net impulse, and flight time.
Recently, there have been many research projects relating to the cooperation of sensors and machine learning. These research papers cover a wide variety of fields, and most researchers hope to replace expensive instruments or complex measurement systems with simple, low-cost sensors and machine learning. Bauman et al. [5] did a preliminary investigation of predicting time-to-next heel strike with acceleration data and machine learning. They used two accelerometers, and used their x- and y-axis data as the raw data. The acceleration data was first processed through a second-order 10 Hz cutoff Butterworth filter. They used the sliding window to process the acceleration data to obtain its primary features. These features were subsequently fed into artificial neural networks (ANN) and a teacher forcing ANN for training, and a grid search could then be used to find optimal model hyperparameters. Chaaban et al. [7] proposed a predictive method based on one or multiple IMU sensors to predict vertical ground force and knee biomechanics. They tied various numbers of sensors to a test subject’s leg to collect the gyro and acceleration data measured by the IMUs as their model training data. In their experiments, they compared the performance difference between single and various multi-sensor setups. They determined that their setup was a valid way to replace expensive and complex laboratory force plate measurement systems.
Johnson et al. [15] proposed a method of collecting acceleration data via five sensors tied to different physical locations on a test subject’s body, then using deep learning to assess the predictive quality of these data concerning multidimensional ground reaction forces while running, and sidestepping. Lim et al. [24] proposed the sacrum as the key measurement point for single IMU setups. Being near the center of mass (CoM), the sacrum can be used to effectively capture and predict dynamic data (e.g., torque of the knee, ground reaction force of the knee) of the lower limbs, by an ANN. They also showed that the CoM is a dynamic determinant of walking. Through biomechanical analysis, it is possible to approximate the subject’s ground reaction force, and even the segment-angle of their lower limbs, by the weighted sum of CoM dynamics. Acceleration data were converted to velocity and displacement, and this was used to create features that can be used to predict lower limb kinetics and kinematics.
This paper follows a similar system architecture to the one described in references [10, 11]. This architecture predicted each of the seven performance variables of a CMJ and operates in three phases, as shown in Figure 3.
Phase 1 – Acceleration-data preprocessing: Segment the acceleration-time data from the IMU into multiple individual jumps, and then extract important features from each jump.
Phase 2 – Force data preprocessing: Normalize the force-time data from the force plate, and then segment the force-time data into multiple jumps, and calculate each value of the seven performance variables, for each CMJ.
Phase 3 – Machine-learning model training: Select several established and recognizable features from the data features observed in Phase 1. For each of the seven performance variables, train an individual regression model to fit the value of the corresponding performance variable obtained from the force plate data.

Proposed architecture and application workflow for a machine learning model that can predict CMJ performance variables from IMU sensor data. CMJ, countermovement jump; IMU, inertial measurement unit.
The test subject performs several CMJs, as shown in Figure 1A. The raw acceleration-time data from the wearable IMU are logged, and Figure 4A is a plot of that data. Figure 4B shows the same data after a sixth order 30 Hz Butterworth filter has been applied to reduce noise. It is also necessary to subtract gravity (1 g) from the acceleration-time data to bring the y-axis steady-state to 0 g (Figure 4C). Finally, the normalized acceleration-time data can be divided into individual jumps (Figure 4D).

Data preprocessing of CMJ acceleration data. (A) Raw IMU acceleration data (B) Raw IMU data post-Butterworth filter. (C) Post-Butterworth filter data, after elimination of the gravity component. (D) Acceleration data after jump segmentation. CMJ, countermovement jump; IMU, inertial measurement unit.
Feature extraction from the preprocessed acceleration-time data for each CMJ is possible following the method proposed in references [10, 11]. This study extracts statistical features, time intervals, and critical point values, and then computes averages. The critical values from the raw acceleration-time data look like those presented in Figure 5, where point A represents the peak acceleration value, before the jumper leaves the ground, points B and D are where the acceleration value in the original acceleration-time series is zero, before normalization (i.e., 1 g in the normalized acceleration-time data), and point C is the lowest acceleration value before landing. Point E is the maximum acceleration value during the contact phase, and Point F is the second-highest acceleration value, also occurring in the contact phase, but after Point E. All of the 31 extracted features and their descriptions are listed in Table 1.

Critical point extraction from post-processed IMU acceleration-time data. IMU, inertial measurement unit.
31 features extracted from the acceleration data
| Feature | Description |
|---|---|
| Mean | The mean of the acceleration data |
| SD | The standard deviation of the acceleration data |
| IQR | The interquartile range of the acceleration data |
| Skewness | The skewness of the acceleration data |
| Kurtosis | The kurtosis of the acceleration data |
| Frequency | The dominant frequency of the acceleration data |
| Entropy | The sample entropy of the acceleration data |
| value_A | The value of point A |
| value_C | The value of point C |
| value_E | The value of point E |
| value_F | The value of point F |
| duration_AC | The time interval from point A to point C |
| duration_BD | The time interval from point B to point D |
| duration_CE | The time interval from point C to point E |
| duration_EF | The time interval from point E to point F |
| slope_AL | The slope from (point A − 20 ms) to point A |
| slope_AR | The slope from point A to (point A + 20 ms) |
| slope_CL | The slope from (point C − 20 ms) to point C |
| slope_CR | The slope from point C to (point C + 20 ms) |
| slope_EL | The slope from (point E − 10 ms) to point E. |
| slope_ER | The slope from point E to (point E + 10 ms). |
| slope_FL | The slope from (point F − 10 ms) to point F |
| slope_FR | The slope from point F to (point F + 10 ms) |
| slope_AC | The slope from point A to point C |
| slope_CE | The slope from point C to point E |
| slope_DE | The slope from point D to point E |
| slope_EF | The slope from point E to point F. |
| Integration | The integration value before point C |
| average_A | The average value between (point A − 80 ms) and (point A + 20 ms) |
| average_C | The average value between (point C − 20 ms) and (point C + 180 ms). |
| average_E | The average value between (point E − 20 ms) and (point E + 180 ms) |
IQR, Interquartile range.
A ground truth value is required for all CMJ performance variables to train the regression models. This ground truth is acquired by getting the test subject to perform CMJs while simultaneously wearing an IMU-enabled device and standing on a calibrated force plate. It is necessary to acquire the IMU acceleration-time data and the corresponding force-time data from the force plate simultaneously. After a subject performs one or more CMJs, the force-time data can be plotted as shown in Figure 6A. It is intuitive that the heavier a subject is, the greater the force of their CMJ. To make the machine learning model reliable for a variety of users with different body masses, the force-time data have to be normalized to eliminate the body mass effects. Normalization is completed by dividing the raw force data by the subject’s weight, as shown in Figure 6B.

Data preprocessing of CMJ force data. (A) Raw force data. (B) Force data, divided by the subject weight. CMJ, countermovement jump.
Next, it is necessary to compute the seven performance variables to derive ground truth values for training the machine learning models. Six of the performance variables are derived from the normalized force-time data, as shown in Figure 2. These variables include net impulse, rate of force development (RFD), peak force, flight time, average loading rate, and second peak force; only the peak power variable is derived from the power-time data.
The lack of power-time data means that it is necessary to derive the power-time data from the raw force-time data by the use of Eqs (1)–(3); where t represents time, a(t) represents acceleration at time t, F(t) represents force at time t, m is mass, v(t) is the velocity at time t, and P(t) represents power at time t, measured in Watts. Since the subject weight affects the power value, the machine learning system has to take the maximal peak power value, divided by the subject weight to create a Watt/kg value. This is used to eliminate the effect of subject weight on peak power, and Figure 7 shows the power-time data before and after division by the subject weight.

Data preprocessing of CMJ power data. (A) Raw power data. (B) Power data, divided by the subject weight. CMJ, countermovement jump.
Before utilizing the total extracted features for training, important features are selected from the 31 variables detectable from the acceleration data. Feature selection can simplify a regression model, improve model performance, and reduce the risk of overfitting. The method of feature selection is filter-based ranking [8] because of its simplicity and proven performance record across a number of applications. The filter-based ranking approach is SelectkBest, and this is available in the scikit-learn software package [1]. Each CMJ performance variable is assessed to determine the significance of the relationship between the feature and the performance variable. The score function of SelectkBest is specified as f_regression in advance, and the significance of the relationship between a feature and its related performance variable is given by the F-score. Once the F-score of each of the 31 features is calculated, they are ranked in decreasing order. For example, according to Newton’s laws of motion, the absolute value of slope_AC is positively correlated with peak force, and consequently, it is also positively correlated with flight time. In our experiment, the F-scores for slope_AC with respect to both peak force and flight time are high. Following the ranking results of the 31 features, the top-k features with the highest F-scores for training are selected. The higher the score, the stronger the relationship between a feature and a performance variable; thus, the k most significant features are identified.
The regression models used in this study include: LSVR, DTR, LR, and several popular ensemble methods, including XGBR and LGBMR. A support vector machine (SVM) is a popular classification method used for machine learning. Unlike deep learning, SVM does not require a large amount of data and offers good performance in many binary classification tasks. LSVR extends SVM when performing regressions. A decision tree (DT) is considered a conventional and widely used machine learning model, which constructs a tree-like data structure, with decision rules, and then infers a result from those rules. The advantage of DT is its simplicity and explicability, and DTR is an extension of DT used for regressions. LR is a basic regression model for examining issues in both machine learning and statistics. Both light gradient-boosting machine (LGBM) and XGB are well-known ensemble methods, and ensemble methods offer superior predictive ability for many tasks [30]. eXtreme gradient boosting (XGB) is a very powerful ensemble method, which is scalable, and therefore it can fit a range of scenarios [9]. It has also been used in many Kaggle competitions. LGBM is a Microsoft ensemble method that benefits from a shorter training time and higher accuracy, compared to XGB [16]. LGBM can speed up the training process of conventional gradient boosting decision tree (GBDT) by more than 20 times, while achieving almost the same level of accuracy as XGB.
This study simultaneously collects acceleration-time and force-time data. The acceleration-time data are collected using a NAXSEN IMU sensor, produced by SIPPLink Technology. (4) The size (50 mm × 46 mm × 9 mm) and appearance of the device are shown in Figure 8A. The small dimensions of the device mean that it is easily worn by the test subject, close to their sacrum, as shown in Figure 8B. The force-time data are collected by a 3D portable force plate produced by Kistler (model 9260AA6), and the appearance and dimensions (600 mm × 500 mm × 50 mm) of the device are shown in Figure 8C. The sample rate of both the IMU sensor and force plate is set at 1,000 Hz.

A NAXSEN IMU tied on a jumper and a Kistler 9260AA6 force plate. (A) Appearance of the SIPPLink Technology. (B) Placement of the IMU on the test subject. (C) Kistler 9260AA6 force plate used during this study. IMU, inertial measurement unit.
A total of 19 healthy test subjects aged 20–40 years provided a total of 870 CMJ observations. This placement is closer to the subject’s CoM and consequently captures the body dynamics more effectively. Specifically, it provides a lower error rate than when attached to the subject’s ankle, according to Lee et al. [22]. The IMU is attached to the test subject’s clothing as shown in Figure 8B.
Previous studies [3] report that different types of shoes create a sufficiently large margin of error in the results, which requires compensation, especially affecting the average loading rate and the second peak force when landing. Therefore, each subject performs 10 CMJs barefoot on the force plate, with a short interval between jumps.
The application is configured to capture and log data from the devices at 1,000 Hz, which directly corresponds to the devices’ sampling rates. To obtain a data resolution of 1,000 Hz from the IMU, the device must be directly wired to a computer. In this system, a standard universal serial bus (USB) cable is used; however, it should be noted that using a wired connection would be impractical for real-life applications. The only wireless communications protocol offered by the IMU is Bluetooth, which limits the maximum streaming sample rate to 50 Hz. This is inconsistent with the native sampling rate of the force plate, which is 1,000 Hz. To address this issue, the acceleration-time data are up-sampled from 50 Hz to 1,000 Hz, by both duplication and linear interpolation. The features extracted from the original acceleration-time data (at 1,000 Hz) and the two sets of up-sampled acceleration-time data (now at 1,000 Hz) are used to train the regression models and determine the model fit. Each of the CMJ performance variables is obtained from the force-time data. In total, 80% of the observations are used for model training and 20% are used for testing.
The performance metric used to determine model performance is the mean absolute percentage error (MAPE) model, shown in Eq. (4), where yk is the actual value of a performance variable, derived from the force-time data;
Machine learning parameter settings for the models used in this study
| Model | Parameter | Setting value |
|---|---|---|
| LSVR [11] | Epsilon | 0.0 |
| Tol | 0.0001 | |
| C | 1.0 | |
| Loss | epsilon_insensitive | |
| DTR [11] | intercept_scaling | 1.0 |
| max_iter | 100,000 | |
| max_depth | None | |
| min_samples_split | 2 | |
| min_samples_leaf | 1 | |
| min_weight_fraction_leaf | 0.0 | |
| XGBR | max_leaf_nodes | None |
| min_impurity_decrease | 0.0 | |
| learnig_rate | 0.3 | |
| Subsample | 1 | |
| n_estimators | 100 | |
| Gamma | 0.0 | |
| max_depth | 6 | |
| min_child_weight | 1 | |
| max_delta_step | 0 | |
| num_leaves | 31 | |
| max_depth | −1 (no limit) | |
| Subsample | 1 | |
| LGBMR | learning_rate | 0.1 |
| n_estimators | 100 | |
| min_split_gain | 0.0 | |
| min_child_weight | 0.001 | |
| min_child_samples | 20 | |
DTR, decision tree regression; LGBMR, light gradient boosting machine regression; LSVR, linear support vector regression; XGBR, eXtreme gradient boosting regression.
Since using unsuitable features may significantly affect the predictive performance of the selected models, a feature selection task is performed to validate and, where necessary, remedy the model(s). Since there are at most 231 – 1 non-empty subsets of the 31 extracted features in the worst case, it is time-consuming to exhaustively train different regression models using each of these subsets to obtain the model with the lowest prediction error (e.g. MAPE). Therefore, a feature selection analysis is performed using the scikit-learn Select k Best tool [1] to reduce the exhaustive training effort. For each of the seven performance variables, k is varied from 1 to 31, to select the k most significant features of the related performance variable. The predictive performance for each variable under MAPE is examined using each of the five regression models, with a different k value. The experimental results of each of the seven performance variables are shown in Figures 9A–9G. Figures 9A–9G show that the performance of most regression models improves as the number of k increases, in terms of MAPE. This indicates that most of the 31 extracted features are relevant to the predictive ability of the variables.

Prediction results from varying the number (k) of selected features for each performance variable, under each of the five regression models. (A) Peak force. (B) Second peak force. (C) Flight time. (D) Average loading rate. (E) Net impulse. (F). Peak power. (G) RFD. DTR, decision tree regression; LGBMR, light gradient boosting machine regression; LR, linear regression; LSVR, linear support vector regression; MAPE, mean absolute percentage error; RFD, rate of force development; XGBR, eXtreme gradient boosting regression.
The experimental results in this section show the comparative success of the five regressions, assessed by MAPE, when estimating the seven CMJ performance variables. These data are tested on the three types of acceleration data: (1) the original 1,000 Hz data, (2) the 50 Hz data up-sampled by duplication, and (3) the 50 Hz data up-sampled by interpolation. These results are shown in Tables 3A–3C, respectively. We can see that LGBMR is the best among all compared regression models to predict almost all of seven performance variables on all three types of datasets in terms of MAPE. Even though LGBMR is not the best in the task of predicting the average loading rate on the original 1,000 Hz among all models, the MAPE of LGBMR is only slightly inferior to XGBR. Therefore, we still select LGBMR as our final prediction model for its excellent ability to predict almost all of the performance variables of a CMJ. Interestingly, the MAPE in the second peak force is higher than the MAPE in other variables. After consulting with domain experts, we believe that the errors in the second peak force originate from the contact phase, as different subjects may employ varying landing strategies. For each performance variable of a CMJ, the selected features achieving the best prediction results of LGBMR are listed in Table 4.
Comparison of MAPE for original data and up-sampling data.
| (a) MAPE of different models utilizing the original data (1,000 Hz native). | ||||||
|---|---|---|---|---|---|---|
| Variable | Model | LR [11] | LSVR [11] | DTR [11] | XGBR | LGBMR |
| Peak force | 7.97 | 10.84 | 7.54 | 5.50 | 5.00 | |
| Second peak force | 24.88 | 28.31 | 23.84 | 19.16 | 18.07 | |
| Flight time | 2.59 | 4.63 | 2.85 | 2.09 | 2.04 | |
| Average loading rate | 34.32 | 33.54 | 30.68 | 23.50 | 23.59 | |
| Net impulse | 2.99 | 4.63 | 3.28 | 2.61 | 2.42 | |
| Peak power | 5.73 | 6.86 | 7.21 | 5.60 | 5.43 | |
| RFD | 21.12 | 23.16 | 21.73 | 17.50 | 16.56 | |
| (b) MAPE of different models utilizing up-sampled data (50–1,000 Hz, by duplication). | ||||||
|---|---|---|---|---|---|---|
| Variable | Model | LR [11] | LSVR [11] | DTR [11] | XGBR | LGBMR |
| Peak force | 8.06 | 11.18 | 7.60 | 5.63 | 5.29 | |
| Second peak force | 25.84 | 26.52 | 28.20 | 20.96 | 20.04 | |
| Flight time | 3.52 | 4.87 | 3.22 | 2.52 | 2.38 | |
| Average loading rate | 34.82 | 33.08 | 36.44 | 26.89 | 26.37 | |
| Net impulse | 3.82 | 4.83 | 3.82 | 2.89 | 2.76 | |
| Peak power | 6.36 | 7.28 | 7.84 | 6.11 | 5.91 | |
| RFD | 22.02 | 23.22 | 23.65 | 17.85 | 17.55 | |
| (c) MAPE of different models utilizing up-sampled data (50–1,000 Hz, by interpolation) | ||||||
|---|---|---|---|---|---|---|
| Variable | Model | LR [11] | LSVR [11] | DTR [11] | XGBR | LGBMR |
| Peak force | 7.50 | 10.27 | 7.32 | 5.59 | 5.18 | |
| Second peak force | 24.69 | 26.43 | 29.32 | 21.04 | 19.89 | |
| Flight time | 3.25 | 4.80 | 3.31 | 2.40 | 2.28 | |
| Average loading rate | 33.80 | 31.72 | 34.35 | 26.38 | 24.99 | |
| Net impulse | 3.58 | 4.47 | 3.71 | 2.94 | 2.60 | |
| Peak power | 6.13 | 6.67 | 7.25 | 5.83 | 5.76 | |
| RFD | 21.84 | 22.94 | 23.70 | 18.00 | 17.57 | |
The bold values indicate the smallest MAPE among the five models.
DTR, decision tree regression; LGBMR, light gradient boosting machine regression; LR, linear regression; LSVR, linear support vector regression; MAPE, mean absolute percentage error; RFD, rate of force development; XGBR, eXtreme gradient boosting regression.
So far, in this study, LGBMR has proven to be the superior machine learning-based method for predicting flight time. However, there are alternate conventional ways of collecting acceleration data to estimate the vertical jump variable. To compare the predictive performance of the LGBMR model being examined here, it is now compared to the performance of the integration-based method proposed in reference [27]. As shown in Table 5, in terms of MAPE, LGBMR outperforms the integration-based method, when predicting CMJ flight time.
Data features used to predict values for the seven CMJ performance variables
| Feature | Peak force | Second peak force | Flight time | Average loading rate | Net impulse | Peak power | RFD |
|---|---|---|---|---|---|---|---|
| Mean | V | V | V | V | V | V | |
| SD | V | V | V | V | V | V | |
| IQR | V | V | V | V | V | V | V |
| Skewness | V | V | V | V | V | V | V |
| Kurtosis | V | V | V | V | |||
| Frequency | V | V | V | V | V | ||
| Entropy | V | V | V | V | V | V | |
| value_A | V | V | V | V | V | V | V |
| value_C | V | V | V | V | V | V | |
| value_E | V | V | V | V | V | V | V |
| value_F | V | V | V | V | V | V | |
| duration_AC | V | V | V | V | V | V | |
| duration_BD | V | V | V | V | V | V | V |
| duration_CE | V | V | V | V | V | V | V |
| duration_EF | V | V | V | V | V | V | |
| slope_AL | V | V | V | V | V | V | V |
| slope_AR | V | V | V | V | V | V | V |
| slope_CL | V | V | V | V | V | V | V |
| slope_CR | V | V | V | V | |||
| slope_EL | V | V | V | V | V | ||
| slope_ER | V | V | V | V | V | V | V |
| slope_FL | V | V | V | V | V | V | V |
| slope_FR | V | V | V | V | |||
| slope_AC | V | V | V | V | V | V | |
| slope_CE | V | V | V | V | V | V | |
| slope_DE | V | V | V | V | V | ||
| slope_EF | V | V | V | V | V | V | |
| Integration | V | V | V | V | V | V | V |
| average_A | V | V | V | V | V | V | V |
| average_C | V | V | V | V | V | V | V |
| average_E | V | V | V | V | V | V | V |
CMJ, countermovement jump; RFD, rate of force development; SD, standard deviation.
Relative predictive performance of the CMJ flight time variable, measured using MAPE
| Integration-based method | LGBMR | |
|---|---|---|
| CMJ | 11.75 | 2.04 |
The bold values indicate the smallest MAPE among the five models.
CMJ, countermovement jump; LGBMR, light gradient boosting machine regression; MAPE, mean absolute percentage error.
A test subject demonstration is used to test the predictive ability of the system, using a commercially available IMU sensor, connected to a computer via Bluetooth. The IMU sensor is manufactured by Rabboni (5) and produced by SIPPLink Technology. The dimensions (44 mm × 44 mm × 15 mm) and appearance are shown in Figure 10A. As described previously, the device is affixed to the test subject’s sacrum area, as shown in Figure 10B. The sample rate of the IMU is 50 Hz over Bluetooth, and the data needs to be up-sampled to 1,000 Hz by interpolation, following the experimental result discovered in Section IV.

Rabboni IMU tied on a jumper. (A) Commercially available Rabboni IMU device. (B) Placement of the device on a test subject. IMU, inertial measurement unit.
The system architecture and application workflow of the demonstration system are exhibited in Figure 11. First, a file name is created so that the acceleration data are correctly saved. Then, the computer connects to the IMU sensor via Bluetooth. If the connection is successful, the application begins to show real-time acceleration-time data from the IMU (Figure 12, bottom), and then the test subject can perform a jump. The system will determine whether the type of jump is a CMJ; if it is, the system applies LGBMR to predict each of the seven performance variables associated with a CMJ. The predicted value of each performance variable is then displayed (Figure 12, top). Note that in all tests the IMU is positioned close to the sacrum, as stated in references [3, 22], and the test subjects are barefoot. If the IMU is installed in a different location, or if the test subjects are wearing shoes, the algorithm may need to be redesigned to reduce the error range.

System architecture and application workflow. CMJ, countermovement jump.

Prototype system predicting CMJ variable performance in a live test. CMJ, countermovement jump.
This paper demonstrates a method of estimating each of the seven performance variables used to assess a CMJ. It is shown that this is possible using an economical, wearable IMU device, combined with a machine learning system. For the everyday consumer, this is a viable and (in many cases) preferable alternative to the expensive and cumbersome experience that involves laboratory equipment and force plates. However, the system demonstrated here has certain limitations.
This experiment showed that most of the seven performance variables are reliably captured using an LGBMR-based IMU sensor, including peak force, flight time, net impulse, and peak power. Moreover, this study develops a system for simplifying the measurement process, which is easy to use for persons involved in sports-related sciences. Most importantly, it is clear that the performance variables of a CMJ can be accurately predicted by lightweight, wearable devices when using the correct machine learning model, in this case, LGBMR.
Despite the progress made in this study, there is considerable scope for further research. For example, rather than up-sampling the acceleration-time observations, it is likely possible to derive more findings from the available data; however, this will take time as it involves hunting for patterns and features.
Furthermore, obtaining a larger sample of test subjects would be beneficial, as this would provide a more comprehensive evaluation of the system across a broader range of body weights, fitness levels, and demographic groups. In particular, it will be necessary to collaborate with athletes and coaches to assess key aspects, such as latency, accuracy, usability, and overall applicability in real-world settings. Such collaboration is also essential to deepen our understanding of the precise relationships among the variables investigated in this study.