Over the last few decades, computer science has become a significant multidisciplinary support for enhancing sports (Ráthonyi et al., 2018). Virtual reality (VR) applications used in sports are an example of how new technological methods are applied to the sporting world. These technologies support top-level sports and athletes; e-learning features in the training process and biomechanical analyses with VR modelling and simulations have become possible (Link and Lames, 2014).
There are limitless possibilities for collaboration between computer science and sports. Computer science opportunities exist for sports researchers in a variety of technological domains, including data processing and software development for training, sensor control, and data visualisations (Link and Lames, 2014; Ráthonyi et al., 2018). Video assistant refereeing (VAR) was introduced as a verified real-world example of this network. The goal and potential of this technological camera system is to enhance referee decision-making and reduce unjust disparities in sports (Dufner et al., 2023; Spitz et al., 2021).
Motion capture systems, a recording movement technology, enable athlete evaluation techniques to be performed (Rybnikár et al., 2022). Technologies make it possible to gather enormous volumes of data that facilitate sophisticated statistical and predictive analyses and help coaches make timely decisions (Southgate et al., 2016). The applications of these technologies include injury risk mitigation (Patton et al., 2020) and performance enhancement (Macadam et al., 2019). There are five branches of human motion capture measurement systems depending on the working principles: the optoelectronic measurement system (OMS), inertial measurement unit (IMU), image processing system (IMS), electromagnetic measurement system (EMS), and ultrasonic localisation system (UMS) (Van der Kruk and Reijne, 2018). At this point, it is worth mentioning the differences between 2D cameras (using OMSs) and depth cameras (using OMSs and IMSs). In the literature, the OMS is regarded as the gold standard for motion capture (Corazza et al., 2010). This system uses light detection to estimate the 3D position of a marker attached to a human body. OMSs utilise two types of markers: passive and active. Active markers emit light, whereas passive markers merely reflect light that reaches the cameras. Both markers have their own advantages and disadvantages. Active markers are more robust than passive markers but require cables to power them, which could disrupt athlete movements (Stancic et al., 2013). The biggest drawback of passive markers is that bright sunlight can interfere with the measurement (Spörri et al., 2016). Occlusion occurs when both markers are used. Apart from a flat image, depth cameras can capture the distance between the camera and an object in its field of view. Currently, data from depth cameras are still less accurate than those from OMSs, such as Vicon or Optitrack (Van der Kruk and Reijne, 2018). These methods have problems with the detection of dark surfaces (light absorption), shiny surfaces (light reflection), and rough surfaces (incoming light angle) (Dutta, 2012). The advantage of depth cameras is that they capture human kinematics without attaching markers to the athlete’s body. This system captures human motion directly from the body geometry. Markerless systems are also less time-consuming and non-intrusive (no need for physical contact with the athlete’s body), and they reduce errors related to skin motion (Scataglini et al., 2024).
The unique technology of each type of depth camera can support markerless kinematic analysis. Time-of-flight (ToF) cameras capture infrared light pulses and then apply image-processing algorithms to interpret return times and produce a depth map (Langmann et al., 2012). One application of ToF cameras is, for example, a taekwondo athlete performance assessment in real time (Malawski, 2020). Cameras that use ToF include Microsoft Kinect, Orbbec Femto, and Intel RealSense. Structured light cameras project a known pattern of infrared light onto a scene. The optical component involves projection, whereas the electronic component processes the pattern deformation to determine the depth (Langmann et al., 2012). Such a camera can determine the accuracy of free throws in basketball or track 3D joint positions during manoeuvres (Sri-Iesaranusorn et al., 2021).
This technology is used by Microsoft Kinect v2 (Kinect for Xbox One), Intel RealSense SR300, and Orbbec Astra. Stereovision cameras rest on two or more lenses to capture images from slightly different perspectives. The optical component comes from image capture through lenses, while the electronic component triangulates points in the two images to compute the depth (Kytö et al., 2011). Examples of stereovision are the Intel RealSense D415 and D435 (the first is ideal for static capture and the second for dynamic motion). In addition, this technology can be used by two or three Microsoft Azure Kinect devices for better measurements and to avoid occlusion. ToF, structured light, and stereovision cameras are the primary types of cameras used for markerless kinematic analysis, each with different strengths depending on the environmental conditions and the specifics of the motion analysis task.
Considering that the development of this technology has increased exponentially in recent years and that the latest information is essential for envisioning prospects, the novel aim of this study is to provide an overview of the research published in the last decade regarding depth camera technology in sports biomechanics, reflect on past interventions, and envision the path and implications of these innovative devices in the world of physical activity and sports. One of the most accurate methods available and validated today for addressing the results of professional sports is the analysis of human motion (Pueo and Jimenez-Olmedo, 2017).
This systematic review was created following the Preferred Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist (Page et al., 2021).
The search elements used in this review were based on the population, intervention, outcome, and time (PIOT) framework. This is a modified version of the PICOT model (Amir-Behghadami and Janati, 2020). The PIOT framework describes the following: (1) the population is depth camera technology in the context of sports biomechanics; (2) the intervention is the application of depth cameras in sports biomechanics; (3) the outcomes are the utilisation and advancements of depth camera technology in the realm of sports biomechanics; and (4) the search scope considered is papers published between 2014 and 2024. The main aim of this systematic review was to include only articles related to the sports discipline, not just simple movements such as moving the legs, raising the arms, or hand gestures. The eligibility criteria were that the articles had to be based on a search strategy and be original papers written in English, Polish, Portuguese, or Spanish.
On 4 January 2024, three electronic databases (Web of Science, Scopus, and PubMed) were searched for articles regarding the use of depth cameras in sports biomechanics. Eligible papers were the primary sources of peer-reviewed scientific journals published between 4 January 2024 and the previous ten years. Keywords were selected based on PIOT and the purpose of writing this systematic review. This study aimed to collect all articles that used depth camera technology to assess biomechanical parameters in sports. Therefore, the four most relevant keywords (depth camera technology, biomechanics in sports, motion capture, and application in sports) were selected for this systematic review. Related search terms were found for each of these keywords to make the search as accurate and comprehensive as possible and to avoid missing articles. The core of the search was the use of depth camera technology in the studies; therefore, all other keywords were connected to it with the Boolean operator ‘AND’. All other keywords were connected to another Boolean, ‘OR’. The operator ‘NOT’ was not used. Table 1 summarises the search terms used in Web of Science, Scopus, and PubMed, combined with the Boolean operators ‘OR’ and ‘AND’.
Key and related search terms used as the search strategy
| Key Search Terms | Related Search Terms |
|---|---|
| depth camera technology | depth camera$ OR 3D camera$ OR depth sensor$ OR time-of-flight camera$ OR range camera$ OR stereo camera$ OR structured light camera$ OR depth-sensing camera$ OR RGB-D camera$ OR infrared camera$ |
| AND | |
| biomechanics in sport | sport$ biomechanics OR biomechanics of sport OR exercise$ biomechanics OR physical activity* analysis OR sport movement$ analysis OR sport$ motion analysis OR athletic performance$ analysis |
| OR | |
| motion capture | motion capture OR motion tracking OR movement$ capture OR kinematic$ tracking OR kinematic$ analysis OR motion$ analysis OR motion sensing OR movement$ tracking |
| OR | |
| application in sport | application$ in sport OR sports technology application$ OR practical use in sport$ OR implementation$ in athletics OR sports-related application$ OR applied sports science OR practical application in sport$ OR integration in athletic training OR “sport-specific application” |
After the studies were located, each study was gathered and imported into EndNote X20, a reference management application. Once duplicate items were eliminated from the search results, K.M. and F.M., the two writers, separately evaluated the appropriateness of the title and abstract. The same authors thoroughly examined all eligible records to find the studies that met the inclusion criteria. A third, more experienced author was consulted if there were any questions about inclusion or exclusion judgments (K.P.).
K.M. and F.M., two of the authors, extracted and harmonised the data by employing a standardised methodology with an agreed-upon consensus consisting of seven criteria: (1) study, (2) year of publication, (3) purpose, (4) sport, (5) depth camera technology, (6) what was measured, and (7) application.
The COSMIN (Consensus-based Standards for the selection of health Measurement Instruments) risk-of-bias checklist for PROMs was used to assess the study quality (Terwee et al., 2018). This systematic review focused mainly on the depth camera technology used in sports biomechanics. Two independent working authors (K.M. and F.M.) evaluated these 14 articles using the COSMIN risk-of-bias checklist to determine the quality of the studies. The authors relied on the COSMIN user manual (February 2018 version, available on their website (https://pubmed.ncbi.nlm.nih.gov/29435801/)) to assess the quality of the included studies. Because this review focused on performance-based outcomes, the researchers used the COSMIN risk-of-bias online spreadsheet and the 2018 user manual to assess eight measurement property categories in the 14 selected articles. A four-point rating scale (‘very good’, ‘adequate’, ‘doubtful’, or ‘inadequate’) was applied by two independent reviewers (K.M. and F.M.). Only the measurement properties addressed in each article were included in the risk-of-bias assessment. If a property was not evaluated, ‘not applicable’ (N) was marked. According to the COSMIN guidelines, the lowest score received by an article determined its overall quality. Further details on the risk of bias are provided in Appendix 1.
Next, using the next table in the COSMIN spreadsheet, the analysis of each measurement property in the 14 articles was collated and compared to the standards for ‘good measurement properties’. According to the definitions given in the user manual, each property with data was assessed as ‘sufficient’ (+), ‘insufficient’ (-), or ‘indeterminant’ (?). Appendix 2 describes the scoring method. Lastly, the authors chose four COSMIN categories to evaluate the overall quality of the evidence: risk of bias, inconsistency, imprecision, and indirectness. Within each category, points were subtracted based on the most recent ratings, starting with a default ‘high’ quality rating. A more thorough description of this procedure and the particular definitions employed can be found in the COSMIN 1.0 user manual. The overall scores for each measurement attribute and the quality of evidence ratings are summarised in Table 2. In discussions between two reviewers (K.M. and F.M.) regarding discrepancies in the scoring of all the categories mentioned, a third independent reviewer (K.P.) was consulted to make the final decision.
Summary of COSMIN quality of evidence scoring consensus
| Parameter | Quality Rating1 | Quality of Evidence2 | ||||
|---|---|---|---|---|---|---|
| Rater 1 | Rater 2 | Cons.* | Rater 1 | Rater 2 | Cons.* | |
| Structural validity | N | N | ||||
| Internal consistency | N | N | ||||
| Cross-cultural validity | N | N | ||||
| Measurement invariance | N | N | ||||
| Reliability | + | + | + | Moderate | Low | Low |
| Measurement error | + | + | + | Moderate | Moderate | Moderate |
| Criterion validity | ? | + | + | Very Low | Very Low | Very Low |
| Construct validity | + | + | + | Moderate | Low | Moderate |
| Responsiveness | + | ? | + | Moderate | Moderate | Moderate |
Cons. = Consensus
+ – sufficient, − – insufficient, ? – indeterminate, N – not applicable;
High, moderate, low, or very low.
The flowchart in Figure 1 illustrates the study selection process. Through searches in Web of Science, Scopus, and PubMed, 2740 articles were initially identified. There were 1815 articles left for further analysis after removing 925 duplicates. Subsequently, articles were screened based on their titles and abstracts, resulting in the elimination of 1742 articles. Finally, 73 articles were subjected to a full-text assessment, of which 14 were deemed relevant for inclusion. The exclusion criteria were based on various factors, including the type of technology used not being a depth camera (17), exercise or movement not related to sports (14), recreational activities (8), gait analysis (6), different languages (6), patents (3), reviews (2), technical aspects of motion capture technology (1), inaccessibility (1), and not an article (1).
After the selection phase, 14 articles that resulted in a more thorough analysis were subsequently presented.
The 14 articles included in the systematic review were assessed in terms of the quality of rating and quality of evidence (Figure 1).

Flowchart of study selection process based on the PRISMA
The analysis showed versatile applications of depth cameras for assessment and training support in sports. This technology observes kinematic parameters, analyses techniques, and assesses the effectiveness of movements in sports disciplines. Table 3 presents the purposes of the research, sports that used the technology, types of depth cameras, parameters measured, main applications, and methodological evaluation of each study.
Summary of the studies
| Study | Year Published | Purpose | Sport | Depth Camera | What Was Measured | Application | COSMIN Grade |
|---|---|---|---|---|---|---|---|
| An et al., 2017 | 2017 | To obtain a numerical estimate of the feasibility of a dual depth camera system | Running | Two Microsoft Kinect sensors | The left hip and knee analysis included motion (flexion, extension), velocity, acceleration, angular velocity, linear and angular kinematics (displacement), and the centre of mass (COM). | The dual Kinect system allows better tracking of an athlete’s movements due to a better viewing angle. | I |
| Babayan et al., 2015 | 2015 | To introduce an innovative, cost-effective apparatus utilising electronic and electrical components for simulated swimming exercises | Swimming | Microsoft Kinect | The skeletal stream was captured from the depth sensor. The data were used to obtain the X, Y, and Z position coordinates of the joint data points. | Users could improve their swimming abilities and get exercise with this system. | I |
| Bittar et al., 2017 | 2017 | To develop an augmented basketball hoop by integrating sensors, actuators, and computers | Basketball | Microsoft Kinect | Knee flexion, spine tilt during free throws. Detection of throws. The overall score of throws was calculated with the help of Arduino. | Quality assessment of the free throws with real-time feedback | D |
| Chang et al., 2022 | 2022 | To improve Kinect data quality using deep neural networks and achieve precise motion capture | Swimming | Microsoft Kinect | The swimming postures were tracked and detected. The deep-intercepted images were processed to capture postures and movements effectively and to obtain improved outcomes. | Recognising and fixing the bad postures of swimmers in real-time, enhancing training, and assisting coaches during training | I |
| Cunha et al., 2021 | 2021 | To showcase research aimed at designing and implementing a user-friendly, cheap system for real-time performance assessment | Taekwondo | 3D Orbbec Astra Camera | The taekwondo movements of the athlete were measured and collected. | Feedback could refine or enhance an athlete’s techniques, leading to improved performance in less time. Advancing training methods and technological progress during practice | D |
| Cunha et al., 2023 | 2023 | To create a user-friendly, low-cost system and design a technology tool to assess taekwondo athletes’ performance in real time | Taekwondo | 3D Orbbec Astra Camera | Taekwondo-specific movements like Ap Tachagui, Miro Tachagui, and Jirugui were measured. | Evaluating athlete performance through movement analysis during practice | D |
| Emad et al., 2020 | 2020 | To evaluate and improve seven movements of Karate Kata 1 (Heian Shodan) | Karate | Microsoft Kinect sensor | Seven movements from Heian Shodan were recorded (perfect trials and with typical mistakes). | Detection and evaluation of 7 movements in karate kata. In future work, deep learning methods could be used to achieve better results. | I |
| Hwang et al., 2021 | 2021 | To replicate actual motion by gathering subject motion data | Golf | Microsoft Kinect sensor | Measuring and entering data about human bodies to create a human body model that reflected individual body characteristics | A system that uses data from 15 inertial sensors and data from a depth camera to create a motion identical to the real one | I |
| Kaharuddin et al., 2017 | 2017 | To perform a biomechanics analysis on multiple people executing the ‘Jurus Satu’ manoeuvre using the Kinect device | Silat (combat sport) | Kinect sensor | Identification of twenty bodily joints and data extraction during the execution of ‘Juru Satu’ by experienced and inexperienced athletes. | Use of a motion capture system to learn a relatively unknown movement and to provide data on joint kinematic parameters | I |
| Malawski, 2020 | 2020 | To identify lunge actions (footwork training in fencing) and then extract the qualitative characteristics of those actions | Fencing | Microsoft Kinect sensor, Two Point Grey | Real-time analysis of footwork in fencing was considered. Fencers stood and moved sideways with their hand facing their opponent. | Tracking lunges in continuous training (hand timing, duration, and acceleration) and providing real-time feedback | I |
| Nam et al., 2013 | 2013 | To demonstrate the feasibility of golf swing motion tracking using two depth sensors | Golf | FMVU-03MTM-CS cameras | Tracking the movement of the golf club | The system enables the analysis of golf shots using inertial sensors and visual data from a depth camera. | D |
| Pandurevic et al., 2019 | 2019 | To assess movements in sport climbing, understanding technique and performance. This could potentially result in training injury prevention. | Sport climbing | Intel RealSense D435 | Registration and analysis of motion patterns. The possibility to automatically determine joint angles | The data from force sensors and motion capture systems will allow the improvement of technique, performance, and injury prevention. | I |
| Reily et al., 2017 | 2017 | To employ the Microsoft Kinect 2 for the automated evaluation of men’s gymnastics performance on the pommel horse and to validate its efficacy as a training tool for gymnastics coaches | Gymnastics | Microsoft Kinect 2 | The gymnasts’ position and movements during routines were tracked. Analysis of the spin time consistency and an evaluation of the body angle consistency | Automatic performance assessment in men’s pommel horse gymnastics has been confirmed as an effective coaching tool. | I |
| Sri-Iesaranusorn et al., 2021 | 2021 | To determine the factor that will help novice players improve their strokes (differences in consistency of strokes between players at various skill levels) | Table tennis | Infrared (IR) depth sensor | The participants executed ten strokes from each side (forehand and backhand). Three joints (the wrist, the elbow, the shoulder) and the racket centre were monitored by the depth camera. | Video tracking and accelerometers are used to evaluate stroke consistency in table tennis, (monitoring and enhancing beginner’s skills by comparing them to standard players). | I |
Among the 14 articles, depth cameras were applied to individual sports, including running (An et al., 2017), swimming (Babayan et al., 2015; Chang et al., 2022), taekwondo (Cunha et al., 2021, 2023), karate (Emad et al., 2020), golf (Hwang et al., 2021; Nam et al., 2013), silat (Kaharuddin et al., 2017), fencing (Malawski, 2020), sport climbing (Pandurevic et al., 2019), gymnastics (Reily et al., 2017), and table tennis (Sri-Iesaranusorn et al., 2021). In addition, a team sport (basketball) has utilised this technology for capturing movement (Bittar et al., 2017).
The most popular depth camera technology in all articles was Microsoft Kinect, which was used in nine studies (An et al., 2017; Babayan et al., 2015; Bittar et al., 2017; Chang et al., 2022; Emad et al., 2020; Hwang et al., 2021; Kaharuddin et al., 2017; Malawski, 2020; Reily et al., 2017). Two articles used this technology to track swimmers’ movements and postures and identify incorrect techniques (Babayan et al., 2015; Chang et al., 2022). In basketball, the Microsoft Kinect sensor analyses the position of the spine and knees during free throws in real time (Bittar et al., 2017). In karate, depth cameras have been employed to detect and evaluate seven movements from kata (Emad et al., 2020), and they have been applied in silat to identify and assess body joint movements during specific movements (Kaharuddin et al., 2017). The Microsoft Kinect sensor has also been utilised in fencing to track footwork in real time (Malawski, 2020) and in golf, in combination with IMUs, for movement model creation (Hwang et al., 2021). The Microsoft Kinect 2 sensor tracked gymnasts’ positions, analysed their spin time, and evaluated body angle consistency (Reily et al., 2017). Unfortunately, not all the authors mentioned a specific model of the Kinect device. An et al. (2017) measured angles in the knee and hip joints, angular and linear kinematics, and the centre of mass during running using two Microsoft Kinect sensors. Another dual-depth camera system has been used in golf (Nam et al., 2013). In this study, two Point Grey cameras tracked the golf club. This was the only study on this specific technology. The third depth camera technology is Intel RealSense (Pandurevic et al., 2019). This device registered and analysed the movement patterns during sport climbing. In two other studies (Cunha et al., 2021, 2023), a 3D Orbbec Astra camera was used for taekwondo. The first study (Cunha et al., 2021) measured and collected movements in taekwondo. The second one (Cunha et al., 2023) focused on specific techniques such as Ap Tachagui, Miro Tachagui, and Jirugui. Table tennis players were recorded executing forehand and backhand strokes using an infrared depth sensor (Sri-Iesaranusorn et al., 2021). The authors did not mention more specific details about the device that captured the body joints (wrist, elbow, and shoulder) and the middle of the racket.
Table 3 shows that depth camera technology has been widely implemented to enhance the performance analysis, real-time feedback, and accuracy of movement monitoring. In a study by An et al. (2017), the dual-sensor setup improved the tracking of athlete movements, offering a more precise and multidimensional view that could be used to refine technique and optimise training. Babayan et al. (2015) used a Kinect sensor in an innovative system designed for simulated exercises (data on joint coordinates in three dimensions were collected). This device highlights the accessibility of motion analysis tools, providing an approach that allows the practical improvement of technique through skeletal mapping and real-time positioning feedback. Bittar et al. (2017) introduced an augmented basketball hoop equipped with sensors to evaluate the throw quality and accuracy in real time with the help of Arduino technology. This system, which enabled immediate feedback on throw mechanics, illustrated the value of integrated technology for enhancing training in sports that require precise motion. Chang et al. (2022) further expanded the capabilities of Kinect in motion analysis by integrating deep neural networks to improve the tracking of the accuracy in swimming. This motion-tracking system allows the real-time identification and correction of improper postures. Using this system, coaches can continuously support daily training routines with precise feedback, monitoring, and technique adjustment. Similarly, Cunha et al. (2021) designed a system to be user-friendly and cost-effective. The authors presented a new avenue for training that could potentially be adapted to other sports with similar movement patterns. Cunha et al. (2023) further developed this concept by proposing a multi-sport adaptable system based on the 3D Orbbec Astra Camera, which enhanced training by refining athletes’ techniques through detailed motion capture and feedback. This application suggests that such systems could be beneficial in sports with comparable requirements for high-speed, complex motions, demonstrating the versatility of depth cameras in training. Emad et al. (2020) leveraged the Kinect sensor to capture correct forms and common errors in karate kata. The Kinect system facilitated movement correction. Meanwhile, in golf, Hwang et al. (2021) utilised Kinect with 15 IMUs to model golfer motion by replicating movement with high accuracy. This dual-system approach creates a body model that reflects body characteristics, enabling players to correct errors based on realistic simulations of their movements. Moreover, Kaharuddin et al. (2017) utilised a Kinect depth camera to perform biomechanical analysis. This study demonstrated how to facilitate the learning of complex, unfamiliar movements by offering detailed feedback on joint positioning and movement quality. Malawski et al. (2020) demonstrated the capacity of Kinect and inertial sensors. This capability for real-time feedback was particularly advantageous in supporting continuous training by identifying and correcting forms in ongoing drills and ensuring that each movement was similar to the desired technique. In gymnastics, Reily et al. (2017) facilitated automatic analysis for coaches, who could monitor multiple gymnasts’ performances and receive synchronised and consistent feedback on each athlete’s routine. Finally, in table tennis, Sri et al. (2021) compared the strokes of novice players with those of experienced athletes. This study highlighted areas for improvement in technique, with depth sensors providing a robust framework for assessing consistency and skill level. Together, these studies underline the broad applicability of depth camera systems in sports, revealing their effectiveness in monitoring performance, providing precise motion data, and contributing to enhanced athletic training methodologies
The objective of this study was to gather the applications of depth cameras in sports biomechanics. This study shows the limitations and paths for the further development of this technology. Based on this, it is possible to draw two important conclusions: (1) depth cameras are more often used in individual sports (the occlusion risk increases with the number of people in the scene of view); and (2) there is a need to standardise procedures and develop methodological guidelines for their use (to help improve the consistency of results, which allows for more objective comparisons of training effects across disciplines and studies). Currently, one of the limitations of depth camera technology is the maximum sampling frequency; for example, Kinect devices have 30 frames per second (fps) (Chang et al., 2022), and the Intel RealSense D435 has up to 90 fps (Pandurevic et al., 2019). The Femto Bolt camera has the widest sampling frequency, from 30 fps to 120 fps (Laaraibi, 2024), but increasing the sampling frequency decreases the camera resolution. In this situation, the movement of a high-speed athlete will not be captured accurately (Van der Kruk and Reijne, 2018).

The past and the future of the depth cameras
The applications of depth cameras are diverse among sports disciplines. Sri-Iesaranusorn et al. (2021) achieved satisfactory results in table tennis by creating a template to compare beginner strokes with those of experienced players. The application of smartwatches and mobile phones is the next step. In addition, basketball has used two systems (a depth camera and Arduino technology). The results provided real-time feedback for the player. Ninety-two per cent of users found the system useful, and 94% described the experience as pleasant (Bittar et al., 2017). Babayan et al. (2015) proposed a dry swimming machine that allows swimming techniques to be practiced with realistic feedback (lifting and lowering depending on the correctness of the movement). The results were promising; however, further testing is required to assess effectiveness. One limitation is the lack of a natural environment. Conducting research during real training sessions (in water) could provide more reliable and development-influencing data. However, such an analysis may be difficult to perform. Depth camera usage is the most frequent in combat sports and martial arts. Although the authors of these articles used different depth cameras (Kinect and 3D Orbbec Astra), they arrived at similar conclusions. Kaharuddin et al. (2017) evaluated Kinect as a tool for the biomechanical analysis of basic movement patterns in silat. Despite its effectiveness, accurate analysis requires additional processing, particularly in a detailed assessment of joint angles and movement quality. Cunha et al. (2023) used the Orbbec Astra camera in taekwondo, which, when combined with machine learning (a project developed in their previous research; Cunha et al., 2021), reduced noise and improved the capture of movements such as kicks and punches. The results and further research in this project are promising. Emad et al. (2020) created iKarate, which uses Kinect to analyse kata movements. The system achieved 91% accuracy in movement classification (using the F-DTW algorithm); however, researchers have suggested implementing more advanced technologies, such as deep learning, for better results. Chang et al. (2022) presented an algorithm that corrects the starting posture during swimming using Kinect. A low resolution and noise in the data require processing using advanced networks, which improve the accuracy and smoothness of the data. The results and further research in this project are promising. The results showed that high precision can be achieved even with low-accuracy equipment; however, depth camera technology must be fused with algorithms that enhance movement capture. Pandurevic et al. (2019) focused on improving the technique, performance, and injury prevention of competitive climbing athletes. They used not only the Intel RealSense camera to capture the motion but also a force sensor that was mounted on the climbing wall to better understand the movement patterns. In the article, the researchers did not mention the disadvantages of motion capture. However, in future studies, an additional camera capable of side observations could be considered. This would make it possible to focus on the examination of the catch or distance of the athlete from the wall. Malawski (2021) compared IMUs with depth camera data to analyse fencing technique. Depth camera data were slightly more accurate and provided additional information such as the stride length and speed (footwork analysis). The IMUs performed better in analysing rotational movements (no occlusion). The researcher ensured that research in this direction will be developed. Reily et al. (2017) used Kinect to analyse the movements of the gymnasts, achieving 97.8% accuracy in identifying an athlete’s location and 93.8% accuracy in detecting rotation. The small research group (three standard players and seven beginners) was one limitation. Further work should focus on developing an algorithm by significantly increasing the number of standard players. The number of frames per second (only 30) has been limited in studies on golf (Hwang et al., 2021; Nam et al., 2013). Hwang et al. (2018) pointed out problems with the accuracy of Kinect in dynamic conditions, such as noise in the light structure and errors resulting from body movements. More data and advanced error correction algorithms have been proposed. Nam et al. (2013) obtained better accuracy, but they used inertial sensors and two Point Grey cameras. An et al. (2017) verified that the dual Kinect system is capable of performing biomechanical analysis on athletes. The dual Kinect system tracks wider coverage of the athlete’s motion than a single Kinect system. Apart from that, the presented motion capture system is markerless, so it is more comfortable and saves a large amount of time in setting up the system compared to conventional marker-based motion capture systems. In terms of dual-depth-camera system performance, more depth cameras should result in better motion capture system features.
The practical benefits of depth cameras include the following: (1) providing detailed feedback in real time (Babayan et al., 2015; Bittar et al., 2017; Cunha et al., 2021; Emad et al., 2020; Malawski, 2020; Reily et al., 2017), (2) identifying subtle technical errors that may be difficult to detect visually (Chang et al., 2022; Kaharuddin et al., 2017; Nam et al., 2013; Pandurevic et al., 2019; Reily et al., 2017), and (3) enabling an individual approach to improving athlete technique (An et al., 2017; Sri-Iesaranusorn et al., 2021).
One limitation of depth camera technology is the number of frames per second (fps). High-speed movements (racket head speed in tennis, golf club swing, baseball bat hit, and jumps) require a very high sampling frequency. Unfortunately, Microsoft Kinect and Orbbec Astra have 30 fps (Cunha et al., 2021; Malawski, 2020). This is insufficient for capturing high-speed movements (Hwang et al., 2021). In contrast, OMSs have a frame rate beginning at 120 fps (Pueo and Jimenez-Olmedo, 2017), which is four times higher. Other depth cameras can reach up to 90 fps (Intel RealSense) (Pandurevic et al., 2019) or 120 fps (Femto Bolt) (Laaraibi, 2024), but this is the maximum value. All of these motion capture systems have a trade-off between sampling frequencies and camera resolution (Van der Kruk and Reijne, 2018).
The second limitation and difference between depth cameras and OMSs is the range of view (Van der Kruk and Reijne, 2018). The Microsoft Kinect camera captures objects up to 5 m and an OMS, such as Vicon, captures objects from a minimum of 10 m. This indicates that every device has its own limitations. In addition, the accuracy of OMSs is better (error of 0.001 m) than that of depth cameras (error from 0.004 to 0.02 m) (Hwang et al., 2021; Van der Kruk and Reijne, 2018). Regarding the accuracy of the systems, Bilesan et al. (2021) analysed lower body angles during gait trials and achieved promising results. The average difference between Kinect and Optitrack was 2 degrees. The angles in the pelvis must be improved for a depth camera system (Bilesan et al., 2021). It should also be mentioned that the difference between these systems is the presence or absence of markers on human bodies. This is a significant difference in sports biomechanics research. Although depth cameras are less accurate, they allow full movement freedom and are non-invasive (athletes may feel uncomfortable if somebody touches their skin to place the markers) (Scataglini et al., 2024).
The next problem with depth camera systems is occlusions. The problem concerns both marker-based and markerless camera-based systems. In the case of markers, the problem is much simpler but not easy. Motion capture systems have algorithms that eliminate the consequences of occlusion. These functions are primarily based on movement models or features (Retinger et al., 2023). Depth cameras still do not have good enough capturing algorithms and require improvement in this area (Choppin and Wheat, 2013). The easiest method to use to avoid occlusion is to increase the number of cameras. This allows one to view objects from different angles simultaneously (Regazzoni et al., 2014). One of the best solutions for avoiding occlusion is the use of IMUs. Inertial sensors connected to a 3D motion capture system showed the best results (Van der Kruk and Reijne, 2018).
Other limitations of the technology include the use of alternative devices and a lack of standardisation. Although research has increasingly used cameras such as Intel RealSense (Pandurevic et al., 2019), Point Grey (Nam et al., 2013), or Orbbec Astra (Cunha et al., 2023), their effectiveness compared to Kinect remains under-researched. The small number of technology comparisons limited the selection of the best solutions. Currently, there is no uniform standard for evaluating the effectiveness of depth camera technology, which makes it difficult to compare research results. Developing such standards would allow more consistent and transparent evaluation. This could help future research on better technology development in human movement analysis (Bernardina et al., 2019).
Depth cameras use technologies such as the time of flight, structured light, and stereovision, which give them a wide array of applications beyond sports performance analysis. These cameras are integral to various fields, including machine vision, autonomous driving, healthcare, augmented reality (AR), and virtual reality (VR). In the field of robotics, they help guide robotic arms (Garcia et al., 2013), help unmanned vehicles navigate (Gai et al., 2021), and manage logistics (Yoshimura et al., 2023) by enabling precise object recognition (Bo et al., 2011) and spatial analysis. In healthcare, depth cameras facilitate enhanced patient monitoring (Siddiqi et al., 2021; Addison et al., 2021) and support advanced interaction systems for patients with mobility problems (Xu et al., 2021). They are also used in physical therapy to track and analyse patient movements (Omelina et al., 2016) and assist in accurate diagnostics (Belić et al., 2019) and treatment planning. Depth cameras are critical for obstacle detection (Iacono and Sgorbissa, 2018), providing data that help these vehicles understand and navigate their environments safely. These cameras significantly contribute to the decision-making algorithms of self-driving cars (Al-Nuaimi et al., 2021). Depth cameras enhance immersion in VR and AR by accurately mapping the real world and integrating digital elements in real time (Alexiadis et al., 2012). This technology allows more interactive and engaging user experiences in gaming (Yahav et al., 2007), education (Syed et al., 2022), and virtual tourism (Siddiqui et al., 2022). Depth cameras have been employed in agriculture for tasks such as monitoring crop growth (Andujar et al., 2016), assessing the health of livestock (Monteiro et al., 2021), and the in-field estimation of fruit quality (Neupane et al., 2021). These applications benefit from the ability of depth cameras to analyse spatial data and enhance precision agriculture practices. The flexibility and range of applications of depth cameras illustrate their importance in different sectors. Their ability to provide detailed spatial awareness and real-time data processing makes them invaluable tools for advancing technology and automation in many areas of daily life.
Based on the results from the COSMIN risk-of-bias list (Terwee et al., 2018), the reliable description and standardisation of research methodologies using new technologies would allow a better understanding of research topics. The flow of information would be clearer, allowing a better description of problems in specific areas. Attention should also be paid to the need for more studies comparing the performance of depth cameras with gold-standard measurement devices (Helten et al., 2013; Piche et al., 2022). The tracking algorithms used in depth camera technology currently represent the main development direction. Based on the available literature, it is necessary to examine combinations of different technologies. Based on the collected data, it is necessary to focus on adjusting the appropriate filters (Min et al., 2011), improving tracking algorithms (Ma and Wu, 2014), and combining depth camera technologies with machine learning technology (Chatzitofis et al., 2019), neural networks (Chang et al., 2022), or artificial intelligence in general (Jacobsson et al., 2023) to further develop these systems. The fusion of innovative technologies may prove particularly fruitful in the future. In addition, error correction algorithms, such as mathematical models that consider data noise or fluctuations resulting from motion dynamics, can improve the reliability of the measurements (Hwang et al., 2021). Algorithms allow the connection of one, two, or three Microsoft Kinect devices. An et al. (2017) used two depth cameras to obtain views from different angles. The results are promising, and OMSs use the same solution. The number of cameras decreases the risk of occlusion, but in the case of dynamic movement, depth cameras remain helpless at this time (Nam et al., 2013). One limitation of this method is the maximum frame rate. The accuracy cannot be sufficiently high for high-speed movements because there are insufficient frames per second. Sports such as badminton, squash, and golf require approximately 1000 fps. The next threshold is 480 fps (tennis, baseball, or table tennis). For most sports, 240 fps is sufficient (Pueo, 2016). Unfortunately, the best depth camera can track only up to 120 fps (the most common is the 30 fps Kinect device). Increasing the number of frames per second is one of the greatest challenges faced by depth cameras. Many studies have used Microsoft Kinect sensors to capture the motion of athletes (Chang et al., 2022; Emad et al., 2020; Malawski, 2020; Reily et al., 2017). There is a need for further research on cameras such as Intel RealSense (Pandurevic et al., 2019), Orbbec Astra (Cunha et al., 2023), or Femto Bolt. In the case of the latter type of camera, there have been no studies involving them. More studies would allow the verification of the potential of cameras or the comparison of cameras between studies concerning a specific movement or in a single study.
Depth cameras are effective at analysing techniques in individual sports; however, their use in team sports is limited by the problem of occlusion. They have the potential for injury prevention, progress monitoring, and technical error correction. However, the low frame rate and noise in the data make it difficult to fully exploit their capabilities. The advantages of this technology are markerless motion capture, ease of use, and low cost; however, the lack of methodological standards makes it difficult to compare different devices.
The number of frames per second needs to be increased (target 240–480 fps), and this resolution improvement would allow a better quality of dynamic movement recording. The development of noise-reduction algorithms, error correction, and the integration of artificial intelligence and IMU systems could eliminate occlusion problems. It is also crucial to standardise measurement procedures and research methodologies, as well as conduct tests under non-laboratory conditions. Exploring less popular cameras, such as Intel RealSense, could provide new technological possibilities.
Depth cameras have significant potential, but their development requires investment in equipment, algorithms, and standardisation. This technology is a promising tool that, with appropriate improvements, could become a key element of sports biomechanics.