Adaptive Upper Limb Robot-Assisted Rehabilitation: Learn-from-Therapist Demonstrations

Ismail Auta; Ahmed Fares; Hiroyasu Iwata; Haitham El-Hussieny

doi:10.14313/jamris-2026-004

Introduction

Globally, stroke is a major health problem and is among the leading causes of death and long-term disability [1, 2]. An estimate in 2019 indicates that about 101 million people live with the effects of stroke, and about 12.2 million new cases are reported yearly [3]. With the aging of population, the number of expected incidences is projected to rise [4, 5]. After a stroke, more than two-thirds of patients experience impaired motor function in their upper limbs [6], and at six months post-stroke, 50% of the patients continue to experience a significant loss of functionality [7]. This impairment presents a challenging consequence for stroke survivors, it particularly interferes with their ability to perform daily tasks independently [8]. Therefore, as the number of survivors increases, the demand for effective rehabilitation services grows significantly [9, 10]. These services are essential for assessing and improving motor functions by providing targeted training during the vital early recovery phase, which can last up to three months post-stroke [11, 12]. Additionally, this area remains a top research priority in stroke rehabilitation [13, 14].

Conventional rehabilitation techniques, including physical and occupational therapy, and various forms of electrical and sensory stimulation, often fail to restore full motor function in the upper limb of stroke patients [15]. This is often attributed to insufficient therapy doses, poor patient engagement, and lack of objective feedback [16]. These approaches do not induce the neuroplastic changes necessary for maximum motor recovery, suggesting a need for more intensive rehabilitation strategies [17]. Robotic technology integrated into rehabilitation has shown remarkable potential in enhancing recovery following stroke [18]. For instance, technologies such as exoskeletons and end-effector robots address these limitations by offering accurate support and precise movement of the limbs, significantly improving functional outcomes [7, 19]. Systematic reviews and meta-analyses verify that robot-assisted therapy enhances arm function, muscle strength, and activities of daily living by promoting intensive, repetitive exercises [20-22].

Problem Statement: Despite the advancement in robotic rehabilitation, some critical areas still require further exploration to enhance the efficacy of these interventions. Currently, robotic-assisted rehabilitation faces significant challenges, including the need for personalization to address individual differences among patients [23], and the necessity for adaptability to ensure that therapies can adjust to patients’ progress and specific needs in real-time [24]. This adaptability provides assistance when it is most needed to optimize recovery and to facilitate patient engagement during therapy sessions, particularly during the critical first three months post-stroke, which are important for recovery [6, 25].

A common limitation of current rehabilitation robots is their limited generalizability to a broader patient population [26]. This is because standardized treatments are not optimally effective for everyone, and each patient may require distinct levels of assistance and types of movement therapy.

This makes it difficult for a single robotic design to effectively serve all potential users. Additionally, a therapist is required to provide demonstrations for each task, which may not always be feasible where therapist time is scarce. This dependency can limit the practicality and scalability of robotic rehabilitation systems where consistent therapist availability may not be feasible. Finally, patient’s feedback is not taken into account, which may limit its ability to adapt to the patient’s needs. Addressing these challenges requires the development of rehabilitation robots that are capable of blending patient needs and robot functionalities to provide effective and prompt responses to therapeutic interventions.

A variety of methods and technologies have been developed to enhance the functionality of the upper limb. Among these are exoskeleton robots, designed to mimic the skeletal structure of patient’s limb, and end-effector robots, which interact directly with the limb through the movement of its end-effector.

These robotic technologies are categorized based on the types of assistance they provide, which include active, passive, and haptic [27]. Active assistance is employed to support patients who retain some motor functionality; in this mode, patients actively engage in rehabilitation exercises alongside the robot. In contrast, passive assistance does not require any effort from the patient, as the robot performs the movements independently. Haptic assistance, meanwhile, involves sensory stimulation, often through touch, that can be either active or passive. Recently, haptic devices have been integrated with virtual reality (VR) to create rehabilitation scenarios that provide sensory feedback, enhancing the therapeutic experience [1, 10, 15].

A typical approach to modeling rehabilitation exercises involves the use of probabilistic models. For instance, [28] and [29] present a rehabilitation technique that incorporates both force and impedance-based behaviours from the patient, they utilize Gaussian Mixture Models (GMM) with Gaussian Mixture Regression (GMR) to develop a generalized model based on real-time patient responses. In [28], a dynamic bicycle cranking model is used to adjust the level of assistance provided according to therapist performance in demonstrating different sub-tasks at a patient-specific basis. [30] employed GMM and GMR for robotic assistance in play activities for children with cerebral palsy. They applied it in a two-dimensional pick-and-place task utilizing master-slave teleoperation system. Their major drawback is the inability to adapt to patient variability and to external disturbances.

Recent developments have incorporated machine learning techniques. In their study, [31] developed an Intelligent Assistant for Robotic Therapy (iART) using Long Short-Term Memory (LSTM) networks that helps in the replication of therapist behaviours in assisting tasks, involving tracking complex three-dimensional trajectory.

In [32], a feedforward neural network combined with VR-based haptics is used to model and regenerate therapeutic rehabilitation strategies initially demon-strated by therapists, enabling them to remotely demonstrate and manage rehabilitation through teleoperation and facilitating ongoing therapy even when they are not physically present. However, adapting these systems to the wide variability in patient conditions poses another challenge.

Moving beyond these techniques, Dynamic Movement Primitives (DMP) offer another approach in robotic rehabilitation, widely recognized for their ability to model complex movement patterns. They have been applied across various fields, showing significant results in enhancing robotic functionality and interaction. [33] has investigated the use ofDMP by providing a method to teach robots how to replicate observed actions and adapt to new goals simply by adjusting start and goal parameters, making the system highly adaptable to different scenarios. The results show-cased include successful implementation on a Sarcos robot arm, where the robot performed tasks like pick- and-place and water-serving, demonstrating significant generalization capabilities in novel situations.

[34] proposes an upper limb rehabilitation robot that performs task-oriented exercises by recognizing objects and generating trajectories for reaching them. The DMP-based motion planner used in this system successfully replicates the motion style of healthy subjects and achieves target positions with minimal error. Similarly, [35] developed a motion planning system based on the DMP and validated it using the Kuka LWR4+ robotic arm. Eight healthy subjects were recruited to perform a series of tasks such as: pouring, drinking, and eating while their wrists were attached to the robot’s end effector. The system accurately mirrors and generalizes the personal motion style of users across different task scenarios. Compared to previous work by [36], this study achieved improvements, including minimized trajectory errors, augmented capacity, and optimized memory utilization for the DMP database. However, it has a drawback of executing each movement separately, which restricts its application in scenarios that require integration of several movements. [37] addresses this problem by using hierarchical deep reinforcement learning to integrate Dynamic Movement Primitives (DMPs) with actor-critic algorithms. This approach enables robots to perform sequences of movement primitives, thereby enhancing efficiency in task execution. It was successfully evaluated on a 6 DOF robot arm through a pick-and-place operation.

These studies have demonstrated the capability of DMPs to perform complex, real-world tasks by dynamically adapting to changes and ensuring precise trajectory planning. Building on this, a novel approach involves learning directly from therapist demonstrations. In this method, a therapist performs rehabilitation exercises which are observed by a robot, which then replicates these movements with a patient.

This technique eliminates the need for deep technical knowledge in robotics from the therapist’s side, as a simple demonstration to the robot suffices. While various algorithms can be used for learning from therapist demonstrations, the integration of DMPs with Learning from Demonstrations (LfD) presents a particularly effective solution for representing complex human movements [38].

In this research we propose a Learn-from-Therapist Demonstrations framework that integrates DMP with Model Reference Adaptive Control (MRAC) to enhance robotic assisted rehabilitation. The DMP component captures demonstrations by formulating them as a non-linear differential equation. These are then modelled using Locally Weighted Regression (LWR), which learns a dynamic forcing term, thereby giving the DMP the ability to adapt the learned trajectories to various scenarios. The MRAC component is used to dynamically adapt the level of assistance provided to patients. It employs a Jacobian-based controller to convert task velocities into joint velocities, coupled with an adaptive control mechanism that adjusts the assistance in real-time, based on ongoing patient performance. This framework provides the flexibility to generalize the observed demonstration by adjusting the parameters of the DMP. It also enhances the adaptation to patient performance through the MRAC, thereby implementing an ’assist-as-needed’ rehabilitation strategy, which give the patients the freedom to actively participate in rehabilitation.

Thus, the main objective of this research is to develop a Learn-from-Therapist Demonstrations framework that addresses several aspects of robotic-assisted rehabilitation. The framework aims to: (i) enable therapists with no technical skills to train the system in generating rehabilitation exercises (ii) learn and generalize the motion style from therapist demonstrations (iii) accurately replicate these exercises with a robot for patient treatment and (iv) dynamically adapt the level of robotic assistance based on patient deviations, promoting an ’assist-as-needed’ approach. Therefore, we hypothesize that the LfTD framework will significantly enhance motor function rehabilitation by facilitating more accurate replication of therapist-led movements, thereby increasing the adaptability and personalization of therapy sessions. Additionally, we also expect this framework to improve patient outcomes with faster recovery times and greater improvement in motor function compared to conventional robotic-assisted therapies.

The LfTD framework is expected to significantly improve the effectiveness and personalization of rehabilitation therapies by merging the expertise of human therapists with the precision and adaptability of robotic systems. Such an approach bridges the gap between sophisticated robotic programming and practical therapeutic interventions, making advanced rehabilitation technology accessible and effective for a broader range of patients by providing tailored and responsive therapy.

The expected benefits are extensive: therapists can leverage this technology to enhance their capabilities, offering interventions that were previously unfeasible due to physical or time constraints.

The rest of this paper is organized as follows: After the introduction, Section II details the LfTD framework. This section describes the process of demon-stration collection, DMP computation, exercise generation, and feedback control implementation. Based on this framework, Section III presents its application on a simulated robotic manipulator, along with a discussion of the results obtained. Lastly, Section IV concludes with our main findings and directions for future research.

Learn-from-Therapist Demonstrations

2.1.

System Overview

The proposed LfTD framework is designed to enable a two-link robotic manipulator to mimic upper limb rehabilitation exercises by learning from demonstrations provided by therapists. This process is structured into four main steps, as outlined in Figure 1:

Collecting demonstrations from therapists,
Learning the movement primitives using Dynamic Movement Primitives (DMP),
Generating personalized exercise routines,
Implementing adaptive feedback control to ensure precise execution of the exercises.

The subsequent sections will provide a detailed explanation of each of these steps.

2.2.

Demonstrations Collection

The process of collecting rehabilitation demonstrations involves capturing the therapist’s Cartesian motion using a visual tracking system. In this setup, a healthy human subject simulates the therapist by demonstrating the rehabilitative exercise intended for the patient, with the tracking system capturing this demonstration for later replication in robotic-assisted rehabilitation. The subject’s trajectory is recorded through a colored marker placed on their wrist, which is tracked by an RGB camera mounted on an L-shaped stand aimed at the wrist, as shown in the collection of demonstrations block in Figure. 1.

This demonstration establishes the desired motion pattern, which is then utilized within our LfTD framework in a simulated environment. To simplify data collection, the captured data is represented in two-dimensional Cartesian coordinates within the image space. The decision to use visual tracking was driven by the need for a straightforward yet effective method to precisely capture dynamic movements.

The motion is captured as a series of pixel coordinates, p ∈ ℝ², and then converted into coordinates, y ∈ ℝ², that the robot can use in its operational space. The conversion process is mathematically expressed by Equation (1), which maps the camera’s field of view, defined by the range [𝐶_min, 𝐶_max], to the robot’s workspace, defined by the range [𝑅_min, 𝑅_max]: 1 $y = \frac{(p - C_{\min}) \times (R_{\max} - R_{\min})}{(C_{\max} - C_{\min})} + R_{\min}$ {\bf{y}} = {{\left( {{\bf{p}} - {C_{\min }}} \right) \times \left( {{R_{\max }} - {R_{\min }}} \right)} \over {\left( {{C_{\max }} - {C_{\min }}} \right)}} + {R_{\min }}

This conversion is essential to ensure that the captured trajectories align with the constraints and capabilities of the robot. It accurately scales the motion data to fit within the dimensions of the robot’s operational area. After capturing the motion, the next step is to calculate the velocity and acceleration, which are critical inputs for the DMP algorithm. Due to the discrete nature of the data, these calculations are performed using the finite difference method, as detailed in Equations (2) and (3): 2 $v (t) = \frac{y (t + 1) - y (t)}{Δ t}$ {\bf{v}}(t) = {{{\bf{y}}(t + 1) - {\bf{y}}(t)} \over {\Delta t}} 3 $a (t) = \frac{v (t + 1) - v (t)}{Δ t}$ {\bf{a}}(t) = {{{\bf{v}}(t + 1) - {\bf{v}}(t)} \over {\Delta t}}

In this context, y(t), v(t), and a(t) represent the position, velocity, and acceleration at time t, respectively. The time interval between samples is denoted by △t. To reduce noise in the captured motion data, a moving average filter is applied to the position data before calculating velocity and acceleration. This smoothing process is defined by Equation (4), where the moving average at time t is calculated as: 4 $\bar{y} (t) = \frac{1}{N} \sum_{i = 0}^{N - 1} y (t - i)$ {\bf{\bar y}}(t) = {1 \over N}\sum\limits_{i = 0}^{N - 1} {\bf{y}} (t - i) where N is the number of samples used in the moving average. This filter smooths the data by averaging adjacent points, effectively minimizing random fluctuations and ensuring that subsequent calculations for velocity and acceleration are based on cleaner input. Once captured and calculated, the data is normalized, preparing it for further processing.

2.3.

Dynamic Movement Primitives (DMP)

The fundamental principle of Dynamic Movement Primitives (DMP) is to model complex motions using differential equations that describe the temporal evolution of motion [38, 39]. Each degree of freedom (DoF) in the demonstrated trajectory is modeled by a second-order differential equation. This equation is analogous to a point mass attached to a spring-damper mechanism, combined with a non-linear forcing term f, with an acceleration $\overset{..}{y} = a$ \mathop {\bf{y}}\limits^{..} = {\bf{a}}, and is expressed as [40]: 5 $τ \ddot{y} = α (β (g - y) - \dot{y}) + f$ \tau {\bf{\ddot y}} = \alpha (\beta ({\bf{g}} - {\bf{y}}) - {\bf{\ddot y}}) + {\bf{f}}

In this model, y represents the position and ẏ = v the velocity of the therapist’s wrist collected during demonstrations. The variable g denotes the end-target position of the movement. The constants τ, α, and β are positive parameters that adjust the spatial and temporal scales of the demonstration. Meanwhile, f is the non-linear forcing term that encapsulates the unique movement patterns observed in the demonstrations.

The primary goal of the DMP is to learn the forcing term f, as specified in Equation (5), based on the demonstrations that provide data for ÿ, ẏ, and y [41]. This forcing term f is expressed as a weighted sum of n Gaussian basis functions, as follows: 6 $f (x, g) = \frac{\sum_{i = 1}^{n} ψ_{i} ω_{i}}{\sum_{i = 1}^{n} ψ_{i}} x (g - y_{0})$ f(x,{\bf{g}}) = {{\sum\nolimits_{i = 1}^n {{\psi _i}} {\omega _i}} \over {\sum\nolimits_{i = 1}^n {{\psi _i}} }}x\left( {{\bf{g}} - {{\bf{y}}_0}} \right)

In this formulation, ω_i represents the weights learned from demonstrations, while ψ_i denotes the Gaussian basis functions. The variable g is the end-target position, and y₀ is the starting position of the demonstration. The phase variable x modulates the influence of the term x(g – y₀), which acts as both a diminishing and a scaling factor. This ensures that the system converges towards the goal, aiding the forcing term in stabilizing the system to a steady state of rest. The weights ω_i are determined through locally weighted regression [42], allowing for the accurate reproduction of the required trajectory when necessary.

Given the time-dependent nature of the forcing term, autonomy is achieved by introducing a phase variable through a canonical system, which ensures the motion’s temporal alignment by starting at an arbitrary value usually 1 and decaying to 0 over the course of the motion, making the motion to smoothly starts and ends at the desired positions [43]. This phase variable typically utilizes a simple, time-based decaying exponential function, expressed in terms of first-order dynamics as follows: 7 $\dot{x} = - α_{x} x$ \dot x = - {\alpha _x}x where α_x is a constant that determines how fast the system decays. Meanwhile, the Gaussian basis functions are represented as: 8 $ψ_{i} = \exp (- h_{i} {(x - c_{i})}^{2})$ {\psi _i} = \exp \left( { - {h_i}{{\left( {x - {c_i}} \right)}^2}} \right) where h_i and c_i represent the mean and variance of each Gaussian function respectively that define how the influence of each basis function varies with the phase variable x.

As depicted in Figure 2, to tailor the DMP to efficiently learn the therapist demonstrations y = [y_x, y_y], we use the captured data from the demonstrated trajectory to compute the forcing term f = [f₁, f₂] in Equation. (5). The canonical system outlined in equation (7) then undergoes integration, with a specific temporal scaling, to accurately derive the phase variable. For each basis function in f, locally weighted regression techniques are applied to determine the weights ω_i that minimize the error between the force inferred from demonstrations in Equation. (5) and the estimated force in Equation. ((6)).

2.4.

Generation of Rehabilitation Exercises

In robot-assisted rehabilitation, the generalization capabilities of DMP are particularly valuable [33]. Rehabilitation robots often need to assist patients with a variety of movement exercises that may change as the patient’s condition evolves. By learning the fundamental patterns of movement exercises through therapist-led demonstrations, DMPs can generate new, patient-specific trajectories [39]. This adaptive quality allows the rehabilitation robot to cater to the unique recovery needs and progress of each patient. For example, as a patient regains more motor control, the DMP can adjust the trajectories to be more challenging or target different aspects of movement.

DMPs offer the flexibility to adapt learned demonstrations to new goals or conditions by modifying their temporal and spatial attributes. Once the weights for the forcing term (as shown in Equation (5)) are established, new motions can be generated in terms of position (y), velocity (ẏ), and acceleration (ÿ) [36]. This adaptation involves adjusting the initial position (y₀), goal position (g), or the temporal scaling factor (τ) as detailed in Equation (5). These adjustments can be made without the need to retrain the model from scratch. After determining the new acceleration from Equation (5), integration is used to compute the corresponding velocity and position profiles over time for the newly generated trajectories.

This capability is important for therapists as it can enable them to progressively increase the complexity and range of exercises according to the patient’s improving condition, helping to ensure that the rehabilitation process remains both appropriately challenging and engaging throughout [44].

The ability of the proposed LfTD framework to accurately reproduce the therapist’s motion is assessed using Mean Absolute Error (MAE), as defined by Equation (9): 9 $M A E = \frac{1}{N} \sum_{t = 0}^{N} | y_{j}^{d} (t) - y_{j}^{r} (t) |$ MAE = {1 \over N}\sum\limits_{t = 0}^N {\left| {{\bf{y}}_j^d(t) - {\bf{y}}_j^r(t)} \right|}

This equation provides a comprehensive MAE value over the total number of time instants (N) by taking the absolute value of the difference between demonstrated $(y_{j}^{d} (t))$ \left( {{\bf{y}}_j^d(t)} \right) and replayed trajectories $(y_{j}^{r} (t))$ \left( {{\bf{y}}_j^r(t)} \right) at each time instant (t) along the j-th Cartesian axis.

2.5.

Model Reference Adaptive Control (MRAC) for Kinematic Control

After generating new rehabilitation exercises learned from therapist demonstrations, a two link manipulator will be used to ensure that the patient’s hand, which is attached to the robot’s end-effector, follows the desired trajectory accurately. This setup allows for precise control of the robot’s movements, aligning them with the specific rehabilitation goals set based on the therapist’s input.

DMP framework generates motion by defining a sequence of position, velocity, and acceleration samples that outline the desired trajectory for the robot’s end effector. To translate this trajectory from the task space to the robot’s joint space, it is essential to solve the robot’s inverse kinematics. This research adopts a Jacobian-based kinematic control method to iteratively calculate the required joint configurations (q) to guide the robot along the specified trajectory. These calculated configurations then act as reference inputs within the robot’s joint space, to ensure precise adherence to the trajectories [45].

Consequently, a mechanism for adaptation is required to dynamically adjust the assistance provided to the patient while following the trajectory, ensuring that the assistance is provided based on the performance of the patient. The designed control scheme addresses these requirements and is explained in relation to the schematic in Figure 3, as follows,

1) Jacobian-based controller: The non-linear relationship between end-effector velocities and joint velocities is given by: 10 $\dot{y_{r}} = J (q) \dot{q}$ \mathop {{{\bf{y}}_r}}\limits^. = {\bf{J}}({\bf{q}}){\bf{\dot q}} where ẏ_r is the velocity of the robot’s end-effector, $\dot{q}$ \dot q are the joint velocities, and J (q) is the robot’s analytical Jacobian at configuration q defined as 𝜕y/𝜕q [46].

This relationship allows for the determination of joint velocities by computing the inverse of the Jacobian matrix as follows [47], 11 $\dot{q} = J^{- 1} (q) {\dot{y}}_{r}$ {\bf{\dot q}} = {{\bf{J}}^{ - 1}}({\bf{q}}){\mathop {\bf{y}}\limits^. _r}

The joint configuration q can then be computed by integrating these variables over time. It is worth noting that in our robot’s case, the Jacobian matrix is square, which allows for the direct computation of its inverse. To avoid numerical drift during the tracking of the desired trajectory y_d, the joint velocities are redefined as follows, 12 $\dot{q} = J_{A}^{- 1} (q) ({\dot{y}}_{d} + Ke)$ {\bf{\dot q}} = {\bf{J}}_A^{ - 1}({\bf{q}})\left( {{{{\bf{\dot y}}}_d} + {\bf{Ke}}} \right)

Here, ẏ_d represents the time derivative of our desired trajectory, while the error e represents the difference between the desired and actual end effector positions (y_d – y_r). Additionally, a diagonal positive definite matrix K ∈ ℝ^2×2 is introduced to ensure stability and convergence.

An alternative approach utilizes the transpose of the Jacobian matrix, expressed as: 13 $q = J_{A}^{T} (q) Ke$ {\bf{q}} = {\bf{J}}_A^T({\bf{q}}){\bf{Ke}}

This alternative representation not only reduces tracking errors but also eliminates steady-state errors [48].

2) Adaptation Mechanism for Personalized Assistance: To encourage active involvement of patients in the rehabilitation process, we have designed the gain K to be adaptive, allowing the level of assistance to be modified according to the patient’s performance. The new adapted gain denoted as $\tilde{K}$ \tilde K is computed based on the current error e, the original gain K, and a predetermined rate parameter r, as follows: 14 $\tilde{K} = K (1 + | e | r)$ \tilde K = {\bf{K}}(1 + |{\bf{e}}|r)

This gain is designed to be proportional to both the magnitude of the error and the rate parameter, ensuring that as either increases, the assistance provided to the patient is appropriately adjusted, with higher rate values leading to more substantial assistance. The adaptability of the gain based on error measurement allows the system to respond to changes in a patient’s performance levels. This means that the assistance provided by the robotic system is directly influenced by the patient’s deviation from the desired trajectory. By continuously monitoring the error, the system can adjust the level of support to meet each patient’s unique needs, ensuring that those who require more assistance receive it while allowing more capable patients to engage with reduced support. To further refine the level of assistance, an additional condition is applied: if the error magnitude |e| falls below a small threshold δ, the rate parameter r is reduced significantly, indicated as: If |e| < δ,⇒ r ≪ r This conditional adjustment provides a mechanism to encourage patients to take more control over their movements as they improve, thereby fostering active participation in the rehabilitation process.

Results and Discussion

The proposed LfTD framework is implemented using the parameters provided in Table 1 to teach a two-link robot how to learn and adapt therapeutic motions for upper limb rehabilitation, with experiments conducted in both real and simulated environments.

Table 1.

Parameters for the LfTD Model

Parameters	Description	Value
n	No. of gaussians	60
α	DMP parameter	8.5
β	DMP parameter	4
α_x	Canonical system constant	0.8
τ	Time scaling constant	0.218
r	Controller adaptation rate	100
K	Adaptive controller gain	[10 16]
l_i	Robot link lengths	[50, 50] cm

In the initial phase, we successfully captured the rehabilitation motions through visual tracking, by using an RGB camera mounted on an L-shaped tripod as explained in the LfTD framework. Two different demonstrations were recorded as illustrated in Figure 4, with each involving a human subject moving his wrist (with a label) in the camera’s field of view to generate the therapeutic motion, while at the same time converting the captured motion from pixel coordinates to the robot’s cartesian coordinates using equation 1. We choose motions suitable for a two-link manipulator by considering movements like the figure ’S’ illustrated in Figure 4a. This exercise involves moving the arm to trace an S-shaped pattern in the horizontal plane. The analysis in this paper will focus on this demonstration.

After capturing the trajectory, the generated motion in Cartesian coordinates was fed to the DMP for calculating the target forcing term f, with the aim of learning, reproducing, and generalizing the captured motion. Sixty basis functions were used to balance accuracy and computational efficiency, allowing the system to effectively capture motion details without overfitting. This configuration was refined through careful evaluation. Utilizing locally weighted regression and the sixty equally spaced basis functions, as depicted in Figure 5, the system successfully learned the weights for each degree of freedom (DOF), which are crucial for reproducing the captured motion. Particle Swarm Optimization (PSO) was used to optimize the DMP parameters α and β. A swarm of 16 particles was run for 40 iterations, resulting in convergence within 51.61 seconds and an objective function value of 0.0041. The search space for the parameters was limited to [0, 0] and [30, 30]. The optimal values found were α = 22.23 and β = 6.64, which improved the model’s trajectory accuracy. Figure 6 illustrates the Trajectory Error Function, showing the reduction in error over time. By efficiently learning the weights and optimizing the parameters, the DMP was able to accurately learn and reproduce the trajectory with good precision.

One example of the reproduced demonstration is depicted in Figure 7 where it is compared with the original demonstration which have starting and final points at [0 0]^T and [–26.5 –22.6]^T cm respectively. The DMP successfully reproduced the motion with negligible error in both x and y positions with the error variation depicted in Figure 8. This error, though negligible, is primarily attributed to the approximation of the non-linear forcing term and the tuning of the DMP parameters.

Detailed MAE values are presented in Table 2, highlighting the accuracy of the reproduced trajectory, with a total position error across both axes of 0.51 cm. These lower values indicate a higher degree of similarity.

Table 2.

Mean Absolute Error (MAE) metrics

Position (cm)
y_x	y_y
0.34	0.17

The ability of the DMP to generalize to new goals was examined by modifying both the initial and final points of the demonstrated trajectory while maintaining the learned weights, as illustrated in Figure 9. The original trajectory, with initial and final points at [0 0]^T and [–26.5 –22.6]^T cm, respectively, was adapted to a new trajectory with initial and final points at [2.8 2.0]^T and [–32.3 –29.6]^T cm, respectively.

The generated trajectory successfully adapted to the new goal while maintaining the exact shape of the demonstrated trajectory pattern without altering the fundamental motion style. This capability is advantageous for performing repetitive therapeutic exercises with varying goals, as it eliminates the need to retrain the robot for each distinct goal. For instance, the same trajectory pattern can be applied to both adults and children, accommodating the different sizes of their movements. Additionally, many activities of daily living, such as picking and placing objects, do not have fixed initial and final points; thus, by demonstrating the motion pattern once, the DMP can generalize the movement to suit various applications [30, 33, 34].

In adapting the trajectory to new initial and final points, the parameters of the DMP defined in Table 1 were kept constant. This ensured that the shape and dynamics of the original motion pattern were preserved.

A) Robot Simulation and Design Considerations: The simulation process, illustrated in Figure 3, utilized MATLAB’s Simscape with a kinematic model of the two-link manipulator, focusing solely on the positions and trajectories of the robot. The developed LfTD performance in tracking the generated trajectory was analysed by configuring the robot to follow trajectories defined by the DMP. This is achieved by integrating the reproduced trajectory into our MRAC, which facilitated the conversion of these trajectories into joint-space commands for the robot to execute the movements. This integration showcases the seamless transition from task-space modelling to actual robotic motion. At first, it is presumed that the patient requires assistance to complete a given task.

The deviation between the actual and desired end-effector trajectories executed by the robot is illustrated in Figure 10. This figure shows that the robot accurately tracks the desired trajectory throughout its motion. The simulation results further highlight the robot’s capability to maintain precise trajectory tracking in a 2D setting, providing an evaluation of the MRAC’s performance.

B) Testing Adaptability and Future Projections: To assess the adaptability of the MRAC, disturbances were introduced to the end-effector’s position during the robot’s motion, as illustrated in Figure 11. These disturbances simulate patient impairment behaviors, such as sudden changes in movement or deviations from the intended path [49]. Disturbances are generated as waveforms with an amplitude of 2 cm and a duration of 20 s for the x-axis, and an amplitude of 3 cm with the same duration for the y-axis. Each waveform is filtered through a first-order transfer function, $\frac{1}{μ s + 1}$ {1 \over {\mu s + 1}}, where the parameter μ is set to 6, based on criteria such as overshoot and settling time, to smooth the signal and ensure the disturbances exhibit a realistic, gradual effect. The controller’s adaptation mechanism adjusts the gain in real-time based on feedback from the patient’s performance. This feature allows the patient to actively participate in the rehabilitation process, as the controller’s assistance dynamically corresponds to the patient’s deviation from the desired trajectory. The controller’s response to these disturbances as illustrated in Figure 11 demonstrates its robustness and adaptability to changing conditions. By examining the MRAC’s ability to manage these filtered disturbances, we gain insight into its potential to handle patient-induced variations, supporting the effectiveness of therapeutic exercises despite unpredictable movement patterns. These responses were recorded after tuning different gain values and rate parameters, and their effect on system behaviour was analyzed. The configurations that yielded the most favourable outcomes are detailed in Table 1.

In evaluating the system’s performance, three key metrics were prioritized: the accuracy of motion reproduction, the generalizability, and adaptability to dynamic changes. Our results indicate a significant improvement in motion reproduction accuracy with MAE values of 0.34cm and 0.17cm in the x and y positional coordinates respectively, highlighting the system’s ability to closely mimic and generalize desired movements. Furthermore, tests under varying conditions with disturbances mimicking human deviation demonstrated the system’s adaptability to dynamic changes. These findings support the efficacy of our approach in addressing the critical aspects of performance, underscoring its potential for real-world applications. Finally, our technique is characterized by its ability to capture demonstration through one-shot learning and to have the ability to generalize and adapt this demonstration.

Conclusion

In this paper, we presented an innovative approach that utilizes learning from therapist demonstrations to model therapeutic exercises for the upper limb. A Learn-from-Therapist Demonstration framework was developed through this approach which employs Dynamic Movement Primitives (DMP) for modelling complex motor functions, and a Model Reference Adaptive Controller (MRAC) for adaptive learning. This framework enables a 2-link robot to replicate rehabilitation exercises within a simulated 2D environment. Additionally, the system’s adaptability has been further validated through disturbance tests, underscoring the potential of our robotic rehabilitation solution to offer personalized and effective therapy and for future developments in robotic-assisted rehabilitation.

By modeling and generalizing therapeutic exercises and providing assistance based on the subjects performance, we have demonstrated the potential feasibility of our approach. Although the results are promising, the reliance on simulations presents some limitations. Future work will focus on advancing to practical experiments with actual robots to assess the effectiveness of this approach in improving patient outcomes. Additionally, incorporating a dynamic model could be explored to evaluate the influence of dynamic parameters in scenarios where dynamic responses play a more prominent role.

Adaptive Upper Limb Robot-Assisted Rehabilitation: Learn-from-Therapist Demonstrations

Full Article

Paradigm

My account