RESEARCH
Peer reviewed scientific video journal
Video encyclopedia of advanced research methods
Visualizing science through experiment videos
EDUCATION
Video textbooks for undergraduate courses
Visual demonstrations of key scientific experiments
BUSINESS
Video textbooks for business education
OTHERS
Interactive video based quizzes for formative assessments
Products
RESEARCH
JoVE Journal
Peer reviewed scientific video journal
JoVE Encyclopedia of Experiments
Video encyclopedia of advanced research methods
EDUCATION
JoVE Core
Video textbooks for undergraduates
JoVE Science Education
Visual demonstrations of key scientific experiments
JoVE Lab Manual
Videos of experiments for undergraduate lab courses
BUSINESS
JoVE Business
Video textbooks for business education
Solutions
Language
English
Menu
Menu
Menu
Menu
A subscription to JoVE is required to view this content. Sign in or start your free trial.
Research Article
Ligia Yumi Mochida1, Paulo R. P. Santiago2, Miranda Lamb3, Guilherme M. Cesar1
1Department of Physical Therapy,University of North Florida, 2School of Physical Education and Sport of Ribeirão Preto,University of São Paulo, 3Department of Clinical and Applied Movement Sciences,University of North Florida
Erratum Notice
Important: There has been an erratum issued for this article. View Erratum Notice
Retraction Notice
The article Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size (LEfSe) in Microbiome Data (10.3791/61715) has been retracted by the journal upon the authors' request due to a conflict regarding the data and methodology. View Retraction Notice
This study introduces a toolbox combining motion capture technologies to analyze kinematic signals and quantify full-body coordinative patterns with angle-angle plots. Designed to overcome challenges of traditional marker-based systems, the toolbox supports flexibility across research, clinical, and field settings, advancing individualized care and functional assessments for children with motor disabilities.
Three-dimensional marker-based motion capture systems are the gold standard for evaluating kinematic patterns in human movement, offering precise quantification of segment and joint positions. However, traditional marker-based systems pose several challenges, particularly for children with neurological disabilities and sensory processing abnormalities, such as those observed with children with cerebral palsy. These challenges hinder the use of kinematic markers and limit detailed analyses of movement patterns. Recent advancements in markerless motion capture systems utilizing deep learning-based human pose estimation allowed us to explore cost-effective alternatives to traditional optical systems and the subsequent data processing approaches. An integrated toolbox was developed, combining multiple motion capture technologies: research-grade kinematic equipment, kinematic clusters, inertial measurement units, three-dimensional (3D) markerless systems, and two-dimensional (2D) markerless systems with commercially available cameras (via MediaPipe). For this current study, we present the outcomes of 3D marker-based versus 2D markerless motion capture, the major ongoing issue in human subjects' biomechanical studies, to describe coordinative patterns via hip-knee angle-angle plots. The cyclogram approach was selected because it offers a robust metric and readily interpretable framework for analyzing coordination via coupled motion between body segments. Two typically developing children and two children with cerebral palsy performed a functional movement pattern, the sit-to-stand task. The findings here demonstrated the feasibility of integrating multimodal systems for kinematic analyses, providing flexibility for research and clinical settings. Moreover, the novel open-source approach presented in this work addresses the challenges posed by many patient populations experiencing sensory processing issues, allowing for an advanced and individualized plan of care.
Three-dimensional kinematic motion capture evaluations have offered reliable information in the quantification of expected and abnormal movement patterns. Such information is crucial when designing rehabilitation plans1 and establishing safety approaches for patient engagement with clinical equipment (e.g.,2,3). When considering children with neurologic-induced disabilities, these kinematic evaluations can provide enhanced care by addressing changes in coordination patterns in distinct functional tasks such as gait4, sit-to-stand task5, and upper extremity reaching6.
Although three-dimensional (3D) kinematic evaluations are considered the gold standard for accurate motion capture analysis7, placement of kinematic markers on the skin is a requirement to ensure exact quantification of segment and joint motions. This marker-based evaluation requires knowledge of specific anatomical landmarks for those performing this evaluation, along with the tolerance and acceptance of tapes secured to the skin of the individuals receiving the evaluation. While the former can be learned, the latter is an issue with several patient populations. As an example, children with brain injury or cerebral palsy may exhibit disorders with the processing of sensory information8,9,10. This disorder may emerge as hypersensitivity, preventing clinicians and researchers from applying the kinematic markers on the children's skin, which, in turn, can hinder the quantification of changes in movement capabilities and an efficient individualized plan of care.
In addition to the issue of marker placement on the skin of patient populations, most measurement systems utilized in kinematics studies are not affordable, are confined to laboratory spaces, and are only available with highly qualified personnel, making them impractical for routine clinical use. However, current advances in motion capture technology support different approaches for recording kinematic data. For instance, automatic human pose estimation using deep learning techniques has attracted attention among computer vision researchers. These markerless motion capture techniques may offer a promising solution to technical and practical challenges associated with marker-based motion analysis11. Instead of relying on the tracking of markers, this approach uses trained neural networks to estimate the positions of the entire landmark12. This development enhances human posture-recognition technology's overall efficiency and adaptability.
It is well established that the kinematic relationship between body segments can reveal underlying control mechanisms supporting variability of movement patterns. While several methods can be used to measure coordination between segments13, angle-angle plots (cyclograms) allow for an inherently interpretable quantification of coupled motion between segments/joints over a stipulated timeframe14,15. Since many children with cerebral palsy experience difficulty when performing whole-body movement patterns due to impaired selective motor control16, we selected to investigate the simple, yet with complex underlying control17, movement of sit-to-stand. The goal of this study was to demonstrate the feasibility of an open-source toolbox combining motion capture technologies to facilitate kinematic evaluations of coordinative patterns for children with disabilities.
In this work, we will utilize hip-knee cyclograms to demonstrate coordination between trunk/pelvis and thigh/shank segments, using the hip and knee joints as representative intersegmental couplings relevant to sit-to-stand biomechanics. Hip-knee coordination was selected since it captures the primary joint coupling that drives the vertical progression of the center of mass during the sit-to-stand task, a functionally critical movement for children with disabilities who often rely on compensatory lower-limb strategies due to impaired selective motor control16 affecting joint synergies. The integration of different technologies (i.e., from research-grade 3D kinematic equipment to commercially available cameras) into one processing toolbox allows for flexibility of use from research settings to field and clinical environments. The toolbox also allows for detailed kinematic evaluations when marker placement is not feasible due to sensory processing abnormalities for children with disabilities via markerless motion capture signal processing features.
This study was conducted as a case series using representative results from four children to demonstrate protocol feasibility rather than statistical inference. Two children typically developing (male 4 years old, 100 cm, 14.5 kg; male 10 years old, 144 cm, 40.8 kg) and two children with cerebral palsy, Gross Motor Function Classification System III (female 4 years old, 93 cm, 13.8 kg; male 4 years old, 102 cm, 16.7 kg) participated in this study. Parents/legal guardians and children provided informed consent/assent prior to participation in accordance with our Institutional Review Board-approved study protocol (IRB-FY2024-33).
NOTE: While this protocol describes procedures for both 3D marker-based and 2D markerless motion capture systems, the focus of this study is on the implementation of the 2D markerless approach, which represents the core clinical and practical application of this work. The core implementation of the current protocol requires a marker-based motion capture system (minimum 8 cameras) for 3D kinematic data acquisition, and a single RGB camera (minimum resolution 1920 × 1080 pixels, 50 Hz) positioned perpendicular to the sagittal plane for 2D markerless capture. The method is suitable for controlled clinical or laboratory environments with adequate lighting (> 500 lux) and minimal background clutter. To improve robustness in non-ideal environments, the vailá toolbox integrates pre-processing tools (e.g., drawbox masking, video crop/resize) that allow users to isolate the participant from background distractions or adjust suboptimal camera framing. Key limitations include reduced accuracy for the single-camera, markerless approach for children exhibiting significant out-of-plane movements, as this approach cannot capture depth information or movements outside the camera's field of view. Users should ensure participants remain within the predefined capture volume and maintain sagittal plane alignment during task execution to optimize markerless tracking performance.
1. Pre-test checks
2. Camera placement

Figure 1: Close-up of the two camera systems used in this study. The 3D kinematic system's infrared camera (left) was used in which 12 cameras made up the system for the marker-based approach, and one video camera (right) was used for the markerless approach. Please click here to view a larger version of this figure.
3. Synchronization
NOTE: This step is only needed when using different systems concurrently. Synchronization between systems is not necessary when using only a single kinematic (2D or 3D) system.
4. Capture volume calibration (marker-based system)
5. Marker placement (marker-based system)

Figure 2: Plug-in Gait model adapted for the present study. The standard marker set was applied to define lower limb, pelvis, trunk, and head segments. Upper limb markers were removed to focus exclusively on lower-body and trunk kinematics. This adaptation preserves the integrity of the original Plug-in Gait full-body model while excluding arm motion from the analysis. Please click here to view a larger version of this figure.
6. Chair and environment
7. Participant instruction and safety
8. Trial acquisition
9. Data processing
NOTE: The vailá toolbox (v0.10.21)24 is cross-platform (Windows, macOS, Linux) and requires Python 3.12 or newer. The markerless analysis modules (e.g., MediaPipe25) are optimized to run efficiently on standard CPUs, ensuring accessibility in clinical environments without specialized hardware. A CUDA-enabled GPU is optional and only utilized if the user selects alternative high-speed engines (e.g., YOLO-Pose) for faster processing. For detailed execution commands, step-by-step guides, and full documentation, users are directed to the toolbox's online resources and integrated help files, available at: https://github.com/vaila-multimodaltoolbox/vaila/tree/main/docs/modules/markerless-analysis.

Figure 3: MediaPipe's 33 landmarks. For this study, we utilized only the landmarks pertinent to the sit-to-stand task, specifically landmarks 11 and 12 (shoulders), 23 and 24 (hips), 25 and 26 (knees), and 27 and 28 (ankles). Please click here to view a larger version of this figure.
The open-source Python-based toolbox successfully quantified hip-knee coordination patterns from both marker-based and markerless systems in pediatric participants during the sit-to-stand task. To provide a quantitative comparison alongside the visual cyclograms, key summary metrics (Hip and Knee Range of Motion) for the representative trials are presented in Table 1. Representative angle-angle plots from typically developing children (Figure 4A,B) exhibit coordination profiles from the markerless system that closely matched those obtained from the gold-standard 3D marker-based motion capture. The consistency confirms that the proposed method can capture normative whole-body coordination strategies with high fidelity, validating the toolbox's capacity to replicate established kinematic outcomes in controlled conditions.
Representative results for children with cerebral palsy are illustrated in Figure 4C,D. As expected, the overall coordination pattern differed from that of typically developing children, reflecting the altered motor control commonly reported in this patient population. Important to note, while the range of traverse motion is similar (e.g., marker-based from 20° to 80°, markerless from 0° to 60°) the markerless and marker-based results were not uniformly aligned across children. For one participant (Figure 4D), the markerless system failed to reproduce the same inter-joint coordination profile (Trial 2) observed with the marker-based system. This discrepancy was traced to uncontrolled movements in the frontal and transverse planes, which the 2D markerless approach could not fully capture. Despite this limitation, the markerless system still provided a recognizable, quantifiable coordination profile, highlighting the feasibility of extracting clinically relevant movement patterns even in complex motor presentations.
Together, these findings demonstrate both the strengths and limitations of the proposed method. For typically developing children, markerless tracking can yield coordination outcomes comparable to the laboratory-based gold standard, confirming the robustness of the open-source approach. In clinical populations, discrepancies can occur in cases with highly variable, multidirectional movements. However, rather than representing protocol failure, such suboptimal cases illustrate the range of outcomes that can occur when applying markerless approaches to clinical populations. The results confirm that the toolbox can reproduce expected coordination profiles in typically developing children, while also documenting interpretable but less precise outcomes in children with cerebral palsy when the movement pattern expands into distinct planes. Presenting both types of results provides a realistic view of the protocol's performance and clarifies the conditions under which the method is most effective.

Figure 4: Representative hip-knee intersegmental coordination plots for all participants. Each row corresponds to one child: (A) typically developing older child (top plots), (B) typically developing younger child, (C) female child with cerebral palsy, and (D) male child with cerebral palsy (bottom plots). For all cases, the left column displays data from the marker-based system and the right column displays data from the markerless system. In all plots, the movement pattern begins on the top portion of the graph and is completed at the bottom left. The children with cerebral palsy (C and D) demonstrated altered coordination compared with typically developing peers, consistent with expected motor impairments for this pediatric population. For panel D (Trial 2, orange line), the marker-based system identified a distinct coordination pattern, that was not similarly reproduced with the markerless system. This divergent profile was due to uncontrolled out-of-plane movements not fully captured in 2D (in this case, trunk rotation in the transverse plane with the child's left shoulder rotating posteriorly), illustrating the range of possible outcomes that can be recorded with single-camera markerless protocols. Please click here to view a larger version of this figure.
| Participant | Trial | Hip ROM (3D) | Hip ROM (2D) | Knee ROM (3D) | Knee ROM (2D) |
| TD 1 (A) | 1 | 83.4 | 85.1 | 85.2 | 87.0 |
| 2 | 82.9 | 84.7 | 84.7 | 86.5 | |
| 3 | 84.0 | 86.2 | 85.8 | 87.9 | |
| TD 2 (B) | 1 | 88.7 | 89.5 | 90.1 | 91.2 |
| 2 | 87.6 | 88.9 | 89.0 | 90.3 | |
| 3 | 88.1 | 89.2 | 89.5 | 90.8 | |
| CP 1 (C) | 1 | 90.2 | 91.5 | 92.5 | 94.0 |
| 2 | 89.8 | 91.1 | 91.9 | 93.4 | |
| 3 | 90.5 | 91.8 | 92.8 | 94.3 | |
| CP 2 (D) | 1 | 79.5 | 81.3 | 82.0 | 84.1 |
| 2 | 81.2 | 75.9 | 84.3 | 79.8 | |
| 3 | 80.3 | 82.0 | 83.1 | 85.2 |
Table 1: Quantitative comparison of hip and knee range of motion (ROM) in degrees (°) for representative sit-to-stand trials from marker-based (3D) and markerless (2D) systems. The table uses data from Figure 4.
The goal of this work was to demonstrate the feasibility and application of an open-source, multimodal toolbox capable of integrating marker-based and markerless motion capture systems to analyze whole-body coordination in pediatric populations. In representative applications, the protocol successfully produced hip-knee cyclograms that were comparable between marker-based and markerless systems in typically developing children, while also highlighting similarities and discrepancies in children with cerebral palsy when movements extended out of the sagittal plane (e.g., trunk transverse rotation). We intentionally tested the toolbox on two pediatric case series because motor behavior differs across age bands (e.g., 5 vs. 10 years) and between children typically developing and children with cerebral palsy. A critical step within the protocol is the choice of reference system and recording geometry. Three-dimensional marker-based optical motion capture remains the benchmark against which alternative approaches should be evaluated26,27. Static calibration and precise temporal synchronization are essential when integrating multimodal data streams for validating across approaches28. In this study, synchronization was achieved with the 3D kinematic system's LockLab trigger; however, alternative approaches such as a visible LED flash or photodiode capture can provide robust alignment when hardware triggers are unavailable29. Camera positioning is equally critical: for sagittal-plane tasks such as sit-to-stand23, a single consumer-grade camera can yield interpretable coordination plots if placed at close range (3.0 m or closer) with orthogonal orientation to minimize occlusions30,31.
Practical modifications and troubleshooting strategies are essential to increase the robustness of 2D markerless recordings across different environments. For routine applications where only markerless data are required, critical considerations include camera placement, lighting, and verification of frame rate to avoid variable-frame-rate drift32,33. Optimizing illumination and camera angle reduces frame-to-frame jitter and improves joint detection, particularly in small children or when occlusions occur during forward leaning. Prior to extended data collection, test trials should be performed to confirm that key events of the sit-to-stand cycle are consistently captured and free from interruptions such as multi-person detections. When multimodal comparison is required, temporal synchronization must be addressed; in our study, the 3D marker-based system's setup provided a hardware trigger, but simple approaches such as a visible LED flash or audio cue can serve as practical alignment signals29. Finally, while we utilized MediaPipe for pose estimation, users should weigh the trade-off between fast, ready-to-use frameworks like OpenPose, which enable rapid deployment with consumer cameras, or customizable approaches such as DeepLabCut, which demand annotation and retraining but may yield higher accuracy in pediatric or clinical populations34. We chose MediaPipe and YOLO-Pose for their optimal combination of high accuracy (comparable to OpenPose), low computational overhead (allowing real-time processing on consumer hardware), and permissive open-source licenses, making them more accessible for broad clinical and research implementation. This selection was informed by benchmarking analyses demonstrating that MediaPipe and YOLO-Pose offer competitive pose estimation performance while requiring fewer computational resources, which is critical for environments with limited hardware capacity35,36,37.
Limitations of the method must be recognized to guide appropriate applications. Two-dimensional, sagittal-only markerless tracking cannot capture out-of-plane rotations, and in our representative results, this limitation manifested in a child with cerebral palsy, where deviations in coordination were visible in markerless but not marker-based data. This discrepancy highlights that while markerless pipelines can mitigate problems such as marker occlusion38 and sensory disorders8,10, they may also misinterpret movements dominated by transverse or frontal-plane components. Additional limitations include the computational demands of processing high-resolution video, the need for adequate technical expertise, and the potential for errors introduced by clothing artifacts or small body size in children. Tasks requiring multiplanar resolution or precise quantitative assessment for clinical decision-making may still necessitate multi-view or 3D motion capture solutions30.
Discrepancies in joint angle trajectories between marker-based and markerless systems, particularly in children with cerebral palsy, require careful interpretation when assessing coordination profiles. As illustrated in Figure 4D, Trial 2, the markerless system produced a divergent cyclogram relative to the marker-based system, with deviations reaching approximately 20°. These differences likely reflect uncontrolled out-of-plane posture and movements that are not captured by single-camera 2D systems. However, as shown in Table 1, the overall range of motion for hip and knee joints remained consistent across systems, with differences typically below 2°, supporting feasibility of markerless tracking for sagittal-plane movement quantification. Importantly, the general shape and progression of coordination loops were preserved in most trials, reinforcing the interpretability of cyclograms for within-subject longitudinal tracking39. Provided that acquisition conditions are standardized, and movement of interest remains largely within one single plane, markerless systems can yield clinically meaningful coordination profiles suitable for functional assessment in pediatric populations.
With respect to existing and alternative methods, the proposed multimodal approach occupies a translational niche between laboratory-grade precision and clinical practicality. Unlike exclusive reliance on marker-based systems, this workflow allows continuity of data collection when kinematic markers cannot be tolerated8,9,10, and unlike purely markerless solutions, it retains the option of benchmark comparison. The open-source Python framework supports integration of inertial sensors, research-grade clusters, and video-based estimators within a single pipeline, thereby promoting reproducibility and method standardization. In relation to other published protocols, the critical novelty lies not in markerless tracking per se, but in providing a unified structure for synchronizing, processing, and exporting multimodal biomechanical data across heterogeneous hardware platforms40.
The potential applications of this protocol extend across research and clinical domains. For clinical biomechanics and rehabilitation, the toolbox provides a practical option for assessing whole-body coordination in children with cerebral palsy or other populations where marker placement is not feasible. The approach also enables longitudinal monitoring of functional tasks, as in repeated sit-to-stand assessments, where within-subject trends can be tracked despite variability in tolerance to different acquisition methods. Beyond the lower limb, the protocol can be adapted to upper-extremity tasks (e.g., shoulder-elbow cyclograms), provided that camera viewpoints are adjusted and task-specific calibration procedures are performed. Future directions include optimizing user interfaces, expanding tutorials, and integrating machine-learning routines for automated event detection and pattern recognition. By explicitly outlining steps, modifications, and limitations, the present protocol supports reproducible implementation across diverse settings and provides a methodological foundation for expanding clinical use of markerless biomechanics. Importantly, the protocol directly addresses a major clinical challenge: children with neurological disabilities, particularly those with cerebral palsy, often exhibit sensory processing abnormalities that limit the feasibility of marker-based motion capture. By providing a markerless alternative, this toolbox expands access to detailed kinematic evaluation for populations that are traditionally underserved in biomechanical research and rehabilitation.
The authors have nothing to disclose.
We would like to acknowledge the financial support provided (to PI: Cesar) by the 2023 Pediatric Physical Therapy Research Grant from the Foundation for Physical Therapy Research, by the MedNexus Research Innovation Fund, and by the Eunice Kennedy Shriver National Institute for Child Health and Human Development grant (1R03HD114548-01).
| FLIR Blackfly camera | FLIR Systems | Model: BFS-U3-200S6C-C | Used for 2D markerless motion capture (50 Hz, 1980×1200 px) |
| Hikvision PoE camera | Hikvision | N/A | Alternative video recording for 2D markerless capture |
| LockLab hardware trigger | Vicon | LockLab | Used for temporal synchronization |
| Logitech C920 camera | Logitech | HD Pro Webcam C920 | Alternative consumer camera for markerless capture |
| MediaPipe | Google Research | v0.10.21 | Used for real-time 2D pose estimation. Available online at https://ai.google.dev/edge/mediapipe/solutions/guide |
| OpenPose software | Carnegie Mellon University / OpenPose community | Version 1.7.0 | Used for real-time 2D pose estimation |
| Plug-in Gait full body model | Vicon | Standard model | Used as the biomechanical model for anatomical marker placement |
| Vero infrared cameras | Vicon | Vero series | Used for 3D marker-based motion capture (100 Hz) |