Head movement dynamics in dystonia: a multi-centre retrospective study using visual perceptive deep learning

Dystonia is a neurological movement disorder characterised by abnormal involuntary movements and postures, particularly affecting the head and neck. However, current clinical assessment methods for dystonia rely on simplified rating scales which lack the ability to capture the intricate spatiotemporal features of dystonic phenomena, hindering clinical management and limiting understanding of the underlying neurobiology. To address this, we developed a visual perceptive deep learning framework that utilizes standard clinical videos to comprehensively evaluate and quantify disease states and the impact of therapeutic interventions, specifically deep brain stimulation. This framework overcomes the limitations of traditional rating scales and offers an efficient and accurate method that is rater-independent for evaluating and monitoring dystonia patients. To evaluate the framework, we leveraged semi-standardized clinical video data collected in three retrospective, longitudinal cohort studies across seven academic centres. We extracted static head angle excursions for clinical validation and derived kinematic variables reflecting naturalistic head dynamics to predict dystonia severity, subtype, and neuromodulation effects. The framework was also applied to a fully independent cohort of generalised dystonia patients for comparison between dystonia sub-types. Computer vision-derived measurements of head angle excursions showed a strong correlation with clinically assigned scores. Across comparisons, we identified consistent kinematic features from full video assessments encoding information critical to disease severity, subtype, and effects of neural circuit interventions, independent of static head angle deviations used in scoring. Our visual perceptive machine learning framework reveals kinematic pathosignatures of dystonia, potentially augmenting clinical management, facilitating scientific translation, and informing personalized precision neurology approaches.


Supplementary Discussion
Evidence before this study Clinical assessment of dystonia, a neurological movement disorder, has traditionally relied on rating scales that aim to simplify complex phenomenology into lower-dimensional rating items.However, these score-based assessments have significant clinimetric limitations and do not fully capture the rich spatiotemporal dynamics of dystonic phenomena, which are crucial for clinical judgment and pathophysiological understanding.In contrast, recent investigations in animal models of dystonia have already demonstrated the utility and relevance of quantitative methods for phenotyping, which gradually supersedes previous observer-dependent behavioural analyses.Taken together, this has led to a need for more objective and detailed clinical evaluation methods of dystonia.
We performed a PubMed search up to July 2023 combining the terms "dystonia" AND ("deep learning" OR "machine learning" or "computer vision" OR "vision-based" OR "video-based") AND ("angle" OR "kinematic" OR "rating" OR "scoring" OR "movement analysis") including abstracts in English or German.The search yielded three studies that validated vision-based frameworks for automating the assessment of cervical dystonia severity compared to clinician-annotated ratings.Two of these studies focused on deriving head angle deviations from specialised camera setups, while the third study utilised computer vision in a retrospective video dataset recorded using conventional equipment.These studies reported fair to moderately strong correlations between vision-based head angle measurements and clinical scores.Additionally, three studies investigated computer vision for assessing head tremor in the context of cervical dystonia: one case report demonstrated the clinical validity of computer vision-derived head angle and head tremor metrics, while a retrospective crosssectional study reported moderate agreement of computer vision-derived metrics and scores, and another retrospective study used computer vision to analyze the interplay of static and dynamic components of cervical dystonia.Two additional studies used computer vision-based kinematics to quantify dystonia-like phenomena in rodent models of monogenetic dystonia, demonstrating utility in both phenotype and genotype predictions.
However, most of the clinical studies were limited to static task conditions, where patients attempted to hold a neutral position of the head, thus not providing a naturalistic account of dystonia.Moreover, beyond head angular deviations and oscillation metrics, no study explored a broader kinematic feature space that reflects the true spatiotemporal complexity of dystonic movements.Additionally, the studies assessed patients at single time points without considering different therapy conditions, particularly the effects of deep brain stimulation, which is a highly effective intervention targeting brain circuits.Nor did they compare dystonia sub-types, such as cervical and generalised dystonia.

Added value of this study
In this study, we present a comprehensive visual perceptive deep learning framework that addresses the gaps in current dystonia assessments.We use this framework to retrospectively analyse a unique dataset from three multi-centric, studies encompassing video examinations of patients along the dystonic severity continuum, including different deep brain stimulation states.Our framework goes beyond the automation of suboptimal symptom severity assessments by reverse engineering a set of clinically inspired kinematic features.The resulting high dimensional, yet intuitively interpretable kinematic feature space enabled us to explore disease states and effects of brain circuit therapies in a level of detail comparable to experimental neuroscientific investigations.Through a data-driven approach, we have identified a consistent set of only four dynamic parameters that encode dystonia severity, subtype, and the efficacy of brain circuit interventions.Notably, these features are independent of static head angle deviations, which play a central role in dystonia severity scores, pointing to the involvement of partially distinct neurobiological processes not captured by these scores.Our findings align with emerging concepts of symptom-specific brain circuits and findings in rodent models of dystonia, thereby exemplifying the visual perceptive framework's potential to augment clinical management and bridge translational gaps in movement disorders research.By providing a more comprehensive and precise assessment of the disorder, our study offers valuable insights for improved treatment strategies and further understanding of dystonia's complex neurobiology.

Implications of all the available evidence
The available evidence collectively underscores the limitations of traditional rating scales in capturing the informative spatiotemporal dynamics of dystonic movements, emphasizing the need for more objective and granular evaluation methods.In line with recent animal studies using computer vision for dystonia quantification, recent clinical studies have shown the potential of computer vision-based frameworks in automating cervical dystonia severity assessment and capturing head tremor metrics.
However, their underlying study designs may inadvertently reinforce limitations associated with the clinical scoring process.
In this study, we introduce a comprehensive visual perceptive deep learning framework that serves as a powerful platform to augment clinical judgement and generate valuable pathophysiological insights by extracting a set of clinically inspired, interpretable kinematic features.Our findings have implications beyond dystonia, showcasing the utility of visual perceptive frameworks in enhancing clinical management and fostering integration with advanced neuroimaging and neurotechnological methods.This study opens doors for future translational research to explore the broader application of computer vision and deep learning techniques to derive kinematic signatures of movement disorders across species and experimental conditions, promising more precise and personalised assessments that can significantly improve therapeutic strategies and patient outcomes.

Model variable interpretation
We remind the reader that Mediapipe was used to detect a face mesh, enabling us to determine the head angles along three axes of motion: torticollis (rotation), laterocollis (tilt), and antero-retrocollis (forward and backward).The custom trained CNN was used to predict the movement state, e.g., rotation left or tilt right.Predictions were made for each frame of the video, after which we engineered features that capture the dynamics of the patient.Below we provide more details on the interpretation of the derived features from the two deep learning tools.
Head angles (derived from Mediapipe face-mesh tracking): • Angle torticollis: Head-angle deviation from face-forward in yaw axis when a patient is sitting in a neutral position.Positive angle = right, Negative angle = left.
• Angle laterocollis: Head-angle deviation from face-forward in tilt axis when a patient is sitting in a neutral position.Positive angle = right tilt, Negative angle = left tilt.
• Angle antero/retrocollis: Head-angle deviation from face-forward in antero/retrocollis axis when a patient is sitting in a neutral position.Positive angle = anterocollis, Negative angle = retrocollis.
Correlation features (derived from custom CNN movement state prediction): • Correlation movement mean: Mean correlation coefficient between all predicted movement states.
• Correlation mean face forward: Mean correlation coefficient of each movement state to faceforward movement state.
Head oscillations (derived from head angles, i.e., Mediapipe): • Oscillation amplitude: The amplitude of the largest peak in a Fourier transform of the angles.
For each axis respectively.
• Oscillation frequency: The frequency of the largest peak in a Fourier transform of the angles.
For each axis respectively.
Symmetry features (derived from custom CNN movement state prediction): • Symmetry rotation: Proportion of time head was oriented in one direction compared to the opposite direction for the rotation states (left or right).
• Symmetry tilt: Proportion of time head was oriented in one direction compared to the opposite direction for the tilt states (left or right).
• Symmetry anteroretrocollis: Proportion of time head was oriented in one direction compared to the opposite direction for the antero/retrocollis states (forward or backward).
Harmonics (derived from head angles, i.e., Mediapipe): The harmonic strengths were determined using the head angles for each axis of motion respectively.The harmonic strength was determined as the distance correlation between the fundamental tremor frequency and its first harmonic (double the fundamental frequency).