Early detection and tracking of bulbar changes in ALS via frequent and remote speech analysis

Stegmann, Gabriela M.; Hahn, Shira; Liss, Julie; Shefner, Jeremy; Rutkove, Seward; Shelton, Kerisa; Duncan, Cayla Jessica; Berisha, Visar

doi:10.1038/s41746-020-00335-x

Download PDF

Brief Communication
Open access
Published: 13 October 2020

Early detection and tracking of bulbar changes in ALS via frequent and remote speech analysis

Gabriela M. Stegmann ORCID: orcid.org/0000-0002-0542-1109^1,2,
Shira Hahn^1,2,
Julie Liss^1,2,
Jeremy Shefner³,
Seward Rutkove⁴,
Kerisa Shelton³,
Cayla Jessica Duncan³ &
…
Visar Berisha^1,2

npj Digital Medicine volume 3, Article number: 132 (2020) Cite this article

21k Accesses
34 Citations
28 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 20 November 2020

This article has been updated

Abstract

Bulbar deterioration in amyotrophic lateral sclerosis (ALS) is a devastating characteristic that impairs patients’ ability to communicate, and is linked to shorter survival. The existing clinical instruments for assessing bulbar function lack sensitivity to early changes. In this paper, using a cohort of N = 65 ALS patients who provided regular speech samples for 3–9 months, we demonstrated that it is possible to remotely detect early speech changes and track speech progression in ALS via automated algorithmic assessment of speech collected digitally.

A systematic review and narrative analysis of digital speech biomarkers in Motor Neuron Disease

Article Open access 07 December 2023

Rate of speech decline in individuals with amyotrophic lateral sclerosis

Article Open access 20 September 2022

A machine-learning based objective measure for ALS disease severity

Article Open access 08 April 2022

Introduction

Amyotrophic lateral sclerosis (ALS) is characterized by a progressive loss of motor function due to central nervous system damage and loss of spinal and bulbar motor neurons. ALS causes individuals to become progressively weaker and lose motor function, eventually resulting in death. Social and economic consequences of ALS include cost of care for the patients, loss of employment, and cost of treatment, medications, and orthopedic devices^1,2,3. Bulbar deterioration is particularly devastating, impairing the ability to communicate, leading to faster decline, shorter survival (less than 2 years from diagnosis), and reduced quality of life^4,5,6. Studies have found that while 30% of individuals in the population present with bulbar symptoms at the onset of ALS, most ALS patients eventually develop them and lose their ability to speak and swallow safely⁷.

The standard ways of assessing bulbar dysfunction are the ALS functional rating scale-revised (ALSFRS-R) and, less commonly, the Center for Neurologic Study Bulbar Function Scale (CNS-BFS)⁸. Both instruments, however, lack sensitivity to early bulbar changes⁹. Several studies have found that speech features, such as jitter, shimmer, articulatory rate, speaking rate, and pause rate, are affected in ALS^10,11, and that these can be measured from remotely-collected speech samples^12,13. However, no study has assessed the sensitivity of remote speech analysis in detecting and tracking bulbar change. In this study, we assessed speech features digitally and evaluated their sensitivity to detecting early changes and tracking progression.

We defined early changes as speech changes that occurred before any changes in the ALSFRS-R bulbar subscales. We defined sensitive tracking as the ability to detect longitudinal within-person changes in speech. We used a cohort of healthy and ALS patients from ALS at Home¹⁴, a longitudinal, observational study that was conducted entirely remotely. Participants were recruited, screened, enrolled, and assessed daily from home. Speech was collected via a mobile application and assessed through automated speech analysis. Although it is possible to analyze a large number of speech features, we focused on articulatory precision (AP) and speaking rate (SR) as they relate to articulation and rate, both of which are known to decline in dysarthria¹⁵ secondary to ALS. We evaluated whether the automatic analysis of remotely-collected speech could (1) detect early speech changes and (2) sensitively track speech changes longitudinally.

The ALS sample was divided according to the following categories:

Impairment category: We identified participants who had normal function according to ALSFRS-R bulbar subscales (speech, salivation, and swallowing subscales with score = 4) at the beginning of the study. Twelve participants had normal bulbar function and the other participants had impaired bulbar function. This sample was used to test whether AP and SR significantly differed between the normal bulbar function group and the healthy controls, thus evaluating their ability to detect early changes.
Onset category: Type of onset was collected from participants. Twelve ALS participants initially presented with bulbar onset, while the other 52 participants presented with other types of onset (nonbulbar onset). The non-bulbar onset group included participants with axial, limb, and generalized onset. This sample was used to compare the SR and AP longitudinal trajectories of individuals according to their type of onset (bulbar and nonbulbar onset). We expected that bulbar-onset participants would exhibit faster speech decline, and thus used onset type to evaluate whether AP and SR were sensitive to these differences in speech decline.

Results

Description of sample

Tables 1 and 2 show the descriptive statistics of the sample, including their demographics and ALS severity. The ALSFRS-R speech, ALSFRS-R bulbar, SR, and AP scores all indicate that the most severe group in terms of bulbar symptoms were the ALS participants with bulbar onset, followed by ALS participants with bulbar impairment. Overall, lower scores in AP and SR were associated with greater impairment in speech (mixed-effects¹⁶ correlations between the ALSFRS-R speech subscale and AP, SR were r = 0.73, r = 0.64, respectively; Lorah¹⁶). Figure 1 shows the distributions of the AP and SR scores for healthy, ALS with normal bulbar function, impaired bulbar function, bulbar onset, and nonbulbar onset participants.

Table 1 Sample description (enrollment).

Full size table

Table 2 Sample Description (by group).

Full size table

Fig. 1: Boxplots for healthy controls, ALS participants with no bulbar impairment (normal ALSFRS-R scores for speech, swallowing, and salivation), ALS participants with bulbar impairment (at least one score below 4 in the ALSFRS-R scores for speech, swallowing, and salivation), and ALS participants with bulbar-onset (type of ALS onset).

Analyses

Three sets of analyses were conducted. First, to evaluate whether declines in AP and SR occurred earlier than declines on the ALSFRS-R bulbar subscale, we compared the healthy individuals to ALS individuals with normal bulbar function. If participants started the study with normal bulbar function but their ALSFRS-R bulbar scores declined throughout the study due to ALS progression, we only used their data before the decline began. Both AP and SR were significantly higher in the healthy individuals than in the ALS individuals with normal bulbar function (see top section of Table 3), indicating that AP and SR decline was detected earlier than declines on the ALSFRS-R bulbar subscale.

Table 3 Results from all analyses.

Full size table

Second, we further evaluated the validity of AP and SR as a measure of speech decline in ALS by comparing the scores in healthy controls and all ALS participants regardless of bulbar impairment or onset. AP and SR were significantly higher in healthy participants than all ALS participants regardless of onset or impairment (middle section of Table 3), strengthening the evidence that these two measures can detect ALS speech impairment.

Third, we evaluated the sensitivity of AP and SR to detect longitudinal within-person changes in speech. We used a growth curve model¹⁷ (GCM), which is a mixed-effects model that estimates the longitudinal trajectory of an outcome for a sample of the participants with multiple observations over time. We compared the rates of decline between the bulbar-onset and nonbulbar-onset participants expecting that bulbar-onset participants would have steeper speech decline than nonbulbar onset participants. The time variable was the number of days since the onset of the first symptom. For both features, the final GCM¹⁷ followed a linear trajectory, had a random intercept and random slope, and had distinct mean slopes for bulbar-onset and nonbulbar-onset participants. For AP, both groups had significantly negative mean slopes, such that AP decreased as ALS progressed. However, bulbar-onset participants declined more rapidly as their mean slope was significantly more negative than the mean slope for nonbulbar-onset. For SR, the decline over time in nonbulbar-onset participants was nonsignificant (mean slope not significantly different from 0), whereas the bulbar-onset group showed significant decline (the mean slope was negative and significantly lower than nonbulbar-onset group). The longitudinal plots are shown in Fig. 2, and the GCM parameters are in the bottom section of Table 3.

**Fig. 2: Articulatory precision (AP) and speaking rate (SR) scores as a function of number of days since date of ALS onset.**

Discussion

In this study, we have identified two objective speech metrics that detected bulbar impairment before the ALSFRS-R bulbar subscale, sensitively tracked longitudinal decline, and could be assessed from remotely collected speech samples via a mobile app. They were consistent with both cross-sectional and longitudinal expectations: cross-sectionally, healthy participants had the highest SP and AP, followed by ALS participants with no bulbar impairment, and finally followed by all ALS participants, including those with bulbar impairment. Furthermore, the analyses were repeated controlling for time of day, age, and gender, and the results remained consistent. Longitudinally, bulbar-onset ALS participants declined faster in SR and AP than nonbulbar-onset ALS participants. This represents a unique opportunity for earlier and more sensitive identification and remote tracking of bulbar impairment than is currently available.

The ability to digitally detect early changes and sensitively track progression has important implications for personal planning and for research. Such information is valuable to the patient, family, and medical staff to inform life planning decisions, such as making necessary work and family decisions while speech is still intelligible, deciding on the timing of therapeutic interventions, and obtaining augmented and alternative communication technology¹⁸. These objective measures are also useful for ALS clinical trials as they can be used to provide valuable information about disease progression, determine enrollment, stratify participants, and appropriately power a study¹⁹. Furthermore, the ability to remotely assess participants in a study has the additional benefit of reducing participant burden, reducing attrition, and enrolling individuals who would otherwise not be able to participate, such as those with transportation or ambulation challenges.

One limitation of the study was that participant information such as cognitive function, drinking, smoking, vision problems, medications, ability to read, or other health problems was not available, and therefore we were not able to explore these as potential confounders. However, given the consistency of the results, we do not expect that controlling for these additional variables would lead to a different conclusion, although a prospective study is needed to confirm this. Other limitations of remote assessment include misperformance of tasks, for example, reading a sentence incorrectly. We screened for this by automatic QA on all samples and random manual QA on a subset of samples.

Methods

Sample

The study was approved by the institutional review board at Barrow Neurological Institute. All participants provided written informed consent to participate in the study. Participants from ALS at Home provided daily speech samples for 3 months, twice weekly for an additional 6 months, and ALSFRS-R scores on a weekly basis. Participants were allowed to receive assistance from their caregivers if needed. In the current analysis, we included participants who were enrolled for at least 45 days to use participants that were engaged in the study and avoid those who dropped out too early. This resulted in 21 healthy participants and 65 participants with ALS.

Speech collection and analysis

Speech samples were collected remotely via a mobile application²⁰, where participants were requested to complete a series of speech elicitation tasks, including readings of five sentences. The instructions, including the sentences, were provided in the application, and participants read the text from the application. The same text was shown each day to all participants. Figure 3 shows a screenshot of the app. Speech was recorded locally on the participants’ phones, uploaded to a separate cloud-based repository, saved as a.wav file, and algorithmically analyzed on the cloud. Participants were requested to make the recordings from a quiet room, and ambient noise was recorded for 5 s and used in the speech analysis.

**Fig. 3: Screenshot of the mobile application.**

The speech obtained from the five sentences was used to extract SR and AP^14,20. SR is a measure of how fast participants read the sentences. The SR is determined by automatically estimating the total speech time from the read sentences and dividing the number of syllables in the target sentences by the total speech time. To determine the speech onset and offset times, we use a statistical model-based voice activity detector similar to the one described in Sohn et al.²¹. This model uses spectral and energy features extracted from the collected background noise sample to identify an optimal speech detection threshold. The total speech time is then measured by finding the time elapsed from speech onset to speech offset. The number of syllables is known as the participant is asked to read specific sentences. The speaking rate is the total number of syllables divided by the total speech time. AP is a measure of the match between the expected and observed acoustic features for each phoneme. The algorithm, an extension of existing work²², takes as input connected speech, elicited from the speaker via the mobile app, and the corresponding transcript. The algorithm assesses how well the acoustics of each phoneme correspond to the acoustics of the expected phoneme in spoken English. This assessment is made by creating a distribution of acoustic features for every English phoneme from a large corpus of read speech (~1000 h) in American English. We then calculate a likelihood ratio from a comparison between the acoustic features extracted from each phoneme in the speech collected by the app and the normative distribution for the expected phoneme. For ease of interpretation, articulatory precision was projected onto a 0–10 scale (higher scores are indicative of more precise articulation).

Statistical analyses

Given that each participant had repeated observations, the analysis necessitated mixed-effects models, where fixed-effects parameters were used for estimating the mean difference between the two groups and the mean trajectories. All analyses were performed in R. The packages lme4²³ and nlme²⁴ were used, since these two are widely used R packages to estimate mixed-effects models.

Reporting summary

Further information on experimental design is available in the Nature Research Reporting Summary linked to this paper.

Data availability

The data that support the findings of this study are available from the corresponding author upon request.

Code availability

All analyses were conducted in R language. The code is available from the corresponding author upon request.

Change history

20 November 2020
A Correction to this paper has been published: https://doi.org/10.1038/s41746-020-00364-6.

References

López-Bastida, J., Perestelo-Pérez, L., Montón-Álvarez, F., Serrano-Aguilar, P. & Alfonso-Sanchez, J. L. Social economic costs and health-related quality of life in patients with amyotrophic lateral sclerosis in Spain. Amyotroph. Lateral Scler. 10, 237–243 (2009).
Article Google Scholar
Jennum, P., Ibsen, R., Pedersen, S. W. & Kjellberg, J. Mortality, health, social and economic consequences of amyotrophic lateral sclerosis: a controlled national study. J. Neurol. 260, 785–793 (2013).
Article Google Scholar
Oh, J. et al. Socioeconomic costs of amyotrophic lateral sclerosis according to staging system. Amyotroph. Lateral Scler. Frontotemporal Degener. 16, 202–208 (2015).
Article Google Scholar
Shellikeri, S. et al. The neuropathological signature of bulbar-onset ALS: a systematic review. Neurosci. Biobehav. Rev. 75, 378–392 (2017).
Article CAS Google Scholar
del Aguila, M. A., Longstreth, W. T., McGuire, V., Koepsell, T. D. & van Belle, G. Prognosis in amyotrophic lateral sclerosis: a population-based study. Neurology 60, 813–819 (2003).
Article Google Scholar
Makkonen, T., Ruottinen, H., Puhto, R., Helminen, M. & Palmio, J. Speech deterioration in amyotrophic lateral sclerosis (ALS) after manifestation of bulbar symptoms. Int. J. Lang. Commun. Disord. 53, 385–392 (2018).
Article Google Scholar
Green, J. R. et al. Bulbar and speech motor assessment in ALS: challenges and future directions. Amyotroph. Lateral Scler. Frontotemporal Degener. 14, 494–500 (2013).
Article Google Scholar
Smith, R. A. et al. Assessment of bulbar function in amyotrophic lateral sclerosis: validation of a self-report scale (Center for Neurologic Study Bulbar Function Scale). Eur. J. Neurol. 25, 907–e66 (2018).
Yunusova, Y., Plowman, E. K., Green, J. R., Barnett, C. & Bede, P. Clinical measures of bulbar dysfunction in ALS. Front. Neurol. 10, 1–11 (2019).
Article Google Scholar
Chiaramonte, M. & Bonfiglio M. Acoustic analysis of voice in bulbar amyotrophic lateral sclerosis: a systematic review and meta-analysis of studies. Logop. Phoniatr. Vocol. 22, 1–13 (2019).
Vieira, H., Costa, N., Sousa, T., Reis, S. & Coelho, L. Voice-based classification of amyotrophic lateral sclerosis: where are we and where are we going? A systematic review. Neurodegener. Disord. 19, 163–170 (2019).
Article Google Scholar
Connaghan, K. P. et al. Use of Beiwe smartphone app to identify and track speech decline in amyotrophic lateral sclerosis (ALS). In: Interspeech 2019, ISCA 4504–4508 (2019).
Arora, S. et al. Detecting and monitoring the symptoms of Parkinson’s disease using smartphones: a pilot study. Parkinsonism Relat. Disord. 21, 650–653 (2015).
Article CAS Google Scholar
Rutkove, S. B. et al. ALS longitudinal studies with frequent data collection at home: study design and baseline data. Amyotroph. Lateral Scler. Frontotemporal Degener. 20, 61–67 (2019).
Article Google Scholar
Enderby, P. Handbook of Clinical Neurology, Vol. 110. 273–281 (Elsevier, Amsterdam, 2013).
Lorah, J. Effect size measures for multilevel models: definition, interpretation, TIMSS example. Large Scale Assess. Educ. 6, 1–11 (2018).
Article Google Scholar
Grimm, K. J., Ram, N. & Estabrook, R. Growth Modeling: Structural Equation and Multilevel Modeling Approaches (Guilford, New York, 2017).
Ball, L., Beukelman, D. & Pattee, G. Timing of speech deterioration in people with amyotrophic lateral sclerosis. J. Med. Speech Lang. Pathol. 10, 231–235 (2002).
Google Scholar
Chiò, A. et al. Prognostic factors in ALS: a critical review. Amyotroph. Lateral Scler. 10, 310–323 (2009).
Article Google Scholar
Aural Analytics. ALS at Home—Speech. 2016. https://apps.apple.com/in/app/als-at-home-speech/id1169813257 (2016).
Sohn, J., Kim, N. & Sung, W. A statistical model-based voice activity detection. IEEE Signal Process. Lett. 6, 1–3 (1999).
Article Google Scholar
Jiao, Y. et al. Articulation entropy: an unsupervised measure of articulatory precision. IEEE Signal Process. Lett. 24, 485–489 (2017).
Article Google Scholar
Bates, D., Maechler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48 (2015).
Article Google Scholar
Pinheiro, J., Bates, D., DebRoy, S., Sarkar D. & R Core Team. nlme: Linear and Nonlinear Mixed Effects Models. https://CRAN.R-project.org/package=nlme (2019).

Download references

Acknowledgements

This work was supported by NIH SBIR (1R43DC017625-01), NSF SBIR (1853247), NIH R01 (5R01DC006859-13), and ALS Finding a Cure Grant.

Author information

Authors and Affiliations

Arizona State University, Phoenix, AZ, USA
Gabriela M. Stegmann, Shira Hahn, Julie Liss & Visar Berisha
Aural Analytics, Scottsdale, AZ, USA
Gabriela M. Stegmann, Shira Hahn, Julie Liss & Visar Berisha
Barrow Neurological Institute, Phoenix, AZ, USA
Jeremy Shefner, Kerisa Shelton & Cayla Jessica Duncan
Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
Seward Rutkove

Authors

Gabriela M. Stegmann
View author publications
You can also search for this author in PubMed Google Scholar
Shira Hahn
View author publications
You can also search for this author in PubMed Google Scholar
Julie Liss
View author publications
You can also search for this author in PubMed Google Scholar
Jeremy Shefner
View author publications
You can also search for this author in PubMed Google Scholar
Seward Rutkove
View author publications
You can also search for this author in PubMed Google Scholar
Kerisa Shelton
View author publications
You can also search for this author in PubMed Google Scholar
Cayla Jessica Duncan
View author publications
You can also search for this author in PubMed Google Scholar
Visar Berisha
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

G.S.: statistical analyses; led the writing of the paper. S.H.: provided expertise in which speech features to measure, helped with writing and editing, provided input in statistical analyses. J.L. and V.B.: speech study design, helped with writing and editing, provided input in statistical analyses. J.S. and S.R.: ALS study conception and supervision of the study. K.S. and C.J.D.: study execution, data management, and preliminary analysis.

Corresponding author

Correspondence to Gabriela M. Stegmann.

Ethics declarations

Competing interests

V.B. and J.L. are co-founders of Aural Analytics. J.S. is a scientific advisor to Aural Analytics. G.S. and S.H. are employed by Aural Analytics. This work was supported by NIH SBIR (1R43DC017625-01), NSF SBIR (1853247), and NIH R01 (5R01DC006859-13).

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Stegmann, G.M., Hahn, S., Liss, J. et al. Early detection and tracking of bulbar changes in ALS via frequent and remote speech analysis. npj Digit. Med. 3, 132 (2020). https://doi.org/10.1038/s41746-020-00335-x

Download citation

Received: 19 May 2020
Accepted: 17 September 2020
Published: 13 October 2020
DOI: https://doi.org/10.1038/s41746-020-00335-x

This article is cited by

Validation of the Center for Neurologic Study Bulbar Function Scale–Chinese version in a population with amyotrophic lateral sclerosis
- Shan Ye
- Lu Chen
- Dongsheng Fan
Orphanet Journal of Rare Diseases (2024)
Detecting bulbar amyotrophic lateral sclerosis (ALS) using automatic acoustic analysis
- Leif E. R. Simmatis
- Jessica Robin
- Yana Yunusova
BioMedical Engineering OnLine (2024)
Responsible development of clinical speech AI: Bridging the gap between clinical research and technology
- Visar Berisha
- Julie M. Liss
npj Digital Medicine (2024)
The use of digital tools in rare neurological diseases towards a new care model: a narrative review
- Francesca Torri
- Gabriele Vadi
- Michelangelo Mancuso
Neurological Sciences (2024)
A systematic review and narrative analysis of digital speech biomarkers in Motor Neuron Disease
- Molly Bowden
- Emily Beswick
- Suvankar Pal
npj Digital Medicine (2023)

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Description of sample

Analyses

Discussion

Methods

Sample

Speech collection and analysis

Statistical analyses

Reporting summary

Data availability

Code availability

Change history

20 November 2020

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links