Automated screening for Fragile X premutation carriers based on linguistic and cognitive computational phenotypes

Millions of people globally are at high risk for neurodegenerative disorders, infertility or having children with a disability as a result of the Fragile X (FX) premutation, a genetic abnormality in FMR1 that is underdiagnosed. Despite the high prevalence of the FX premutation and its effect on public health and family planning, most FX premutation carriers are unaware of their condition. Since genetic testing for the premutation is resource intensive, it is not practical to screen individuals for FX premutation status using genetic testing. In a novel approach to phenotyping, we have utilized audio recordings and cognitive profiling assessed via self-administered questionnaires on 200 females. Machine-learning methods were developed to discriminate FX premutation carriers from mothers of children with autism spectrum disorders, the comparison group. By using a random forest classifier, FX premutation carriers could be identified in an automated fashion with high precision and recall (0.81 F1 score). Linguistic and cognitive phenotypes that were highly associated with FX premutation carriers were high language dysfluency, poor ability to organize material, and low self-monitoring. Our framework sets the foundation for computational phenotyping strategies to pre-screen large populations for this genetic variant with nominal costs.

The supplementary materials include: Fig. S1. Traditional workflow to create linguistic profiles. Fig. S2. Distribution of linguistic features for FX premutation carriers and the comparison group. Fig. S3. Distribution of cognitive features for FX premutation carriers and the comparison group. Fig. S4. Performance of random forest classifier by using different length of transcripts as the input. Table S1. Description of some standard linguistic features that had zero information gain. Table S2. Linguistic features for the language sample in Supplementary Text. Table S3. Comparison of automated feature extraction module and manual SALT methods. Table S4. Group differences in the linguistic profile between FX premutation carriers and comparison group. Table S5. Description of cognitive features (BRIEF-A). Table S6. Features within in-person and over-the-phone language samples of FX premutation carriers. Table S7. Performance of different classifiers. Table S8. Random forest classifier performance for different sets of input features. Table S9. Group differences between linguistic features in time segment 4 and time segment 5 in FX premutation carriers. Table S10. Mean decrease in accuracy of fitted model after dropping each variable. Table S11. Performance metrics of random forest classifiers for females in the US population.   S1. Traditional workflow to create linguistic profiles. The workflow starts with data collection via phone or in-person interviews. The audio recordings were transcribed and resulting transcripts were manually segmented and SALT coded. The coded transcripts were later analyzed to create linguistic profiles. Manual coding of each transcript (red box) takes about one hour, which is time consuming, expensive and require personnel with SALT expertise. In our proposed framework we have eliminated these steps by developing an automated text-processing module to process the raw transcripts and extract linguistic features directly.

Exclamations
Number of all the exclamations (utterances ending with "!")

Number of words
Number of all the words

Number of words per minute
Utterance with dysfluency Number of utterances containing at least one dysfluency.

Average word per dysfluency
Average number of words per dysfluency (average dysfluency length)

Number of dysfluency words
Total number of words occurring in a dysfluency. For this study it refers to the number of repeated words.

Long utterances
Utterances with more than ten morphemes

Negative utterances
Number of utterances with negative verbs    Table S5. Description of cognitive features (BRIEF-A). Standard definitions are employed as described in 2-5 .

Feature Description Inhibit
The Inhibit scale assesses inhibitory control and impulsivity. This can be described as the ability to resist impulses and the ability to stop one's own behavior at the appropriate time.

Shift
The Shift scale assesses the ability to move with ease from one situation, activity, or aspect of a problem to another as the circumstances demand. Key aspects of shifting include the ability to (a) make transitions; (b) tolerate change; (c) problem-solve flexibly; (d) switch or alternate attention; and (e) change focus from one mindset or topic to another.

Emotional control
The Emotional Control scale measures the impact of executive function problems on emotional expression and assesses an individual's ability to modulate or control his or her emotional responses.

Initiation
The Initiate scale reflects an individual's ability to begin a task or activity and to independently generate ideas, responses, or problem-solving strategies.

Working memory
The Working Memory scale measures "on-line representational memory;" that is, the capacity to hold information in mind for the purpose of completing a task, encoding information, or generating goals, plans, and sequential steps to achieving goals. Working memory is essential to carry out multistep activities, complete mental manipulations such as mental arithmetic, and follow complex instructions.

Planning/Organization
The Plan/Organize scale measures an individual's ability to manage current and future-oriented task demands. The scale consists of two components: plan and organize. The Plan component captures the ability to anticipate future events, to set goals, and to develop appropriate sequential steps ahead of time in order to carry out a task or activity. The Organize component refers to the ability to bring order to information and to appreciate main ideas or key concepts when learning or communicating information.

Task monitoring
The Task Monitor scale reflects the ability to keep track of one's problem-solving success or failure, and to identify and correct mistakes during behaviors.

Self-monitoring
The Self-Monitor scale assesses aspects of social or interpersonal awareness. It captures the degree to which an individual perceives himself as aware of the effect that his or her behavior has on others.

Negative
The Negativity scale measures the extent to which the respondent answered selected BRIEF-A items in an unusually negative manner.

Infrequency
Scores on the Infrequency scale indicate the extent to which the respondent endorsed items in an atypical fashion relative to the combined normative and clinical samples.

Inconsistency
Scores on the Inconsistency scale indicate the extent to which similar BRIEF-A items were endorsed in an inconsistent manner relative to the combined normative and mixed clinical/healthy adult samples.

Organization of material
The Organization of Materials scale measures orderliness of work, living, and storage spaces (e.g., desks, rooms).

Global Executive Composite
The Global Executive Composite (GEC) is an overarching summary score that incorporates all of the BRIEF-A clinical scales.

Behavioral Regulation Index
The Behavioral Regulation Index (BRI) captures the ability to maintain appropriate regulatory control of one's own behavior and emotional responses. This includes appropriate inhibition of thoughts and actions, flexibility in shifting problem-solving set, modulation of emotional response, and monitoring of one's actions. It is composed of the Inhibit, Shift, Emotional Control, and Self-Monitor scales.

Metacognition Index
The Metacognition Index (MI) reflects the individual's ability to initiate activity and generate problem-solving ideas, to sustain working memory, to plan and organize problem-solving approaches, to monitor success and failure in problem solving, and to organize one's materials and environment. It is composed of the Initiate, Working Memory, Plan/Organize, Task Monitor, and Organization of Materials scales.

Language sample
The following example demonstrates the linguistic features in a manually coded transcript.
Dysfluencies including repetitions and filled pauses are shown. Table S2 lists all the feature values for this example.
Matthew is um sixteen. He has Fragile X Syndrome. He is uh warm, happy, cooperative, sweet, um handsome, athletic in his own way. He is a very talented artist. He is um a wonderful part of our family. I think we we have a really wonderful relationship. It is it is a little like acquiescent. Um so I guess the next piece would be his Fragile X behavior. So he is um, which I probably like over look half the time but he hand slaps and postures. He is uh cognitively very challenged. Um, he is very dependent on us um but he is really a joy. He gives back more than he takes. Um what more can I say? I think I said everything that comes to mind.

Text processing module
We have developed a text-processing module to extract language characteristics from the transcripts (Fig. S1). This module is able to batch process all the samples and create a comprehensive dataset as the refined output. The resulting dataset also includes combined attributes such as total dysfluencies, verbal information flow and frequency of dysfluencies. We have also extracted utterance distributions and dysfluencies distribution.
In order to validate the feature extraction module, we have compared its output with the output of a language analysis software called the Systematic Analysis of Language Transcripts (SALT) 7 . F1 score was calculated for each feature and each cluster. The average F1 score for each feature among all the transcripts was reported ( Table S3). The text processing outputs are very close to the SALT outputs.

Comparison of linguistic profiles
We have developed an exploratory data analysis module to evaluate normality and variance homogeneity in FX premutation carrier and comparison group samples. Independent sample ttests were used to find significantly different features in two groups. A p-value of less than 0.05 was established for statistical significance. A list of features, which were assessed to be significant, has been reported in Table S4. FX premutation carrier samples were significantly different from the comparison group in terms of dysfluency variables. A higher frequency of dysfluency patterns was observed in FX premutation carriers.
When comparing the group, we found that filled pauses occurred in FX premutation carriers (average ~36) more frequently than the comparison group (average ~22). Overall, more dysfluency features occur in FX premutation carriers (11.15) than in the comparison group (5.38).
On average about 52% of utterances in FX premutation carriers had at least one repetition or filled pauses, while the average for the comparison group was 45%.

In-person vs. phone interviews
The measures used in this study were collected either in-person or over the phone. We have investigated the features obtained from various interview methods in order to measure reliability of data our sample.
Five-minute language samples were obtained from the same group of FX premutation carriers in two separate interviews. Each participant completed both an in person and over the phone interview. The order of performing live and over the phone interviews was varied so some of the participants completed the in person interviews first and then were called over the phone, whereas others first completed the phone first, followed by the live interviews.
We have performed both dependent and independent two samples t-tests to compare the differences in linguistic features. The results are listed in Table S6. According to the results there are no significant differences between features measured over the phone and the ones obtained in live interviews. This method can therefore be reliably used in either context resulting in the same pattern of results.

Informative linguistic features
The results from the feature selection module indicate that many linguistic features are informative, and hence we analyzed the highly-ranked features for unusual patterns and significant group differences. Fig. S2 shows a selected set of these features to demonstrate the differences between the two groups.
According to Fig. S2A the FX premutation carriers had significantly (p<0.0001) more repetitions overall compared to the comparison group. We observed an average of 11.15 repetitions per transcript in FX premutation carriers and about 5.38 repetitions in comparison group.
In terms of the specific type of repetitions, the distribution of word repetitions in FX premutation carriers and the comparison group showed significant differences, using independent samples t-tests (p<0.01, Fig. S2B). Similar to the word repetitions, distribution of phrase repetitions for the comparison group and FX premutation carriers were significantly different (p<0.01, Fig. S2C).
On average, FX premutation carriers' transcripts contained more than 36 filled pauses, while the comparison group's transcripts contained about 22 filled pauses (Fig. S2D). FX premutation carriers used more one-word utterances than the comparison group (Fig. S2E), and generally, the FX premutation carriers tend to use more short utterances (less than 5 words per utterance) than the comparison group (p<0.01). important for developing robust classifiers as described below.

Length of the interviews
The interviews were analyzed in 5 milestones and the classifier performance was evaluated for each data profile. The transcript was divided into 5 segments based on the number of utterances. Milestone 1 contains 20 percent of the transcript; milestone 2 includes 40 percent and so on. The result of this analysis is shown in Fig S4. Longer interviews provide more reliable data.

Segment differences
The results from parsing linguistic features suggest the last part of interviews contain valuable information in order to discriminate the FX premutation carriers from comparison group. Table S9 shows the linguistic features, which are significantly different between milestones 4 and 5 in FX premutation carriers. More repeated words were observed in the last segment of the interviews.

'FX-PM Test' mobile app
We created a demo version of a FX-PM Test mobile app. The app is a data collection platform developed using Apple's ResearchKit library. FX-PM Test provides an interactive interface to guide participants through different levels of enrollment. The process begins with introduction of the research study and its impact of public health. Prior to joining the study, participants learn about the research question, required data and data gathering methodology as well as necessary time commitment and privacy policies. Each participant voluntarily signs a digital informed consent confirming his/her interest to being involved in the study. In each data collection session, users are asked to answer a few simple questions, which will be used to