Abstract
This work proposes a new class of explainable prognostic models for longitudinal data classification using triclusters. A new temporally constrained triclustering algorithm, termed TCtriCluster, is proposed to comprehensively find informative temporal patterns common to a subset of patients in a subset of features (triclusters), and use them as discriminative features within a state-of-the-art classifier with guarantees of interpretability. The proposed approach further enhances prediction with the potentialities of model explainability by revealing clinically relevant disease progression patterns underlying prognostics, describing features used for classification. The proposed methodology is used in the Amyotrophic Lateral Sclerosis (ALS) Portuguese cohort (N = 1321), providing the first comprehensive assessment of the prognostic limits of five notable clinical endpoints: need for non-invasive ventilation (NIV); need for an auxiliary communication device; need for percutaneous endoscopic gastrostomy (PEG); need for a caregiver; and need for a wheelchair. Triclustering-based predictors outperform state-of-the-art alternatives, being able to predict the need for auxiliary communication device (within 180 days) and the need for PEG (within 90 days) with an AUC above 90%. The approach was validated in clinical practice, supporting healthcare professionals in understanding the link between the highly heterogeneous patterns of ALS disease progression and the prognosis.
Similar content being viewed by others
Introduction
Considering longitudinal data, also referred to as multivariate time series data, three-way data, or multivariate trajectory data, triclustering aims to discover patterns that satisfy specific homogeneity and statistical significance criteria. Given the increasing prevalence of three-way data across biomedical and social domains, triclustering—the discovery of patterns (triclusters) within three-way data—is becoming a reference technique to enhance the understanding of complex biological, individual, and societal systems1. Clustering is limited to this end since objects (patients) in three-way data domains are typically only meaningfully correlated on subspaces of the overall space (subsets of features), and although biclustering is able to find correlated objects in a subspace of features or temporal patterns for one feature, cannot consider both time and multiple features2.
In clinical domains, triclustering has been successfully applied for different ends: health record data analysis, where triclusters can identify groups of patients with correlated clinical features along time; neuroimaging data analysis in which triclusters correspond to enhanced hemodynamic or electrophysiological responses and connectivity patterns between brain regions; multi-omics, where triclusters capture putative regulatory patterns within omic series data; and multivariate physiological signal data analysis, where triclusters capture coherent physiological responses for a group of individuals1,3,4. In spite of triclustering relevance for descriptive tasks (knowledge acquisition), its potential in predictive tasks (medical decision) remains considerably untapped1.
In this context, grounded on the potentialities of triclustering approaches, we propose a triclustering-based classifier to learn prognostic models from three-way clinical data, which takes advantage of the temporal dependence between the monitored features, and further enhances model explainability by learning an associative model grounded on local temporal patterns (subsets of features with specific values for a subset of patients in a contiguous set of temporal observations during follow-up). To this end, we propose TCtriCluster, a temporally constrained triclustering algorithm able to mine time-contiguous triclusters that extends the state-of-the-art triCluster algorithm5, originally proposed by Zhao and Zaki to mine patterns in three-way gene expression data, to cope with three-way heterogeneous clinical data (patient-feature-time data).
As a case study, we target prognostic prediction in Amyotrophic Lateral Sclerosis (ALS) using a large cohort of Portuguese patients, where the triclusters learned from patients’ follow-up data can be interpreted as disease progression patterns. The patterns identifying groups of patients with coherent temporal evolution on a subset of features are then used for prognostic prediction as features in a state-of-the-art classifier. The prognostic models learned using the proposed triclustering-based classifier predict whether a patient will evolve to a target clinical endpoint within a certain time window. We target five clinically relevant endpoints in ALS: (1) need for non-invasive ventilation (NIV), (2) need for an auxiliary communication device, (3) need for percutaneous endoscopic gastrostomy (PEG), (4) need for a caregiver, and (5) need for a wheelchair.
The major contributions of this work are the following:
-
A new pattern-centric data transformation from longitudinal data into multivariate temporal features, the triclusters, yielding both descriptive and discriminative qualities for subsequent learning tasks;
-
First study in ALS that comprehensively assesses the state-of-the-art predictability limits of different clinical endpoints of interest, using time windows;
-
A new triclustering algorithm, termed TCTriCluster, able to find time-contiguous triclusters with constant and additive forms of homogeneity;
-
Discriminative patterns of (ALS) disease progression used for prognostic prediction and whose inspection can putatively help to explain prognostics, aiding medical research and practice.
The gathered results are promising, highlighting the potential of the proposed methodology regarding both predictability (outperforming state-of-the-art alternatives) and interpretability. Some limitations should, however, be pinpointed. First, our results primarily focus on the predictive value of follow-up assessments. Nevertheless, the proposed predictors can straightforwardly combine static features with triclustering-based features (as we show at the end). Second, in spite of the large ALS cohort size (N = 1321), collected at the Portuguese ALS center, data from other ALS centers can be used for further validation.
The proposed triclustering-based classifier can be used to learn prognostic models from follow-up data in other diseases, as well as predictive models from three-way data in other domains. The TCtriCluster algorithm can be further used as a standalone tool to mine arbitrarily positioned, overlapping, and temporally constrained triclusters with constant, scaling, and shifting patterns from three-way heterogeneous data.
Background and related work
ALS is a neurodegenerative disease characterized by weakness and functional disability, with patients presenting with a different phenotype and progression rate. Most of the patients with ALS die from respiratory complications within the first 3–5 years after disease onset. Notwithstanding, some patients are living up to 10 years, while in more severe circumstances, survival can be shortened to 1 year6. Recent studies have reported a prevalence of 8-9 cases in 100.000 inhabitants worldwide7, in Portugal, the described prevalence is similar8.
In the absence of curative treatment, it is essential to promote timely interventions for prolonging survival and improving quality of life. The most important interventions are NIV, with a major positive impact on survival; augmentative communication for preventing social isolation; PEG to keep appropriate nutrition; routine caregiver support for daily life activities and wheelchair regular outings, e.g. for medical appointments6,9,10. Clinicians have been using a well-established scale to determine disease progression: the revised ALS Functional Rating Scale (ALSFRS-R)11. This scale has specific questions regarding respiratory symptoms, speaking, swallowing, self-care and walking, which are essential to determine the timing of the several interventions. Regarding respiratory function, a number of tests are used to support the decision of NIV initiation.
Due to the high heterogeneity of this disease, the individual prognosis of an ALS patient is challenging. Therefore it is of utmost importance to develop explainable machine learning models, pinpointing the need for approaches to learn explainable disease progression models that clinicians can effectively use for prognostic prediction and timely interventions, with a possible positive impact on survival and quality of life. Recent years have witnessed an increasing awareness of the potentialities of machine learning amongst ALS researchers, leading to several applications to ALS cohort data12,13,14,15,16,17,18,19,20,21. The great potential of learning stratification models has also shown opportunities for future clinical trials, besides promoting more accurate and trustable predictions by learning group-specific prognostic models13,22,23,24.
In this context, Carreiro et al.12 conducted a pioneer study proposing prognostic models to predict the need for NIV in ALS based on clinically defined time windows. More recently, Pires et al.22 stratified patients according to their state of disease progression achieving three groups of progressors (slow, neutral and fast), and proposed specialized learning models according to these groups. They further used patient and clinical profiles with promising results23. However, none of their studies took into account the temporal progression of the features. Recently, Martins et al. proposed to couple itemset mining with sequential pattern mining to unravel disease presentation and disease progression patterns and used these patterns to predict the need for NIV in ALS patients25. Despite their relevant results, they did not consider the contiguity constraint imposed by the temporality of the patient’s follow-up data. Matos et al.26 proposed a biclustering-based classifier. Biclustering was used to find groups of patients with coherent values in subsets of clinical features (biclusters), then used as features together with static data. Besides promising, none of this approach also did not take into account the temporal dependence between the features.
In previous work, a preliminary assessment of the role of classic triclustering approaches for predicting ventilation support needs in ALS was undertaken27, and, biclusters discovered in the static dimension of data were considered to predict the need for NIV within specific time windows28. Differently from these earlier works, our research proposes a novel triclustering approach grounded on temporal contiguity constraints that yield both higher predictability and better explainability.
Complementarily to the above pattern-centric stances, Pancotti et al.29 recently applied state-of-the-art deep learning methods to study disease progression in ALS using a publicly available database (PRO-ACT), showing competitive performance.
Despite the extent of research on ALS prognostic ends, most of the existing works focus on survival prediction, NIV needs, or general changes to the ALS functional rating scale (ALSFRS-R), generally neglecting specific clinical endpoints of interest. Specific clinical endpoints, such as the need for a wheelchair or percutaneous endoscopic gastrostomy, have been primarily studied under descriptive stances, including the analysis of cumulative time-dependent risks30. To our knowledge, their predictability under the machine learning stance using time windows and explainable progression patterns remains unassessed.
Methods
This section describes the proposed methodology to learn a triclustering-based classifier from three-way data, from preprocessing (including creating learning examples) to classifier performance evaluation. It further describes TCtriCluster, the proposed triclustering algorithm to mine temporally constrained triclusters. Figure 1 depicts the overall workflow.
In what follows, consider that a three-way dataset, D, is defined by n objects \(X = \{x_1,\ldots ,x_n\}\), m features \(Y = \{y_1,\ldots ,y_m\}\), and p contexts \(Z = \{z_1,\ldots ,z_p\}\), where the elements \(d_{ijk}\) relate object \(x_i\), feature \(y_j\), and context \(z_k\). Consider also that, a bicluster \(B = (I, J)\) is a subspace given by a subset of objects, \(I \subseteq X\), and a subset of features, \(J \subseteq Y\)2. Similarly, a tricluster \({\mathscr {T}} = (I, J, K)\), contains \(I \subseteq X\) objects, \(J \subseteq Y\) features and \(K \subseteq Z\) contexts, and \(t_{ijk}\) denote the elements of \({\mathscr {T}}\), where \(1 \le i \le |I|\), \(1 \le j \le |J|\) and \(1 \le k \le |K|\)1. In this context, each tricluster \({\mathscr {T}}\) can be represented as a set of biclusters \({\mathscr {T}} = \{{\mathscr {B}}_1, {\mathscr {B}}_2, \ldots , {\mathscr {B}}_s\}\):
Preprocessing data
The three-way dataset, composed of several heterogeneous features measured over a number of time points, is first preprocessed to obtain learning examples. Depending on the dataset, dealing with missing values and class imbalance might also be needed. Some triclustering searches, such as the one proposed in this work, can ignore missing values, tackling imputation needs.
TCtriCluster: a new temporal triclustering algorithm
triCluster5, a pioneer and highly cited triclustering approach proposed and implemented by Zhao and Zaki is selected. It is a quasi-exhaustive approach, able to mine arbitrarily positioned and overlapping triclusters with constant, scaling, and shifting patterns from three-way data. Given that triCluster was proposed to mine coherent triclusters in three-way gene expression data (gene-sample-time), at this point, it is important to understand that clinical data can be preprocessed in order to have a similar structure, in which patient-feature-time data resembles the gene-sample-time data considered in earlier works. triCluster is composed of 3 main steps: (1) constructs a multigraph with similar value ranges between all pairs of samples; (2) mines maximal biclusters from the multigraph formed for each time point (slices of the 3D dataset); and (3) extracts triclusters by merging similar biclusters from different time points. Optionally, it can delete or merge triclusters according to the placed overlapping criteria.
As our goal is to mine temporal three-way data, meaning the Z context dimension corresponds to time, we borrow a pivotal idea behind CCC-Biclustering31, a state-of-the-art and highly efficient temporal biclustering algorithm, and introduce a temporal constraint in triclustering to promote interpretability, predictive accuracy, and efficiency. The goal thus becomes to mine Time-Contiguous Triclusters (TCTriclusters), triclusters with consecutive time points. In this context, we re-implemented triCluster in Python and extended it to cope with a time constraint. The new TCtriCluster algorithm implements this time constraint on its 3rd phase, as shown in Algorithm 1 (line 9).
TCtriCluster allows different combinations of input parameters (from the input parameters of triCluster5 that should be explored in order to discover the best parameters with which the final classifier should be learned. The input parameters are: \(\varepsilon , mx, my, mz, \delta ^x, \delta ^y, \delta ^z, \eta\) and \(\gamma\), corresponding to maximum ratio value, the minimum size of tricluster dimensions x, y and z, maximum range threshold along dimensions x, y and z, overlapping and merging threshold, respectively. More details about the input parameters are referred to5.
Hyperparameterizing the triclustering search
In this step, we find the best hyperparameters used as input by the triclustering algorithm (described above) in order to optimize predictive performance. The workflow, depicted in Fig. 2, starts by performing triclustering on the preprocessed data to obtain triclusters. Next, and since our triclustering-based classifier uses the triclusters as features, we compute a 3D virtual pattern for each tricluster.
The proposed 3D virtual pattern corresponds to the tricluster most representative pattern, an extension of the 2D version defined in32, and is computed as follows.
Definition 1
(3D virtual pattern). Given a tricluster \({\mathscr {T}}\), its virtual pattern \({\mathscr {P}}\) is defined as a set of elements \({\mathscr {P}} = \{ \rho _1, \rho _2,\ldots , \rho _{|I|}\}\), where \(\rho _i, 1 \le i \le |I|\) is defined as the mean (or the mode, in case of categorical features) of values in the \(i^{th}\) row for each context:
Considering as example a tricluster \({\mathscr {T}}\)=(I, J, K), mined from three-way data, (X, Y, Z), composed by 3 objects, 3 features (\(y_1\) and \(y_7\) are categorical features) and 3 contexts, such that \(I = \{ x_1, x_3, x_7 \},\; J = \{ y_1, y_3, y_7 \},\; K=\{ z_2, z_3, z_4 \}\). For simplicity, consider \({\mathscr {T}} = \{ B_2, B_3, B_4 \}\):
and an object (patient) \(P(X_p, I, K)\) defined as \(P = \{ C_2, C_3, C_4 \}: C_2 = \begin{bmatrix} 1&2.22&5 \end{bmatrix}; \; C_3 = \begin{bmatrix} 1&2.26&7 \end{bmatrix}; \; C_4 = \begin{bmatrix} 2&2.35&8 \end{bmatrix}\). In this settings, the Virtual Patterns are: \(\rho (B_2) = \begin{bmatrix} 1&2.6667&5 \end{bmatrix}\); \(\rho (B_3) = \begin{bmatrix} 3&2.9&3 \end{bmatrix}\); \(\rho (B_4) = \begin{bmatrix} 3&2.7333&3 \end{bmatrix}\); and \(\rho ({\mathscr {T}}) = \begin{bmatrix} 3&2.7667&3 \end{bmatrix}\).
Note that, optionally, in cases where triclustering could capture heterogeneous triclusters, we can detach the biclusters which compose the tricluster and use those biclusters as features (computing virtual pattern 2D) instead of the pattern that describe the whole tricluster. Notice that in this previous example, if we detached the tricluster, we will use three patterns—\(\rho (B_2)\), \(\rho (B_3)\) and \(\rho (B_4)\)—in which the first one is far different from the two others. This optional step gives more information to the classifier, promoting its predictive performance.
With the virtual patterns computed, to assess how well a specific object (patient), \(p_i\), follows the general tendency of a given tricluster \({\mathscr {T}}\) we have to compare \(p_i\) with the 3D virtual pattern, \({\mathscr {P}}\), which is the most representative pattern of the tricluster \({\mathscr {T}}\). To do this, we propose two approaches: (1) compute the Euclidean distance; or (2) compute Pearson correlation between the 3D virtual pattern \({\mathscr {P}}\) and the equivalent pattern (same features and contexts) of \(p_i\).
We denote these assessments as Virtual Distance 3D and Virtual Correlation 3D, and define them as follows:
Definition 2
(Virtual distance 3D). The virtual Euclidean distance between an observation \(p_i\) and a tricluster \({\mathscr {T}}\) is defined as
Definition 3
(Virtual correlation 3D). The virtual linear correlation between an object \(p_i\) and a tricluster \({\mathscr {T}}\) is defined as
After computing similarities matrices based on the virtual patterns (using distances or correlations), these matrices are used as learning examples by the classifier (having the triclusters as features) and evaluated with a 5\(\times\)10-fold Stratified Cross-Validation in order to find the best triclustering parameters, using classification performance as metric. The best parameters are then fed to the next step.
Learning the final classifier
Figure 3 depicts the steps involved in learning the final model. With the best parameters found in the previous step, an additional iteration is performed in order to obtain the final triclusters. The final triclusters are then used to create a classic multivariate data space by creating one variable per tricluster and computing the virtual distance/correlation between each training object and the given tricluster to produce the transformed data. Using this multivariate data space, a traditional classifier can be learned and used to make predictions in the next step.
Testing stage
After learning the target triclustering-based predictive model, new three-way objects can be classified. To do this, it is necessary to first calculate the array of similarities between the new object and the triclusters (virtual patterns) obtained in the previous steps. This array will be fed to the classifier that will, in turn return the classification for the new object with a percentage of accuracy. Figure 4 depicts an example using clinical three-way data (case study described in the next section).
Ethics approval and consent to participate
The study was conducted in accordance with the Declaration of Helsinki and was approved by the local (Faculty of Medicine, University of Lisbon) ethics committee. Informed consent to participate in the study was obtained from all participants. Data access was granted in the context of project AIpALS (PTDC/CCI-CIF/4613/2020), where the authors’ institutions participate.
Case study: prognostic prediction in ALS
In this study, we want to predict whether a given patient will evolve to a critical endpoint within k days (time window) since the last clinical appointment using data from the patients’ follow-up. The target endpoints considered and validated by the clinicians are the following:
-
C1—need for non-invasive ventilation (NIV), as decided by the international guidelines11
-
C2—need for an auxiliary communication device (question 1 of the ALSFRS-R with a score of 1 or lower)
-
C3—need for percutaneous endoscopic gastrostomy (PEG) (question 3 of the ALSFRS-R with a score of 2 or lower)
-
C4—need for a caregiver (question 5 or 6 of the ALSFRS-R with a score of 1 or lower)
-
C5—need for a wheelchair (question 8 of the ALSFRS-R with a score of 1 or lower)
In order to apply the triclustering-based classification approach, the three-way data corresponds to longitudinal data collected at the patient’s follow-up, and in particular, the dimensions X, Y, and Z correspond to patients, features, and time, as shown in Fig. 5.
Cohort data
Our study is conducted using the Lisbon ALS clinic dataset containing Electronic Health Records from ALS Patients regularly followed at the local ALS clinic since 1995 and last updated in October 2021. Its current version contains 1321 patients (740 males and 581 females) with age at onset average \(63 \pm 13\) years. Each patient incorporates a set of static features (demographics, disease severity, co-morbidities, medication, genetic information, exercise, and smoking habits, past trauma/surgery, and occupations) along with temporal features (collected repeatedly at follow-up), like disease progression tests (ALSFRS-R scale, respiratory tests, etc.). Table 1 shows the patient cohort characterization.
As the proposed methodology is focused on three-way clinical data analysis and in order to test its potential, we first restrict our data to temporal data only, discarding static data (described in Table 1). We considered 7 features per time point, the Functional Scores (ALSFRS-R), briefly described next, and a respiratory test: Forced Vital Capacity (FVC). Following recent studies33,34, we computed an extra temporal feature based on ALSFRS-R scale: MITOS stage33. The values for this feature range between 0-5 and provides information about the patient’s disease stage at the moment of the assessment. Concretely, the value represents the number of compromised ALSFRS-R domains33,34,35. The value 5 represents death.
ALSFRS-R scores for disease progression rating are an aggregation of integers on a scale of 0 to 4 (where 0 is the worst and 4 is the best), providing different evaluations of the patient functional abilities at a given time point35. This functional evaluation is based on 12 questions, explained in Table 2. Different functional scores are then computed using subsets of scores, as shown in Table 3.
Preprocessing
Data were preprocessed in accordance with the approach proposed by Carreiro et al.12, which assumes the patients are followed up regularly and perform a normative set of tests after each appointment. As patients may not be able to perform all tests in a single day, the method takes their temporal distribution into account when learning from the available clinical records by computing snapshots of the patient’s condition by grouping tests performed within a clinically accepted time window.
Following these assumptions, we performed a hierarchical (agglomerative) clustering with constraints to compute the patient’s snapshots, a state-of-the-art procedure to perform alignments along a follow-up12. The constraints applied when grouping the sets of evaluations followed well-established principles as in12: (1) the evaluations that compose a snapshot cannot belong to the same test as clinicians do not prescribe the same test twice; and (2) all the evaluations considered in the same snapshot should be consistent regarding the critical features of interest (i.e., the patient should be either in the critical endpoint or not in all the records composing the snapshot). For this study, the cutting point for creating the snapshots was defined as 100 days and goes in line with Carreiro et al.12.
At this stage, we compute five datasets (one for each of the critical endpoints) with the patient’s snapshots which have a critical feature, establishing, for each snapshot, if the patient is or is not in a critical endpoint (binary feature). The critical feature value (target to be learned by the classifier) was computed for each critical endpoint based on the date on which a patient’s critical status was detected. For each one, the critical date considered and validated by the clinicians was the date of the first evaluation with the following ALSFRS-R conditions (see Table 2):
-
C1: critical when Q12 \(\le\) 3
-
C2: critical when Q1 \(\le\) 1
-
C3: critical when Q3 \(\le\) 2
-
C4: critical when Q5 \(\le\) 1 \(\vee\) Q6 \(\le\) 1
-
C5: critical when Q8 \(\le\) 1
As an example, for the target endpoint C1 (need for NIV), the critical feature identifies whether a patient will evolve to a critical status (need for NIV), occurring when the patient has a date within the defined interval where the Q12 score is lower than 3. Figure 6 depicts an example of the computation of patient snapshots.
After creating the patients’ snapshots, we have to compute the learning examples used by the predictive models. According to its critical point of interest, each dataset needs to have the patient’s evolution for a critical state, depending on the observed changes k days from the snapshot. We create the binary target class Evolution (E), where 1 represents an evolution for a critical status within k days from the snapshot, and 0 represents an unchanged critical status within the same time window.
The process of labelling the snapshots is performed based on the date on which a critical status was detected12. A patient’s snapshot (with date i) in which he/she was in a critical state between i and \(i + k\) is labelled as E=1 (situation A). The snapshots having a date more than k days before the critical status date (outside the time window) are labelled as E=0 (situation B). In the case of patients for who a crtical status has never been detected, their snapshots are labelled as E=0, existing at least one snapshot after \(i + k\) days (situation C). The snapshots with no critical status information after \(i + k\) days are considered not eligible for the analysis since it is impossible to ensure an evolution or not to a critical status in the considered time window (situation D). The snapshots in which the patient is in a critical status are also not eligible for the analysis since we aim to predict the evolution from a non-critical state to a possible critical one (situation E). Figure 7 shows examples of the Evolution computing process.
We chose 3 clinically relevant time windows for this study: 90, 180 and 365 days (3, 6 and 12 months). Therefore, the process resulted in 3 datasets for each target endpoint and time window (15 in total). The number of snapshots in each dataset (discriminated by the classes) is documented in Table 4.
Finally, since the underlying triclustering algorithm is a quasi-exhaustive algorithm1 and we want to make the predictions based on current and recent clinical evaluations, we defined a maximum length on historical data to assist the prognostic tasks. With this assumption, we need to transform our datasets coupling snapshots to create the final learning instances which will feed to the model. The process of grouping snapshots is depicted in Fig. 8 and consists in defining a maximum size L and grouping consecutive snapshots for each patient. The size of sets (number of snapshots) will be defined by \(\min (L, nP)\) where nP is the number of available snapshots for a given patient.
The final learning examples, used in the experiments, considered 3, 4, and 5 consecutive snapshots (CS) per patient, corresponding to clinical evaluations at 3, 4, and 5 consecutive appointments, respectively. The Evolution (Y or N) label of the last snapshot is considered as the target class. The new class distributions and the coupled snapshots are depicted in Table 5.
Table 5 shows we face considerable class imbalance. In some time windows considered in this case study, the expression of non-evolution patients (class N) is far superior to that of evolution patients (class Y). To tackle this evident imbalance and prevent its drawbacks in the classification process, when the number of examples belonging to the majority class (N instances) is higher than 2/3, we first perform a Random Undersample (RU) until obtaining a representation of 2/3 in the dataset and then used SMOTE36 to oversample the minority class examples achieving an equal number of both classes.
Baseline results: prognostic models based on patient snapshots
Reproducing the methodology based only on patient snapshots and time windows presented by Carreiro et al.12, we performed experiments to predict the evolution of a given patient to a critical status for each of the critical endpoints of interest. Predicting the progression to assisted ventilation (need for NIV) is further included. The experiments were conducted with the datasets preprocessed, as explained in previous sections. Resulted from the creation of snapshots, missing values are observed (ranging between 8 and 15% prevalence). To surpass this problem, and since we are dealing with temporal data, we imputed missing values using the values in the previous snapshot (Last Observation Carried Forward). After this, for the snapshots that had not an earlier snapshot (which were residual in number), we imputed missings with the mean/mode for the specific feature.
We evaluated four classifiers: Naive Bayes (NB), SVM with Gaussian kernel, XGBoost (XGB), and Random Forests (RF) due to their state-of-the-art performance in this kind of predictive task23,25.
The evaluation was made using a 5 \(\times\) 10-fold stratified cross-validation scheme where we ensured that all the assessments from a given patient were all in the train/test fold. Moreover, to improve the model performance, we tackled the class imbalance within cross-validation, applying the same steps explained in the previous section only in the training folds.
Tables 6 and 7 show the benchmark results. Superior results are observed against the reference state-of-the-art results gathered in a previous study (need for NIV)12. As observed in the original study12, the results for Sensitivity are lower than for Specificity, understandable as positive cases (Evolution = Y) are the minority class.
Triclustering-based classification results
To prove that historical clinical evaluations improve the model predictions, using triclusters as features, we applied our triclustering-based classification approach in accordance with the principles introduced in section “Methods”. For this case study, we opted to detach the triclusters into biclusters and then use them as features. Note that these biclusters are slices of the mined triclusters representing the temporal disease progression. As introduced, each slice is used individually to better represent the state of patients at a specific time point, given the expected differences across the temporal dimension.
As for the baseline, we performed experiments using four classifiers: Naive Bayes, SVM with Gaussian kernel, XGBoost, and Random Forests. The full results are documented in Supplementary Information File SI1 corresponding to the prognostic models for predicting the progression to the critical status C1, need for NIV; C2, need for an auxiliary communication device; C3, need for PEG; C4, need for a caregiver and C5, need for a wheelchair, respectively. We present the results for AUC, Sensitivity, and Specificity obtained with the models for time windows of 90, 180, and 365 days, identified by the clinicians as clinically relevant. We considered different numbers of historical assessments, creating datasets with 3, 4, and 5 consecutive snapshots (CS). Note that for each dataset (each one with examples with different history sizes) we applied the proposed approach using distances (D) and correlations (C) as the similarity criteria between the patients and the detached biclusters (from triclusters). Table 8 presents a summary of the best-obtained results for each target endpoint according to the three different considered time windows.
Comparing the gathered results with the baseline obtained by the state-of-the-art approach proposed by Carreiro et al.12 (see Fig. 9), we highlight the following:
-
triclustering-based classification obtained promising results, predicting all the target endpoints with solid accuracy. The best models achieved AUC results up to 90% predicting the progression for the target endpoints;
-
overall, triclustering-based predictors using current-and-past patient’s assessments are better than baseline models using only one evaluation (each snapshot individually) in predicting the progression to a critical status in ALS;
-
prognostic models of progression to C5 (wheelchair need) were those with minor differences in results against the baseline;
-
predicting progression to C1 – C4 states yield distinctively higher predictive accuracy using the proposed triclustering-based approach against baselines. Mid- and long-term predictions yield differences up to 10pp;
-
prognostic models achieved AUC above 90% when predicting the need for an auxiliary communication device (C2), PEG (C3) and caregiver (C4). Most of the best predictions needed 5 appointments, but mid-term prediction for the need for PEG (C3) and short-term prediction for the need for a caregiver (C4) only required 3;
-
overall, the distance criteria between patients and triclusters, when compared against peer correlation criteria, yield the best predictive results. The models with the best results were typically learned from a patient history with 5 follow-ups. However, for C2 and C4 needs, short-term prognostics (90 days) yielded better results using only the 3 latest snapshots from patient follow-up;
-
the high standard deviation of sensitivity estimates shows the inherent difficulty of predicting the positive class (Evolution=Y);
-
the triclustering-based approach allows to collect discriminative patterns of disease progression, promoting better model interpretability in clinical domains.
Some limitations should be noted. First, the approach is focused on dynamic features. Note, nevertheless, that static features can be straightforwardly combined along triclustering-based features for the classification training step. Appendix 1 shows the results of using the static features described in Table 10 together with the triclustering features using the best model parameters and classifiers as shown in Table 8. Second, the triclustering algorithm’s ability to deal with the heterogeneity inherent to this type of data is limited since categorical variables need to entail a denormalization step (nominal variables) or numeric encoding (ordinal variables). Finally, despite the considerably large size of the conducted cohort in light of ALS prevalence, the validation of predictors in international populations is highlighted as a subsequent relevant step.
Model interpretability
The relevance of a prognostic methodology should be evaluated not only by its predictive performance but also by its guarantees of interpretability. The proposed triclustering-based approach allows us to collect essential patterns of disease progression (used as features of the new space), promoting better model interpretability in clinical domains. In addition, the importance of the input patterns/features for the predictive model can be further recovered to rank the discriminative relevance of the underlying patterns.
To perform the model explainability and identify the more relevant patterns used by the models, the unified SHAP approach37 was applied. In particular, we select the KernelExplainer, and TreeExplainer methods, which introduce the possibility of directly measuring local feature interaction effects38. The goal is to understand what are the most relevant features, what features appear together, and whether the patterns found are clinically relevant to understand the patient’s progression to the critical endpoints: C1, need for NIV; C2, need for an auxiliary communication device; C3, need for PEG; C4, need for a caregiver and C5, need for a wheelchair.
We chose to analyze three target endpoints for three different time windows. All the outputs of the remaining endpoints and time windows are made available in a repository (see section “Data availability”). Figure 10 and Table 9 illustrate the top patterns found by TCtriCluster and selected by the classifiers to make the predictions. For the sake of simplicity, we reproduce only the outputs for Random Forest models.
An overall analysis reveals that the majority of the selected patterns refer to the last snapshot/time-point of the triclusters. This makes sense since this is the snapshot closer to the target. However, patterns corresponding to previous snapshots remain relevant since they can reveal other meaningful properties, including the underlying disease progression rate.
Conclusions
A new methodology was proposed to learn predictive models from longitudinal data using a novel triclustering-based classifier. To this end, TCtriCluster, an extension of triCluster, is proposed to handle heterogeneous clinical data with a temporal contiguity constraint. This restriction was shown to be effective in improving the efficacy of the target predictive models, highlighting its relevance for triclustering three-way time series data. We further show that triclustering-based classification enhances prognostic tasks with the potentialities of model interpretability, enabling the discovery of domain-relevant temporal patterns, then used as features in the predictive models.
As the central case study, we targeted the problem of predicting the clinical progression of ALS patients towards disease endpoints within clinically relevant time windows (90, 180 and 365 days). In particular, we focus on the prognostic of five relevant endpoints (need for non-invasive ventilation, auxiliary communication device, PEG, caregiver, and wheelchair) and assess predictability limits using different lengths of patient historical assessments.
The triclustering-based models achieved good results in short-term predictions (AUC higher than 90%) for the need for an auxiliary communication device and the need for PEG. Short-term prognostics of the need for NIV, caregiver, and wheelchair also yield good predictive performance (AUC around 85%). Some of these models improved their performance while predicting in the mid and long-term. The proposed methodology shows general improvements against state-of-the-art in the capacity to predict the target endpoints, confirming the relevance of using triclusters to perform data transformations sensitive to local patterns of disease progression. The possibility of extracting group-specific patterns along time frames of arbitrary length offers a higher degree of feature expressiveness, which is generally lacking in peer approaches. Another relevant property of the proposed transformation is the preserved interpretability of the produced features as they reveal informative progression patterns that discriminate a given outcome of interest. The inspection of those patterns unravels groups of individuals with coherent temporal variations on a subset of the clinical assessments throughout the follow-up.
This study represents a significant advance in prognostic prediction in ALS, showing generalized improvements in the predictability of degenerative progression towards critical states, meaning clinical interventions. This offers the unique opportunity to better-preparing families for the next illness stages and further entails individualized management with the purpose of optimizing independence, function, and safety, therefore reducing symptom burden and improving the quality of life of the patients.
The proposed triclustering-based methodology can further be used to learn predictive models with different types of three-way data, encompassing prognostic problems in other diseases with available longitudinal cohort studies.
Data availability
The data acquired from the undertaken cohort study are not publicly available to ensure the patients’ rights to privacy and anonymity. Contact the corresponding author for further data access queries. The proposed triclustering-based classifier was coded in Python and is available in https://github.com/dfmsoares/triclustering-based-classifier together with a demo example. The notebooks with model interpretability for all the target endpoints are available in the same repository.
References
Henriques, R. & Madeira, S. C. Triclustering algorithms for three-dimensional data analysis: A comprehensive survey. ACM Comput. Surv. 51, 95 (2019).
Madeira, S. C. & Oliveira, A. L. Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Trans. Comput. Biol. Bioinf. 1, 24–45 (2004).
Amar, D., Yekutieli, D., Maron-Katz, A., Hendler, T. & Shamir, R. A hierarchical bayesian model for flexible module discovery in three-way time-series data. Bioinformatics 31, i17–i26 (2015).
Kakati, T., Ahmed, H. A., Bhattacharyya, D. K. & Kalita, J. K. Thd-tricluster: A robust triclustering technique and its application in condition specific change analysis in hiv-1 progression data. Comput. Biol. Chem. 75, 154–167 (2018).
Zhao, L. & Zaki, M. J. Tricluster: An effective algorithm for mining coherent clusters in 3d microarray data. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, SIGMOD ’05, 694–705 (ACM, 2005).
Heffernan, C. et al. Management of respiration in mnd/als patients: An evidence based review. Amyotroph. Lateral Scler. 7, 5–15 (2006).
Chiò, A. et al. Global epidemiology of amyotrophic lateral sclerosis: A systematic review of the published literature. Neuroepidemiology 41, 118–130 (2013).
Conde, B., Winck, J. C. & Azevedo, L. F. Estimating amyotrophic lateral sclerosis and motor neuron disease prevalence in portugal using a pharmaco-epidemiological approach and a bayesian multiparameter evidence synthesis model. Neuroepidemiology 53, 73–83 (2019).
Paganoni, S., Karam, C., Joyce, N., Bedlack, R. & Carter, G. T. Comprehensive rehabilitative care across the spectrum of amyotrophic lateral sclerosis. NeuroRehabilitation 37, 53–68 (2015).
Londral, A., Pinto, A., Pinto, S., Azevedo, L. & De Carvalho, M. Quality of life in amyotrophic lateral sclerosis patients and caregivers: Impact of assistive communication from early stages. Muscle Nerve 52, 933–941 (2015).
Andersen, S. A. et al. Efns guidelines on the clinical management of amyotrophic lateral sclerosis (mals)-revised report of an efns task force. Eur. J. Neurol. 19, 360–375 (2011).
Carreiro, A. V. et al. Prognostic models based on patient snapshots and time windows: Predicting disease progression to assisted ventilation in amyotrophic lateral sclerosis. J. Biomed. Inform. 58, 133–144 (2015).
van der Burgh, H. K. et al. Deep learning predictions of survival based on mri in amyotrophic lateral sclerosis. NeuroImage: Clin. 13, 361–369 (2017).
Pfohl, S. R., Kim, R. B., Coan, G. S. & Mitchell, C. S. Unraveling the complexity of amyotrophic lateral sclerosis survival prediction. Front. Neuroinform. 12, 36 (2018).
Grollemund, V. et al. Machine learning in amyotrophic lateral sclerosis: Achievements, pitfalls, and future directions. Front. Neurosci. 13, 135 (2019).
Zandonà, A., Vasta, R., Chiò, A. & Di Camillo, B. A dynamic bayesian network model for the simulation of amyotrophic lateral sclerosis progression. BMC Bioinform. 20, 118 (2019).
Tavazzi, E. et al. Leveraging process mining for modeling progression trajectories in amyotrophic lateral sclerosis. BMC Med. Inform. Decis. Mak. 22, 1–17 (2022).
Tavazzi, E. et al. Predicting functional impairment trajectories in amyotrophic lateral sclerosis: A probabilistic, multifactorial model of disease progression. J. Neurol. 269, 3858–3878 (2022).
Leão, T., Madeira, S. C., Gromicho, M., de Carvalho, M. & Carvalho, A. M. Learning dynamic bayesian networks from time-dependent and time-independent data: Unraveling disease progression in amyotrophic lateral sclerosis. J. Biomed. Inform. 117, 103730 (2021).
Papaiz, F., Dourado, M. E. T., Valentim, R. A. D. M., Morais, A. H. F. D. & Arrais, J. P. Machine learning solutions applied to amyotrophic lateral sclerosis prognosis: A review. Front. Comput. Sci. 47, 58 (2022).
Müller, M., Gromicho, M., de Carvalho, M. & Madeira, S. C. Explainable models of disease progression in ALS: Learning from longitudinal clinical data with recurrent neural networks and deep model explanation. Comput. Methods Progr. Biomed. Update 1, 100018 (2021).
Pires, S., Gromicho, M., Pinto, S., Carvalho, M. & Madeira, S. C. Predicting non-invasive ventilation in als patients using stratified disease progression groups. In 2018 IEEE International Conference on Data Mining Workshops (ICDMW) 748–757 (IEEE, 2018).
Pires, S., Gromicho, M., Pinto, S., de Carvalho, M. & Madeira, S. C. Patient stratification using clinical and patient profiles: Targeting personalized prognostic prediction in als. In International Work-Conference on Bioinformatics and Biomedical Engineering 529–541 (Springer, 2020).
Gromicho, M. et al. Dynamic bayesian networks for stratification of disease progression in amyotrophic lateral sclerosis. Eur. J. Neurol. 29, 2201–2210 (2022).
Martins, A. S., Gromicho, M., Pinto, S., de Carvalho, M. & Madeira, S. C. Learning prognostic models using diseaseprogression patterns: Predicting the need fornon-invasive ventilation in amyotrophic lateralsclerosis. IEEE/ACM Transactions on Computational Biology and Bioinformatics (2021).
Matos, J. et al. Unravelling disease presentation patterns in als using biclustering for discriminative meta-features discovery. In International Work-Conference on Bioinformatics and Biomedical Engineering 517–528 (Springer, 2020).
Soares, D. et al. Towards triclustering-based classification of three-way clinical data: A case study on predicting non-invasive ventilation in als. In International Conference on Practical Applications of Computational Biology & Bioinformatics 112–122 (Springer, 2020).
Soares, D. F., Henriques, R., Gromicho, M., de Carvalho, M. & Madeira, S. C. Learning prognostic models using a mixture of biclustering and triclustering: Predicting the need for non-invasive ventilation in amyotrophic lateral sclerosis. J. Biomed. Inform. 134, 104172 (2022).
Pancotti, C. et al. Deep learning methods to predict amyotrophic lateral sclerosis disease progression. Sci. Rep. 12, 1–10 (2022).
Beghi, E. et al. Outcome measures and prognostic indicators in patients with amyotrophic lateral sclerosis. Amyotroph. Lateral Scler. 9, 163–167 (2008).
Madeira, S. C., Teixeira, M. C., Sa-Correia, I. & Oliveira, A. L. Identification of regulatory modules in time series gene expression data using a linear time biclustering algorithm. IEEE/ACM Trans. Comput. Biol. Bioinf. 7, 153–165 (2008).
Divina, F., Pontes, B., Giráldez, R. & Aguilar-Ruiz, J. S. An effective measure for assessing the quality of biclusters. Comput. Biol. Med. 42, 245–256 (2012).
Chiò, A., Hammond, E. R., Mora, G., Bonito, V. & Filippini, G. Development and evaluation of a clinical staging system for amyotrophic lateral sclerosis. J. Neurol. Neurosurg. Psychiatry 86, 38–44 (2015).
Fang, T. et al. Comparison of the king’s and mitos staging systems for als. Amyotrophic Lateral Scleros. Frontotemporal Degener. 18, 227–232 (2017).
ENCALS. Als Functional Rating Scale Revised (als-frs-r). version: May 2015 (2015).
Chawla, N. V., Bowyer, K. W., Hall, L. O. & Kegelmeyer, W. P. Smote: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Proceedings of the 31st international conference on neural information processing systems 4768–4777 (2017).
Lundberg, S. M. et al. From local explanations to global understanding with explainable ai for trees. Nature Mach. Intell. 2, 2522–5839 (2020).
Acknowledgements
This work was partially supported by Fundação para a Ciência e a Tecnologia (FCT), the Portuguese public agency for science, technology and innovation, funding to projects AIpALS (PTDC/CCI-CIF/4613/2020), LASIGE Research Unit (UIDB/00408/2020 and UIDP/00408/2020) and INESC-ID Research Unit (UIDB/50021/2020) and a PhD research scholarship (2020.05100.BD) to DFS; and by the BRAINTEASER project which has received funding from the European Union’s Horizon 2020 research and innovation programme under the grant agreement No 101017598.
Author information
Authors and Affiliations
Contributions
DFS: methodology, implemented the approach, analysed the data and results, writing—original draft, writing—review & editing. RH: methodology, writing—review & editing, supervision, revised the results critically. MG: data collection and preprocessing, definition of ALS case study, critically analyzed the results from a clinical point of view. MdC: performed the clinical follow-up of the patients’ cohort in Lisbon, data collection and preprocessing, definition of ALS case study, critically analyzed the results from a clinical point of view. SCM: methodology, writing—review, supervision, revised the results critically.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Appendix 1
Appendix 1
Adding static features
Although the proposed triclustering-based classifier itself does not consider using static features, we decided to add them to the learning matrices to understand if they improve the triclustering-based classifier performance. Table 10 depicts the obtained results with the same parameters that proved to be the best depicted in Table 8. In fact, static features improved the prognostic prediction of some critical points while others remained similar. Table 1 shows the static features used.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Soares, D.F., Henriques, R., Gromicho, M. et al. Triclustering-based classification of longitudinal data for prognostic prediction: targeting relevant clinical endpoints in amyotrophic lateral sclerosis. Sci Rep 13, 6182 (2023). https://doi.org/10.1038/s41598-023-33223-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-023-33223-x
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.