Unraveling cradle-to-grave disease trajectories from multilayer comorbidity networks

We aim to comprehensively identify typical life-spanning trajectories and critical events that impact patients’ hospital utilization and mortality. We use a unique dataset containing 44 million records of almost all inpatient stays from 2003 to 2014 in Austria to investigate disease trajectories. We develop a new, multilayer disease network approach to quantitatively analyze how cooccurrences of two or more diagnoses form and evolve over the life course of patients. Nodes represent diagnoses in age groups of ten years; each age group makes up a layer of the comorbidity multilayer network. Inter-layer links encode a significant correlation between diagnoses (p < 0.001, relative risk > 1.5), while intra-layers links encode correlations between diagnoses across different age groups. We use an unsupervised clustering algorithm for detecting typical disease trajectories as overlapping clusters in the multilayer comorbidity network. We identify critical events in a patient’s career as points where initially overlapping trajectories start to diverge towards different states. We identified 1260 distinct disease trajectories (618 for females, 642 for males) that on average contain 9 (IQR 2–6) different diagnoses that cover over up to 70 years (mean 23 years). We found 70 pairs of diverging trajectories that share some diagnoses at younger ages but develop into markedly different groups of diagnoses at older ages. The disease trajectory framework can help us to identify critical events as specific combinations of risk factors that put patients at high risk for different diagnoses decades later. Our findings enable a data-driven integration of personalized life-course perspectives into clinical decision-making.


Introduction
Multimorbidity, the occurrence of two or more diseases in one patient, is a frequent phenomenon 1, 2 . Today's reality of a 100-year lifespan brings a shifting multimorbidity burden and increased healthcare and long-term care costs 3,4 . It was estimated that mnore than 50 million people in Europe show more than one chronic condition 5 . In 6 , authors estimated that 16-57% of adults in developed countries are diagnosed with more than one chronic disease and predicted a dramatic rise of multimorbidity rates in the next years. The WHO World Report on Ageing and Health emphasizes the importance of research to better understand the dynamics and consequences of ageing 7 . Studies on multimorbidity patterns may contribute to successful ageing by the prevention of disease progression by identifying critical events which lead to a rapid deterioration of health 8, 9 . As diseases tend to co-occur and interact with each other (in a way that can worsen the course of both), they cannot be studied separately from each other 10 . The analysis of multimorbidity has recently been catalyzed by the massive collection of patient health information on diagnoses, medication, results of laboratory tests in electronic health records (EHR), and other clinical registries. Comorbidity networks have been established as tools to analyse multimorbidity in such datasets 11,12 . Age and sex-specific analyses can further be conducted to address age-and sex-dependent associations between diagnoses 13,14 . These works confirm that patients mainly develop diseases in close network proximity to disorders they already suffer.
The concept of disease trajectories has been proposed to formally describe the progression of multimorbidity over time. Disease trajectories are frequently occurring patterns or sequences of diagnoses at particular times and are typically extracted from the medical history of millions of patients. Thus, apart from the pairwise disease associations, uncovering complex disease patterns and assessing their temporal directionality is crucial for estimating disease progression, developing prediction models 15, 16 , analysing trajectories 17, 18 and their temporal patterns using clustering algorithms 19,20 . Many studies used data on in-hospital stays to construct such trajectories. A summary of applications of machine learning tools to understand the structure and formation of multimorbidity in the population was given in 21 . However, studies of multimorbidity patterns over the full life span of patients, from cradle to grave, remain scarce, as studies frequently take crosssectional approaches 2,22 .
Longitudinal analysis of multimorbidity requires large population-wide disease registries which span over multiple years, if not decades. Such analyses are challenging as they require custom-made methods and that are often computationally challenging 17 . Taken together, a life span perspective on multimorbidities addressing the need for more comprehensive knowledge on disease trajectories and their critical events is largely missing to date 23 .
Here, we propose a novel approach to dynamical comorbidity networks from longitudinal population-wide healthcare data to comprehensively identify disease trajectories in an entire population. A multilayer comorbidity network is constructed where nodes correspond to diagnoses, layers to age groups, intralayer links to disease co-occurrence relations, and interlayer links encode the directionality of disease pairs (which diagnosis tends to occur first). We identify temporal disease trajectories as communities in this multilayer network. In some cases, these tightly connected communities share some nodes and be referred to as overlapping communities.

Cohort extraction
All patients with at least one hospital visit in Austria during period 2003-2014 n = 7,239,710 Patients without hospital stay during 1997-2003 n = 3,955,028

Multilayer Network construction
Filtering of the network Overlapping community detection in multilayer network

Analysis of detected communities
Data on all hospital visits in Austria from 1997 to 2014 Stratification by age and 6 time windows of 2 years Cochran-Mantel-Haenszel method

Network
Clustering

Interlayer links Intralayer links
Data split it in two time frames T1 = [2003, 2008] and T2 = [2009,2014] Link weights calculation The central assumption of our approach is that communities of nodes in the comorbidity network represent patients' diseases trajectories. We identify overlapping communities rather than exclusive clusters as the same diseases (nodes) can naturally be part of different disease trajectories, i.e. sleep disorders in patients with and without obesity and diabetes mellitus type 2. We further try to identify critical events as points along trajectories, where two initially identical trajectories start to diverge and will lead to different outcomes in terms of disease burden (hospital utilization) and mortality. Figure 1 illustrates the suggested methodology of this large-scale disease trajectory study. We analysed data from an electronic health registry covering almost all of 8.9 million Austrians with more than 44 million in-hospital stays over 17 years, from 1997-2014. To ensure the comparability of the health status of our study population, we restricted the analysis to patients who were "healthy" at the beginning of the observed period between 2003 and 2014. Therefore, in the first step of the analysis, we identified as the study population all patients with at least one hospital stay between 1997 and 2002 with a diagnosis from the range A00-N99 (in total 1,081 diagnoses). Moreover, in the early 2000s, Austria transitioned from the previous ICD coding system to ICD-10 2001. It was crucial to avoid combining various classification systems as it would have compromised the reliability of the analysis, Figure 1 (blue box).
In a next step we then constructed a multilayer comorbidity network to explore how different disease conditions co-occur and develop over time. We separated our data into 10-year age groups. For every age group we introduced a layer in the multilayer comorbidity network. In this network, two types of link can be found, links that connect nodes in the same layer (intralayer links) and links that connect nodes from different layers (interlayer links). All identified significant correlations of diagnoses in the same age group are defined as intralayer links, while interlayer links represent the correlation between diagnoses in different age groups, Figure 1 (green box). Nodes without any intralayer links were removed, Figure 1 (red box).
We used an algorithm based on the local optimization of a fitness function presented in 24 to identify overlapping communities in the multilayer network, Figure 1 (orange box). Note that the detected communities typically encompass more than one age layer. We analysed the age structure of the detected overlapping communities and the number of chapters of diagnoses inside the communities. More concrete, we conceptualize disease trajectories as groups of diagnoses that occur at different age groups (layers in the network) and that are more closely connected to other diagnoses in the same community compared to diagnoses outside of the community.
As disease trajectories can overlap, this enables us to comprehensively study relationships between disease trajectories across more than one age group. We defined pairs of trajectories as converging if they do not overlap (no shared diagnoses) in younger age groups while they have a nonzero overlap in older age groups. Additionally, diverging pairs of trajectories overlap at the beginning, in younger age groups, but have different pathways in older age groups.
From this we can identify critical events in patient careers. Critical events are defined as combinations of diagnoses in a specific age group, mainly chronic conditions, that signal that the disease trajectories are about to diverge towards paths that lead to different levels of mortality or lengths of hospital stays in the following age groups. Critical events can be thought of as bifurcation points of disease trajectories that can lead to trajectories associated with strongly varying outcomes. These events can support the identification of patients at risk for more severe multimorbidity trajectories and associated adverse outcomes in the next decade and thereby provide leverage points for targeted preventive actions.

Data
The analysed dataset spans 17 years of nationwide in-hospital data from all hospitals in Austria. Each hospital stay is recorded with primary and secondary diagnoses, age in the resolution of 5 years, sex, admission and release date, release type (i.e., release, transfer, death...). This dataset covers the period from 1997 until 2014 and the vast majority of Austria's population with 8.9 unique patients. Diagnoses are coded with the three-digit International Classification of Diseases, 10th Revision (ICD-10) codes. We restricted our analysis to 1,081 codes from A00 to N99, excluding codes describing health encounters that can not be directly related to diseases (i.e., O00-O9A -Pregnancy, childbirth, and the puerperium, S00-T88 -Injury, poisoning and certain other consequences of external causes...). The data always reports a primary diagnosis as the main reason for hospitalization, along with a variable number of secondary diagnoses.
In this study, we assigned equal importance to both primary and secondary diagnoses 19,25 . To ensure that our study population's health state was comparable at the beginning of the observation period and not in the middle of connected hospitalization episodes, we introduced a wash-out period and limit the analysis to patients who had no hospital visits between 1997 and 2002. Consequently, excluding these patients also ensured that analysed data has only one ICD coding system, as in the early 2000s, Austria updated its ICD coding system to ICD-10 2001 19, 25, 26 .

Multilayer Comorbidity Network
Formally, we construct the multilayer comorbidity network given by the tensor M α,β i,j where i and j refer to nodes (diagnoses) on layers (age groups) α and β, respectively. We refer to entries in M with α = β as intralayer links and with α ̸ = β as interlayer links. The analysis was performed separately for male and female patients.
Intralayer links Intralayer links give the correlation between diagnoses within the same age group. The analysed dataset was stratified by six time windows of two years each, from 2003 to 2014. A contingency table is created for each pair of diagnoses in each stratum (for each sex and age group, the intralayer analysis includes six strata, each covering two calendar years). We used all contingency tables with more than four patients in each subgroup to compute relative risks (RR) and the p-value for rejecting the null hypothesis that the co-occurrence of two analysed diagnoses is statistically independent 26 . A weighted average of the estimates of the risk ratios and odds ratios across the stratified data were calculated by using the Cochran-Mantel-Haenszel method 27 .
Subsequently, all correlations with RR higher than 1.5 and p-value smaller than 0.05 were extracted and presented as intralayer links 14 . These links are bidirectional, and we use a normalized RR as the link weight. The normalization of RR was done such that the sum of all total weights of all intralayer links with the same target was one.
Interlayer links To estimate directionality or time order in pairs of diagnoses, we split the observation period in two time frames T 1 = [2003, 2008] and T 2 = [2009,2014]. We investigate if a patient diagnosed with i in T 1 elevates the risk of being diagnosed with j in T 2 and compute the interlayer link weight as . (1) Overlapping community detection in multilayer network We deleted all nodes without at least one inbound and one outbound link. Further, we normalized all link weights to range from 0 to 1 by dividing each link's weight by the sum of all links of the same type of a target node The algorithm for detecting the overlapping and hierarchical community structure in complex networks proposed in 24 was applied. This unsupervised clustering algorithm does not have a predefined number of communities. The detection procedure is initiated starting with a random node, which represents one community by itself.
where k G in are the total internal degrees of the nodes in the community G and k G out are the total external degrees of the nodes in the community G.
As long as the f G improves, neighboring nodes are added, or nodes already community members are removed.
The entailed resolution parameter a enables us to uncover different hierarchical levels of a system, natural choice is a = 1. Fitness is calculated at each step. Once the fitness cannot be increased anymore by a node removal or addition step, that community is "completed" and "closed." The community detection process ends when all nodes have been assigned to at least one community. To parallelize and optimize this computationally costly process, we identify the community of every node and delete duplicates among the discovered communities.
Identified communities usually consist of diseases in different age groups that tend to co-occur more frequently among themselves than diseases that are not part of the community. Hence, these communities represent typical disease trajectories; we denote a trajectory X as a set of diagnosis-age tuples, X = {(i 1 , α 1 ), (i 2 , α 2 ), (i 3 , α 3 )...}, where i is an ICD10 code ranging from [A00, N99] and α is the age group from [1,8].
We measure the similarity of trajectories by the Jaccard coefficient between two trajectories consisting of tuples with diagnoses i and age groups α, (i, α). That is, two trajectories have a non-zero overlap if they share diagnoses within the same age groups.
Identifying converging and diverging trajectories We performed a comprehensive classification with respect to all pairwise relations between every pair of trajectories. Provided that two trajectories share at least one diagnosis, they can be related in one of four different ways, namely (i) diverging, (ii) converging, (iii) nested, or (iv) persistent, Figure 2.
Diverging trajectories have some overlapping elements at younger ages, but they develop into markedly different sets of diagnoses at older ages.
Converging trajectories overlap at older ages but are clearly different at younger ages. Trajectories X and Y are converging if it holds that where Two trajectories are nested if one of them is a subset of another one, X ⊂ Y or Y ⊂ X. Persistent trajectories X and Y can overlap in the highest age group of X and lowest age group of Y , or vice versa.

Diverging
Converging Nested Persistent Identifying critical events We define critical events by one or a combination of diagnoses and age groups where two trajectories begin to diverge and where one of the diverging trajectories has patients with considerably higher number of diagnoses, higher mortality or more extended hospital stays in the subsequent age group(s) compared to the other diverging trajectory. Mortality of a trajectory for a certain age group is calculated as where m is the in-hospital mortality of a diagnosis (defined as the percentage of patient diagnosed with the diagnose in a specific age group who die in-hospital) which is a member of a trajectory. Length of hospital stay of a trajectory in a certain age group is defined as the average number of days spent in hospital for patients who are diagnosed with at least half of all diagnoses from a trajectory.

Multilayer Comorbidity Network
We constructed the multilayer comorbidity network based on hospital data, basic characteristics of the database are shown in Figure S1. We used all 3-digits ICD10 codes from the range A00-N99 and one more newly introduced code for patients without any diagnosis, in total 1,082 codes. Nodes in the constructed network are ICD10 codes appearing in one of eight different age groups, i.e. E66-0-9, E66-10-19, etc. Hence, we used 8,648 nodes to construct a multilayer comorbidity network with eight layers (one for each ten years age group, 0-9, 10-19,... 70-79 years old). We filtered the network by removing nodes without any intralayer links. This reduced the network from 8,648 nodes to 4,923 nodes for males and 4,764 nodes for females. The average degree in the filtered male network is 11.6 SD 39.7, for the female network the average degree is 15.8 SD 46. The number of hospital stays, Figure 3(A), and nodes N Figure 3(B) increases with age, reaches a peak at ages 60 to 69 and decreases for older ages. We see similar age trends in Figure 3(C) the total number of links and Figure 3(D) the average degree for intralayer as well as in-or outbound interlayer links for males and females.

Trajectories
The unsupervised community detection algorithm discovered 642 distinct disease trajectories in the male and 618 in the females network; they are listed in SI, Tables S1-S2, and shown in Figure 4. These trajectories contain on average 9 (IQR 2-6) different diagnoses that range over up to 7 age groups (mean: 2.3 age groups), meaning that these trajectories range on average over 20-30 years and in some cases over up to 70 years of life. Besides trivial examples like a trajectory with the only diagnosis being K51 (ulcerative colitis) in each age group in males, we also found more complex trajectories spanning 70 years. For instance, for female patients there is a trajectory that starts with personality disorder (F61) at the age of 20-29y. Over the following decades there is an accumulation of mental disorders including depression (F33), post-traumatic stress disorder (F43) and eating disorders (F50) in 50-59y, followed by anxiety disorders (F40) and a few more non-chronic diagnoses in 60-69y. Figure 4: (A) Multilayer comorbidity network of female patients. Nodes represent diagnoses in age groups of ten years; each age group makes up a layer of the comorbidity multilayer network. The node color indicates the diagnose chapter; the diagnosis prevalence of the node in the network scales node size. All identified trajectories of female (B) and male (C) patients. The members of the trajectories are nodes of a multilayer comorbidity network (diagnose + 10 years age group). The Y-axis approximates the age groups of nodes; the node color indicates the diagnose chapter, and each grey area around nodes is one trajectory. More detailed visualization of these plots can be found in the interactive web application: (A) https://vis.csh.ac.at/netviewer/ and (B) & (C) https://vis.csh.ac.at/comorbidity_network_graphics/ The distribution of the size of the trajectories (number of diagnoses-age tuples) is presented in Figure 5 (A). Most trajectories contain between 3 and 5 diagnoses-age combinations; while a few trajectories contain more than hundred elements. We split trajecto-ries into seven groups based on the number of age groups in the trajectory and analysed the number of different disease chapters in one trajectory Figure 5 (B). This shows that trajectories typically span heterogeneous chapters of ICD codes, meaning that they often span diagnoses affecting quite different organ systems. We calculated the Jaccard index to inspect the pairwise similarity and dissimilarity of trajectories; see the distribution of this index in Figure 5 (C). Jaccard indices range between zero and one, indicating varying degrees of similarity between two trajectories. The most common relationship among pairs of trajectories is nested, which explains the peak at one in the Jaccard index. Figure 5 (D) shows frequency statistics of different types of trajectory pairs. In Figure 6 we show a more detailed view of some of the trajectories from Figure  4. We show two examples for trajectories (grey areas) departing from (A) hypertension (I10) at an age of 10-19y in females and (B) sleep disorders (G47) at an age of 20-29y in males. In both cases, different combinations of other diagnoses appear in subsequent decades. The hypertension trajectory diverged into chronic kidney diseases (2,289 patients) or (1,027 patients) a combination of metabolic (obesity, disorders of lipoprotein metabolism) and digestive disorders (liver diseases, cholelithiasis) with nicotine abuse. The sleep disorder trajectory diverged either toward the metabolic syndrome (including obesity and type 2 diabetes) in 115 patients or towards a combination of movement disorders, hernia, obesity and diseases of the middle ear (316 patients).
In total, we identified 35 pairs of such diverging trajectories in females and 35 in males; see Figure 5 D). On average, diverging trajectories have 2.9 SD 0.8 age groups, 3.5 SD 1.8 different diagnoses chapters, and 8.1 SD 4.7 different diseases for females, and for males 3.0 SD 1 age groups, 3.5 SD 2.9 different diagnoses chapters, and 11 SD 11 different diseases. While there are 64 pairs of converging trajectories in females and 95 in males, converging trajectories in females have: 2.8 SD 0.9 age groups, 4.2 SD 3.2 different diagnoses chapters, and 26 SD 79 different diseases, in males: 3 SD 1 age groups, 3.8 SD 3.5 different diagnoses chapters, and 22 SD 68 different diseases. Some of the trajectories are persistent (16 pairs of trajectories in females, 14 in males). These can be combined as they overlap at the end of X trajectory and the beginning of the Y trajectory.
The most frequent relationship between trajectories was the complete overlap of shorter and longer trajectories, which we defined as nested. We found 314 pairs of nested trajectories among female trajectories and 266 in males trajectories.
We designed and implemented an online visualization tool that allows a user to interactively explore the comorbidity network structure and the underlying diagnose data, https://vis.csh.ac.at/netviewer/.

Outcomes of trajectories
For every trajectory, we calculated (in-hospital) mortality and the number of days spent in the hospital for each age group, Figure 7. In-hospital mortality for each trajectory is shown in the yellow outer circle. The analysis reveals notable variations in mortality rates across trajectories, with younger age groups generally exhibiting lower mortality. Moreover, it is evident that certain trajectories undergo significant shifts in mortality as they progress into older age groups. The green circle represents the average duration of hospitalization for trajectories, while the blue circle denotes the number of diagnoses, and the purple inner circle signifies the count of patients who followed at least 50% of a given trajectory. Notably, the green circle highlights discernible differences in the number of hospital days Figure 6: Two examples of diverging trajectories, (A) departing from hypertension (I10) at an age of 10-19y in females diverges to the "kidney-trajectory" and the "metabolic trajectory" (B) departing from sleep disorders (G47) at an age of 20-29y in males diverges to metabolic trajectory with diabetes mellitus type 2 (E11), obesity (E66), lipid disorders (E78) and hyperuricemia (E79) and path with movement disorders or otitis media (G25), obesity (E66) and abdominal hernia (K46).
among different trajectories. Some trajectories have a clearly higher number of hospital days compared to other trajectories; these trajectories mainly consist of mental and behavioral disorders (F chapter) and infectious and parasitic diseases (B chapter) in males, while in females, besides these we see diseases of musculoskeletal and connective tissue (M chapter) and diseases of the nervous system (G chapter).
We also compared outcomes of diverging trajectories; some examples are shown in Table 1 (extended tables in SI, Table S8 and S9). We calculated an average number of hospital diagnoses, hospital days, and hospital stays for each age group in each trajectory over all patients following these trajectories. We calculated the ratio of each outcome of trajectories in each diverging pair to check if these trajectories develop into different outcomes in terms of disease burden and mortality. For example, both trajectories from the pair starting with N81 in 50s are characterised with a similar average number of hospital diagnoses in the 20s, while in the 30s patients of the second trajectory have, on average, 24% more hospital diagnoses. In the same example, we see that patients of the first trajectory, on average have more days spent in hospital and hospital stays in the 20s (Ratio of average number of / number of days spent in hospital = 1.547 / hospital stays = 1.548), but in 30s patients of the second trajectory, have remarkably more days spent in hospital and hospital stays (Ratio of average number of / number of days spent in hospital = 0.331 / hospital stays = 0.551), Table 1. : Outcomes of trajectories that span more than one age group (A) Females (B) Males, the outer yellow circle shows mortality of each trajectory in each age group, the green circle presents an estimation of the average number of hospital days of patients of the trajectory, while the blue circle presents an estimation of the average number of diagnoses, the inner purple circle shows a number of patients who are following each trajectory in each age group.Each trajectory is represented by a single line within each circle, which is further divided based on the age groups the trajectory encompasses.

Discussion and conclusion
In this work we introduced a novel method to identify life-course disease trajectories, in some cases spanning up to 70 years of life, in terms of sequences and combinations of hospital diagnoses that form and change over time. Our comprehensive analysis identified 642 disease trajectories in males and 618 in females ranging over the entire diagnostic spectrum (41% of males and 42% of female trajectories contained diagnoses from more than one ICD chapter). While the most common length of these trajectories was two diagnoses for both sexes, on average they contained 5.3 SD 5.1 and 5.4 SD 5.5 diagnoses for males and females, respectively, emphasizing the heterogeneous and widespread nature of multimorbidity in the general population.
There is a substantial variation in the number of patients that follow a trajectory. We count patients for each trajectory for each age group if they have at least 50% of diagnoses from a trajectory. In general, shorter trajectories tend to be followed by more patients (more than 10,000 patients per trajectory per age group) than longer, more specific ones that typically contain approximately a hundred patients. The number of patients in a trajectory typically increases with age.
The trajectories foster the rapid identification of critical events. These can take the form of bifurcation points where a trajectory "splits up" into multiple diverging trajectories at a specific age group. More concrete, we found 35 pairs of diverging trajectories for females and 35 pairs for males. For example, in females diagnosed with arterial hypertension (I10) between 10 and 19 years, two major trajectories were identified by the model. The first trajectory lead to the additional diagnosis of chronic kidney disease (N18) at an age of 20-29 years. This is clinically relevant as the number of pediatric arterial hypertension is increasing worldwide 28 and it is well known that aHTN is closely related to chronic kidney disease. So our results point out that from a clinical point of view, a strict monitoring for arterial hypertension should be established especially in children at high risk, such as obese children or children with the metabolic syndrome. Arterial hypertension does not only mean increased risk for chronic kidney disease, but also other complications such as cardiovascular disease. The second trajectory was characterized by patients with the metabolic syndrome; these patients were disproportionally diagnosed with obesity (E66), lipid disorders (E78), steatosis hepatis (K76), cholelithiasis (K80) and nicotine abuse (N17) in their further life. In general, we therefore have two trajectories in females initially diagnosed with arterial hypertension, which are in principal dangerous conditions -the "kidney-trajectory" and the "metabolic trajectory". We found that approximately 2,289 patients follow the "kidney-" and 1,027 patients follow the metabolic trajectory. These trajectories are mostly important as metabolic diseases belong to the most common diseases worldwide and also chronic kidney disease is a disease which is related to multi morbidity and increased mortality rate.
In a different example we found that sleeping disorders (G47) in males diagnosed in the age groups between 20-39 years were also followed by a metabolic trajectory which was defined by an over-representation of later diagnoses of diabetes mellitus type 2 (E11), obesity (E66), lipid disorders (E78) and hyperuricemia (E79). The other trajectory, diverging from sleeping disorders, is characterized by a higher chance of being diagnosed with movement disorders or otitis media (G25), obesity (E66) and abdominal hernia (K46). We found substantial differences in the average number of diagnoses and hospital days between patients of different branches of these diverging trajectories. While patients who followed these two trajectories showed similar average numbers of diagnoses at age 20-29 (3.3 diagnoses in both cases), patients who followed a metabolic trajectory had, on average, 3.9 diagnoses ten years later while patients who followed the other trajectory had, on average, 5.1 diagnoses. The number of sleeping disorders is on the rise and these results show that patients with sleeping disorders have to be monitored for several diseases in different trajectories. Our analysis also identified several instances were diverging trajectories differed substantially in their mortality, in some cases of up to 18 times.
In terms of mortality we identify trajectories that develop into a combination of diagnoses with high mortality in older age groups. For instance, a trajectory consisting of chronic bronchitis and COPD at an age of 40-49y, bronchiectasis and intraoperative and postprocedural complications at 50-59y and finally in sequelae of tuberculosis, inflammatory polyneuropathy, conjunctivitis, bronchitis, bronchiectasis, eosinophilia and again intraoperative and postprocedural complications in 60-69y in males had eight times higher mortality in the age group 60-69y compared to its' mortality ten years earlier (mortality increased from 0.089 in 40-49y to 0.013 before jumping to 0.11 in 60-69y). Trajectories with the highest mortality usually contain cancer diagnoses, but cardiovascular or respiratory diseases also feature in the trajectories with high mortality.

Strengths and Limitations
Strengths of this study include its comprehensive population-wide in-hospital database, containing information on about 9 million individuals. Non-systematic errors, such as randomly missing diagnoses, have little impact on our research because of the volume of the data set. However, this study has some limitations caused by data quality and limitations in data availability, in particular, the lack of information on outpatient visits, medication and lifestyle. Consequently, we cannot evaluate the outcomes of outpatient visits, blood tests, examinations, or imaging because primary care diagnoses are not recorded in this dataset; only hospital diagnoses coded with ICD10 codes were available for analysis.
Another drawback is that the database was designed for billing purposes, so diagnoses that did not result in financial compensation were frequently not reported. Therefore, we have to point out that some diseases, such as alcohol related disorders or nicotine dependence, are often not recorded correctly in our data. Further, socio-economic indicators for individual patients were also not available in the dataset, leaving it yet to be explored how socio-economic status impacts on these trajectories. An additional constraint associated with the dataset is the exclusive availability of in-hospital mortality data. On a methodological level, it is also important to bear in mind that a constructed multilayer comorbidity network has two types of links (with normalized links weights); but these types are not distinguishable by the used community detection algorithms.
In summary, we presented a novel and statistically grounded way of studying disease progression over time based on a population-wide and decade-spanning data set of hospital diagnoses. We proposed an age multilayer comorbidity network as a base for our modelling approach. We showed that this kind of network is a promising approach for better understanding disease trajectories and their dynamics as patients age. While some of the identified trajectories in this study have been described in previously published studies, many novel disease trajectories and their decades-long time dynamics have been revealed. A better understanding of diseases, their correlations and the sequences in they occur has the potential to improve the prevention of focal diseases. Early detection and identification of a patient's projected disease trajectory might enable prompt and timely treatments next to targeted preventive action. Consequently, that will help transition health systems from single-disease models to more effective life-spanning and individualized multimorbidity models 29 .

Author Contributions
ED and PK conceived the study and devised the analytic methods. ED wrote the manuscript with contributions from PK, ST, ML and JS. ED carried out the analysis and produced the plots and graphics. JS and LY designed and implemented the visualisation too. AK-W, AK and ML contributed medical expertise regarding the medical interpretation of the findings and in developing medical hypotheses. ED and PK researched and prepared the data. All authors reviewed and contributed to the manuscript.
Unravelling cradle-to-grave disease trajectories from multilayer comorbidity networks  We designed and implemented an online visualization tool that allows a user to interactively explore the comorbidity network structure and the underlying diagnose data, https://vis.csh.ac.at/netviewer/. The tool positions the comorbidity network in a hierarchical fashion -one hierarchy layer for each age group; within each layer the nodes represent diagnoses, labelled by ICD10 code. A layer places nodes on a 2D plane within 3D space and signifies the associated age group by a label. The node color indicates the diagnose chapter; node size is scaled by the diagnosis prevalence of the node in the network. Connections between nodes indicate the likelihood that a pair of diseases is co-occurring in the link color (from white (unlikely) to orange (very likely)). A multitude of options to explore network properties and structure are at the user's disposal. A user can search for diagnoses (top left), or select one of the pre-computed communities (Select Group Button). When selecting a node or a community the camera in the main view automatically transforms to a position that puts all selected network elements into view. A two dimensional representation of the current community is displayed in the bottom right corner.

Suplementary Information
When selecting a node in the main 3D view or in the 2D community view, information on detailed node attributes are displayed in the info box on the left. The info box also displays a list of a node's community memberships as well as incoming and outgoing edges of the currently selected node, along with the edge weights. Selecting a community or an edge automatically transforms the view to the selected element.