Patients [1] with rare diseases are a major challenge for healthcare systems. Rare diseases affect ~350 million people worldwide [2]. According to the World Health Organization (WHO), ~7000 rare diseases affect 7% of the world’s population [3]. It is estimated that rare diseases affect 5–8% of the global population and account for 4.5–4.5% of all hospitalization costs [3,4,5,6]. In the United States, the estimated number of patients living with rare diseases is estimated to be greater than 30 million. Up to 40% of these patients are misdiagnosed, resulting in an exacerbation of their underlying disease with a concomitant incidence of comorbidities and decreased quality of life (QoL). More than 90% of rare diseases do not currently have effective therapies. Among patients who are correctly diagnosed, the majority lack access to appropriate drugs. A significant proportion of these patients do not respond to therapies or suffer partial or complete loss of effect over time, which leads to multiple admissions and long-term complications. Patients with rare diseases account for direct medical costs in excess of $418 billion in the US alone per year. These “diagnosis gaps” and “drug gaps” create an opportunity to develop a better method for diagnosis, treatment, and monitoring for these patients [7]. In the present paper, we describe a second-generation artificial intelligence (AI) system that provides an inclusive answer to the unmet needs of patients with rare diseases.

Current problems in diagnosis, therapy, and follow-up of patients with rare diseases

A review of recently published studies suggests that there are three major gaps in providing adequate care for patients with rare diseases: (a) failure to diagnose, late diagnosis, and misdiagnosis, which are highly common in these patients; (b) lack of satisfactory response to therapies: a significant proportion of the patients do not respond well to current therapies. These numbers are estimated to be between 20% and 40% of patients with rare diseases; (c) lack of proper monitoring tools. All three gaps are associated with poor outcomes and are major sources of economic burden on healthcare systems [8].

Many patients with rare diseases have experienced diagnostic odyssey, that is, undergoing extensive and prolonged serial tests and clinical visits, sometimes for many years, all with the hope of identifying the etiology of their disease [3]. The diagnostic process in rare diseases is usually based on classic clinical practices, such as physical examination, personal and family history, laboratory tests, and imaging studies. Delayed diagnosis is frequent because of the lack of knowledge of most clinicians and the small number of expert centers [4].

The search for pathogenic variants in rare human genetic diseases has involved significant efforts to sequence coding regions or the entire genome. The expanding catalog of variants in ~4000 genes recently listed ~6500 diseases and their annotated phenotypes [5]. However, the approximate current diagnostic rate is <50% using these approaches, and many rare genetic diseases remain undiagnosed [9].

For patients with rare diseases, obtaining a genetic diagnosis can mean the end of the diagnostic odyssey, and the beginning of another, the therapeutic odyssey. Most rare diseases still lack approved treatments despite major advances in research providing tools to understand their molecular basis, as well as legislation providing regulatory and economic incentives to catalyze the development of specific therapies known as orphan drugs [8]. Even though innovations in diagnosis and treatment strategies are helping some patients achieve clinical improvements, their limited efficacy, the high cost of drugs, and limited health care budgets restrict access to therapies. Both direct and indirect high costs of treatments create an economic burden for patients, families, and health care systems.

There are three major problems associated with the development of therapies for rare diseases. First, among the over 7000 rare diseases, the vast majority are caused by genetic defects and/or include neurodegeneration, making them difficult to treat. Second, drugs for rare diseases, termed orphan drugs, are expensive, as only a small number of patients are interested in purchasing them. Third, a significant proportion of these patients do not respond to available therapies because of partial or complete loss of response (LOR). This results in the emergence of an economic trap of rare diseases; specifically, despite the high biomedical, pharmaceutical, and technological potential, the development of new drugs is blocked by the economic reality and lack of effectiveness of the drugs [10].

Recent data indicate that substantial improvements are needed to achieve patient-centricity for the management of patients with rare diseases [11]. The current paradigm for developing and using patient-reported outcomes (PRO) measures in clinical research is failing to meet the needs of patients with rare diseases. Less than half of pivotal trials of orphan drugs have a PRO measure as a primary or secondary endpoint. Only 17% of orphan drug labels contain PRO measures [11]. Traditional randomized clinical trials may not always be feasible for patients with rare diseases. Following the release of the framework for the real-world evidence (RWE) program, the US food and drug administration (FDA) and European authorities are exploring ways to optimize the utility of real-world data (RWD) and RWE to support decision-making for these patients. This highlights the need to improve the quality of data by using RWD and RWE beyond “regular hard endpoints” [12].

Examples of the unmet needs in the diagnosis, treatment, and monitoring of rare diseases

Gaucher disease (GD) is an autosomal recessive disease caused by variants in the GBA1 gene located on chromosome 1 (1q21) [13]. The effects of this disease highlight the need for better diagnostic therapeutic tools for treating rare diseases. This phenotype is highly variable [14]. Although the disease is mainly diagnosed in childhood, its manifestation in adults is often missed or identified late due to the failure to recognize the heterogeneous clinical presentation. Current therapeutic options for GD1 are intravenous enzyme replacement therapy (ERT) and oral substrate reduction therapy (SRT) [15]; however, high inter-individual variability in the clinical response to ERT has been associated with a high risk of long-term complications. One-quarter of patients still have thrombocytopenia after four years of therapy [16]. Hepatomegaly and splenomegaly decrease by up to 60%; however, spleen volume may remain more than five times the normal volume in some patients [17]. Furthermore, the response to bone manifestations is slower and more variable.

Cerebellar ataxia (CA) is a group of conditions resulting from cerebellar damage that may be acute or chronic, hereditary, or acquired [18]. The effects of CA emphasize the need for improved therapeutic and monitoring tools in patients with rare diseases. Impairment in cerebellar function presents mainly as dysynergia, dysmetria, tremor, poor balance, gait instability, dysarthria, and cognitive impairment. Gait abnormality is common in most patients with CA, leading to recurrent falls and subsequent injuries [19]. Features of disability and inability to resume work affect many and increase the economic burden of the disease [20]. Due to its heterogenic etiological nature, treatment is aimed at alleviating symptoms and improving functional capacity by physical therapy and rehabilitation [21]. The economic burden is impacted by disease severity, which contributes to direct and indirect costs. Significant expenses are due to the need for rehabilitation, inpatient care, caregiver salary, and loss of productivity signifying the need to improve treatment strategies [22, 23].

Huntington’s disease (HD) is an autosomal dominant progressive neurodegenerative disease. The effects of HD emphasize the unmet need to improve the response to available therapies for rare diseases. HD is caused by a nucleotide repeat (CAG) in the huntingtin gene on chromosome 4. The symptoms of HD include chorea, psychiatric disturbances, and dementia. The most prominent symptom at presentation is usually chorea; however, psychiatric symptoms such as depression and problems with social relationships may appear years earlier [24]. Patients with advanced HD require 24-h care for all activities of daily living. This stage can last for 10 years, thereby contributing to the high economic burden associated with HD [25]. HD treatment is supportive due to the limited effectiveness of pharmacological therapies. The current first-line treatments are inhibitors of vesicular monoamine transporter type 2 (VMAT2), tetrabenazine, and deutetrabenazine [26, 27]. Second-line treatments include antipsychotics.

Neuromyelitis optica spectrum disorder (NMOSD) belongs to a spectrum of neuroinflammatory disorders characterized by demyelination and axonal damage in multiple spinal cord segments and optic nerves. The effects of this disease highlight the need for better diagnostic and therapeutic tools for treating rare diseases. The prominent antibody in NMOSD is an aquaporin-4 (AQP4) antibody, whose titers correlate with disease activity [28, 29]. NMOSD most commonly manifests as optic neuritis and transverse myelitis with a relapsing course. Diagnosis is based on clinical characteristics, MRI, and AQP4 antibodies [30]. If not treated, stepwise deterioration occurs [31]. Treatment for acute attacks consists of glucocorticoids and, for severe symptoms or unresponsive patients, plasma exchange [32]. To prevent attacks, immunomodulatory drugs are recommended, including azathioprine, mycophenolate mofetil, methotrexate, and rituximab [32]. Additional treatment options include the anti-C5 antibody eculizumab and the anti-CD19 antibody inebilizumab. A survey conducted among physicians in 60 countries showed that the average annual cost of treatment for each patient is extremely high [33]. Difficulties associated with selecting the appropriate therapy and LOR are major obstacles to improving the outcome of these patients.

These examples highlight major problems in the management of rare diseases. Late diagnosis and misdiagnosis are common in patients with these disorders. Having a tool for early diagnosis and early identification of patients who likely to incur high health care costs, and those with a low QoL can lead to early intervention and prevention of deterioration. There is also a need for a tool to improve adherence to interventions, including non-pharmacological therapies (e.g., physiotherapy), and to increase the efficacy of chronic drugs. Monitoring tools for these patients may also improve care and decrease costs.

Currently, there is no way to determine which patients will respond to targeted treatments. A major advance in rare disease treatment would be a way to diagnose patients within a short period of time, to predict a patient’s ability to respond to therapies, and improve the effectiveness of therapies by reducing the LOR, which is common in many of these patients. This will require recognizing elements that predict and maintain patients’ responsiveness, so that treatment can be customized for each individual patient [34].

First-generation AI platforms: advantages and disadvantages

Rare diseases, which are severely underrepresented in basic and clinical research, can benefit from AI technologies. The application of AI in medicine is represented by machine learning (also called deep learning), which comprises mathematical algorithms that improve learning through experience. There are three types of machine learning algorithms: (i) unsupervised (ability to find patterns), (ii) supervised (classification and prediction algorithms based on previous examples), and (iii) reinforcement learning (use of sequences of rewards and punishments to form a strategy for operation in a specific problem space) [35]. First-generation AI systems largely focus on clinical decision-making through big data analysis and on developing algorithms for diagnosis and treatment [36]. While these systems were shown to be beneficial in some cases, they failed to improve clinical responses, leading to a low adoption rate by patients. AIs trained using big data are not suitable for many rare diseases where data are limited to a small number of patients [37, 38].

Multiple factors are involved in the successful diagnosis and treatment of patients with rare diseases. These factors include disease phenotype and severity, adherence to medications, and pharmacogenomic and pharmacokinetic factors. Several computerized diagnostic support systems have been developed for the diagnosis and treatment of these patients. The ability of AI technologies to integrate and analyze data from different sources (e.g., multi-omics and patient registries) has been proposed to overcome the challenges encountered during the management of patients with rare diseases. These challenges include low diagnostic rates, low number of patients, geographical dispersion, and LOR to therapies [39, 40].

Strategies that involve dataset analysis in combination with modeling and simulation to optimize clinical drug development have been applied in the study of rare diseases. Clinical decision support systems (CDSSs) for rare diseases are software systems that support clinicians in diagnosing patients. Nineteen different CDSSs that are clinically important have been identified. Of these CDSSs, 12 used phenotypic and genetic data, followed by clinical data, literature databases, and patient questionnaires. Fourteen are fully developed systems and are, therefore, publicly available. Data can be entered or uploaded manually in six CDSSs, whereas for four CDSSs, no information on data integration was available. Only seven CDSSs allow for further data integration. Thirteen CDSSs do not provide information on their clinical usage [41].

In a review of 211 studies that investigated 74 different rare diseases, ensemble methods, support vector machines, and artificial neural networks were the most commonly applied algorithms. Only a small proportion of studies evaluated their algorithms using external data or against human experts. Input data, images, demographic data, and “omics” data were most frequently used. Most studies have used machine learning for diagnosis or prognosis, whereas studies that aim to improve treatment are relatively scarce. Patient numbers in these studies were relatively small [42].

Cliniface is a digital tool that is currently being implemented in the global commission data ecosystem. This three-dimensional facial visualization and analysis software enables collaboration between clinicians and researchers to advance the understanding of facial characteristics and their relationship with rare diseases and their treatment [43]. Tools such as PhenoTips and Dx29 [44] have been developed to facilitate phenotyping, primarily for diagnostic support. Track health enables the measuring, monitoring, and tracking of a patient’s journey in the health system [45]. The Marfan Foundation has unveiled a mobile health app that allows patients with Marfan syndrome to collect health information from disparate mHealth devices and medical records and create a personal health record. The app also helps providers to manage patients who often see several specialists. A trial conducted in patients with cystic fibrosis showed that the use of a smartphone app facilitates earlier detection of respiratory exacerbations and treatment using oral antibiotics [46].

Concerns regarding unacceptable results, problems associated with data appropriateness, and the risk of bias are some of the challenges encountered by these first-generation systems [47]. First-generation systems usually do not use understandable decision-making algorithms, which further decreases their adoption by clinicians. The improvement in accuracy during data analysis that these systems seek to achieve does not necessarily imply better clinical efficacy [48]. None of these systems showed significant market penetration, mainly because of their relatively low added value to patient outcomes [49]. Some of the available systems either focus on data collection or are simple reminders for taking drugs. These systems, therefore, cannot improve the effectiveness of therapy and adherence.

Establishing a second-generation AI system for improving the treatment of patients with rare diseases

The second-generation AI system was designed to provide an inclusive solution to the three major problems encountered during the management of patients with rare diseases. These problems include gaps in diagnosis, treatment, and the monitoring of patients. Second-generation systems focus on individuals and improve the clinical outcomes of patients [37]. The personalized closed-loop system used by these systems is designed to improve end-organ function, overcome tolerance and loss of effectiveness, and improve the responses of patients to chronic drugs.

Table 1 summarizes some of the current gaps in the diagnosis, treatment, and follow-up of patients with rare diseases and the solutions offered by the second-generation digital system.

  1. i.

    Using a second-generation system for proper diagnosis of patients with rare diseases.

    The second-generation digital system offers a tool for early and proper diagnosis of rare diseases. It works by searching datasets based on “clinical hints,” which are commonly missed when patients present early during the course of their disease with initial disease manifestations. In addition, systems are being developed to enable the early identification of patients who likely to incur high health care costs [50]. This enables early intervention and prevention of deterioration, which reduces costs in the long run for these patients. The system provides physicians and health systems with a tool that helps to associate disease manifestation or unexplained laboratory results with differential diagnosis, based on their probabilities of occurrence in different rare diseases.

    Table 1 Overcoming major barriers in diagnosis, treatment, and monitoring of patients with rare diseases using second-generation artificial intelligence systems.
  2. ii.

    Improving the effectiveness of therapy for rare diseases using a second-generation system.

    The digital pill comprises original or generic drugs used to treat patients with rare diseases that are regulated by a second-generation algorithm. It was designed to improve the response to currently available therapies [37, 51, 52]. It may offer a tool for treatment modifications early in the course of the disease. Personalized induction regimens and treatment-to-target dose intensification, which improve outcomes, are embedded in the algorithm. In contrast to first-generation platforms, the second-generation system is being developed as a method that uses a continuous dynamic feedback loop to account for changes in patient status, disease progression, response to therapy, and environmental factors. Figure 1 shows the use of the system for improved management of patients with rare diseases.

    Fig. 1: A second-generation artificial intelligence system for improving the management of patients with rare diseases.
    figure 1

    A patient suspected for a rare disease is being evaluated for his clinical data by the system enabling accurate early diagnosis and treatment.

    For biomedical therapies to be successfully designed, there is the need to consider the effects of inherent variabilities in an individualized way [51, 53,54,55]. A high degree of intra- and inter-patient variability in drug metabolism, pharmacodynamics, and drug responsiveness has been described [51, 52, 55]. The variability inherent to biological systems and the response to medications underlies the loss of drug effectiveness. Regular administration of a constant daily dose is more likely to be associated with drug resistance [56]. The process of drug tolerance development can sometimes be reversed by implementing a “drug holiday” [57]. The use of treatment regimens that are based on an aperiodic routine of taking the medication at irregular intervals and doses was suggested to improve the overall effect of the medication and reduce the likelihood of the development of resistance [51, 56, 58,59,60,61].

    The digital pill system is being developed at three levels [54]. In the first level, the system provides an app that reminds the patient of the dose and time of administration, as well as non-pharmacological interventions, including physiotherapy. The caregiver is asked to input the ranges of dosages and times of administration of each drug in the app. The system comprises a random number generator that introduces variability in dosing and times within the approved range [37, 54].

    The second level of the system comprises a closed-loop system that alters the variability in dosing and times of administration based on the patient’s response to therapy. The system learns from each patient and customizes the patient’s chronobiology-based regimen. It also imports data collected from all other patients into the algorithm. Endpoints are based on clinically meaningful parameters of efficacy as determined by the physician, along with personal parameters that are based on patient reports [54]. The algorithm alters the variability in dosing and times of administration of the drug based on the patient’s response. While the system continuously learns from all patients, it personalizes the therapeutic regimen for each subject [37, 54].

    In the third level of the system, signatures of variability that are relevant to the disease have been incorporated into the treatment algorithm [37, 54]. These signatures are relatively obvious for chronic diseases: heart rate variability can be used for patients with chronic heart diseases [51], electroencephalogram-derived variability data can be used for patients with epilepsy [62], and variability in cytokine profiles can be used for patients with inflammatory disorders [61]. For patients with rare diseases, the algorithm can identify the patterns that are relevant to the disease by determining variability in laboratory results (e.g., platelets, hemoglobin levels, and bone scores, for patients with GD), or clinical parameters (e.g., gait variability in patients with CA or HD). The algorithm can select the appropriate variability patterns in a patient-tailored manner by continuously comparing the inputs, in the form of various signatures, with the output, in the form of the therapeutic regimen, in a way that adapts itself to the selected outcome measures.

    The digital pill may provide a simple method to overcome drug resistance in patients with chronic diseases. These chronic diseases include cancer [58], epilepsy [62], inflammatory bowel diseases [61], arthritis [61], metabolic diseases [63], obesity [64], pancreatitis [60], infections [59, 65], microtubules-linked disorders [66,67,68], microbiome-based disorders [69], and chronic inflammation [70].

    The second-generation system may overcome several of the obstacles faced by first-generation systems. The system is based on the n = 1 concept, which is ideal for patients with rare diseases for whom big validated datasets are not always available. Moreover, the phenotypic differences among patients with rare diseases make it inappropriate to use averages that are based on limited datasets. While learning from PRO and electronic medical records, the final therapeutic regimen generated for each patient is based on the patient’s response to the therapeutic regimen. The system is dynamic and continuously adopts the therapeutic schedule to change the patient’s status, concomitant diseases, medications, environmental factors, and any additional parameters that affect the patient’s condition and/or response to therapy [54].

  3. iii.

    A monitoring system for patients with rare diseases that uses a second-generation AI system

The digital system can also provide a monitoring tool for patients with rare diseases, leading to improved care and reduced cost. The app may provide physicians with a tool to monitor adherence to therapy and collect data on the efficacy of therapy and its side effects. As a monitoring tool, the system may enable caregivers and patients to adjust the parameters being monitored to suit their needs. Patients are followed up based on parameters that are clinically meaningful to them and to physicians. These parameters can be modified based on big data analysis of clinical and laboratory outcomes that are relevant to the disease (e.g., bone pain and platelet counts in patients with GD) and on individual patient-based parameters of QoL. The algorithm is designed to identify parameters that are relevant to disease outcomes and to each subject.

A second-generation system for the management of patients with rare diseases can provide an inclusive solution for better diagnosis, improved therapy, and monitoring. It can add value to physicians by offering them a tool for accurate diagnosis, better treatment, and monitoring. It also benefits healthcare systems by reducing costs due to better diagnosis and therapy [20]. The sustainability of the effects of treatment is ensured by the dynamic continuous closed-loop machine learning system, which adapts itself to changes in disease status and response to therapy in an individualized way. While first-generation digital platforms are associated with a lack of adherence by patients and caregivers, the second-generation system contributes to clinically meaningful outcomes, ensuring a continuous motivation for using the app.

The distribution of public spending on health depends on a variety of factors, ranging from disease burden and system priorities to organizational aspects and costs. Healthcare systems face serious sustainability challenges. This is particularly true for rare diseases, where priority setting involves value-laden choices [71].


Improving the diagnosis, treatment, and follow-up of patients with rare diseases remains a major challenge. Solutions provided by first-generation artificial systems are insufficient. Second-generation systems, which are designed to improve diagnosis, provide methods for overcoming the LOR to therapies, and improve monitoring, may provide added value to patients and caregivers. Improving clinically meaningful outcomes through early and accurate diagnosis and better response to treatment ensures the use of these systems by all players in the health care system. Ongoing clinical trials are expected to show, in the near future, that these systems can be used for patients with rare diseases.