To the Editor:

We have read with great interest the manuscript written by Nguengang Wakap et al. estimating cumulative point prevalence of rare diseases (RDs) [1].

Undoubtedly, it must be considered as a landmark analysis and the first robust estimation of the cumulative point prevalence of RDs, since previously reported figures (6–8%) did not come from any solid study, and the original citations and bibliographic references were all circularly quoted from each other [2]. However, we believe that their Orphanet data-based results should widely be discussed by the scientific community.

As acknowledged by authors, one of the major constraints to being able to know how many people are suffering from a RD is the lack of a standardised definition. Although a low prevalence of disease is shared by all definitions, there are two main issues that make them different: first, the cut-off value that each country/region uses as a threshold to specify that “low prevalence”; and second, inclusion of criteria other than prevalence criterion, such as chronicity, severity and risk of premature death [3]. It should not be forgotten that RD concept arose from the pharmaceutical industry’s demands for help in researching and marketing “orphan drugs”, so its definition depends on the rules governing the designation of a substance as such [4]. Thus, while in 1983 the US regulation defined RD when “occurs so infrequently […] that there is no reasonable expectation that the cost of developing and making available in the United States a drug for such disease or condition will be recovered from sales in the United States of such drug.” [5], in 1999 the EU considered that RDs “are life-threatening or chronically debilitating diseases which are of such low prevalence that special combined efforts are needed to address them so as to prevent significant morbidity or perinatal or early mortality or a considerable reduction in an individual’s quality of life or socio-economic potential.” [6] Consequently, US sets a threshold of prevalent cases in its territory below which a disease is designated as rare (<200,000), while EU also considers, together with a maximum prevalence (5 cases per 10,000 inhabitants), criteria of severity and chronicity.

That is especially important because it puts the following issues on the table:

  1. 1.

    If prevalence is an appropriate measure only in relatively stable conditions, being unsuitable for acute disorders, and all official definitions of RDs are based on maximum point prevalence values (“burden of disease”), could diseases using other measures of frequency (e.g. incidence) as an epidemiological indicator be defined by this term? Could then neglected tropical diseases (NTDs) [7] or rare cancer (incidence of less than 6 (EU) or 15 (US) per 100,000 people per year) [8] be considered as RDs per se or do they simply share certain characteristics but do not fit the prevalence-based definition? Would we not be mixing the concept of “rare” with that of “orphan” disease? [9]

  2. 2.

    As already accepted in the work of Inserm [1], for most RDs there is an evidently strong geographical disparity. Therefore, even if we were to apply a universal standardised RD definition, what would be done with those that meet the definition in some places but not in others (as NTDs)?

  3. 3.

    And finally, if we consider the rest of the characteristics (besides low prevalence), would it be right to “label” as RDs some of which are not serious or life-threatening?

The Orphanet database, whose information is based on several scientific studies, presents some limitations, as already stated by Institute of Medicine [5], among which we would like to highlight:

  1. a.

    “the use of single numbers for conditions with widely varying estimates of prevalence in the literature and the lack of bibliographic citations and explanatory details”

  2. b.

    “It is likely that there is an overestimation for most diseases as the few published prevalence surveys are usually done in regions of higher prevalence and are usually based on hospital data. Therefore, these estimates are an indication of the assumed prevalence but may not be accurate.”

In addition to these limitations, and those admitted by authors in their manuscript, another issue should be noted that leads to an overestimation of range boundaries proposed, mainly the minimum threshold. RDs with point prevalence data were assigned by authors to one of the prevalence classes listed on Orphanet (<1/1,000,000, 1–9/1,000,000, 1–9/100,000, and 1–5/10,000) and then they calculate independently minimum and maximum boundaries by summing the results within class. In this way, diseases whose point prevalence is close to the minimum boundary within their class, will contribute to maximum boundary with values that are quite far from the real one. This overestimation can be exacerbated for the most prevalent RD class (1–5/10,000) that representing a small percentage of the global (4.2%), contributes to overall figure about 80% of the global weight [1]. On the other hand, most of the RDs analysed (84.5%) corresponds to the less frequent class (<1/1,000,000) and for them “minimum and maximum values were assigned as 1/1,000,000” [1], which consequently makes the minimum boundary of the estimate higher.

Therefore, from our point of view, the figure yielded by the authors’ comprehensive approach (~3.5–5.9%), not only would not be underestimated (as pointed out in the manuscript) but it seems to be overestimated. It is considerably higher than, e.g., the data provided by the Italian population register of the Veneto region (with >15-year of experience), which estimated that 0.61% of its population suffers from one of the RDs they were monitoring (58% of total included in Orphanet) [10].