The molecular spectrum and distribution of haemoglobinopathies in Cyprus: a 20-year retrospective study

Haemoglobinopathies are the most common monogenic diseases, posing a major public health challenge worldwide. Cyprus has one the highest prevalences of thalassaemia in the world and has been the first country to introduce a successful population-wide prevention programme, based on premarital screening. In this study, we report the most significant and comprehensive update on the status of haemoglobinopathies in Cyprus for at least two decades. First, we identified and analysed all known 592 β-thalassaemia patients and 595 Hb H disease patients in Cyprus. Moreover, we report the molecular spectrum of α-, β- and δ-globin gene mutations in the population and their geographic distribution, using a set of 13824 carriers genotyped from 1995 to 2015, and estimate relative allele frequencies in carriers of β- and δ-globin gene mutations. Notably, several mutations are reported for the first time in the Cypriot population, whereas important differences are observed in the distribution of mutations across different districts of the island.

the range of 15-18% 6 , one of the highest in the world. In addition, the α -thalassaemia carrier rate was estimated to be around 20% 7 , although earlier studies using electrophoresis estimated it to be about 10-12% of the population 8,9 . Cyprus was the first country to introduce a successful population-wide prevention programme for β -thalassaemia, based on premarital screening, and, as a result, the annual birth rate has decreased to less than five cases from an expected 30-50 6,10 .
α -thalassaemia is characterised by a decrease or complete absence of expression from one or more of the four α -globin genes and may be brought about by a deletion or a nondeletion mutation in the α -globin genes. Mutations are divided into two major classes: (a) α 0 -thalassaemia mutations, which delete both α -globin genes on the same chromosome, and (b) α + -thalassaemia mutations, which delete or deactivate only one of the α -globin genes. As a result, four clinical conditions of increased severity are recognised, based on the number of α -globin genes affected 11,12 : (a) α -thalassaemia silent carrier, in which only one α -globin gene is affected by a deletion (-α /α α ) or a nondeletion (α ND α /α α ) and which is mainly asymptomatic, (b) α -thalassaemia trait, in which only two α -globin genes are functional, either in cis (--/α α ) or in trans (-α /-α or α ND α /α ND α ), and which usually results in mild anaemia, (c) Hb H disease, in which there is only one functional α -globin gene (--/-α or --/α ND α ), and (d) Hb Bart's hydrops fetalis, in which all α -globin genes are deleted (--/--), leading to severe anaemia and, usually, death in utero or shortly after birth. The spectrum of α -thalassaemia mutations has been well-documented over the last decades 13 , with more than 230 mutations currently reported in the public IthaGenes database 1 . In Cyprus, the α -thalassaemia carrier rate and the relative allele frequencies were previously determined by screening 495 random cord blood samples 7 . The -α 3.7 deletion is the most common α -globin mutation, accounting for 72.8% of all α -globin mutations, while the most common α 0 mutations, specifically -α 20.5 and --MED I, account for a combined 7.8% of all α -globin mutations and about 0.8% of the population. Consequently and despite the high prevalence of α -thalassaemia carriers in Cyprus, the risk for Hb Bart's hydrops fetalis is relatively low, due to the low prevalence of α 0 mutations in the population. In contrast, the number of Hb H disease patients is higher, but no study has ever determined the Hb H disease prevalence in the population, an omission that is all the more striking given that a wide range of severe phenotypic characteristics have been observed, particularly in the nondeletional type of the disease 14,15 .
β -thalassaemia is characterised by the reduced synthesis (β + ) or absence (β o ) of the β -globin chains in the Hb molecule, resulting in accumulation of unbound α -globin chains that precipitate in erythroid precursors in the bone marrow and in the mature erythrocytes, leading to ineffective erythropoiesis and peripheral haemolysis 16 . It is mainly caused by single nucleotide substitutions, small deletions or insertions within the β -globin gene or its immediate flanking sequence and, rarely, by large deletions 17,18 . To date, more than 350 β -thalassaemia mutations have been reported in the IthaGenes database 1 . In Cyprus, the most common β -thalassaemia mutation is IVS I-110 (G> A), with a frequency of 74-80%, followed by three other alleles, specifically IVS II-745 (C> G), IVS I-6 (T> C), IVS I-1 (G> A), with frequencies of 5-8% 19 .
In adult life, Hb A (α 2 β 2 ) is the major Hb component with Hb A 2 (α 2 δ 2 ) represented in a fraction of 2.5-3.5% and with traces (< 1%) of Hb F (α 2 γ 2 ). The percentage of Hb A 2 is usually higher in β -thalassaemia carriers, because of the reduced production of Hb A, while it is usually lower for Hb H disease patients. Therefore and although δ -thalassaemia has no clinical significance, mutations in the δ -globin gene interfere with typical thalassaemia phenotypic characteristics, affecting population screening for thalassaemia. Thus, investigation of δ -globin gene mutations in the Cypriot population is important to avoid misdiagnosis for thalassaemia carriers 20 . In a previous study 21 , the carrier frequency for mutant δ -globin chromosomes was estimated to be around 1.26% and the spectrum of observed δ -globin gene mutations was reported for the Cypriot population.
In addition to the thalassaemias, several less prevalent structural Hb variants have been identified in the Cypriot population 22 . Nine structural variants concerning the β -globin chains and three concerning the α -globin chains have been identified, with the most common being Hb S (0.2%), Hb D-Punjab (0.02%), Hb Lepore Boston-Washington (0.03%) for the β -globin chain and Hb Setif (0.1%) for the α -globin chain. Recently, a novel δ -globin chain variant (Hb A 2 -Famagusta) was discovered in four distinct families in Cyprus 23 , while other δ -globin variants have been observed in the past 21 , with Hb A 2 -Yialousa being the most prevalent.
This article reports (a) the molecular spectrum and geographic distribution of all known β -thalassaemia patients and Hb H disease patients in Cyprus, (b) the molecular spectrum of α -, β -and δ -globin gene mutations in the population and their geographic distribution, and (c) an updated, more precise estimation of the relative allele frequencies in carriers of β -and δ -globin gene mutations. In this retrospective investigation, we retrieved and analysed genotypic characteristics of samples isolated during the last 20 years from 592 β -thalassaemia patients, 595 Hb H disease patients and 13824 carriers of α -, β -and δ -globin gene mutations. Therefore, this study represents the most significant and comprehensive update on the status of haemoglobinopathies in Cyprus for at least two decades, providing comprehensive evidence for the success and critical information for the improvement of the population screening programme.

Materials and Methods
Ethics Statement. The study is in accordance with the guidelines and regulations of the Cyprus legislation and National Bioethics Committee. All genetic and personal information used throughout this study were collected as part of the routine diagnostic services at the Cyprus Institute of Neurology and Genetics (CING) from 1995 to 2015, after the request of the participants and in accordance with the CING regulations, whereas no additional data were collected or stored for this research investigation. All subjects were de-identified in compliance with the FDA Guidance Document "Informed Consent for In Vitro Diagnostic Device Studies Using Leftover Human Specimens that are Not Individually Identifiable" issued in April 2006 and is exempt from IRB review, as also confirmed by the Cyprus National Bioethics Committee. In addition, the study is in accordance with the To preserve the anonymity of these subjects, demographic data were limited to geographic distribution at the low-resolution district-level, age and sex, while only summary data are reported without providing detailed descriptions of the individual cases.
Study design and subjects. Since 1978, population screening for haemoglobinopathies in Cyprus has been performed by the Thalassaemia Screening Laboratory of the Cyprus Thalassaemia Centre at Nicosia. Selected samples with abnormal haematological indices have been referred to the Molecular Genetics Thalassaemia department in the CING for molecular characterisation and identification of haemoglobinopathy carriers, whereas additional individuals have been referred for molecular analysis as part of family studies. Moreover, genetic analysis was performed as part of prenatal diagnosis for couples at risk of an affected thalassaemia birth. In addition and through its role as the reference centre for genetic testing for haemoglobinopathies, the CING performed genetic analysis for all known β -thalassaemia patients and Hb H disease patients in Cyprus.
This study includes data for Greek Cypriots, who, according to the latest population census in 2011 (http:// www.mof.gov.cy/mof/cystat/statistics.nsf/census-2011_cystat_en/census-2011_cystat_en), represent about 98.8% of the habitants with Cypriot citizenship. Owing to the political situation, only sporadic samples of Turkish Cypriots were analysed as part of the national control programme of the Republic of Cyprus, even though they would normally account for a significant fraction of the population. To allow a clear definition of the sample population and of the scope of this study, those sporadic samples were thus not included in our analyses.
This study includes data for all β -thalassaemia patients and Hb H patients managed by the dedicated Thalassaemia Clinics in all four cities, specifically Nicosia, Limassol, Larnaca (merged with the smaller Famagusta district) and Paphos, as well as carrier data obtained by the CING from 1995 to 2015, as part of the routine genetic analysis for haemoglobinopathies. After removing transplanted and deceased patients, we compiled two datasets of thalassaemia patients, specifically (a) 592 β -thalassaemia patients, and (b) 595 Hb H disease patients. In addition, we compiled a dataset of all carriers genotyped at the CING from 1995 to 2015, resulting in 13824 individuals, more specifically overlapping sets of (a) 9287 carriers of α -globin gene mutations, namely individuals with silent α -thalassaemia and α -thalassaemia trait, (b) 4700 carriers of a β -globin gene mutation, and (c) 504 individuals with one or both δ -globin genes mutated. To avoid reduncancies in the datasets, we selected only unrelated individuals for the analysis of α -and δ -globin genes, resulting in final datasets of 8412 carriers of α -globin gene mutations and 428 carriers of δ -globin gene mutations. In the case of β -thalassaemia carriers, we compiled a random dataset from couples at risk of having an affected birth that participated in prenatal diagnosis, giving a final dataset of 2335 unrelated β -thalassaemia carriers. Sample sizes, age and sex distribution for all datasets are summarised in Table 1. DNA isolation and genotyping of α-, βand δ-globin genes. Genomic DNA was extracted from peripheral blood using the Gentra Puregene Kit (Qiagen, Valencia, CA, USA) according to the manufacturer's protocol and kept at − 80 °C for long-term storage. For genotyping of the causative pathological β -globin mutations, we used the single-tube amplification refractory mutation system-Polymerase Chain Reaction (ARMS-PCR) methodology 24 . Seven mutations common to the Cypriot population were tested, specifically IVS I-110 (G> A), IVS I-6 (T> C), IVS I-1 (G> A), IVS II-745 (C> G), CD 39 (C> T), -87 (C> G) and CD 5 (-CT). When no mutation was detected by the ARMS-PCR, the β -globin gene was examined by performing PCR and Sanger sequencing using an ABI 3130xl Genetic Analyzer (Applied Biosystems-Life Technologies, USA).
Deletion-type mutations were tested using the multiplex ligation-dependent probe amplification method (MLPA) by utilising the SALSA MLPA probemix P102 HBB protocol (MRC, Holland).
The samples of β -thalassaemia patients, Hb H disease patients and carriers of α -globin gene mutations were investigated for α -globin deletions and/or point mutations by gap-PCR genotyping assays. The α -thalassaemia screening panel consisted of deletions and point mutations representing the common α -thalassaemia determinants encountered previously in Cyprus, specifically -α 3.7, triplicated α (α α α or anti −α3.7 ) 25 , --MED I, -α 20.5, IVS I-1 (-5 bp), Poly(A) AATAAA > AATGAA and Hb Agrinio 15,26 . In the case of Hb H disease patients and when no mutation was detected by the gap-PCRs, the α -globin genes were studied by sequencing, as detailed  above for β -globin gene mutations. Deletion-type mutations were tested using the MLPA method by utilising the SALSA MLPA probemix P140-B3 HBA protocol (MRC, Holland). Protocols for these procedures are described in detail in the IthaPedia wiki at the ITHANET Portal (www.ithanet.eu/ithapedia) 27 . In addition, possible carriers of δ -globin gene mutations were tested by performing PCR and Sanger sequencing as described elsewhere 21 .
Statistical analysis. The data were analysed using the R programming language (version 3.2.4). The analysis included data manipulation, filtering and plotting using the R packages tidyr, dplyr and ggplot2. Descriptive statistics were utilised for the analysis, particularly to calculate mutation frequencies for haemoglobinopathies in Cyprus and in individual districts. In addition, the one-sample proportions test (Wilson score) with Yates' continuity correction was utilised to calculate 95% confidence intervals (95% CI), using the prop.test function in R.

Results and Discussion
Carriers of α-globin gene mutations. From 1995 to 2015, 9287 carriers of α -globin gene mutations were genotyped at the CING. After removing related individuals with identical genotype, the remaining 8412 carriers were used to analyse the molecular spectrum and distribution of α -globin gene mutations, shown in Table 2. However, we do not report the relative allele frequencies, because the dataset of α -globin gene mutation carriers referred to the CING for genotyping is not a random representation of the carrier population, particularly owing to the underrepresentation of silent α -thalassaemia carriers in the dataset, such as individuals with genotypes α α /-α 3.7 or α α /α α α , which often remain undetected through the population screening programme. Importantly, the -α 3.7 deletion is the most common α -thalassaemia allele in Cyprus with a relative frequency of 72.8%, as reported in an earlier study that utilised a random dataset 7 . Nevertheless, the present study, through the compilation of a large dataset, reports the widest spectrum of α -globin gene mutations observed in the population since 1995, with several α -globin gene mutations reported for the first time in Cyprus. More specifically, three α -thalassaemia mutations, namely --SEA, -α 4.2 and CD 108 (-C), and two α -chain structural variants, namely Hb Icaria and Hb Stanleyville-II, are reported for the first time through this study, whilst three cases with α 0 deletions in the erythroid-specific DNAse I hypersensitive site MCS-R2 (HS40) were detected and are currently under investigation to determine the precise breakpoints. Moreover, four cases with unknown duplications of the α -locus and three cases with unknown deletions involving both genes, α 2 and α 1, were detected and are currently under investigation. In addition, the geographic distribution of α -globin gene mutations reveals statistically significant differences between districts. Notably, the --MED I deletion is more prevalent in Larnaca/Famagusta, where the -α 3.7 allele is less prevalent. Moreover, the severe IVS I-1 (-5 bp) allele is more prevalent in Larnaca/Famagusta and Paphos, while a higher prevalence of the -α 20.5 deletion is observed in Nicosia.  Carriers of β-globin gene mutations. Population screening in Cyprus was established mainly to prevent β -thalassaemia, which is usually a more severe disorder than the Hb H disease. For this reason, prenatal diagnosis has been offered for couples at risk for an affected β -thalassaemia birth. The list of all couples participating in prenatal testing is a random representation, with regards to genotype, of the β -thalassaemia carrier population in Cyprus, because it comprises unrelated individuals without any bias for mild or severe mutations. Thus, from the 4700 carriers of β -globin gene mutations genotyped at the CING during 1995-2015, we have selected all couples participating in prenatal testing for the calculation of the carrier frequencies, resulting in 2335 individuals, and the results are shown in Table 3. The most common β -thalassaemia mutation in the population is IVS I-110 (G> A), with a frequency of 79.01%, followed by mutations IVS I-6 (T> C), IVS I-1 (G> A) and IVS II-745 (C> G) with frequencies of 6.34%, 6.00% and 4.11%, respectively. These frequencies are in agreement with values reported in an earlier study 19 , but the present study provides both a more precise estimate through its much larger sample size and a critical update to numbers that date back 23 years. Most geographical differences observed in β -globin gene mutation frequencies are small and not statistically significant, as demonstrated by the 95% CI shown in Table 3. Specifically, the frequency of the IVS I-110 (G> A) mutation is significantly higher in Paphos, albeit with a smaller sample size. In addition, small differences are observed in Limassol, with slightly higher frequencies for alleles IVS I-1 (G> A) and IVS II-745 (C> G), and Nicosia, with a slightly higher frequency for the IVS I-6 (T> C) allele.
Notably, several mutations are reported here for the first time in the Cypriot population. More specifically, five β -thalassaemia mutations, namely -87 (C> G), CD 44 (-C), IVS II-1 (G> A), -101 (C> T) and (δ β ) 0 Sicilian, and the β -chain structural variant Hb City of Hope have not been observed in earlier studies 19,28 , whereas the mutation IVS II-848 (C> A) has been reported only in the Turkish Cypriot population in the past 29 . Apart from the mutations listed in Table 3, other mutations were observed in individuals genotyped at the CING, but not included in the final dataset that was based on individuals participating in prenatal diagnosis. These mutations are rare in the population, with frequencies of 0.5% or lower, and include CD 8 (-AA), Hb Beirut, Hb Serres, Hb Limassol, Hb Nicosia, Hb O-Arab, Hb G-Accra that have been observed in the population in the past 19,22 , but also mutations that are reported in the Cypriot population for the first time, namely Hb C, Hb E, -30 (T> A), CD 36/37 (-T) and -92 (C> T).
Carriers of δ-globin gene mutations. 504 individuals carrying at least one δ -globin gene mutation were genotyped at the CING from 1995 to 2015. After removing related individuals, a dataset of 428 carriers was available to calculate the carrier frequencies shown in Table 4. The most common δ -globin gene mutation in Cyprus was CD 27 (GCC> TCC), resulting in Hb A 2 -Yialousa, with a frequency of 48.12%, followed by CD 4 (ACT> ATT), Hb A 2 -Yokoshima, Hb A 2 -Pelendri and IVS II-897 (A> G) with frequencies of 19.87%, 11.26%, 6.84% and 5.96%, respectively. Notably, three δ -globin chain variants were observed in the Cypriot population for the first time, namely Hb A 2 -NYU, Hb A 2 ′ and Hb A 2 -Etolia.
Differences in the geographic distribution of δ -globin gene mutations are observed across different districts. Hb A 2 -Yialousa is particularly prevalent in the Larnaca/Famagusta district, while the CD 4 (ACT> ATT) allele is  β-thalassaemia patients. All known live 592 β -thalassaemia patients in Cyprus were genotyped at the CING and, with a population of 659115 Greek Cypriots (accounting for 98.8% of the total population), the prevalence of β -thalassaemia patients is estimated to be around 0.9 cases per 1000 people. Despite the application of a successful prevention programme, the prevalence is only slightly lower than earlier estimates of 1 in 1000 4 , which can be mainly attributed to the better survival of β -thalassaemia patients 30 . In addition, it was recently reported that the prevalence of β -thalassaemia carriers has been decreasing over the past 25 years, from an estimated 15-18% to around 12% of the population 5 , so that the present population of β -thalassaemia patients (average age of 41 years) is representative of a previously higher carrier rate. Table 5 shows the relative allele and genotype frequencies in β -thalassaemia patients, including frequencies for each individual district. Notably, five mutations account for more than 95% of all β -thalassaemia alleles in the patient population. As expected, the most common β -globin gene mutation is IVS I-110 (G> A), with a percentage of 72.72%, followed by alleles IVS I-6 (T> C) and IVS I-1 (G> A), with frequencies of 12.42 and 5.15%, respectively. Consequently, homozygosity for IVS I-110 (G> A) is the most common genotype in β -thalassaemia patients with a frequency of 52.7%, while 18.92% of the patients are compound heterozygous for IVS I-110 (G> A) and IVS I-6 (T> C), with the latter allele usually associated with a milder phenotype than IVS I-110 G> A 16 . Notably, the frequency of IVS I-110 (G> A) is lower in patients than in carriers, while the frequency of IVS I-6 (T> C) is much higher, possibly due to a better survival rate in patients carrying the milder IVS I-6 (T> C) mutation.
Hb H disease patients. For the first time, we have identified all 595 known Hb H disease patients in Cyprus and, thus, the prevalence of Hb H disease in the population is around 0.9 cases per 1000 persons. Hb H disease is caused by a combination of α 0 -and α + -thalassaemia alleles, and Table 6 shows the frequencies of α 0 and α + mutations and the observed genotype frequencies in Hb H disease patients in Cyprus and in individual districts. Two common alleles, namely -α 3.7 and IVS I-1 (-5 bp), account for more than 95% of all α + mutations and, similarly, deletions --MED I and -α 20.5 account for more than 98% of all α 0 mutations. Notably, the --MED I deletion and the IVS I-1 (-5 bp) allele are particularly prevalent in the east part of the island, namely in the Larnaca/ Famagusta districts, while the -α 20.5 deletion is more common in Nicosia than in other districts, mirroring similar observations in the α -thalassaemia carriers dataset (Table 2). Compound heterozygosity of α 0 mutations with the -α 3.7 deletion is the most common genotype, with a frequency of 82.35%. Importantly, compound heterozygosity of --MED I or -α 20.5 with the IVS I-1 (-5 bp) α + -thalassaemia mutation, a nondeletional form of Hb H disease that is often assosiated with a more severe phenotype, is observed in 13.11% of the patients.
Geographic distribution of haemoglobinopathies. Important differences are observed in the molecular spectrum of haemoglobinopathies across different districts, as discussed in previous sections, even though Cyprus is a small island with a total area of 9251 km 2 . In addition to the district-specific molecular spectra, differences are observed in the prevalence of haemoglobinopathies in different districts, as illustrated in Fig. 1. The figure shows the fraction of the population living in different districts, according to the latest population census, compared to the fraction of carriers of α -, β -and δ -globin gene mutations in the same districts, as reflected through the datasets used in this study, thus indicating under-and over-representation of different types of  haemoglobinopathies relative to the share of the total population. Prevalence of haemoglobinopathies is generally higher than the island average in the Larnaca/Famagusta district, particularly for carriers of α -and δ -globin gene mutations, while an overrepresentation of carriers for α -globin gene mutations, only, is observed in Nicosia.
In contrast, Limassol, the second-most populous district in Cyprus, has a lower prevalence of carriers of α -and δ -globin gene mutations and the same is observed in Paphos, the least populated district. As demonstrated in an earlier study for structural Hb variants 22 , a more detailed analysis of the geographic distribution of haemoglobinopathies would provide a valuable insight into the prevalence of thalassaemia mutations in Cyprus.

Conclusions
This study represents the most significant update on the status of haemoglobinopathies in Cyprus for over two decades and reports the analysis of molecular data collected during 1995-2015 at the Molecular Genetics     Thalassaemia department at the CING, the reference laboratory for genetic analysis in Cyprus. Notably, the total sample size of around 15000 individuals (patients and carriers) used in this study is much larger than the sample sizes used in previous studies 7,19,21,26 , making the current up-to-date report the most reliable investigation on the status and distribution of haemoglobinopathies in Cyprus. The estimated β -thalassaemia carrier rate is around 12-15% 4,5 of the population, i.e. 80-100 thousands, and the α -thalassaemia carrier rate was estimated around 20% 7 , i.e. around 130 thousands. Thus, initial datasets of 9287 and 4700 carriers of α -and β -globin gene mutations, respectively, utilised in this study represent a significant fraction of the carrier population. Furthermore, this study demonstrates important differences in the prevalence and distribution of haemoglobinopathy alleles across different districts in Cyprus, with differences observed between the east and west part of the island. The analysis of the molecular spectrum and distribution of haemoglobinopathies provides valuable information to the population screening programme and facilitates effective prenatal diagnosis. A limitation of this study is the lack of a random sample for the calculation of α -thalassaemia carrier frequencies, because a number of silent carriers are not detected by the population screening programme and, thus, are not referred to the CING for genetic analysis. Hence and because we analysed data collected through the routine molecular analysis at the CING since 1995, we could not compile a random dataset for the calculation of relative allele frequencies. The same limitation, however, is true for other studies of α -thalassaemia allele frequencies, even where this is not explicitly stated, mainly due to challenges involved in diagnosis 11,12 . A separate investigation should be performed on a random dataset to precisely determine the α -thalassaemia allele frequencies.
In addition and for the first time, we have identified and genotyped all thalassaemia patients in Cyprus, specifically β -thalassaemia and Hb H disease patients. To our knowledge, this is the first comprehensive analysis of the national β -thalassaemia population and Hb H disease population in any country. Hence, this study represents the first major step towards the challenging task to analyse and correlate genotype and phenotype for thalassaemia patients in Cyprus, which will be our future direction.