The major genetic risk factor for severe COVID-19 does not show any association among South Asian populations

Singh, Prajjval Pratap; Srivastava, Anshika; Sultana, Gazi Nurun Nahar; Khanam, Nargis; Pathak, Abhishek; Suravajhala, Prashanth; Singh, Royana; Shrivastava, Pankaj; van Driem, George; Thangaraj, Kumarasamy; Chaubey, Gyaneshwer

doi:10.1038/s41598-021-91711-4

Download PDF

Article
Open access
Published: 11 June 2021

The major genetic risk factor for severe COVID-19 does not show any association among South Asian populations

Prajjval Pratap Singh¹,
Anshika Srivastava¹,
Gazi Nurun Nahar Sultana²,
Nargis Khanam¹,
Abhishek Pathak³,
Prashanth Suravajhala⁴,
Royana Singh⁵,
Pankaj Shrivastava⁶,
George van Driem⁷,
Kumarasamy Thangaraj^8,9 &
…
Gyaneshwer Chaubey¹

Scientific Reports volume 11, Article number: 12346 (2021) Cite this article

6184 Accesses
6 Citations
277 Altmetric
Metrics details

Subjects

A Publisher Correction to this article was published on 05 August 2021

This article has been updated

Abstract

With the growing evidence on the variable human susceptibility against COVID-19, it is evident that some genetic loci modulate the severity of the infection. Recent studies have identified several loci associated with greater severity. More recently, a study has identified a 50 kb genomic segment introgressed from Neanderthal adding a risk for COVID-19, and this genomic segment is present among 16% and 50% people of European and South Asian descent, respectively. Our studies on ACE2 identified a haplotype present among 20% and 60% of European and South Asian populations, respectively, which appears to be responsible for the low case fatality rate among South Asian populations. This result was also consistent with the real-time infection rate and case fatality rate among various states of India. We readdressed this issue using both of the contrasting datasets and compared them with the real-time infection rates and case fatality rate in India. We found that the polymorphism present in the 50 kb introgressed genomic segment (rs10490770) did not show any significant correlation with the infection and case fatality rate in India.

Trans-ethnic genome-wide association study of severe COVID-19

Article Open access 31 August 2021

COVID-19: Impact on linguistic and genetic isolates of India

Article Open access 11 October 2021

The genetic and evolutionary determinants of COVID-19 susceptibility

Article 28 June 2022

Introduction

Since the beginning of COVID-19 pandemic, it has been observed that people with a different ethnic background and country or continent of origin have variable degrees of susceptibility^1,2. Though there are a few well known factors for higher susceptibility, e.g. age and comorbidity^3,4, the hospitalisation of younger healthy people has also been reported⁵. Recent genome wide association study has identified a gene cluster at chromosome 3 as well as the ABO gene at chromosome 9 associated with the severe risk factor for COVID-19 among Europeans⁶. Subsequently, the COVID-19 Host Genetics Initiative has corroborated this result⁷. The worldwide meta-analysis of the COVID-19 Host Genetics Initiative has identified 13 genetic loci associated with higher susceptibility or higher severity⁸.

Zeberg and Pääbo⁹ have identified a risk haplotype of 50 kb size introgressed from Neanderthals, which they called the ‘Neanderthal core haplotype’. This risk haplotype was found to be present with an allele frequency of 30% among South Asians, 8% in Europeans and 4% among admixed Americans. The peak carrier frequency was estimated among the Bangladeshi population, where 63% carried at least one copy of this haplotype. The study also cited twice the risk of mortality among people of Bangladeshi extraction living in the UK as opposed to the native population of Brittanic pedigree¹⁰.

Conversely, three of our studies on ACE2, the gateway of SARS-CoV-2, identified a haplotype, shared among South Asians and East Eurasians, likely protecting them from severe risk^11,12,13. Additionally, the spatial distribution of this haplotype showed strong association with the low infection as well as low case fatality rate (CFR)¹³. To resolve this discrepancy between the two sets of findings and the associated claims, we have extracted a SNP (rs10490770) reported to be associated with the high risk for COVID-19⁹, from our published and unpublished genome wide datasets (Supplementary Table S1), and looked for existing association with the state-wise COVID-19 data of India.

Materials and methods

Zeberg and Pääbo⁹ have mainly discussed about the SNP rs35044562. However, they reported 12 other SNPs present in the ‘Neanderthal core haplotype’ that are in high linkage disequilibrium (r² > 0.98) (Supplementary Table S2). SNP rs10490770 showed high LD (r² = 0.99) with the SNP rs35044562. The genome-wide genotype data by Illumina tagged rs2285666 and rs10490770 SNPs in their panel. Therefore, we searched the genotype datasets generated by this platform. The frequency data for both of the SNPs from various Indian populations were extracted using Plink 1.9¹⁴, from 1000 genome project data phase 3¹⁵, data published by the Estonian Biocentre^16,17,18,19 and our newly genotyped samples for various Indian states and Bangladesh (Supplementary Table S1). In addition to our previous study¹³, more samples were added for the SNP rs2285666. The state-wise COVID-19 infection and CFR datasets were extracted from https://www.covid19india.org/. The regression estimations and plots were built by https://www.graphpad.com/quickcalcs/linear1/ and reverified by the Microsoft Excel regression calculations. We have also used Pearson’s correlation coefficient test²⁰ to evaluate the effect of both the SNPs. The spatial distribution of both SNPs were drawn using the web tool available at https://www.datawrapper.de/.

Results and discussion

In contrast to the conclusions drawn by Zeberg and Pääbo⁹, our study on ACE2 identified a haplotype that is frequent among South Asians and East Eurasians^11,12,13. This haplotype is derived by a polymorphism rs2285666 responsible for elevated expression of ACE2. We have found high inverse correlation of this haplotype with the state-wise cases as well as the case-fatality rate (CFR) among Indian populations¹³. This correlation was significant at various timelines of the pandemic in India (Table 1). We verified the statistical tests with the updated data up to December 2020 and found these data to be consistent with previous observations (Fig. 1 and Supplementary Fig. S1). Thus, it is likely that the ACE2 SNP rs2285666 has played a significant role in modulating the susceptibility to the disease among Indian populations.

Table 1 Estimates of Pearson correlation coefficient for the rs2285666 and rs10490770 with the real-time COVID-19 cases as well as case fatality rate among Indian populations. The significant values are shown in bold letters.

Full size table

In our search of the SNPs reported to be associated with high risk by Zeberg and Pääbo⁹, we found rs10490770 from genome-wide datasets^{17,18,21,22,23}. We applied the same tests done for the ACE2 SNPs (Fig. 1). The state-wise frequency variation of this SNP did not show any association either with the number of cases or the CFR (Table 1 and Supplementary Fig. S1). We repeated these regression tests for the number of cases as well as the CFR data, obtained during all the three months. However, none of them showed any association with the rs10490770 (p > 0.3) (Table 1). It is interesting to note that this SNP (rs10490770) has been found to be associated with disease severity in the global data⁸. However, the lack of association for rs10490770 with COVID-19 cases or CFR in India is striking and suggests instead a complex susceptibility response among Indian populations. Along with the complex genetic structure^11,24, socio-economic status²⁵ and hygiene²⁶ may have contributed to such a complex scenario. Furthermore, a detailed clinical and genome-wide association study on Indian COVID-19 patients would be useful to resolve this complexity.

Zeberg and Pääbo⁹ used the data of higher susceptibility to the disease among the Bangladeshi population living in UK¹⁰ to support their findings. By considering the effect of sex, age, socio-economic deprivation and region, this report found that people of Bangladeshi origin had double the risk of mortality as compared to people of British origin. However, the higher mortality rate for Bangladeshi population in the UK needs more detailed investigation on comorbidity, genetic admixture as well as local environment and socio-economic circumstances in their particular British context. More importantly, a similar trend had also been observed among admixed Americans, where some of the same qualifications may apply mutatis mutandis^27,28,29. Furthermore, it is notable that among the Bangladeshi samples analysed by us, the tribal populations of Bangladesh showed almost three times less frequency of rs10490770 (Supplementary Table S1). This is likely due to the different population histories of the caste and tribal populations of Bangladesh^30,31. The Tibeto-Burman speakers of Bangladesh show a closer genetic affinity with the East and Southeast Asian populations, whereas the Indo-European speaking caste populations incline with the Indian populations. Therefore, it is advised explicitly to differentiate between the caste and tribal populations while making any statement about Bangladeshi populations. Significantly, our data also show that the incidence of the allele rs2285666 has been found to occur in the highest frequency of 100% in Indian populations such as the Nishi and Kokborok (Tripuri), who represent Trans-Himalayan language communities (Supplementary Fig. S1 and Supplementary Table S1). As a linguistic phylum, the Trans-Himalayan language family is widespread in parts of eastern Eurasia and includes languages such as Tibetan, Burmese, Mandarin, Cantonese and Hokkien.

Thus, our extensive analyses on real-time data did not show any association of rs10490770 with the state-wise infection rates as well as CFRs, suggesting that the risk allele for COVID-19 in Europe does not play a significant role in COVID-19 severity in South Asia.

Data availability

All datasets generated for this study are included in the article/Supplementary Material.

Change history

05 August 2021
A Correction to this paper has been published: https://doi.org/10.1038/s41598-021-94864-4

References

Mackey, K. et al. Racial and ethnic disparities in COVID-19–related infections, hospitalizations, and deaths: A systematic review. Ann. Intern. Med. 174, 362–373 (2021).
Article Google Scholar
Shelton, J. F. et al. Trans-ethnic analysis reveals genetic and non-genetic associations with COVID-19 susceptibility and severity. medRxiv https://doi.org/10.1101/2020.09.04.20188318 (2020).
Article Google Scholar
Alberca, R. W., de Oliveira, L. M., Branco, A. C. C. C., Pereira, N. Z. & Sato, M. N. Obesity as a risk factor for COVID-19: An overview. Crit. Rev. Food Sci. Nutr. 1, 15. https://doi.org/10.1080/10408398.2020.1775546 (2020).
Article CAS Google Scholar
Fang, L., Karakiulakis, G. & Roth, M. Are patients with hypertension and diabetes mellitus at increased risk for COVID-19 infection?. Lancet Respir. Med. 8, 21 (2020).
Article Google Scholar
Godri Pollitt, K. J. et al. COVID-19 vulnerability: The potential impact of genetic susceptibility and airborne transmission. Hum. Genom. 14, 1–7 (2020).
Article Google Scholar
Ellinghaus, D. et al. Genomewide association study of severe Covid-19 with respiratory failure. N. Engl. J. Med. https://doi.org/10.20944/preprints202007.0178.v2 (2020).
Article PubMed PubMed Central Google Scholar
COVID-19 Host Genetics Initiative. The COVID-19 Host Genetics Initiative a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic. Eur. J. Hum. Genet. 28, 715 (2020).
Article Google Scholar
Ganna, A. Mapping the human genetic architecture of COVID-19 by worldwide meta-analysis. medRxiv https://doi.org/10.1101/2021.03.10.21252820 (2021).
Article PubMed PubMed Central Google Scholar
Zeberg, H. & Pääbo, S. The major genetic risk factor for severe COVID-19 is inherited from Neanderthals. Nature 587, 610–612 (2020).
Article ADS Google Scholar
Public Health England. COVID-19: Review of disparities in risks and outcomes (2020). https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/908434/Disparities_in_the_risk_and_outcomes_of_COVID_August_2020_update.pdf
Srivastava, A. et al. Most frequent South Asian haplotypes of ACE2 share identity by descent with East Eurasian populations. PLoS ONE 15, e0238255 (2020).
Article CAS Google Scholar
Singh, K. K., Chaubey, G., Chen, J. Y. & Suravajhala, P. Decoding SARS-CoV-2 hijacking of host mitochondria in pathogenesis of COVID-19. Am. J. Physiol. https://doi.org/10.1152/ajpcell.00224.2020 (2020).
Article Google Scholar
Srivastava, A. et al. Genetic association of ACE2 rs2285666 polymorphism with COVID-19 spatial distribution in India. Front. Genet. 11, 1163 (2020).
Article Google Scholar
Chang, C. C. et al. Second-generation PLINK: Rising to the challenge of larger and richer data sets. Gigascience 4, S13742-015 (2015).
Article Google Scholar
1000 Genomes Project Consortium et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
Article Google Scholar
Estonian Biocentre Public_Data. https://evolbio.ut.ee/.
Chaubey, G. et al. “Like sugar in milk”: Reconstructing the genetic history of the Parsi population. Genome Biol. 18, 110 (2017).
Article Google Scholar
Pathak, A. K. et al. The genetic ancestry of modern Indus valley populations from Northwest India. Am. J. Hum. Genet. 103, 918–929 (2018).
Article CAS Google Scholar
Tätte, K. et al. The genetic legacy of continental scale admixture in Indian Austroasiatic speakers. Sci. Rep. 9, 3818 (2019).
Article ADS Google Scholar
Benesty, J., Chen, J., Huang, Y., and Cohen, I. (2009). “Pearson correlation coefficient,” in Noise Reduction in Speech Processing, Springer Topics in Signal Processing, vol. 2, Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-642-00296-0_5
Chaubey, G. et al. Population genetic structure in Indian austroasiatic speakers: The role of landscape barriers and sex-specific admixture. Mol. Biol. Evol. 28, 1013–1024 (2011).
Article CAS Google Scholar
Metspalu, M. et al. Shared and unique components of human population structure and genome-wide signals of positive selection in South Asia. Am. J. Hum. Genet. 89, 731–744 (2011).
Article CAS Google Scholar
Tamang, R. et al. Reconstructing the demographic history of the Himalayan and adjoining populations. Hum. Genet. 137, 129–139 (2018).
Article Google Scholar
Metspalu, M., Mondal, M. & Chaubey, G. The genetic makings of South Asia. Genet. Hum. Orig. 53, 128–133 (2018).
CAS Google Scholar
Rajkumar, R. P. The relationship between demographic, socioeconomic, and health-related parameters and the impact of COVID-19 on 24 regions in India: Exploratory cross-sectional study. JMIR Public Health Surveill. 6, e23083 (2020).
Article Google Scholar
Finlay, B. B. et al. The hygiene hypothesis, the COVID pandemic, and consequences for the human microbiome. Proc. Natl. Acad. Sci. USA 118, e2010217118 (2021).
Article CAS Google Scholar
Kullar, R. et al. Racial disparity of coronavirus disease 2019 in African American communities. J. Infect. Dis. 222, 890–893 (2020).
Article CAS Google Scholar
Yancy, C. W. COVID-19 and African Americans. JAMA 323, 1891–1892 (2020).
Article CAS Google Scholar
Hooper, M. W., Nápoles, A. M. & Pérez-Stable, E. J. COVID-19 and racial/ethnic disparities. JAMA 323, 2466–2467 (2020).
Article CAS Google Scholar
Sultana, G. N. N., Sharif, M. I., Asaduzzaman, M. & Chaubey, G. Evaluating the genetic impact of South and Southeast Asia on the peopling of Bangladesh. Leg. Med. Tokyo Jpn. 17, 446–450 (2015).
Article CAS Google Scholar
Gazi, N. N. et al. Genetic structure of Tibeto-Burman populations of Bangladesh: Evaluating the gene flow along the Sides of Bay-of-Bengal. PLoS ONE 8, e75064 (2013).
Article ADS CAS Google Scholar

Download references

Acknowledgements

KT was supported by Council of Scientific and Industrial Research (CSIR) and J C Bose Fellowship from Science and Engineering Research Board (SERB), Department of Science and Technology, Government of India. PPS was supported by the CSIR-JRF fellowship from CSIR, India.

Author information

Authors and Affiliations

Cytogenetics Laboratory, Department of Zoology, Banaras Hindu University, Varanasi, Uttar Pradesh, 221005, India
Prajjval Pratap Singh, Anshika Srivastava, Nargis Khanam & Gyaneshwer Chaubey
Centre for Advanced Research in Sciences (CARS), Genetic Engineering and Biotechnology Research Laboratory, University of Dhaka, Dhaka, 1000, Bangladesh
Gazi Nurun Nahar Sultana
Department of Neurology, Institute of Medical Sciences, Banaras Hindu University, Varanasi, India
Abhishek Pathak
Department of Biotechnology and Bioinformatics, Birla Institute of Scientific Research Statue Circle, Jaipur, Rajasthan, India
Prashanth Suravajhala
Department of Anatomy, Institute of Science, Banaras Hindu University, Varanasi, Uttar Pradesh, 221005, India
Royana Singh
Department of Home (Police), DNA Fingerprinting Unit, State Forensic Science Laboratory, Government of MP, Sagar, India
Pankaj Shrivastava
Institut Für Sprachwissenschaft, Universität Bern, Länggassstrasse 49, 3012, Bern, Switzerland
George van Driem
CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India
Kumarasamy Thangaraj
Centre for DNA Fingerprinting and Diagnostics, Hyderabad, India
Kumarasamy Thangaraj

Authors

Prajjval Pratap Singh
View author publications
You can also search for this author in PubMed Google Scholar
Anshika Srivastava
View author publications
You can also search for this author in PubMed Google Scholar
Gazi Nurun Nahar Sultana
View author publications
You can also search for this author in PubMed Google Scholar
Nargis Khanam
View author publications
You can also search for this author in PubMed Google Scholar
Abhishek Pathak
View author publications
You can also search for this author in PubMed Google Scholar
Prashanth Suravajhala
View author publications
You can also search for this author in PubMed Google Scholar
Royana Singh
View author publications
You can also search for this author in PubMed Google Scholar
Pankaj Shrivastava
View author publications
You can also search for this author in PubMed Google Scholar
George van Driem
View author publications
You can also search for this author in PubMed Google Scholar
Kumarasamy Thangaraj
View author publications
You can also search for this author in PubMed Google Scholar
Gyaneshwer Chaubey
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

G.C., K.T., R.S. and P.S. conceived and designed this study. P.P.S., A.S., G.S., N.K., A.P., Pr.S., R.S., and P.S. collected the data for alleles and COVID-19. P.P.S., A.S., Pr.S., and G.C. analyzed the data. P.P.S., Pr.S., K.T., Gv.D. and G.C. wrote the manuscript from the inputs of other co-authors. All authors contributed to the article and approved the submitted version.

Corresponding authors

Correspondence to Kumarasamy Thangaraj or Gyaneshwer Chaubey.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this Article was revised: In the original version of this Article, Kumarasamy Thangaraj was omitted as a corresponding author. Correspondence and request for materials should also be addressed to thangs@ccmb.res.in . In addition, the email address of the co-corresponding author Gyaneshwer Chaubey was incorrectly given as thangs@ccmb.res.in. Correspondence and request for materials should also be addressed to gyaneshwer.chaubey@bhu.ac.in .

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Singh, P.P., Srivastava, A., Sultana, G.N.N. et al. The major genetic risk factor for severe COVID-19 does not show any association among South Asian populations. Sci Rep 11, 12346 (2021). https://doi.org/10.1038/s41598-021-91711-4

Download citation

Received: 23 February 2021
Accepted: 24 May 2021
Published: 11 June 2021
DOI: https://doi.org/10.1038/s41598-021-91711-4

This article is cited by

Role of the Neanderthal Genome in Genetic Susceptibility to COVID-19: 3p21.31 Locus in the Spotlight
- Mohammad Yaghmouri
- Pantea Izadi
Biochemical Genetics (2024)
Cellular and molecular features of COVID-19 associated ARDS: therapeutic relevance
- Gaetano Scaramuzzo
- Francesco Nucera
- Savino Spadaro
Journal of Inflammation (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.