Evaluation of coverage variation of SNP chips for genome-wide association studies

Li, Mingyao; Li, Chun; Guan, Weihua

doi:10.1038/sj.ejhg.5202007

Download PDF

Article
Published: 06 February 2008

Evaluation of coverage variation of SNP chips for genome-wide association studies

Mingyao Li¹^na1,
Chun Li^2,3^na1 &
Weihua Guan⁴

European Journal of Human Genetics volume 16, pages 635–643 (2008)Cite this article

2695 Accesses
74 Citations
3 Altmetric
Metrics details

Abstract

Genome-wide association (GWA) studies for complex human diseases are now feasible. Many GWA studies rely on commercial SNP chips, for which a common evaluation criterion is global coverage of the genome. Although providing an overall evaluation of an SNP chip, the global coverage does not tell us how the coverage varies across the genome, an important feature that should be taken into consideration, as coverage variation often results in power variation and potentially biased search in subsequent association analysis. To achieve a fuller understanding of SNP chip coverage, we conducted detailed evaluation of coverage, including (1) a map of local coverage – calculated over small consecutive genomic regions and (2) gene coverage – calculated for each known gene in the genome. These evaluations can reveal the degree of variation of each SNP chip in covering the genome and can facilitate SNP chip comparisons at a finer scale.

A comprehensive evaluation of polygenic score and genotype imputation performances of human SNP arrays in diverse populations

Article Open access 20 October 2022

Efficient phasing and imputation of low-coverage sequencing data using large reference panels

Article 07 January 2021

ParseCNV2: efficient sequencing tool for copy number variation genome-wide association studies

Article 01 November 2022

Introduction

Genome-wide association (GWA) studies for complex human diseases have now become increasingly popular due to rapid decrease of genotyping costs and recent completion of the International HapMap Project.^{1, 2, 3, 4} With interrogation of hundreds of thousands of SNPs in a large collection of human subjects, GWA studies allow a comprehensive scan of the genome and have the potential to identify novel disease-related genes. The advent of GWA studies has led to the discovery of susceptibility genes for age-related macular degeneration,⁵ cardiac repolarization,⁶ obesity,⁷ inflammatory bowel disease,⁸ and type II diabetes.⁹

However, many issues in designing and analyzing GWA studies remain unclear. For example, when designing a GWA study, an investigator has to choose among several SNP chips. Ideally, one would wish to choose the SNP chip that provides the best genomic coverage for the studied population. However, given the increased cost of using a denser chip, one would also be interested in knowing how much power gain a denser chip has over a less dense chip. The decision is largely dependent on comparison of different SNP chips, thus making systematic and thorough evaluation inevitably important.

The most commonly used criterion for SNP chip evaluation is global coverage, defined as the fraction of common SNPs that are tagged by the SNPs on the chip.^{10, 11} The global coverage is clearly the most relevant criterion, as it represents the average level of coverage of all common SNPs. However, the HapMap data showed in great detail the extent of local variation in linkage disequilibrium (LD) across the genome. Since coverage is calculated based on LD, one would expect variation in coverage as well. Although the global coverage provides an overall evaluation of an SNP chip, it does not tell us how the coverage varies across the genome, an important feature that should be taken into consideration because coverage variation often results in power variation in subsequent association analysis.

To achieve a fuller understanding of the coverage of SNP chips, we propose carrying out more detailed coverage evaluations, including a map of local coverage over small consecutive genomic regions, and gene coverage that is calculated for each known gene in the genome. These evaluations reveal the degree of variation of each SNP chip in covering the genome and can facilitate SNP chip comparisons at a finer scale. We evaluate both the local coverage and gene coverage for six currently available SNP chips, including Affymetrix SNP Array 5.0 and SNP Array 6.0, and Illumina HumanHap300, HumanHap550, HumanHap650Y, and Human1M. Since the power for regions or genes of low coverage is likely to be lower than that for regions or genes of high coverage, information on local coverage and gene coverage can help determine if supplementary genotyping is necessary for the success of a GWA study.

Methods

Data sets

We considered six most commonly used SNP chips in GWA studies: Affymetrix SNP Array 5.0 (500 568 SNPs) and SNP Array 6.0 (934 968 SNPs), and Illumina HumanHap300 (317 511 SNPs), HumanHap550 (555 352 SNPs), HumanHap650Y (660 917 SNPs), and Human1M (1 072 820 SNPs). The Illumina SNP chips include tag SNPs derived from over two million common SNPs (minor allele frequency MAF ≥0.05) in the HapMap data. The Affymetrix SNP Array 5.0 includes SNPs selected on the basis of sequence constraints when choosing the probes, and thus represents a set of quasi-random SNPs that ignores LD patterns.¹⁰ The additional SNPs in the SNP Array 6.0 are mostly tag SNPs. Allele frequency and LD data for the four HapMap populations (CEU, CHB, JPT, and YRI) were obtained from HapMap release no. 21.

Local coverage

We estimated the coverage of the six SNP chips for chromosomal regions of sizes 1 Mb throughout the genome. We adapted the formula of Barrett and Cardon¹⁰ to estimate local coverage rate for each of the four HapMap populations. Briefly, for each 1 Mb region, we obtained R – the number of common SNPs in the HapMap, T – the number of common SNPs on the SNP chip, and L – the number of common SNPs not on the SNP chip but are tagged at r²≥0.8 by at least one SNP in the chip within 250 kb. Let G denote the total number of common SNPs in the region under consideration, including those that have already been discovered and those that have yet to be discovered. Following Barrett and Cardon,¹⁰ the local coverage rate is estimated by

Here L/(R−T) computes the fraction of HapMap common SNPs tagged by SNPs on the chip but are not tags themselves. Multiplying this fraction by G−T yields the number of common SNPs in the region that are not on the chip but can be tagged by SNPs on the chip. This number is then added by T to give an estimate of the total number of SNPs that are captured by either LD tagging or by inclusion on the chip. Compared to a naïve estimate of coverage, (L+T)/R, this formula corrects for overestimation of coverage.¹⁰

The value of G is unknown and needs to be estimated. For a 1 Mb region, the average number of common SNPs is estimated to be about 2631 based on the estimated numbers of common SNPs (7.5 × 10⁶) and euchromatic base pairs (2.85 × 10⁹) in the human genome.^{10, 11} We recognize that different estimates of G may lead to different values of local coverage rate. However, the above formula can be rewritten as L/(R–T)+[1–L/(R–T)] × T/G, which indicates that the value of G has little effect on the final estimate as long as the fraction of common SNPs included in the SNP chip, T/G, is small, which is true for the six SNP chips we evaluated.

To calculate local coverage rate across the genome, we moved the 1 Mb window by 200 kb and repeated the calculation until the end of the chromosome. We did not calculate the values for a window if (1) the number of common SNPs in the HapMap is <20, (2) all common SNPs are located at the left or right half of the window, or (3) the common SNPs are clustered at the ends of the window with a big gap (≥500 kb) in between. As a result, coverage was not calculated for about 7% of the genome, most of which are in heterochromatic regions and have effectively no coverage from the current SNP chips.

Gene coverage

The local coverage calculation procedure can also be applied to calculate the coverage for each gene in the genome. To obtain the starting and ending positions of genes, we downloaded the known Gene table (contains positions of transcripts for known protein coding genes) and the kgXref table (contains cross reference between transcript IDs and gene symbols) from the UCSC human genome release hg17. A gene region is defined as the region from the transcriptional start to end positions, including both exons and introns. For a gene that has more than one transcript, the gene region is defined as the union of regions for all the transcripts. By merging the known Gene and the kgXref tables and eliminating genes that map onto different chromosomes, we obtained 29 815 autosomal and X-linked gene regions. Gene regions vary greatly in size, and those containing very few HapMap common SNPs may have unreliable or inflated coverage results because the design of most current SNP chips relied on the HapMap data. Because of this, we considered gene regions containing only five or more HapMap common SNPs, resulting in 19 913 gene regions for the CEU sample in final analysis (19 299 for CHB, 19 211 for JPT, and 20 694 for YRI, respectively).

Coverage calculation for SNP Array 6.0 and Human1M

The local coverage and gene coverage were calculated based on the HapMap data. However, each of the latest two chips, SNP Array 6.0 and Human1M, has about 10% of the SNPs that are not in the HapMap. According to Affymetrix, the SNP Array 6.0 has 934 968 SNPs, but with 99 854 SNPs (10.7%) not in the HapMap, including 72 379 common SNPs for CEU, 76 016 for CHB, 70 356 for JPT, and 83 412 for YRI. According to Illumina, the Human1M has 1 072 820 SNPs, but with 125 688 SNPs (11.7%) not in the HapMap, including 70 995 common SNPs for CEU, 67 453 for CHB/JPT, and 77 729 for YRI. Because of this, their local coverage and gene coverage may be underestimated if only the HapMap SNPs were considered in coverage calculation. To address this problem, we calculated an alternative coverage estimate as follows, using the SNP Array 6.0 as an example. Suppose there is an ‘updated HapMap data set’ that consists of the current HapMap SNPs and the SNPs on the SNP Array 6.0. Based on this ‘updated data’, for each region, we could estimate the number of common SNPs, denoted as R₁, and the number of common SNPs on the chip, denoted as T₁. For example, if the region contains m non-HapMap common SNPs on the SNP Array 6.0, then R₁=R+m and T₁=T+m. However, owing to the lack of LD information between the ‘new’ SNPs and the other HapMap SNPs, we do not know how many additional HapMap SNPs are tagged by these ‘new’ SNPs, therefore, L₁ cannot be directly estimated. However, if we assume that the number of tagged common SNPs that are not on the chip increases proportionally with the number of common SNPs on the chip, that is, T₁/T=L₁/L, then L₁ can be estimated as (T₁/T) × L. Therefore, based on the ‘updated HapMap data’, we could calculate the local/gene coverage of the SNP Array 6.0 as

The original estimate of genomic coverage in (1) ignored the SNPs that were on the SNP Array 6.0 but were not on the HapMap, and thus it can be viewed as a ‘lower bound’ of the coverage. On the other hand, the coverage in (2) might overestimate when T₁>T and T is small. In our analysis, we took the average of the coverage calculated using (1) and (2), which we believe may provide a more appropriate estimate for the coverage of the SNP Array 6.0. The coverage estimate for the Human1M was similarly calculated.

Results

A map of local coverage

We estimated the local coverage rate for Affymetrix SNP Array 5.0 and SNP Array 6.0, Illumina HumanHap300, HumanHap550, HumanHap650Y, and Human1M. As an example, Figure 1 displays the local coverage rate for chromosome 17 for the four HapMap populations. Detailed, high-resolution results for all chromosomes can be downloaded from http://www.biostat.mc.vanderbilt.edu/SNPChipCoverage. Not surprisingly, the Human1M has universally better coverage than the other five chips for all four populations. For the CEU sample, the coverage of the HumanHap550 is almost always better than the SNP Array 6.0, despite the fact that the latter chip has a significantly more number of SNPs; moreover, the HumanHap300 is almost always better than the SNP Array 5.0. As expected, the coverage of the HumanHap650Y is significantly improved for the YRI sample over the HumanHap550. For comparison's purpose, the global coverage of the six SNP chips is summarized in Table 1.

Table 1 Global coverage (%) by SNP chips

Full size table

Figure 2 shows a wide range of local coverage across the genome, with some regions receiving low to moderate coverage. For Human1M, the percentage of the euchromatic genome that has ≥80% local coverage rate is 98% for the CEU sample and 97% for the CHB+JPT samples. For HumanHap650Y, the corresponding percentages are 90 and 77%, respectively; for HumanHap550, the percentages are 88 and 73%; for HumanHap300, the percentages are only 41 and 11%. For Affymetrix chips, the percentages are 69 and 74% for SNP Array 6.0, and only 9 and 12% for SNP Array 5.0. All six SNP chips have low coverage rate for the YRI sample. Figure 2 indicates that evaluation of local coverage provides complementary information of an SNP chip in addition to global coverage.

We next evaluated the variation of coverage across chromosomes by calculating the average local coverage rates for all 1 Mb intervals on each chromosome. The coverage of different chromosomes is largely similar, except for chromosome 19, which appears to have lower coverage by all six SNP chips across all HapMap populations (Figure 3). For example, for the CEU sample and SNP Array 6.0, the coverage for chromosome 19 is 67%, whereas the coverage for the other chromosomes ranges from 75 to 86%. The lower coverage for chromosome 19 is presumably due to SNP ascertainment bias in the HapMap¹² or the unusually high density of repeat sequences and high prevalence of large segmental duplications on this chromosome.¹³

Gene coverage

Figure 4 displays the number of gene regions with coverage exceeding certain thresholds for all six SNP chips. For the CEU sample, among the 19 913 genes with at least five common SNPs in the HapMap, 17 730 (89.1%) genes have ≥80% coverage by the Human1M, while the numbers are 16 210 (81.4%), 15 873 (79.7%), 11 207 (56.3%), 12 613 (63.3%), and 6820 (34.2%), respectively, for the HumanHap650Y, HumanHap 550, HumanHap300, SNP Array 6.0, and SNP Array 5.0. The numbers are slightly smaller for the CHB+JPT samples, but drop substantially for the YRI sample. We also note that there is a noticeable fraction of genes that are not well covered by all six SNP chips (Figure 5). For example, for the CEU sample, 1897 (9.5%) genes have coverage of <80% by all six SNP chips. The numbers of such genes are even greater for the CHB (2457, 12.7%), JPT (2295, 11.9%), and the YRI (10 722, 51.8%) samples. Moreover, for each SNP chip, there are some genes that have zero coverage at r²=0.8, even though they contain five or more HapMap common SNPs (Table 2).

Table 2 Number of genes with 0% coverage by SNP chips

Full size table

Similar to the analysis of local coverage, we also calculated the average coverage for genes on each chromosome (Figure 6). Again, we observed that the average coverage for genes on chromosome 19 is significantly lower than that for genes on other chromosomes. For example, for the CEU sample and SNP Array 6.0, the average coverage for genes on chromosome 19 is 61%, whereas the average coverage for genes on other chromosomes ranges from 73 to 85%. Since chromosome 19 has the highest density of genes among all human chromosomes, more than double the genome-wide average,¹³ it is inevitably important to improve its coverage.

Table 3 lists genes that have <30% coverage for the CEU sample by all six SNP chips and that are known to be associated with pathways in the KEGG and BioCarta databases (lists for other samples can be obtained from http://www.biostat.mc.vanderbilt.edu/SNPChipCoverage). This list includes several genes that have been previously identified to be associated with human diseases. For example, Long et al.¹⁴ noted that increased expression and a polymorphism of TGFB1 are associated with abdominal obesity and body mass index in humans. TGFB1 has also been reported to play a role in many other diseases, including Duchenne muscular dystrophy,¹⁵ kidney disease,¹⁶ cancer,¹⁷ scleroderma,¹⁸ lung disease,¹⁹ and herpes simplex virus-1 infection.²⁰ We recognize that these findings need to be replicated by future studies. However, despite the potential important role of TGFB1 in many diseases, all six SNP chips we evaluated have poor coverage for this gene. If an investigator is mainly interested in studying these diseases, then it is likely that TGFB1 will be missed in the initial scan. Understanding the coverage of known genes of different SNP chips will help investigators determine whether supplementary genotyping is needed for certain genes of high interest.

Table 3 Genes with coverage less than 30% by all six SNP chips for the CEU sample

Full size table

We next evaluated whether genes with poor coverage are more likely to be located in copy number variation (CNV) regions.^{21, 22} We obtained the CNV annotation file from Affymetrix, which assembled information of all known CNV regions. For a given coverage threshold, the genes were categorized into two groups, one with coverage higher than the threshold and the other lower than the threshold. Within each group, we calculated the fraction of genes that are located in known CNV regions. Not surprisingly, a higher fraction of low coverage genes fall into known CNV regions than high coverage genes, and the difference is greater for smaller coverage threshold values (Figure 7). This indicates that genes with poorer coverage are more likely to be located in known CNV regions. We also note that for the CEU sample, the fraction of low coverage genes in known CNV regions is slightly higher for the Illumina chips than the Affymetrix chips. This is presumably due to the fact that Illumina designed their products based on tag SNPs derived from the HapMap CEU sample, whereas Affymetrix designed their chips on the basis of sequence constraints when choosing the probes, which may result in a better coverage for CNV regions.

Another possible reason of poor coverage is due to weak LD, as such regions would require inclusion of the majority of SNPs in the region in order to achieve satisfactory coverage. For genes that are not located in known CNV regions, we calculated the average r² over all common SNP pairs that are 30 kb apart. As expected, genes with poor coverage tend to have significantly lower levels of LD than genes with high coverage (data not shown).

Discussion

For six currently available SNP chips, we calculated a map of local coverage across the genome as well as the coverage of all known genes. All six SNP chips have demonstrated variation in their coverage. As GWA studies are becoming a major approach toward disease gene discovery, such explicit evaluation of coverage variation will give a full picture of the genotyping products. We believe that our results can facilitate several aspects in GWA studies.

First, it will be of interest to investigators who have specific prior interest in certain regions in the genome (e.g. candidate genes, linkage peaks, conserved elements and so on). Knowing the extent of coverage for these regions or genes can help determine whether supplementary genotyping is needed in addition to the whole-genome SNP chip.

Second, evaluation of local coverage and gene coverage can ease interpretation and comparison of inconsistent results from GWA studies using different SNP chips. Inconsistency of results in a region or gene across studies might be partly due to differences in coverage. Our results on local coverage (Supplementary Figure 1) and gene coverage (Supplementary Table 1) provide a clear visualization of coverage across the genome for several widely used SNP chips. With such information, an investigator can easily compare local coverage of different SNP chips, aiding interpretation of different results.

Third, knowledge on local and gene coverage can help design new SNP chips. We recognize that the selection of SNPs to be included in a chip will depend on practical constraints; for example, it may be difficult to improve coverage for certain regions due to structural variations such as CNVs or other segmental repeats.^{21, 22} However, our results indicate that many genes in the genome have low coverage simply due to weak LD. Previous studies have shown that some genes are preferentially located in such regions, for example, genes that are involved in immune response and sensory perception.²³ Low coverage of a gene will often result in low power to detect genetic association if the disease variant falls in the gene. Evaluation of local and gene coverage can provide guidance on which regions or genes should receive denser coverage in the new chip.

When calculating gene coverage, we used the transcriptional start and end positions to define gene regions. We recognize that functional variants may exist in the 5′ or 3′ UTRs. However, the UTR information is not available for all the known genes and there is no consensus on how large the UTRs should be. Indeed, we repeated our calculation by expanding each region by 5 kb on each end, and observed similar results (data not shown).

It is commonly believed that GWA studies offer an unbiased approach for identification of susceptibility variants for complex diseases. However, even if the investigator does not impose any prior information onto a GWA study, the analysis results still will be biased toward regions and genes that are better covered by the SNP chip that is used in the study. Thus, for current SNP chips, it is desirable to carry out supplementary genotyping if necessary and to employ more flexible data analysis approaches that can take prior information into account.

In summary, we have evaluated coverage variation of different SNP chips for GWA studies at a finer scale. Although we focused on six SNP chips in this paper, the procedures that we employed are general and are not restricted to a particular product. As whole-genome SNP chips continue to evolve, we believe that detailed coverage evaluation will be valuable for comparing different genotyping products and designing future GWA studies. All results presented in this paper can be downloaded from http://www.biostat.mc.vanderbilt.edu/SNPChipCoverage.

References

Hirschhorn JN, Daly MJ : Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 2005; 6: 95–108.
Article CAS Google Scholar
Wang WY, Barratt BJ, Clayton DG, Todd JA : Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet 2005; 6: 109–118.
Article CAS Google Scholar
Hinds DA, Stuve LL, Nilsen GB et al: Whole-genome patterns of common DNA variation in three human populations. Science 2005; 307: 1072–1079.
Article CAS Google Scholar
International HapMap Consortium: A haplotype map of the human genome. Nature 2005; 437: 1299–1320.
Article Google Scholar
Klein RJ, Zeiss C, Chew EY et al: Complement factor H polymorphism in age-related macular degeneration. Science 2005; 308: 385–389.
Article CAS Google Scholar
Arking DE, Pfeufer A, Post W et al: A common genetic variant in the NOS1 regulator NOS1AP modulates cardiac repolarization. Nat Genet 2006; 38: 644–651.
Article CAS Google Scholar
Herbert A, Gerry NP, McQueen MB et al: A common genetic variant is associated with adult and childhood obesity. Science 2006; 312: 279–283.
Article CAS Google Scholar
Duerr RH, Taylor KD, Brant SR et al: A genome-wide association study identifies IL23R as an inflammatory bowel disease gene. Science 2006; 314: 1461–1463.
Article CAS Google Scholar
Sladek R, Rocheleau G, Rung J et al: A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 2007; 445: 881–885.
Article CAS Google Scholar
Barrett JC, Cardon LR : valuating coverage of genome-wide association studies. Nat Genet 2006; 38: 659–662.
Article CAS Google Scholar
International Human Genome Sequencing Consortium: Finishing the euchromatic sequence of the human genome. Nature 2004; 431: 931–945.
Article Google Scholar
Clark AG, Hubisz MJ, Bustamante CD, Williamson SH, Nielsen R : Ascertainment bias in studies human genome-wide polymorphism. Genome Res 2005; 15: 1496–1502.
Article CAS Google Scholar
Grimwood J, Gordon LA, Olsen A et al: The DNA sequence and biology of human chromosome 19. Nature 2004; 428: 529–535.
Article CAS Google Scholar
Long JR, Liu PY, Liu YJ et al: APOE and TGF-beta-1 genes are associated with obesity phenotypes. J Med Genet 2003; 40: 918–924.
Article CAS Google Scholar
Bernasconi P, Torchiana E, Confalonieri P et al: Expression of transforming growth factor-beta-1 in dystrophic patient muscles correlates with fibrosis: pathogenetic role of a fibrogenic cytokine. J Clin Invest 1995; 96: 1137–1144.
Article CAS Google Scholar
Ziyadeh FN, Hoffman BB, Han DC et al: Long-term prevention of renal insufficiency, excess matrix gene expression, and glomerular mesangial matrix expansion by treatment with monoclonal antitransforming growth factor-beta antibody in db/db diabetic mice. Proc Nat Acad Sci 2000; 97: 8015–8020.
Article CAS Google Scholar
Derynck R, Rhee L, Chen EY, Van Tilburg A : Intron–exon structure of the human transforming growth factor-beta precursor gene. Nucleic Acids Res 1987; 15: 3188–3189.
Article Google Scholar
Dong C, Zhu S, Wang T et al: Deficient Smad7 expression: a putative molecular defect in scleroderma. Proc Nat Acad Sci 2002; 99: 3908–3913.
Article CAS Google Scholar
Pittet JF, Griffiths MJ, Geiser T et al: TGF-beta is critical mediator of acute lung injury. J Clin Invest 2001; 107: 1537–1544.
Article CAS Google Scholar
Gupta A, Gartner JJ, Sethupathy P, Hatzigeorgiou AG, Fraser NW : Anti-apoptotic function of a microRNA encoded by the HSV-1 latency-associated transcript. Nature 2006; 442: 82–85.
Article CAS Google Scholar
Feuk L, Carson AR, Scherer SW : Structural variation in the human genome. Nat Rev Genet 2006; 7: 85–97.
Article CAS Google Scholar
Redon R, Ishikawa S, Fitch KR et al: Global variation in copy number in the human genome. Nature 2006; 444: 444–454.
Article CAS Google Scholar
Smith AV, Thomas DJ, Munro HM, Abecasis GR : Sequence features in regions of weak and strong linkage disequilibrium. Genome Res 2005; 15: 1519–1534.
Article CAS Google Scholar

Download references

Acknowledgements

We thank Drs Goncalo Abecasis, Michael Boehnke, Vivian Cheung, and Richard Spielman for discussion and critical reading of an earlier version of the paper, and Dr Kai Wang for providing the KEGG and BioCarta pathway information. This work was supported by an internal grant from the Center for Human Genetics Research at Vanderbilt University (to CL), and by the University Research Foundation grant and the McCabe Pilot Award from the University of Pennsylvania (to ML).

Author information

Mingyao Li and Chun Li: These authors contributed equally to this work.

Authors and Affiliations

Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, Philadelphia, PA, USA
Mingyao Li
Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, TN, USA
Chun Li
Center for Human Genetics Research, Vanderbilt University School of Medicine, Nashville, TN, USA
Chun Li
Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
Weihua Guan

Authors

Mingyao Li
View author publications
You can also search for this author in PubMed Google Scholar
Chun Li
View author publications
You can also search for this author in PubMed Google Scholar
Weihua Guan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mingyao Li.

Additional information

Supplementary Information accompanies the paper on European Journal of Human Genetics website (http://www.nature.com/ejhg)

Supplementary information

Supplementary Figure 1 (PDF 24449 kb)

Supplementary Table 1 (XLS 31935 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, M., Li, C. & Guan, W. Evaluation of coverage variation of SNP chips for genome-wide association studies. Eur J Hum Genet 16, 635–643 (2008). https://doi.org/10.1038/sj.ejhg.5202007

Download citation

Received: 19 April 2007
Revised: 30 November 2007
Accepted: 20 December 2007
Published: 06 February 2008
Issue Date: May 2008
DOI: https://doi.org/10.1038/sj.ejhg.5202007

Keywords

This article is cited by

Global, pathway and gene coverage of three Illumina arrays with respect to inflammatory and immune-related pathways
- Viola Tozzi
- Albert Rosenberger
- Heike Bickeböller
European Journal of Human Genetics (2019)
Identifying tagging SNPs for African specific genetic variation from the African Diaspora Genome
- Henry Richard Johnston
- Yi-Juan Hu
- Maria Yazdanbakhsh
Scientific Reports (2017)
Genomic copy number variation association study in Caucasian patients with nonsyndromic cryptorchidism
- Yanping Wang
- Jin Li
- Julia Spencer Barthold
BMC Urology (2016)
Evaluation of power of the Illumina HumanOmni5M-4v1 BeadChip to detect risk variants for human complex diseases
- Chuanhua Xing
- Jie Huang
- Josée Dupuis
European Journal of Human Genetics (2016)
Coverage and efficiency in current SNP chips
- Ngoc-Thuy Ha
- Saskia Freytag
- Heike Bickeboeller
European Journal of Human Genetics (2014)

Evaluation of coverage variation of SNP chips for genome-wide association studies

Abstract

Similar content being viewed by others

A comprehensive evaluation of polygenic score and genotype imputation performances of human SNP arrays in diverse populations

Efficient phasing and imputation of low-coverage sequencing data using large reference panels

ParseCNV2: efficient sequencing tool for copy number variation genome-wide association studies

Introduction