Journal home
Advance online publication
Current issue
Archive
Press releases
Free Association (blog)
Supplements
Focuses
Guide to authors
Online submissionOnline submission
For referees
Free online issue
Contact the journal
Subscribe
Advertising
work@npg
Reprints and permissions
About this site
For librarians
 
NPG Resources
Nature
Nature Biotechnology
Nature Cell Biology
Nature Medicine
Nature Methods
Nature Reviews Cancer
Nature Reviews Genetics
Nature Reviews Molecular Cell Biology
news@nature.com
Nature Conferences
RNAi Gateway
NPG Subject areas
Biotechnology
Cancer
Chemistry
Clinical Medicine
Dentistry
Development
Drug Discovery
Earth Sciences
Evolution & Ecology
Genetics
Immunology
Materials Science
Medical Research
Microbiology
Molecular Cell Biology
Neuroscience
Pharmacology
Physics
Browse all publications
Letter
Nature Genetics  26, 233 - 236 (2000)
doi:10.1038/79981

Genome-wide analysis of single-nucleotide polymorphisms in human expressed sequences

Kris Irizarry1, Vlad Kustanovich2, Cheng Li3, Nik Brown5, Stanley Nelson2, 4, Wing Wong3 & Christopher J. Lee1

1  Department of Chemistry & Biochemistry, University of California, Los Angeles, Los Angeles, California, USA.

2  Department of Human Genetics, University of California, Los Angeles, Los Angeles, California, USA.

3  Department of Statistics, University of California, Los Angeles, Los Angeles, California, USA.

4  Department of Pediatrics, University of California, Los Angeles, Los Angeles, California, USA.

5  Graduate Program in Computer Science, University of California, Los Angeles, Los Angeles, California, USA.

Correspondence should be addressed to Christopher J. Lee leec@mbi.ucla.edu
Single-nucleotide polymorphisms (SNPs) have been explored as a high-resolution marker set for accelerating the mapping of disease genes1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11. Here we report 48,196 candidate SNPs detected by statistical analysis of human expressed sequence tags (ESTs), associated primarily with coding regions of genes. We used Bayesian inference to weigh evidence for true polymorphism versus sequencing error, misalignment or ambiguity, misclustering or chimaeric EST sequences, assessing data such as raw chromatogram height, sharpness, overlap and spacing, sequencing error rates, context-sensitivity and cDNA library origin. Three separate validations—comparison with 54 genes screened for SNPs independently, verification of HLA-A polymorphisms and restriction fragment length polymorphism (RFLP) testing—verified 70%, 89% and 71% of our predicted SNPs, respectively. Our method detects tenfold more true HLA-A SNPs than previous analyses of the EST data. We found SNPs in a large fraction of known disease genes, including some disease-causing mutations (for example, the HbS sickle-cell mutation). Our comprehensive analysis of human coding region polymorphism provides a public resource for mapping of disease genes (available at http://www.bioinformatics.ucla.edu/snp).


 Top
Abstract
Previous | Next
Table of contents
Full textFull text
Download PDFDownload PDF
Send to a friendSend to a friend
Save this linkSave this link

Open Innovation Challenges

naturejobs

Figures & Tables
Supplementary info
Export citation
natureproducts

Search buyers guide:

 
ADVERTISEMENT
 
Nature Genetics
ISSN: 1061-4036
EISSN: 1546-1718
Journal home | Advance online publication | Current issue | Archive | Press releases | Supplements | Focuses | For authors | Online submission | Permissions | For referees | Free online issue | About the journal | Contact the journal | Subscribe | Advertising | work@npg | naturereprints | About this site | For librarians
Nature Publishing Group, publisher of Nature, and other science journals and reference works©2000 Nature Publishing Group | Privacy policy