European Journal of Human Genetics (2010) 18, 909–914; doi:10.1038/ejhg.2010.53; published online 21 April 2010

Risk of breast and prostate cancer is not associated with increased homozygosity in outbred populations

Victor Enciso-Mora1, Fay J Hosking1 and Richard S Houlston1

1Section of Cancer Genetics, Institute of Cancer Research, Sutton, Surrey, UK

Correspondence: Dr RS Houlston, Section of Cancer Genetics, Institute of Cancer Research, 15 Cotswold Road, Sutton, Surrey SM2 5NG, UK. Tel: +44 208 722 4175; Fax: +44 208 722 4365; E-mail:

Received 14 December 2009; Revised 23 February 2010; Accepted 10 March 2010; Published online 21 April 2010.



Regions of restricted genetic heterogeneity due to identity by descent (autozygosity) are known to confer susceptibility to a number of diseases. Regions of germline homozygosity (ROHs) of 1–2Mb, the result of autozygosity, are detectable at high frequency in outbred populations. Recent studies have reported that ROHs, possibly through exposing recessive disease-causing alleles or alternative mechanisms, are associated with an increased cancer risk. To examine whether homozygosity is associated with breast or prostate cancer risk, we analysed 500K single-nucleotide polymorphism data from two genome-wide association studies conducted by the Cancer Genetics Markers of Susceptibility initiatives ( Six common ROHs were associated with breast cancer risk and four with prostate cancer (P<0.01). Intriguingly, one of the breast cancer ROHs maps to 6q22.31–6q22.3, a region that has been previously shown to confer breast cancer risk. Although none of the ROHs remained significantly associated with cancer risk after adjustment for multiple testing, a number of ROHs merit further interrogation. However, our findings provide no strong evidence that levels of measured homozygosity, whatever their aetiology (autozygosity, uniparental isodisomy or hemizygosity), confer an increased risk of developing breast or prostate cancer in predominantly outbred populations.


homozygosity; risk; prostate; breast; cancer



Breast and prostate cancers are among the most common cancers in Western populations. Twin studies indicate that inherited genetic factors contribute to the development of ~27% of breast and 42% of prostate cancer,1 and both cancers show two-fold elevated risks of the same disease in first-degree relatives of patients.2

To date, no high penetrance susceptibility loci have been identified for prostate cancer and most of the inherited risk is considered to be a consequence of the co-inheritance of multiple low-risk variants. Although high-penetrance susceptibility to breast cancer in the form of BRCA1 and BRCA2 mutations is well recognised, classical mutations in these genes only account for ~20% of familial disease.2

Recent genome-wide association (GWA) studies of breast and prostate cancers have identified several common single-nucleotide polymorphisms (SNPs) at different loci that influence the risk of these tumours.3, 4, 5 These risk variants are common in the general population (minor allele frequency (MAF)>5%), and are associated with odds ratios typically <1.3. Despite the relatively small predisposing effects that they individually confer on tumour risk, they can have more profound effects by functioning in concert. Moreover, their identification has provided important and novel insights into the biology of both breast and prostate cancer.

The majority of cancer predisposition genes that have to date been identified through GWA studies function in a co-dominant manner, and studies have found no good evidence for recessively functioning disease loci. Although this may be reflective of the biology, it may also be a consequence of GWA studies having suboptimal ability to detect recessively functioning disease alleles. Clues that tumour susceptibility may have a recessive basis come from some, but not all, studies showing an increased incidence associated with consanguinity and in populations characterised by a high degree of inbreeding.6, 7, 8, 9, 10, 11 Further evidence for the role of homozygosity in cancer predisposition is provided by experimental animal inbreeding (eg, backcrossing mice) being associated with increasing tumour incidence,12 although such animal model systems are impoverished of variation compared with natural populations, their alleles being selected for survival in the homozygous state. Specific situations of homozygosity have also been directly associated with cancer, such as uniparental disomy through altered imprinting.13

Common regions of homozygosity (ROHs), the result of autozygosity, have recently been shown to occur at a high frequency in outbred populations as a result of selection.14 However, such selected regions are not likely to have been selected in relation to cancer, which is generally late onset relative to past human life expectancy. Therefore, searching for ROHs on a genome-wide basis provides a means of potentially exposing recessively functioning disease genes. Recently, Assié et al15 studied breast, prostate and head/neck cancer patients of Northern/Western European ancestry by whole-genome loss of heterozygosity analysis using a series of microsatellite markers. A significant increase in the frequency of homozygosity in cases compared with controls was reported. In a separate study of colorectal cancer using Affymetrix XbaI 50K SNP arrays, Bacolod et al16 showed that cases harboured significantly more homozygous regions than did healthy individuals. Collectively, these data provide support to the hypothesis that multiple, recessive, cancer-predisposing loci exist, which are not readily detected using a conventional GWA approach based on analysis of individual SNPs. A possible explanation for this is that relative risks per locus are too low and/or that the disease-associated variants are not in strong linkage disequilibrium (LD) with tag SNPs, perhaps because of low allele frequencies.

Although GWA studies have limited ability to identify recessive disease-causing alleles through single SNP analyses, these data sets can potentially be exploited to search for this class of susceptibility allele through whole-genome homozygosity analysis (WGHA). Hence, to examine whether homozygosity is associated with an increased risk of developing breast or prostate cancer and to search for novel recessively acting disease loci, we conducted a WGHA of breast and prostate cancer GWA study data generated by the Cancer Genetic Markers of Susceptibility (CGEMS) initiative.


Materials and methods

Data set

This study was based on GWA study data on breast and prostate cancer generated by the CGEMS, which are publicly accessible. Full details of the studies are provided in previously published material. Briefly, using Illumina Infinium HD Human550 Duo BeadChips (Illumina, San Diego, CA, USA), the GWA study of breast cancer involved genotyping 1183 breast cancer cases and 1185 controls.17 In the prostate GWA study, 1177 prostate cancer cases and 1149 controls were genotyped using Illumina Infinium Human217 and Human 317 BeadChips (Illumina).18 Of the 1177 prostate cancer cases analysed, 689 had aggressive disease, as defined by a Gleason score ≥7 and disease stage ≥III, and 488 prostate cancer cases had non-aggressive disease.

Quality control

We restricted our analysis to autosomal SNPs. We considered that a DNA sample had failed if it did not generate a genotype for >95% of loci. Similarly, an SNP was considered a failure if <90% of DNA samples generated a genotype at the locus. To identify samples showing relatedness, identity by state (IBS) values were calculated for pairs of individuals; for any pair with >80% identical SNP genotypes, we removed the sample with the lower call rate from the analysis. We excluded SNPs on the basis of deviation from Hardy–Weinberg equilibrium (HWE) using a threshold of P<1 × 10−3 in either cases or controls. We also removed SNPs with MAF<0.01. To identify and exclude individuals with non-Western European ancestry, case and control data were merged with data obtained from individuals of different ethnicities from the International HapMap Project, from genome-wide IBS distances for markers shared between HapMap and our SNP panel, and from dissimilarity measures used to perform principal component analysis.

After imposing these stringent quality control measures for the breast cancer GWA study, 512159 SNP genotypes were available on 1144 cases and 1141 controls. For the prostate cancer GWA study, 509008 SNP genotypes were available on 1168 cases (685 with aggressive disease) and 1093 controls. These two data sets formed the basis of our WGHA.

Statistical and bioinformatics analysis

We detected ROHs using PLINK19 (v1.06) software ( The ROH tool moves a sliding window of SNPs across the entire genome. To allow for genotyping error or other sources of artificial heterozygosity, such as paralogous sequences, within a stretch of truly homozygous SNPs and, hence, to prevent underestimating the number and size of ROHs, 2% heterozygous SNPs were permitted in each window. We set the remaining options to default values (including allowing five missing calls per window, thereby ensuring >90% positive-predictive value of each ROH), except that we varied the parameters for homozyg-snp according to our heuristic preferences for defining ROH as detailed below. Subsequent statistical analyses were performed using packages available in R (version 2.7.0; R Foundation for Statistical Computing, Vienna, Austria) and specifically written Perl code. Comparison of the distribution of categorical variables was performed using the χ2-test. To compare the difference in the average number of ROHs between cases and controls, we used the Student t-test. Naive adjustment for multiple testing was based on the Bonferroni correction.

We used three metrics to investigate the selection pressure on each ROH. The integrated haplotype score (iHS) is based on LD surrounding a positively selected allele compared with background, providing evidence of recent positive selection at a locus.20 An iHS score >2.0 reflects the fact that haplotypes on the ancestral background are longer compared with those on the derived allelic background. Episodes of selection tend to skew SNP frequencies in different directions and Tajima's D is based on the frequencies of SNPs segregating in the region of interest.21 Fixation index (Fst) measures the degree of population differentiation at a locus, taking values from 0 to 1.0.22 iHS, D and Fst metrics were obtained from Haplotter Software (University of Chicago, Chicago, IL, USA;

Identification of runs of homozygosity

To examine commonly occurring ROH and to detect statistically significant associations, we only considered ROH that occurred in ≥10 individuals, thereby ensuring that our study had 80% or better power to identify statistically significant differences in ROH between cases and controls. The initial search for ROH was performed using PLINK,19 with a specified length of 80 consecutive SNPs. This ROH length was chosen to be more than an order of magnitude larger than the mean haploblock size in the human genome without being too large to be very rare. The likelihood of observing 80 consecutive chance events can be calculated as follows:14 In the breast cancer series, mean heterozygosity in controls was calculated to be 34%. Thus, given 512159 SNPs and 2285 individuals, a minimum length of 58 would be required to produce <5% randomly generated ROHs across all subjects ((1–0.34)58 × 512159 × 2285=0.040). A consequence of LD is that SNP genotypes are not always independent, thereby inflating the probability of chance occurrences of biologically meaningless ROHs. Analysis based on the pairwise LD SNP pruning function of PLINK, with a default value of r2>0.8 being necessary to declare that one SNP tags another and restricting the search of tags to within 250kb, showed 370611 separable tag groups, representing a 27.6% reduction of information compared with the original number of SNPs. Thus, ROHs of length 80 were used to approximate the degrees of freedom of 58 independent SNP calls. Corresponding calculation in the prostate cancer series resulted in the same criterion being adopted.

Once all ROHs of at least 80 SNPs in length were identified, these were pruned to only those ROHs that occurred in >10 individuals. To ensure that a minimum length and minimum number of SNPs in each ROH were maintained, each individual's SNP data were recoded as one if the SNP was in an ROH for that individual and zero otherwise. Thereafter, for each SNP, those SNPs with <10 individuals coded as one were recoded to zero before removing any ROHs that, because of this recoding, were now <80 SNPs in length. Therefore, this process resulted in a list of ‘common’ ROHs having a minimum of 80 consecutive ROH calls across ≥10 samples and with each ROH having identical start and end locations across all individuals in whom that ROH is observed.



Before conducting a WGHA of the two GWA data sets, we subjected cases and controls to rigorous quality control in terms of excluding samples and SNPs with poor call rates. Furthermore, we excluded SNPs showing significant departure from HWE. We then critically evaluated data sets for ancestral differences by principal component analysis. Figure 1 shows that all sample series were ancestrally comparable and were representative of European ancestry after excluding individuals of different ethnicity.

Figure 1.
Figure 1 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact or the author

Comparison of ethnicity in each of the sample series. The first two principal components of the analysis were plotted. HapMap data are plotted in grey; CEU individuals as ‘•’; CHB+JPT individuals as ‘’; and YRI individuals as ‘+’. Plotted in black, as ‘X’, are (a) breast cancer cases, (b) prostate cancer cases, (c) breast cancer controls and (d) prostate cancer controls.

Full figure and legend (60K)

A total of 415 and 426 ROHs were identified in breast cancer and prostate cancer series, respectively (Supplementary Tables 1 and 2), encompassing ~35% of the genome, as measured by both the total chromosomal length and the number of included SNPs in both series. Figure 2 shows the similarity between genome-wide plots of the location of each ROH among the genomes of both cancer cases and controls from the two studies.

Figure 2.
Figure 2 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact or the author

Genome-wide plots showing the location of each run of homozygosity among the genomes in (a) breast cancer cases and (b) controls; (c) prostate cancer cases and (d) controls. The threshold limit was set to a minimum of at least 80 consecutive homozygous SNPs.

Full figure and legend (213K)

In both data sets, six ROHs exceeded 12Mb in length and included ROHs encompassing the centromeric regions of chromosomes 3, 5, 6, 8, 11 and 16 (Supplementary Tables 1 and 2). Five other ROHs exceeded 12Mb in one of the series and were all >8.7Mb in the other series. Although the length of these large ROHs across centromeres is partly a consequence of extended regions that are not annotated by SNPs, this is, however, unlikely to be the sole explanation, as these centromeric regions are flanked by large homozygous regions. One of these centromeric regions (8p11.21–8q11.23) has been previously highlighted in several genome-wide studies of selective sweeps, thus validating our methodology.20, 23, 24, 25 Five non-centromeric regions harbouring ROHs >12Mb in length in one of the series and >8.5Mb in the other series were also identified: 3q25.31–26.1, 4q13.1–13.3, 6q13.5–15, 7q31.1–32.1 and 14q13.3–22.1 (Supplementary Tables 1 and 2).

In both series, the ROH covering the largest genomic region (28Mb) spanned the centromere of chromosome 3 (3p12.3–3q13.11), a region previously shown to be characterised by a high frequency of ROH in the European population.24 This ROH was also defined by the largest number of SNPs in a single ROH in both studies.

There were 15 ROHs that had a frequency >25% in the breast cancer control series and 16 in the prostate cancer control series (Table 1). In all, 13 of the ROHs were common to both data sets. Of those common to both data sets, five have previously been reported to have high frequency and to harbour several gene categories that have been identified to be influenced by a high degree of selective pressure.20, 23, 24, 25 Publicly available data from HapMap do not indicate that these regions have excessive copy number variation or segmental duplication, nor do they have very low recombination rates.23 However, the high iHS, D and Fst metrics for each region are compatible with positive selection in the Caucasian samples from which the two case series are based (Table 1).

The total number of common ROHs observed in each individual was calculated to permit genome-wide comparison between case and control groups in each of the two data sets. Therefore, each individual was assigned a value between 0 and 415 in the breast cancer series and a value between 0 and 426 in the prostate cancer series. Overall, patients with breast cancer (mean=21.10, SD=4.75) and controls (mean=20.97, SD=4.75) showed no significant difference in the average number of ROHs (t2277=0.6135, P=0.54). Similarly, for prostate cancer, overall statistics showed no significant difference (t2255=0.5434, P=0.59) in the average number of ROHs between cases (mean=21.07, SD=4.79) and controls (mean=20.96, SD=4.71).

In addition, to examine whether there were differences in the distributions of ROH in the genomes of cases and controls, we computed cumulative distributions for both series (Figure 3). This analysis also provided no support for a difference in autozygosity profiles between cases and controls in either data set on a genome-wide basis. Furthermore, cases and controls from both series were similarly directly comparable.

Figure 3.
Figure 3 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact or the author

Cumulative distributions of ROH in breast cancer and prostate cancer series. The graph is presented in such a way that each data point represents the cumulative fraction (y axis) of samples with the corresponding minimum cumulative run of homozygosity (x axis).

Full figure and legend (58K)

At an individual level, six ROHs differed significantly (P<0.01) between breast cancer cases and controls (Table 2). Five of these six ROHs were more common in cases than in controls. ROH286 was identified in 1.4% of cases (n=16) compared with 0.2% of controls (n=2) (P=0.0009). ROH202 mapping to 6q22.31–22.3 was identified in 4.3% (n=49) of cases compared with 2.1% of controls (n=24). Intriguingly, this region of association to which ECHDC1 (enoyl coenzyme, a hydratase domain-containing protein 1; MIM 612136) and RNF146 (ring-finger protein 146; MIM 612137) map has previously been linked to breast cancer susceptibility in a GWA study of Ashkenazi Jews.26 After adjusting for multiple testing, however, these associations, as well as others, were not statistically significant.

Only one of the four ROHs that differed significantly (P<0.01) between prostate cancer cases and controls had a higher prevalence in cases (ROH86), occurring in 19.2% (n=224) of cases and in 14.7% (n=161) of controls (P=0.0049; Table 2). Interestingly, PCGEM1 (prostate-specific gene 1; MIM 605443) maps to the centromeric region of ROH86. PCGEM1 is a prostate tissue-specific, and prostate cancer-associated non-coding RNA (ncRNA) gene, which has a role in the regulation of apoptosis.27, 28 PCGEM1 displays androgen-dependent expression and is overexpressed in ~80% of prostate cancer, suggesting specific functions of PCGEM1 in the biology and tumourigenesis of the prostate gland.27, 28 ROH149, which maps to 4q28.2–28.3, was identified in 7.0% (n=82) of cases compared with 10.1% (n=110) of controls (P=0.0095). An association between 4q28.2 defined by microsatellite D4S2394 and prostate cancer was identified by Assié et al15 (P<0.001). The clinical behaviour of prostate cancer is heterogeneous and is likely to reflect differences in tumour biology and possibly molecular aetiology. In this regard, we conducted a subgroup analysis of prostate cancer cases stratified by aggressive phenotype. However, this analysis did not provide any further support for a relationship between ROH and prostate cancer risk (data not shown).



Two recent studies of cancer have reported that signatures of autozygosity correlate with tumour incidence, and it has been proposed that these regions showing IBD may be the locations of genes contributing to tumour heritability.15, 16 Moreover, these data have been interpreted as providing an explanation for increased risk of various cancers in inbred groups, which has often been reported.

In this study, we have used high-density genomic scan data to compare the structure of genetic variation in patients with breast and prostate cancer with healthy controls. By imposing stringent quality control, we have ensured that individuals in our study were from an apparently panmictic population with no evidence of stratification. Our data provide further evidence that ROHs, ranging in size from 1 to 28Mb, are common in individuals from an outbred population.29, 30, 31, 32 As documented in Table 1, the common ROHs that we have identified in controls are representative of autozygosity due to distant consanguinity rather than due to chromosomal abnormalities or common copy number variants. Moreover, these homozygous regions are too common and small to be caused by recent consanguinity, instead they are consistent with regions under selective pressure.33 On the basis of our analysis, there was, however, no evidence of an association between homozygosity and risk of breast or prostate cancer – either on the basis of total ROH size per individual or specific ROH.

The assertion that increased autozygosity correlates with cancer incidence provides an attractive explanation for reported increased cancer risk in inbred populations. However, as recently articulated, several criticisms can be levelled at this assertion. The observation of an increased cancer risk associated with consanguinity has often been based on studies of a small number of individuals in an isolated community or a single large family with a high level of inbreeding.34 Thus, the relevance of inbreeding to the population risk of cancer is unclear, as inbreeding and founder effects may be confounded. Sample sizes in molecular studies,15, 16 which have sought to establish a relationship between ROH and cancer risk, have generally been small and, crucially, cases and control groups have been ethnically heterogeneous or unmatched. Furthermore, the study of breast, prostate and head/neck cancer reported by Assié et al made use of relatively sparse microsatellite data. In this study, we have addressed these shortcomings in our study of breast and prostate cancer by analysing a large set of cases and controls that have been genotyped for several hundred thousand SNPs and imposed a high level of quality control both in terms of genotyping and sample ancestry.

By not advocating the use of uncorrected P-values, it is intriguing that our analysis identified two regions previously implicated in susceptibility to these cancers. Although this suggests that further interrogation of other regions may prove profitable, there is limited support from our analysis that WGHA of outbred populations provides a robust methodology for identifying novel cancer susceptibility loci.

In conclusion, our findings do not provide evidence that levels of measured homozygosity, whatever their aetiology, from autozygosity, uniparental isodisomy or hemizygosity, confer an increased risk of developing either breast or prostate cancer in a predominantly outbred population. Moreover, it is unlikely that large numbers of recessive alleles exist that predispose to breast or prostate cancer and are unmasked by autozygosity in most European populations. This analysis does not exclude the possibility that recessively functioning disease alleles exist for these cancers, whatever their penetrances, or that autozygosity may operate in populations characterised by a high level of inbreeding.








Conflict of interest

The authors declare no conflict of interest.



  1. Lichtenstein P, Holm NV, Verkasalo PK et al: Environmental and heritable factors in the causation of cancer–analyses of cohorts of twins from Sweden, Denmark, and Finland. N Engl J Med 2000; 343: 78–85. | Article | PubMed | ISI | ChemPort |
  2. Balmain A, Gray J, Ponder B: The genetics and genomics of cancer. Nat Genet 2003; 33: 238–244. | Article | PubMed | ISI | ChemPort |
  3. Easton DF, Pooley KA, Dunning AM et al: Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 2007; 447: 1087–1093. | Article | PubMed | ISI | ChemPort |
  4. Easton DF, Eeles RA: Genome-wide association studies in cancer. Hum Mol Genet 2008; 17: R109–R115. | Article | PubMed | ChemPort |
  5. Eeles RA, Kote-Jarai Z, Giles GG et al: Multiple newly identified loci associated with prostate cancer susceptibility. Nat Genet 2008; 40: 316–321. | Article | PubMed | ChemPort |
  6. Abramson JH, Pridan H, Sacks MI, Avitzour M, Peritz E: A case-control study of Hodgkin's disease in Israel. J Natl Cancer Inst 1978; 61: 307–314. | PubMed | ChemPort |
  7. Bener A, El Ayoubi HR, Chouchane L et al: Impact of consanguinity on cancer in a highly endogamous population. Asian Pac J Cancer Prev 2009; 10: 35–40. | PubMed |
  8. Feldman JG, Lee SL, Seligman B: Occurrence of acute leukemia in females in a genetically isolated population. Cancer 1976; 38: 2548–2550. | Article | PubMed | ChemPort |
  9. Lebel RR, Gallagher WB: Wisconsin consanguinity studies. II: Familial adenocarcinomatosis. Am J Med Genet 1989; 33: 1–6. | Article | PubMed | ChemPort |
  10. Shami SA, Qaisar R, Bittles AH: Consanguinity and adult morbidity in Pakistan. Lancet 1991; 338: 954. | Article | PubMed | ChemPort |
  11. Simpson JL, Martin AO, Elias S, Sarto GE, Dunn JK: Cancers of the breast and female genital system: search for recessive genetic factors through analysis of human isolate. Am J Obstet Gynecol 1981; 141: 629–636. | PubMed | ChemPort |
  12. Demant P: Cancer susceptibility in the mouse: genetics, biology and implications for human cancer. Nat Rev Genet 2003; 4: 721–734. | Article | PubMed | ISI | ChemPort |
  13. Henry I, Bonaiti-Pellie C, Chehensse V et al: Uniparental paternal disomy in a genetic cancer-predisposing syndrome. Nature 1991; 351: 665–667. | Article | PubMed | ISI | ChemPort |
  14. Lencz T, Lambert C, DeRosse P et al: Runs of homozygosity reveal highly penetrant recessive loci in schizophrenia. Proc Natl Acad Sci USA 2007; 104: 19942–19947. | Article | PubMed
  15. Assie G, LaFramboise T, Platzer P, Eng C: Frequency of germline genomic homozygosity associated with cancer cases. JAMA 2008; 299: 1437–1445. | Article | PubMed | ChemPort |
  16. Bacolod MD, Schemmann GS, Wang S et al: The signatures of autozygosity among patients with colorectal cancer. Cancer Res 2008; 68: 2610–2621. | Article | PubMed | ChemPort |
  17. Yeager M, Orr N, Hayes RB et al: Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet 2007; 39: 645–649. | Article | PubMed | ISI | ChemPort |
  18. Hunter DJ, Kraft P, Jacobs KB et al: A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet 2007; 39: 870–874. | Article | PubMed | ChemPort |
  19. Purcell S, Neale B, Todd-Brown K et al: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007; 81: 559–575. | Article | PubMed | ISI | ChemPort |
  20. Voight BF, Kudaravalli S, Wen X, Pritchard JK: A map of recent positive selection in the human genome. PLoS Biol 2006; 4: e72. | Article | PubMed | ChemPort |
  21. Tajima F: Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 1989; 123: 585–595. | PubMed | ISI | ChemPort |
  22. Holsinger KE, Weir BS: Genetics in geographically structured populations: defining, estimating and interpreting F(ST). Nat Rev Genet 2009; 10: 639–650. | Article | PubMed | ChemPort |
  23. The International HapMap Consortium: A haplotype map of the human genome. Nature 2005; 437: 1299–1320. | Article | PubMed | ISI | ChemPort |
  24. Wang ET, Kodama G, Baldi P, Moyzis RK: Global landscape of recent inferred Darwinian selection for Homo sapiens. Proc Natl Acad Sci USA 2006; 103: 135–140. | Article | PubMed | ChemPort |
  25. Williamson SH, Hubisz MJ, Clark AG, Payseur BA, Bustamante CD, Nielsen R: Localizing recent adaptive evolution in the human genome. PLoS Genet 2007; 3: e90. | Article | PubMed | ChemPort |
  26. Gold B, Kirchhoff T, Stefanov S et al: Genome-wide association study provides evidence for a breast cancer risk locus at 6q22.33. Proc Natl Acad Sci U S A 2008; 105: 4340–4345. | Article | PubMed
  27. Fu X, Ravindranath L, Tran N, Petrovics G, Srivastava S: Regulation of apoptosis by a prostate-specific and prostate cancer-associated noncoding gene, PCGEM1. DNA Cell Biol 2006; 25: 135–141. | Article | PubMed
  28. Srikantan V, Zou Z, Petrovics G et al: PCGEM1, a prostate-specific gene, is overexpressed in prostate cancer. Proc Natl Acad Sci U S A 2000; 97: 12216–12221. | Article | PubMed | ChemPort |
  29. Gibson J, Morton NE, Collins A: Extended tracts of homozygosity in outbred human populations. Hum Mol Genet 2006; 15: 789–795. | Article | PubMed | ISI | ChemPort |
  30. Li LH, Ho SF, Chen CH et al: Long contiguous stretches of homozygosity in the human genome. Hum Mutat 2006; 27: 1115–1121. | Article | PubMed | ChemPort |
  31. Simon-Sanchez J, Scholz S, Fung HC et al: Genome-wide SNP assay reveals structural genomic variation, extended homozygosity and cell-line induced alterations in normal individuals. Hum Mol Genet 2007; 16: 1–14. | Article | PubMed | ChemPort |
  32. Broman KW, Weber JL: Long homozygous chromosomal segments in reference families from the centre d’Etude du polymorphisme humain. Am J Hum Genet 1999; 65: 1493–1500. | Article | PubMed | ISI | ChemPort |
  33. Woods CG, Cox J, Springell K et al: Quantification of homozygosity in consanguineous individuals with autosomal recessive disease. Am J Hum Genet 2006; 78: 889–896. | Article | PubMed | ChemPort |
  34. Akasaka T, Balasas T, Russell LJ et al: Five members of the CEBP transcription factor family are targeted by recurrent IGH translocations in B-cell precursor acute lymphoblastic leukemia (BCP-ALL). Blood 2007; 109: 3451–3461. | Article | PubMed | ISI | ChemPort |


This study was supported by a grant from Cancer Research UK (C1298/A8362 supported by the Bobby Moore Fund). Our analysis made use of data from the Cancer Genetics Markers of Susceptibility (CGEMS) prostate and breast cancer studies. A full list of the investigators who contributed to the generation of data is available from

Supplementary Information accompanies the paper on European Journal of Human Genetics website