Inbreeding and homozygosity in breast cancer survival

Thomsen, Hauke; Filho, Miguel Inacio da Silva; Woltmann, Andrea; Johansson, Robert; Eyfjörd, Jorunn E.; Hamann, Ute; Manjer, Jonas; Enquist-Olsson, Kerstin; Henriksson, Roger; Herms, Stefan; Hoffmann, Per; Chen, Bowang; Huhn, Stefanie; Hemminki, Kari; Lenner, Per; Försti, Asta

doi:10.1038/srep16467

Download PDF

Article
Open access
Published: 12 November 2015

Inbreeding and homozygosity in breast cancer survival

Hauke Thomsen¹,
Miguel Inacio da Silva Filho¹,
Andrea Woltmann¹,
Robert Johansson²,
Jorunn E. Eyfjörd³,
Ute Hamann⁴,
Jonas Manjer^5,6,
Kerstin Enquist-Olsson⁷,
Roger Henriksson^2,8,
Stefan Herms^9,10,
Per Hoffmann^9,10,
Bowang Chen¹,
Stefanie Huhn¹,
Kari Hemminki^1,11,
Per Lenner² &
…
Asta Försti^1,11

Scientific Reports volume 5, Article number: 16467 (2015) Cite this article

2222 Accesses
4 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Genome-wide association studies (GWASs) help to understand the effects of single nucleotide polymorphisms (SNPs) on breast cancer (BC) progression and survival. We performed multiple analyses on data from a previously conducted GWAS for the influence of individual SNPs, runs of homozygosity (ROHs) and inbreeding on BC survival. (I.) The association of individual SNPs indicated no differences in the proportions of homozygous individuals among short-time survivors (STSs) and long-time survivors (LTSs). (II.) The analysis revealed differences among the populations for the number of ROHs per person and the total and average length of ROHs per person and among LTSs and STSs for the number of ROHs per person. (III.) Common ROHs at particular genomic positions were nominally more frequent among LTSs than in STSs. Common ROHs showed significant evidence for natural selection (iHS, Tajima’s D, Fay-Wu’s H). Most regions could be linked to genes related to BC progression or treatment. (IV.) Results were supported by a higher level of inbreeding among LTSs. Our results showed that an increased level of homozygosity may result in a preference of individuals during BC treatment. Although common ROHs were short, variants within ROHs might favor survival of BC and may function in a recessive manner.

Refining the impact of genetic evidence on clinical success

Article Open access 17 April 2024

A single-cell atlas enables mapping of homeostatic cellular shifts in the adult human breast

Article Open access 28 March 2024

Genome-wide association studies

Article 26 August 2021

Introduction

Breast cancer (BC) is the most common cancer among women, comprising about 23% of all female cancers. Each year, nearly 1.67 million new cases are diagnosed and almost 522 000 women die of this disease¹. It has been shown that survival of BC is partly heritable due to yet unknown genetic factors². Further knowledge about the effects of genetic variants on BC survival will help to predict the patient’s individual risk for disease progression and survival probabilities and to develop new and better therapies and preventive strategies. Within the last six years 34 genome-wide association studies (GWASs) on BC have been performed identifying 194 new susceptibility loci (http://www.genome.gov/gwastudies). Their identification has provided important and novel insights into the biology of BC³. In addition, three GWASs have been conducted on BC survival but they only led to the discovery of three prognostic loci^4,5,6.

A more global view on the GWAS data can reveal new insights in cancer formation and progression and give new clues for further investigations. The majority of cancer predisposition genes that have been identified through GWASs function in a co-dominant manner and the studies have not found evidence for recessively functioning disease loci. From the biological point of view it is reasonable to assume that tumors may also appear as an autosomal recessive disease. This is supported by a study that shows an increased cancer incidence associated with consanguinity and higher risk in populations characterized by a higher degree of inbreeding and corresponding homozygosity⁷. As a result, affected individuals are more often homozygous for sequence variants that underlay the disease⁸.

Unfortunately, conventional methods to analyze GWASs and whole exome or whole genome sequencing studies are prone to overlook variants which might exert a recessive effect on the risk of a disease, either as homozygotes or compound heterozygotes⁹. Therefore, a variety of studies have been performed to identify regions with runs of homozygosity (ROHs) and to prove their recessive effects on the risk of complex diseases and traits^{10,11,12,13,14,15}. Several studies have even investigated whether ROHs are associated with an increased risk of developing cancers such as breast, colorectal, lung, prostate and head/neck^3,16,17,18. While Assie et al. showed increased germline homozygosity at specific loci in cancer cases, Orloff et al., and Spain et al. reported a significantly increased frequency of homozygous regions in cases compared with controls^16,17,18. However, Enciso-Mora et al. provided no strong evidence for homozygosity as a risk factor for breast or prostate cancer³.

We conducted a whole-genome homozygosity analysis on BC survival based on our GWAS data¹⁹. The aim of our study was to examine whether extended homozygosity is associated with an increased or decreased survival of BC and to search for novel recessively acting disease loci.

Results

The GWAS data were subjected to rigorous quality control based on standard protocols²⁰. The data set was then critically evaluated for ancestral differences by principal component analysis. Figure 1(A,B) show plots of the first two principal components for the study samples and the corresponding HapMap data before and after exclusion of outliers. There was a good match with the samples of European ancestry. After quality control association between homozygosity and BC survival was tested in three ways.

Genome-wide assessment of associations between homozygosity at single SNPs and BC survival

The mean of the overall proportion of homozygosity for the complete SNP set was significantly lower in STSs as compared to LTSs (P = 0.05). Subsequently, a test for the genome-wide assessment of homozygosity and BC survival was performed on a SNP-by-SNP basis. The corresponding QQ-plot of the P-values is shown in the supplemental Figure 1. Results for the best SNPs with P < 1*10⁻⁴ are shown in Table 1. The most strongly associated SNP was rs9754606 (chr3: 192 220 488 bp; P_homoz = 2.2*10⁻⁶; chi² = 22.41). The false discovery rate (FDR) controlled at some arbitrary level of q* did not fall below the level of q* < 0.05 to indicate globally significant association.

Table 1 Association between homozygosity and time of survival for individual SNPs.

Full size table

Identification of individual ROHs per person and association between ROHs and BC survival

Within our sample set we identified a total of 7646 individual ROHs larger than 1000 kb across all 675 individuals (3608 in the 340 STSs and 4038 in the 335 LTSs). The average length of these ROHs was 2598.59 kb. For each individual, an average of 11.32 ROH segments were detected, which covered in total 8.1% of the human genome. An overview of the distribution of ROHs in the different populations is represented in Fig. 2, showing that most of the individuals of the Umeå population had about 10 to 20 ROHs per person whereas the German population had its mean at eight ROHs per person. Figure 3 shows the individual numbers of ROHs per person in relation to the total length of the ROHs in Mb (all ROHs above 1 Mb differentiated by population). Data points for the German and Malmö subgroups were generally narrowly distributed along both axes, indicating that these individuals had few, relatively short ROHs per person. The two other sample groups were much more widely spread along both axes, reflecting the presence of many and much longer ROHs per person.

Overall, the mean ROH size per person as well as the total length of ROHs per person was not different between STSs and LTSs (Table 2). However, the number of ROHs per person was significantly higher in LTSs than in STSs (P = 0.0001). Even though the population identifier used as a covariate in a generalized linear model had a strong effect (P < 4.55*10⁻⁶), the difference in the number of ROHs between STSs and LTSs was still significant at P = 0.049. After applying a permutation test the number of ROHs per person remained significant (P = 0.049), but the origin of the different populations also stayed significant (P < 1*10⁻⁶), indicating the population as a confounder.

Table 2 Burden analysis of ROH for the entire data set and each subset.

Full size table

Due to the observed differences in the number of ROHs per person, the burden analysis was extended to the population subgroups (Table 2). In none of the subgroups, any of the calculated parameters differed significantly between STSs and LTSs, even though LTSs of the Icelandic subpopulation showed marginally higher numbers of ROHs per person (P = 0.08). Table 2 also gives an overview of the differences among the populations in general. The means of the number of ROHs per person, the total and the average length of the ROHs per person were significantly smaller in the German subset than in the other three subpopulations (P = 0.0003). Compared to the Umeå subset, the Malmö and Icelandic subset showed significantly smaller ROHs per person and smaller total and average ROH size (P = 0.003 and P = 0.0001, respectively).

Common ROH regions and association with BC survival

For a more powerful association analysis between BC survival and ROHs all individuals of the different populations were pooled. A total of 2287 groups for overlapping regions of homozygosity were formed, of which 143 ROHs fulfilled the criteria for the identification of common ROHs (a consensus SNP set representing the minimal overlapping of 75 SNPs in ≥5 samples or pools being homozygous in either STSs only or LTSs only). None of the common ROH regions were associated with BC survival after correction for multiple testing. However, seven regions were associated at a suggestive level (P < 0.05). Another four regions with a P-value <0.05 were present in only four individuals, but also following the general pattern of the ROH regions being exclusively present in LTSs and absent in the STSs and thus, associated with longer survival of BC. As shown in Table 3, the LTSs with longer ROHs were mainly members of the Icelandic and Umeå subgroups, whereas among the STSs only one German woman carried ROH3 and ROH7. None of these overlapping ROHs shown in Table 3 encompassed the centromeric regions. The accompanying inspection of the data for copy number variants (CNVs) resulted in 10.800 CNVs. An average, 16 CNVs were discovered per sample. The average CNV size was 107 kb. After a detailed scan no CNVs were detected within the overlapping ROHs.

Table 3 List of ROHs associated with BC survival.

Full size table

All common ROH regions were tested for differences among all STSs and LTSs of our sample with respect to the proportions of SNPs being homozygous. Table 3 shows the corresponding P-values of the one-tailed t-test for each ROH. Six ROHs showed highly significant differences. The right column of Table 3 shows, that for all common ROHs except for ROH6 the H₀ could be rejected. FDR_ROH were significantly smaller than FDR_GWAS, indicating that ROHs are not inferior to GWAS results. None of the SNPs on the SNP-by-SNP based test (P < 1*10⁻⁴) was overlapping with any of the common ROH regions.

Natural selection as a cause of ROHs

ROHs have been suggested to derive from three possible mechanisms: relatedness due to demographic events (e.g. bottleneck events, founder effects or population isolation), natural selection or recent parental relatedness (inbreeding)²¹. In order to assess the influence of selection on the most promising ROH regions, three estimates were used, Tajima’s D, iHS and Fay Wu’s H^22,23,24. Every ROH of interest showed highly significant values for all three estimates (iHS > 2.0, Tajimas’ D > 2.0 and Fay Wu’s H ≪ −10; Table 3), indicating that each of the eleven most promising ROH regions might be the result of a selective sweep.

Inbreeding and association between homozygosity and BC survival

Next, we calculated the inbreeding coefficients for all samples using the SNP data, i.e. the relationship between haplotypes within an individual. Three estimates were used: one based on the variance of additive genetic values (F I), the second based on SNP homozygosity (F II) and the third based upon the correlation between uniting gametes (F III)²⁵. The means and standard deviations (SDs) for F II in STSs and LTSs were 0.004 (SD 0.016) and 0.006 (SD 0.012), respectively and significantly different from each other (P = 0.03, by t test and by regression of F II on survival as a binary trait (0/1) in a generalized linear model using glm() in R). This suggests that LTSs were in general more inbred than STSs. However, inbreeding coefficients F I and F III did not differ significantly between STSs and LTSs for the overall data set, but means and SDs for F III in STSs were still lower with 0.005 (SD 0.015) than in LTSs with 0.006 (SD 0.011), which supports the differences shown above. Breaking down the analysis of the overall genome to single chromosomes revealed, that the primary source of differences in inbreeding was due to chromosome 9 and 15, for which we detected significantly higher values for all three inbreeding coefficients in LTSs at P = 0.01 (data not shown).

Testing each population subgroup for any differences of the inbreeding coefficients between STSs and LTSs did not show any significant results.

To illustrate the relationship between inbreeding and ROHs we assessed correlations between different consanguinity measures as shown in Fig. 4. Due to extreme values in the total number of ROHs one outlier of the German cases was excluded. The total length of individual ROHs was highly correlated with the total number of ROHs per individual (r = 0.79, P < 0.0001). A similar correlation was estimated between the total number of ROHs per individual and the individual inbreeding coefficient F II (r = 0.66, P < 0.0001). The highest correlation was detected between the total length of ROHs per individual and the individual inbreeding coefficient (r = 0.81, P < 0.0001). The results show that the number of ROHs and their corresponding length is associated with the level of inbreeding of each individual.

Finally, we checked for an association between homozygosity represented by the genomic inbreeding coefficient F_ROH and survival of BC. The overall means and SDs for F_ROH in STSs and LTSs were 0.0112 (SD 0.015) and 0.0128 (SD 0.010). The true difference in means was greater than zero at P = 0.05. For the subpopulations no significant differences were observed except for the Islandic group with a mean of 0.010 (SD 0.008) for STSs and 0.012 (SD 0.011) for LTSs at P = 0.07. On a chromosome-wise level inbreeding coefficients for chromosome 15 were also significantly higher in LTSs with 0.045 (SD 0.06) than for STSs with 0.029 (SD 0.03) (P = 0.04). For chromosome 9 the trend was similar with a mean of F_ROH for STSs with 0.025 (SD 0.03) and for LTSs 0.029 (SD 0.05) (P = 0.24).

Discussion

To our knowledge the current work is the first analysis of the influence of genomic homozygosity on the survival of BC patients. Homozygosity can be caused by demographic events, consanguinity/inbreeding or selective pressure. In our study, most of the ROHs were relatively short excluding consanguinity as the cause of inbreeding, although inbreeding coefficients point to a certain level of relatedness. On the other hand, all of the ROHs of interest showed highly significant evidence for natural selection (iHS, Tajima’s D, Fay-Wu’s H)²³. Thus, the influence of selective pressure on the ROH length cannot be excluded either.

We show some evidence that survival of BC may be associated with increased homozygosity and an increased level of inbreeding. Our stringent quality control prior to the analysis provided the required certainty of no bias due to population stratification for the analysis on a SNP-by-SNP basis. No significant differences in the proportion of homozygous individuals among STSs and LTSs were observed in the SNP-by-SNP analysis.

Further downstream analysis indicated significant differences among the populations in terms of the number of ROHs per person and the total and average length of ROHs per person. These differences are well known and have been used as a resource for studying human genetic diversity and evolutionary history²¹. The origin of the different populations had a significant impact on the differences of the number of ROHs per person and the total and average length of ROHs per person. However, the difference in the number of ROHs per person between STSs and LTSs remained significant (P = 0.049) by using a generalized linear model with population identifier as a covariate and it was confirmed by a permutation test.

As a consequence of the significant differences the total number, the total length of ROHs and the mean ROHs sizes per person were analyzed separately for each subpopulation. Although the overall analysis showed an increased number of ROHs among LTSs, the stratified analysis did not show any significant differences. A possible reason might be the relatively small number of individuals per subgroup. However, the patterns followed the same trend in the Icelandic and Malmö subgroup.

Most importantly, several of the ROHs were significantly more homozygous among LTSs than among STSs and the FDR was also significantly lower. Some of the common ROHs identified in our analysis also overlap with long contiguous stretches of homozygosity from another study but are not due to chromosomal abnormalities or common copy number variants²⁶. Intriguingly, several regions identified as suggestive ROHs harbor genes that are associated with progression and metastasis in BC, such as the GPATCH2 gene on 1q41²⁷. This region (ROH3, Table 3) was homozygous in eight LTSs but only in one STS.

Another important region with influence on BC survival was identified on chromosome 15 (ROH2). Within this region the GRINL1A complex transcription unit (CTU) represents a naturally occurring read through transcription between the neighboring genes MYZAP (GCOM1) and POLR2M (GRINL1A)²⁸. Interestingly, GCOM1 has been identified as an estrogen receptor β (ERβ) target gene²⁹.

The second homozygous region on chromosome 15 (ROH4) hosts two genes of the gamma-aminobutyric acid A receptor family (GABRB3 and GABRB5), that are related to the chemokinesis and chemotaxis in MDA-MB-468 human breast carcinoma cells³⁰.

For several other homozygous regions such as ROH5, ROH6, ROH7, ROH9 and ROH10 genes have been identified with an association for BC or BC progression. These genes may modify disease risk or tumor progression, or they may work as markers of protection, transcription co-activators, or oxidative stress-modifying genes^{31,32,33,34,35,36}.

One of the most striking results of our investigation was the higher degree of homozygosity among LTSs of BC, which is represented by an increased measure of the inbreeding coefficient. These results are in good agreement with the detection of more LTSs within individuals of higher number of ROHs or increased length of homozygous stretches. Further analysis of common ROHs did not result in genome-wide significant differences in survival, but all the regions reflected the same pattern of showing more or solely LTSs being homozygous for specific regions. Most of these regions could even be linked to genes related to progression or treatment in BC. Thus, there seems to be evidence for an association between homozygosity and survival of BC.

The remaining question is whether increased homozygosity in certain regions of the genome supports longer survival of BC or in a reverse way whether increased homozygosity has originated from the fact that patients being homozygous for certain loci respond better to treatment and therefore have better survival.

A possible explanation for the results of increased homozygosity among LTSs may be a relative preference of regions carrying no mutation at all compared with those that carry deleterious mutations in a homozygous or heterozygous status. As such, the regions of homozygosity may reflect a certain degree of genomic resistance against the challenges of chemotherapeutic treatment as compared with heterozygous genotypes. A great example for a similar pattern is provided by the CHEK2 locus, where the CHEK2*1100delC heterozygosity was associated with a 1.4-fold risk of early death in BC patients compared to noncarriers³⁷. It is one of the most recent and well-documented examples for a genetic factor influencing long-term prognosis of women with BC. An earlier publication also showed that heterozygote carriers of the NBN founder mutation are under higher risk to develop BC and die earlier³⁸. Overall, there seems to be some variation of genotypes within patients that will help them to survive the applied treatment better than others. Such genotypes, either alone, in interaction with each other or in combination with specific drugs or treatments may result in better treatment outcome, decreased side effects or improved survival. Therefore, the discovery and understanding of such genotypes may be vital for the improvement of cancer therapy.

Material and Methods

The GWAS on BC survival was a population based case-only study, in which the BC patients were divided in two groups based on their survival time¹⁹. A group of 369 women with short-time survival (STS, less than 6 years after BC diagnosis) was compared with a group of 369 women with long-time survival (LTS, ≥11 years after BC diagnosis). The cases with STS and LTS were selected from four cohorts and matched for age (<40, 40–49, 50–59 and ≥60 years), period of diagnosis (1985–1989, 1990–1994 and 1995-) and the corresponding cohort: 1) 96 STSs and 96 LTSs from the Västerbotten intervention project, the mammary screening project and from the Department of Oncology, Norrlands University Hospital, Umeå, Sweden³⁹; 2) 44 STSs and 44 LTSs from Malmö Diet and Cancer Study, Malmö, Sweden⁴⁰; 3) 82 STSs and 14 LTSs from the Städtisches Klinikum Karlsruhe and Deutsches Krebsforschungszentrum Breast Cancer Study (SKKDKFZS) consisting of women between 21–93 years of age at diagnosis with pathologically confirmed BC recruited at the Städtisches Klinikum Karlruhe, Karlsruhe, Germany from 1993–2005⁴¹; and another 68 LTSs from the Umeå cohort; 4) 147 STSs and 147 LTSs from the Icelandic Cancer Society and University of Iceland Biobank⁴². The STSs and LTSs were identified from the cohorts by record linkage to the regional cancer registries. Follow-up was performed until 2008 and the data were available for every patient. Disease stage of the patients was categorized from 0 to IV. STSs tended to have tumors of higher stage than LTSs¹⁹.

Ethics statement

The studies were coordinated at the German Cancer Research Center (DKFZ) with samples and information obtained with full informed consent and national ethical review board approval [Dnr 07-14IM] in accordance with the Declaration of Helsinki.

Genotyping and quality control

For all samples ~300 000 tagging single nucleotide polymorphisms (SNPs) were genotyped using the Illumina HumanCytoSNP-12v1. Quality control procedures were based on standard protocols using PLINK software (v1.07) and R, v3.0.2 (R Foundation for Statistical Computing, Vienna, Austria)^19,20,43.

To exclude individuals with non-Western European ancestry, data of the STSs and LTSs were merged with data obtained from the International HapMap Project⁴⁴. Principal component analysis was used to identify population outliers. The remaining individuals matched genetically well to the HapMap samples with northern and western European ancestry (CEU). After stringent quality control the final data set consisted of 340 STRs and 335 LTRs with genotyping information for 232 478 autosomal SNPs.

Genome-wide assessment of homozygosity at individual SNPs and BC survival

Motivated by the observation of high frequencies of germline homozygosity at specific markers in cancer cases by Assie et al. an initial test as described by Spain et al. was performed for any association between homozygosity (whether for the major or minor allele) and BC survival on a SNP-by-SNP basis in our entire sample series based on a chi²-test with the number of homozygotes and heterozygotes at each SNP in STSs and LTSs^16,17. To control the problem of multiple testing the false discovery rate (FDR) was calculated and controlled at an arbitrary level q*⁴⁵.

Identification of runs of homozygosity

We defined ROHs following recommendations in Howrigan et al.⁴⁶. ROHs were detected using PLINK (v1.07) software. The ROH tool moves a sliding window of SNPs across the entire individual genome. To prevent for any genotyping errors or other sources of artificial heterozygosity, such as paralogous sequences within a stretch of truly homozygous SNPs and, hence, to balance the number and size of ROHs, no heterozygous SNPs were permitted in any window. We set the remaining options to default values (including at most three missing calls per window, thereby ensuring >90% positive-predictive value of each ROH), except that we varied the parameters for “homozyg-snp” option according to our heuristic preferences for defining ROHs as detailed below. Subsequent statistical analyses were performed using packages available in R (version 3.0.2; R Foundation for Statistical Computing, Vienna, Austria). Comparison of the distribution of categorical variables was performed using the chi²-test with P-values based on Monte Carlo simulations as implemented in the R statistics package. To compare the difference in the average number of ROHs between STSs and LTSs, we used the Student t-test. To account for any confounding due to the different population background of the samples a generalized linear model was applied with the population identifier as a covariate. A permutation test based on the permutation of the regressor residuals in the R package “glmperm” was used to secure the results^47,48.

Criteria for the detection of runs of homozygosity

The initial search for ROHs along each individual’s genome was performed using PLINK with a specified length of 75 consecutive SNPs. The reason for choosing 75 SNPs is based on the likelihood of observing 75 consecutive chance events that can be calculated as follows¹⁴: in our BC data mean heterozygosity was calculated to be around 35%. Thus, given 232 478 SNPs and 675 individuals, a minimum length of 51 SNPs would be required to produce <5% randomly generated ROHs across all subjects ((1–0.35)⁵¹ × 232 478 × 675 = 0.04; ~4%). A consequence of linkage disequilibrium (LD) is that SNP genotypes are not always independent, thereby inflating the probability of chance occurrences of biologically meaningless ROHs. Analyses were based on the pairwise LD SNP pruning function of PLINK with a default value of r² > 0.8, that is necessary to declare that one SNP tags another. Restricting the search of tags to within 250 kb showed 164 484 separable tag groups, representing a 30% reduction of information compared with the original number of SNPs. Thus, ROHs of length 75 were used to approximate the degrees of freedom of 51 independent SNP calls.

In the next step PLINK software and packages in R were used to identify a list of ‘common’ ROHs with a minimum of 75 consecutive SNPs for at least two individuals and with each ROH having identical start and end location across the individuals in whom that ROH was observed. The “homozyg-group” option of the PLINK package produced a file of the ROH regions separated into pools containing the number of STSs and LTSs carrying the same ROH. Corresponding information of the PLINK output file was used in assisting with the interpretation of the results. We defined that pools with more than five individuals and at least 75 identical SNPs being homozygous among the individuals in the same genomic region are treated as common ROHs. In addition, pools being homozygous in either STSs only or LTSs only were included to the list of common ROHs. Copy number variants were detected for each individual using R with no restriction towards the number of SNPs or the length of the CNVs and compared with common ROHs.

An additional test was looking for differences of the average proportion of homozygous genotypes between STSs and LTSs. For common ROH regions the proportion of homozygous genotypes was calculated for all STSs and LTSs separately and the significance of the difference was tested by a one-tailed t-test. Likewise, for each common ROH region p-values of the above stated SNP-by-SNP test were also compared with those obtained from the prior standard single-SNP GWAS. According to the concept of non-inferiority trials, the false discovery rate (FDR) was computed for both sets of p-values and tested for equivalence by a paired t test⁴⁹. The null hypothesis states that the FDR of the ROHs will be equal to the FDR from GWAS for the same region:

As an alternative hypothesis the FDR was smaller for ROH:

This would imply, that ROH are superior to GWAS.

Testing of natural selection as a cause of ROHs

For common ROH regions we used three metrics to investigate the selective pressure on each of the ROH. The integrated haplotype score (iHS) is based on linkage disequilibrium (LD) surrounding a positively selected allele compared with background, providing evidence of recent positive selection at a locus²³. A iHS score ≥2.0 reflects the fact that haplotypes on the ancestral background are longer compared with those on the derived allelic background. Episodes of selection tend to skew SNP frequencies in different directions. We estimated values for Tajima’s D and Fay and Wu’s H based on the frequencies of SNPs segregating in the region of interest^50,51. iHS, Tajima’s D and Fay and Wu’s H metrics were obtained from Haplotter Software (University of Chicago, Chicago, IL, USA; http://haplotter.uchicago.edu/selection/)²³.

Testing the effect of inbreeding on survival

To test whether inbreeding influenced the survival of BC patients, the three inbreeding measures F I, F II and F III using the package Genome-wide Complex Trait Analysis (GCTA) were estimated for each individual and then tested for correlation with survival of BC²⁵. As the covariate age at diagnosis did not show significant influence in prior tests, it was omitted from the analysis. Besides that, a genomic measure of individual homozygosity (F_ROH) was calculated as proposed by McQuillan et al.⁵², in which L_ROH is the sum of ROH per individual above a certain criterion length (i.e. 1000 kb as in the publication) and L_AUTO is the total SNP-mappable autosomal genome length (2.67 × 10⁹ bp): F_ROH = ∑ L_ROH/L_AUTO. For this calculation centromeres were excluded, because they are characterized as long genomic stretches devoid of SNPs and tend to inflate estimates of autozygosity⁵².

Additional Information

How to cite this article: Thomsen, H. et al. Inbreeding and homozygosity in breast cancer survival. Sci. Rep. 5, 16467; doi: 10.1038/srep16467 (2015).

References

Ferlay, J. et al. Cancer incidence and mortality worldwide: Sources, methods and major patterns in GLOBOCAN 2012. International journal of cancer. Journal international du cancer 136, E359–86, 10.1002/ijc.29210 (2014).
Article CAS PubMed Google Scholar
Hemminki, K., Ji, J., Forsti, A., Sundquist, J. & Lenner, P. Survival in breast cancer is familial. Breast cancer research and treatment 110, 177–182, 10.1007/s10549-007-9692-7 (2008).
Article PubMed Google Scholar
Enciso-Mora, V., Hosking, F. J. & Houlston, R. S. Risk of breast and prostate cancer is not associated with increased homozygosity in outbred populations. European journal of human genetics: EJHG 18, 909–914, 10.1038/ejhg.2010.53 (2010).
Article CAS PubMed PubMed Central Google Scholar
Shu, X. O. et al. Novel genetic markers of breast cancer survival identified by a genome-wide association study. Cancer research 72, 1182–1189, 10.1158/0008-5472.CAN-11-2561 (2012).
Article CAS PubMed PubMed Central Google Scholar
Azzato, E. M. et al. Association between a germline OCA2 polymorphism at chromosome 15q13.1 and estrogen receptor-negative breast cancer survival. Journal of the National Cancer Institute 102, 650–662, 10.1093/jnci/djq057 (2010).
Article CAS PubMed PubMed Central Google Scholar
Azzato, E. M. et al. A genome-wide association study of prognosis in breast cancer. Cancer epidemiology, biomarkers & prevention: a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology 19, 1140–1143, 10.1158/1055-9965.EPI-10-0085 (2010).
Article CAS Google Scholar
Feldman, J. G., Lee, S. L. & Seligman, B. Occurrence of acute leukemia in females in a genetically isolated population. Cancer 38, 2548–2550 (1976).
Article CAS PubMed Google Scholar
Lander, E. S. & Botstein, D. Homozygosity mapping: a way to map human recessive traits with the DNA of inbred children. Science 236, 1567–1570 (1987).
Article ADS CAS PubMed Google Scholar
Curtis, D. Approaches to the detection of recessive effects using next generation sequencing data from outbred populations. Advances and applications in bioinformatics and chemistry: AABC 6, 29–35, 10.2147/AABC.S44332 (2013).
Article PubMed PubMed Central Google Scholar
Mok, K. et al. Homozygosity analysis in amyotrophic lateral sclerosis. European journal of human genetics: EJHG 21, 1429–1435, 10.1038/ejhg.2013.59 (2013).
Article CAS PubMed PubMed Central Google Scholar
Ghani, M. et al. Evidence of Recessive Alzheimer Disease Loci in a Caribbean Hispanic Data Set: Genome-wide Survey of Runs of Homozygosity. JAMA neurology 70, 1261–7, 10.1001/jamaneurol.2013.3545 (2013).
Article PubMed PubMed Central Google Scholar
Yang, T. L. et al. Runs of homozygosity identify a recessive locus 12q21.31 for human adult height. The Journal of clinical endocrinology and metabolism 95, 3777–3782, 10.1210/jc.2009-1715 (2010).
Article CAS PubMed PubMed Central Google Scholar
Nalls, M. A. et al. Extended tracts of homozygosity identify novel candidate genes associated with late-onset Alzheimer’s disease. Neurogenetics 10, 183–190, 10.1007/s10048-009-0182-4 (2009).
Article CAS PubMed PubMed Central Google Scholar
Lencz, T. et al. Runs of homozygosity reveal highly penetrant recessive loci in schizophrenia. Proceedings of the National Academy of Sciences of the United States of America 104, 19942–19947, 10.1073/pnas.0710021104 (2007).
Article ADS PubMed PubMed Central Google Scholar
Gamsiz, E. D. et al. Intellectual disability is associated with increased runs of homozygosity in simplex autism. American journal of human genetics 93, 103–109, 10.1016/j.ajhg.2013.06.004 (2013).
Article CAS PubMed PubMed Central Google Scholar
Spain, S. L. et al. Colorectal cancer risk is not associated with increased levels of homozygosity in a population from the United Kingdom. Cancer research 69, 7422–7429, 10.1158/0008-5472.CAN-09-0659 (2009).
Article CAS PubMed Google Scholar
Assie, G., LaFramboise, T., Platzer, P. & Eng, C. Frequency of germline genomic homozygosity associated with cancer cases. JAMA: the journal of the American Medical Association 299, 1437–1445, 10.1001/jama.299.12.1437 (2008).
Article CAS PubMed Google Scholar
Orloff, M. S., Zhang, L., Bebek, G. & Eng, C. Integrative genomic analysis reveals extended germline homozygosity with lung cancer risk in the PLCO cohort. PloS one 7, e31975, 10.1371/journal.pone.0031975 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Woltmann, A. et al. Systematic pathway enrichment analysis of a genome-wide association study on breast cancer survival reveals an influence of genes involved in cell adhesion and calcium signaling on the patients’ clinical outcome. PloS one 9, e98229, 10.1371/journal.pone.0098229 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Turner, S. et al. Quality control procedures for genome-wide association studies. Current protocols in human genetics/editorial board, Jonathan L. Haines … [et al.] Chapter 1, Unit1 19, 10.1002/0471142905.hg0119s68 (2011).
Pemberton, T. J. et al. Genomic patterns of homozygosity in worldwide human populations. American journal of human genetics 91, 275–292, 10.1016/j.ajhg.2012.06.014 (2012).
Article CAS PubMed PubMed Central Google Scholar
Coop, G. et al. The role of geography in human adaptation. PLoS Genet 5, e1000500, 10.1371/journal.pgen.1000500 (2009).
Article CAS PubMed PubMed Central Google Scholar
Voight, B. F., Kudaravalli, S., Wen, X. & Pritchard, J. K. A map of recent positive selection in the human genome. PLoS biology 4, e72, 10.1371/journal.pbio.0040072 (2006).
Article PubMed PubMed Central Google Scholar
Oleksyk, T. K., Smith, M. W. & O’Brien, S. J. Genome-wide scans for footprints of natural selection. Philosophical transactions of the Royal Society of London. Series B, Biological sciences 365, 185–205, 10.1098/rstb.2009.0219 (2010).
Article CAS PubMed PubMed Central Google Scholar
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. American journal of human genetics 88, 76–82, 10.1016/j.ajhg.2010.11.011 (2011).
Article CAS PubMed PubMed Central Google Scholar
Li, L. H. et al. Long contiguous stretches of homozygosity in the human genome. Hum Mutat 27, 1115–1121, 10.1002/humu.20399 (2006).
Article CAS PubMed Google Scholar
Lin, M. L. et al. Involvement of G-patch domain containing 2 overexpression in breast carcinogenesis. Cancer science 100, 1443–1450, 10.1111/j.1349-7006.2009.01185.x (2009).
Article CAS PubMed Google Scholar
Roginski, R. S., Mohan Raj, B. K., Birditt, B. & Rowen, L. The human GRINL1A gene defines a complex transcription unit, an unusual form of gene organization in eukaryotes. Genomics 84, 265–276, 10.1016/j.ygeno.2004.04.004 (2004).
Article CAS PubMed Google Scholar
Le, T. P., Sun, M., Luo, X., Kraus, W. L. & Greene, G. L. Mapping ERbeta genomic binding sites reveals unique genomic features and identifies EBF1 as an ERbeta interactor. PloS one 8, e71355, 10.1371/journal.pone.0071355 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Drell, T. L. T. et al. Effects of neurotransmitters on the chemokinesis and chemotaxis of MDA-MB-468 human breast carcinoma cells. Breast cancer research and treatment 80, 63–70, 10.1023/A:1024491219366 (2003).
Article CAS PubMed Google Scholar
Talmadge, J. E. Follistatin as an inhibitor of experimental metastasis. Clinical cancer research: an official journal of the American Association for Cancer Research 14, 624–626, 10.1158/1078-0432.CCR-07-2216 (2008).
Article CAS Google Scholar
Martin, E. S. et al. The BCSC-1 locus at chromosome 11q23-q24 is a candidate tumor suppressor gene. Proceedings of the National Academy of Sciences of the United States of America 100, 11517–11522, 10.1073/pnas.1934602100 (2003).
Article ADS CAS PubMed PubMed Central Google Scholar
Mrazek, F. et al. Functional variant ANXA11 R230C: true marker of protection and candidate disease modifier in sarcoidosis. Genes and immunity 12, 490–494, 10.1038/gene.2011.27 (2011).
Article CAS PubMed Google Scholar
Dragoumis, D. M., Tsiftsoglou, A. P. & Assimaki, A. S. Pulmonary sarcoidosis simulating metastatic breast cancer. Journal of cancer research and therapeutics 4, 134–136 (2008).
Article PubMed Google Scholar
Huang, W. et al. The N-terminal phosphodegron targets TAZ/WWTR1 protein for SCFbeta-TrCP-dependent degradation in response to phosphatidylinositol 3-kinase inhibition. The Journal of biological chemistry 287, 26245–26253, 10.1074/jbc.M112.382036 (2012).
Article CAS PubMed PubMed Central Google Scholar
Hubackova, M. et al. Association of superoxide dismutases and NAD(P)H quinone oxidoreductases with prognosis of patients with breast carcinomas. International journal of cancer. Journal international du cancer 130, 338–348, 10.1002/ijc.26006 (2012).
Article CAS PubMed Google Scholar
Weischer, M. et al. CHEK2*1100delC heterozygosity in women with breast cancer associated with early death, breast cancer-specific death and increased risk of a second breast cancer. Journal of clinical oncology: official journal of the American Society of Clinical Oncology 30, 4308–4316, 10.1200/JCO.2012.42.7336 (2012).
Article CAS Google Scholar
Seemanova, E. et al. Cancer risk of heterozygotes with the NBN founder mutation. Journal of the National Cancer Institute 99, 1875–1880, 10.1093/jnci/djm251 (2007).
Article CAS PubMed Google Scholar
Kaaks, R. et al. Prospective study of IGF-I, IGF-binding proteins and breast cancer risk, in northern and southern Sweden. Cancer causes & control: CCC 13, 307–316 (2002).
Article PubMed Google Scholar
Manjer, J. et al. The Malmo Diet and Cancer Study: representativity, cancer incidence and mortality in participants and non-participants. European journal of cancer prevention: the official journal of the European Cancer Prevention Organisation 10, 489–499 (2001).
Article CAS Google Scholar
Stevens, K. N. et al. 19p13.1 is a triple-negative-specific breast cancer susceptibility locus. Cancer research 72, 1795–1803, 10.1158/0008-5472.CAN-11-3364 (2012).
Article CAS PubMed PubMed Central Google Scholar
Tryggvadottir, L. et al. Population-based study of changing breast cancer risk in Icelandic BRCA2 mutation carriers, 1920-2000. Journal of the National Cancer Institute 98, 116–122, 10.1093/jnci/djj012 (2006).
Article CAS PubMed Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. American journal of human genetics 81, 559–575, 10.1086/519795 (2007).
Article CAS PubMed PubMed Central Google Scholar
International HapMap, C. et al. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58, 10.1038/nature09298 (2010).
Article ADS CAS Google Scholar
Weller, J. I., Song, J. Z., Heyen, D. W., Lewin, H. A. & Ron, M. A new approach to the problem of multiple comparisons in the genetic dissection of complex traits. Genetics 150, 1699–1706 (1998).
CAS PubMed PubMed Central Google Scholar
Howrigan, D. P., Simonson, M. A. & Keller, M. C. Detecting autozygosity through runs of homozygosity: a comparison of three autozygosity detection algorithms. BMC genomics 12, 460, 10.1186/1471-2164-12-460 (2011).
Article CAS PubMed PubMed Central Google Scholar
Potter, D. M. A permutation test for inference in logistic regression with small- and moderate-sized data sets. Statistics in medicine 24, 693–708, 10.1002/sim.1931 (2005).
Article MathSciNet PubMed Google Scholar
Werft, W. & Benner, A. glmperm: A Permutation of Regressor Residuals Test for Inference in Generalized Linear Models. R J 2, 39–43 (2010).
Article Google Scholar
D’Agostino, R. B., Sr., Massaro, J. M. & Sullivan, L. M. Non-inferiority trials: design concepts and issues - the encounters of academic consultants in statistics. Statistics in medicine 22, 169–186, 10.1002/sim.1425 (2003).
Article PubMed Google Scholar
Tajima, F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123, 585–595 (1989).
CAS PubMed PubMed Central Google Scholar
Fay, J. C. & Wu, C. I. Hitchhiking under positive Darwinian selection. Genetics 155, 1405–1413 (2000).
CAS PubMed PubMed Central Google Scholar
McQuillan, R. et al. Runs of homozygosity in European populations. American journal of human genetics 83, 359–372, 10.1016/j.ajhg.2008.08.007 (2008).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank Åsa Ågren (Department of Public Health and Clinical Medicine/Nutritional Research, Umeå university, Sweden), the Northern Sweden Breast Cancer group and the Icelandic Cancer Registry for data management. Joakim Dillner is acknowledged for helpful discussions and initiating the collaboration.

Author information

Authors and Affiliations

Division of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany
Hauke Thomsen, Miguel Inacio da Silva Filho, Andrea Woltmann, Bowang Chen, Stefanie Huhn, Kari Hemminki & Asta Försti
Department of Radiation Sciences & Oncology, Umeå University, Umeå, Sweden
Robert Johansson, Roger Henriksson & Per Lenner
Cancer Research Laboratory, Faculty of Medicine, University of Iceland, Reykjavik, Iceland
Jorunn E. Eyfjörd
Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany
Ute Hamann
The Malmö Diet and Cancer Study, Lund University, Malmö, Sweden
Jonas Manjer
Department of Plastic Surgery, Skåne University Hospital, Malmö, Lund University, Malmö, Sweden
Jonas Manjer
Department of Public Health and Clinical Medicine/Nutritional Research, Umeå University, Umeå, Sweden
Kerstin Enquist-Olsson
Cancer Center Stockholm Gotland, Stockholm, Sweden
Roger Henriksson
Institute of Human Genetics, Department of Genomics, University of Bonn, Bonn, Germany
Stefan Herms & Per Hoffmann
Division of Medical Genetics and Department of Biomedicine, University of Basel, Basel, Switzerland
Stefan Herms & Per Hoffmann
Center for Primary Health Care Research, Clinical Research Center, Lund University, Malmö, Sweden
Kari Hemminki & Asta Försti

Authors

Hauke Thomsen
View author publications
You can also search for this author in PubMed Google Scholar
Miguel Inacio da Silva Filho
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Woltmann
View author publications
You can also search for this author in PubMed Google Scholar
Robert Johansson
View author publications
You can also search for this author in PubMed Google Scholar
Jorunn E. Eyfjörd
View author publications
You can also search for this author in PubMed Google Scholar
Ute Hamann
View author publications
You can also search for this author in PubMed Google Scholar
Jonas Manjer
View author publications
You can also search for this author in PubMed Google Scholar
Kerstin Enquist-Olsson
View author publications
You can also search for this author in PubMed Google Scholar
Roger Henriksson
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Herms
View author publications
You can also search for this author in PubMed Google Scholar
Per Hoffmann
View author publications
You can also search for this author in PubMed Google Scholar
Bowang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Stefanie Huhn
View author publications
You can also search for this author in PubMed Google Scholar
Kari Hemminki
View author publications
You can also search for this author in PubMed Google Scholar
Per Lenner
View author publications
You can also search for this author in PubMed Google Scholar
Asta Försti
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.J., J.E.E., U.H., J.M., K.E.-O., R.H. and P.L. provided the patient data. S.HE. and P.H. performed genotyping. H.T., M.F., A.W. and B.C. performed the GWAS. H.T. performed the specific analyses for the study and drafted the manuscript. H.T., S.HU., K.H., P.L. and A.F. interpreted the results and critically reviewed the manuscript. All authors read and approved the final version of the manuscript.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Supplementary Information

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Thomsen, H., Filho, M., Woltmann, A. et al. Inbreeding and homozygosity in breast cancer survival. Sci Rep 5, 16467 (2015). https://doi.org/10.1038/srep16467

Download citation

Received: 15 June 2015
Accepted: 14 October 2015
Published: 12 November 2015
DOI: https://doi.org/10.1038/srep16467

This article is cited by

Assessing runs of Homozygosity: a comparison of SNP Array and whole genome sequence low coverage data
- Francisco C. Ceballos
- Scott Hazelhurst
- Michèle Ramsay
BMC Genomics (2018)
Runs of homozygosity: windows into population history and trait architecture
- Francisco C. Ceballos
- Peter K. Joshi
- James F. Wilson
Nature Reviews Genetics (2018)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Refining the impact of genetic evidence on clinical success

A single-cell atlas enables mapping of homeostatic cellular shifts in the adult human breast

Genome-wide association studies

Introduction

Results

Genome-wide assessment of associations between homozygosity at single SNPs and BC survival

Identification of individual ROHs per person and association between ROHs and BC survival

Common ROH regions and association with BC survival

Natural selection as a cause of ROHs

Inbreeding and association between homozygosity and BC survival

Discussion

Material and Methods

Ethics statement

Genotyping and quality control

Genome-wide assessment of homozygosity at individual SNPs and BC survival

Identification of runs of homozygosity

Criteria for the detection of runs of homozygosity

Testing of natural selection as a cause of ROHs

Testing the effect of inbreeding on survival

Additional Information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Ethics declarations

Competing interests

Electronic supplementary material

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Assessing runs of Homozygosity: a comparison of SNP Array and whole genome sequence low coverage data

Runs of homozygosity: windows into population history and trait architecture

Comments

Search

Quick links