Main

In recent years, many large-scale consortia have been initiated to identify susceptibility genes of complex disorders.16 There are several perceived advantages to the consortium approach for studying genetic associations. Consortia often lead to larger sample sizes than do meta-analysis of published studies and thus to sufficient statistical power to demonstrate significant effects of weak susceptibility genes. They provide access to unpublished “negative” data, which reduces problems of publication or selective reporting bias.7 Moreover, they can facilitate the harmonization of criteria for the selection of cases and controls and the standardization of genotype technology, which can reduce between-study heterogeneity in effect estimates by eliminating methodological differences between studies. Although these are straightforward arguments supporting the building of large consortia,6,8 it remains to be investigated whether the results of analyses from consortia yield the acclaimed advantages over meta-analyses of available published data and whether there are differences in the results obtained by these two approaches.

To empirically investigate differences between the analyses of consortia and meta-analyses of published data, we performed meta-analyses of published data for 16 genetic polymorphisms that were investigated by the Breast Cancer Association Consortium (BCAC).9,10 The consortium reported genetic association analyses on 16 genetic polymorphisms, including individual-level published and unpublished data contributed by up to 20 research groups. We compared the amount of collected data and the results between the two approaches and examined the presence of potential biases in the meta-analyses of published studies. We also applied the Venice criteria, a set of criteria recently developed by a consensus of the Human Genome Epidemiology Network for grading the epidemiological strength of cumulative evidence on genetic associations.11 We wanted to determine whether the inferences derived with the two approaches were similar or divergent.

METHODS

Data sources

PubMed, Web of Science, and Human Genome Epidemiology Network literature databases were searched for breast cancer case-control studies on 16 polymorphisms investigated by the BCAC and published in the Journal of the National Cancer Institute and Nature Genetics (Table 1).9,10 For 12 polymorphisms, the consortium collected, cleaned, and synthesized the available data from the participating teams and found no evidence for any significant association.9 For another four polymorphisms, which had a P value of <0.10 upon the collection and synthesis of available data from consortium-participating teams (CASP8 D302H, IGFBP3 C(-202)A, SOD2 V16A, TGFB1 L10P), samples were further genotyped in other studies of the consortium.10 For these four polymorphisms, we considered the results from the more comprehensive latter article.

Table 1 Meta-analyses of published data for polymorphisms investigated by the Breast Cancer Association Consortium

The search strategy was based on the keywords breast cancer combined with the name of the gene without specifying the polymorphism. In addition, we searched for other articles on breast cancer written by the researchers of the consortium as listed in the appendix of their article.9 Reference lists of all retrieved publications were screened for additional studies. Finally, previously published meta-analyses on these associations (including meta-analyses on AURKA F31I,12 BRCA2 N372H,13 PGR V660L,14,15 XRCC1 R399Q,16 and XRCC3 T241M13,17,18) were also scrutinized to ensure that we had missed no studies. These meta-analyses were updated and reanalyzed with data from the primary studies. Searches were updated until July 2007.

Study selection

Genetic association studies were selected if they compared female breast cancer patients with controls from the general population in a case-control design and were reported in English. Studies were excluded when (1) the data were reused in a larger study on the same polymorphism; (2) genotype distributions in controls showed nominally statistically significant (P < 0.05) violations of the Hardy-Weinberg equilibrium (HWE); and (3) the article had incomplete reporting of genotype frequencies. HWE was recalculated from the original data using the χ2 test. HWE testing is known to have low power, even with considerable sample sizes,19 therefore using lower P-value thresholds would miss a lot of considerable deviations from equilibrium. Two investigators (A.M.G-.Z.L. and S.L.-L.) performed the study selection and information extraction independently in duplicate, and discrepancies were discussed with a third researcher (A.C.J.W.J.).

Data synthesis

Summary odds ratios (ORs) and 95% confidence intervals (CIs) were calculated with random-effects models. Summary ORs under the random-effects models were estimated for three genetic models (homozygotes, heterozygotes, and per-allele comparisons), using the method of DerSimonian and Laird.20 In the absence of between-study heterogeneity, the random and fixed-effects models coincide, whereas in the presence of heterogeneity, the basic assumption of fixed effects is violated. For this reason, we gave precedence to random-effects estimates. The statistical significance of the between-study heterogeneity was evaluated with the χ2-based Q statistic that is considered significant for P < 0.10. The degree of heterogeneity between the study results was assessed by the I2 statistic,21 which measures the amount of inconsistency between studies that is beyond chance, taking values from 0 to 100%. Values of <25% suggest little heterogeneity, 25–50% suggests moderate heterogeneity, and >50% means large heterogeneity. In the presence of few studies, both the Q and I2 have large uncertainty and thus should be evaluated with caution.22

Comparison of meta-analyses of published data versus consortium analyses

The two approaches were compared with regard to the amount of available data (number of cases and controls combined) and the results obtained. Meta-analyses were performed according to the three genetic models. All comparisons of the published versus the consortium analyses were based on comparison of the same genetic models.

We estimated Spearman correlation coefficients of the ORs and recorded whether the two ORs differed by more than 1.10-fold in any meta-analysis. Similarly, we estimated Spearman correlation coefficients for the I2 estimates of the two approaches. We also report on whether there was discrepancy in the presence of nominal statistical significance (P < 0.05) with the two approaches for any meta-analyses. Such discrepancies should be interpreted cautiously, because a difference in the level of statistical significance does not mean that the difference is beyond chance. This has been pointed out repeatedly in meta-epidemiological research.23,24 Moreover, we did not formally estimate the statistical significance of the difference in the effect sizes with the two approaches, because a large amount of the data are shared by the published and the consortium datasets, thereby overtly violating the assumption of independence.

Bias diagnostics

We also performed a number of tests to assess whether there is demonstrable potential for bias in the meta-analyses of published data that yielded nominally statistically significant results (P < 0.05) by random-effects calculations and that also did not have large between-study heterogeneity. First, we evaluated whether the results lost their nominal significance when the first-published study was excluded.25 Second, we evaluated whether small studies yielded larger estimates of genetic effects than larger studies. When this is the case, it could signal publication bias, other selective reporting biases, or other biases, or it could be due to genuine heterogeneity between small and large studies. We used a modified regression test as proposed by Harbord et al.26 that has an appropriate Type I error at P = 0.10. Last, we applied a test that determines whether the number of studies with nominally statistically significant results exceeds the expected number of such “positive” studies under plausible assumptions for the effect size in each meta-analysis.27 This is an indication that there is bias in favor of publishing studies that show formally statistically significant results. The test is considered to be significant for P < 0.10. Data were analyzed using Review Manager, version 4.2 (Cochrane Collaboration, Oxford, UK) and Intercooled STATA 8.2 (College Station, TX).

Grading the strength of the cumulative epidemiological evidence

For all meta-analyses, we applied a grading system, the Venice criteria, for the strength of the epidemiological evidence. Details are published elsewhere.11 Briefly, each meta-analyzed association was graded on the basis of amount of evidence, consistency of replication, and protection from bias. For amount of evidence, the grade was A when the total number of minor alleles of cases and controls combined in the meta-analyses exceeded 1000, B when it was between 100 and 1000, and C when it was <100. For consistency of replication, point estimates of I2 exceeding 50% received Grade C, I2 of 25–50% received Grade B, and I2 <25% received Grade A when the result of the meta-analysis was nominally statistically significant; nonsignificant meta-analyses always received C for this criterion.11

For protection from bias, the guidelines propose to consider potential sources of bias at the level of individual studies, including errors in phenotypes, genotypes, and confounding (population stratification), and at the level of meta-analysis, including publication and other selective reporting biases.11 Grade A is given when there is probably no bias that can affect the presence of the association, B when there is no demonstrable bias but important information is missing for its appraisal, and C when there is demonstrable potential or clear bias that can invalidate the association. Protection from bias was investigated only for those meta-analyses that had not already received C grades for amount of data or replication consistency. We considered that meta-analyses based on consortium data were adequately protected from bias since publication and selective reporting are not a concern, and we considered that efforts made by the consortium should also considerably alleviate concerns for other major biases, even if bias cannot be totally excluded; thus, the grade of consortium meta-analyses was A in this regard. Meta-analyses of published analyses received Grade A when the OR deviated more than 1.15-fold from the null (>1.15 or <0.87) and no evidence for bias was demonstrated; a C grade was given when the summary genetic effect was of smaller magnitude (such small effects may easily be generated by even modest selective reporting biases or other biases) or other signs of bias were present, as described above.11

As suggested by the Venice criteria,11 epidemiological evidence for significant association was rated as strong if the meta-analysis received three A grades, moderate if it received any B grade but not any C grade; and weak if it received any C grade in any of the three criteria.

RESULTS

Meta-analyses of published data

Of the 115 potentially eligible publications, five were excluded because they had incomplete reporting of genotype frequencies and one was excluded because the genotyping was performed on tumor tissue. The remaining 109 publications comprised 168 data sets addressing the 16 polymorphisms. Four datasets were excluded because the data had been reused in a larger study on the same polymorphism, two were excluded because the gene was not polymorphic in the specific data set, and 11 were excluded because the distributions of genotypes in controls violated HWE. None of the datasets were included in the analyses of the consortium. A total of 151 sets of data were thus considered in our 16 meta-analyses.

The meta-analyses showed nominally statistically significant lower risk of breast cancer for CASP8 302H carriers, the VV genotype of ADHC1 I350V, and the GG genotype of XRCC3 IVS7-14 (Table 1). A nominally statistically significant higher risk of breast cancer was found for homozygous carriers of the AURKA F31I and XRCC3 T241M minor alleles and for heterozygous carriers of the XRCC3 5UTR G allele. (For the respective results of the consortium analyses, see Table 4 in Ref. 9 and Table 1 in Ref. 10.)

Comparisons between meta-analyses and BCAC analyses

Included studies and sample size

The mean number of patients and controls was 18,289 (range 4,520–28,452) in the meta-analyses of published data and 23,140 (range 12,013–37,633) in the consortium analyses (Fig. 1A; Fig. 1B shows the respective data limited to breast cancer cases only). Figure 2 shows that, by and large, the subjects in the meta-analyses were different from those in the consortium analyses. The consortium had access to unpublished data of, on average, 13,797 persons per polymorphism (range 2,213–30,774), which amounted to 42.7% of all available data (range 9.0–76.2%). Data were most often unpublished for ADH1C I350V (76.2% of all available data were unpublished) and most often published for PGR V660L (9.0% unpublished). On average, the consortium did not use data of 8,945 cases and controls (range 1,031–19,177) that were published by others. The latter comprised 47.8% of all published data (range 11.3–91.5%) and 26.9% of all available data (range 3.0–50.0%). For two polymorphisms (AURKA F31I and XRCC1 R399Q), almost half of all available data were published by research groups that were not part of the consortium. For example, for XRCC1 R399Q, the combined studies of the consortium included 18,339 cases and controls, whereas an additional 18,354 were available from other studies.

Fig. 1
figure 1

Total number of cases and controls, odds ratios, and heterogeneity of meta-analyses of published data compared with meta-analyses of the Breast Cancer Association Consortium.9 Summary odds ratios of the consortium were obtained from random-effects meta-analyses.

Fig. 2
figure 2

Total number of cases and controls included in the meta-analyses of published data and in the analyses of the Breast Cancer Association Consortium.9.

Estimates of effect

Differences between the ORs of the meta-analyses and the consortium analyses are presented in Figure 1C. Eleven of the meta-analyses of published data and eight of the consortium analyses reached nominal statistical significance. The three meta-analyses for CASP8 were the only ones that reached statistical significance in both approaches. For 39 of the 48 comparisons, the ORs of the meta-analyses of published data were <10% higher or lower than the ORs of the consortium analyses. The largest differences were found for the analyses of homozygous carriers of the TGFB1 P alleles (OR 1.16 [95% CI: 1.08–1.25] in the consortium analysis vs. OR 0.98 [95% CI: 0.86–1.12] in the meta-analysis), ADH1C V alleles (OR 1.00 [95% CI: 0.83–1.21] vs. OR 0.83 [95% CI: 0.69–1.00]), and AURKA I alleles (OR 0.99 [95% CI: 0.78–1.27] vs. OR 1.28 [95% CI: 1.06–1.54]). However, in all of these analyses, the 95% CIs overlapped. The OR of AURKA homozygotes was not statistically significant when the analysis was restricted to whites (OR 1.30 [95% CI: 0.98–1.72]). The Spearman correlation coefficient for the OR estimates between the two approaches was 0.44 (P = 0.002).

Heterogeneity

Of the 48 meta-analyses of published data, 18 (37.5%) showed no or low heterogeneity (I2 ≤ 25%), 19 (39.6%) showed moderate heterogeneity, and 11 (22.9%) had large estimated between-study heterogeneity (Fig. 1D). The latter 11 meta-analyses concerned six genes. Figure 1D compares I2 estimates of meta-analyses of published data with that available for the consortium analyses. The Spearman correlation coefficient for I2 estimates was 0.47 (P = 0.001). Large estimates of between-study heterogeneity was observed in 14 (31%) of the 45 meta-analyses of the consortium (I2 statistics could not be calculated for the PGR analyses of the consortium because the data were not provided). In nine of the 45 meta-analyses, the consortium had large estimates of between-study heterogeneity where this was not seen in the meta-analysis of published data, whereas the opposite situation was observed in four meta-analyses.

Bias diagnostics

The consortium checked for CASP8 and TGFB1 whether the first positive study could have contributed to the statistically significant finding and found that the results remained the same.10 Exclusion of the first-published study resulted in loss of the nominal significance in five of the nine meta-analyses of published data that had nominally statistically significant results. For the remaining four meta-analyses (comparison of homozygotes for AURKA and the three CASP8 meta-analyses), the modified regression test showed no significant difference between small and larger studies and the excess test showed no statistically significant excess of “positive” studies.

Grading of cumulative epidemiological evidence

Table 2 shows the grading for amount of data, replication consistency, and protection from bias in each of the meta-analyses of published data versus the respective consortium meta-analyses. Forty-four of the 48 meta-analyses of published data and 40 of the 48 consortium analyses had at least one C grade and were therefore considered “weak” evidence for association. Eight meta-analyses of the consortium were evaluated as showing “strong” evidence for association (homozygous and per-allele meta-analyses of IGFBP3 and all meta-analyses for CASP8 and TGFB1); of those, only the three meta-analyses of CASP8 were also deemed to show “strong” evidence in the analyses of published data. Conversely, one meta-analysis of published data (homozygotes comparison for AURKA) was deemed to show “moderate” evidence but was “weak” in the consortium data.

Table 2 Grading of epidemiological evidence of association using the criteria of the Human Genome Epidemiology Network (HuGENet)11

Combining meta-analyses of published data with consortium analyses

The association between CASP8 and breast cancer risk was still statistically significant when all available data were combined. The OR for homozygous carriers of the rare allele was 0.74 (95% CI: 0.61–0.89, I2 12%), for heterozygous carriers was 0.89 (95% CI: 0.85–0.94, I2 0%), and for the per-allele effect was 0.88 (95% CI: 0.84–0.92, I2 0%). The meta-analysis of TGFB1 heterozygotes yielded a nominally statistically significant effect (OR 1.05, 95% CI: 1.00–1.11, I2 22%), but the homozygotes and per-allele meta-analyses were no longer statistically significant (homozygotes OR 1.05, 95% CI: 0.96–1.15, I2 43% and per-allele OR 1.03, 95% CI: 0.99–1.07, I2 40%). In addition, the IGFBP3 meta-analyses did not reach nominal statistical significance when the consortium data were combined with data published by nonmembers. The meta-analyses for the AURKA homozygotes yielded borderline significance with large between-study heterogeneity (OR 1.15, 95% CI: 0.99–1.34, I2 52%). None of the other meta-analyses were statistically significant.

DISCUSSION

Our meta-analyses of published studies and the consortium analyses both identified CASP8 as a breast cancer susceptibility gene and found insufficient evidence that the other genes are associated with risk of breast cancer. Some discrepancies did occur, however, especially for whether nominal statistical significance is claimed for specific comparisons. Summary effect sizes tended to correlate, between-study heterogeneity was not larger or smaller on average with either of the two approaches, and final inferences on the grading of the cumulative evidence agreed in the large majority of the meta-analyses. Exceptions to these statements did exist, and in these cases, the consideration of all data (both from consortia and other published evidence) may be most useful.

Before interpreting the results of this study, one methodological issue needs to be addressed. We should acknowledge that we are using statistical significance thresholds that are lenient, as traditionally used for candidate gene variants, but it is possible that even for candidate gene variants much lower P values may be needed to reach high credibility levels. Moreover, we adopt an approach where we examine statistical significance separately for each particular OR (i.e., separately for heterozygotes and for homozygotes), rather than global testing of the null across all genotypes. Although this approach is suitable for comparing the results of the meta-analyses and the consortium analyses, it is not the most conservative for the identification of susceptibility variants with high credibility.

The differences between the two approaches that we found may be explained by the fact that the meta-analyses and the consortium analyses, by and large, were based on different data collected in different populations. Research groups that did not participate in the consortium contributed about half of all published data and more than a quarter of all available data. Large-scale studies, such as the Shanghai Breast Cancer Study (n = 1193),28 the Carolina Breast Cancer Study (n = 2045),29 a nested case-control study within the Nurses Health Study (n = 1004),30 and the Long Island Breast Cancer Study Project (n = 1052),31 as well as many smaller studies with approximately 500 patients each, were not participating in the consortium. For example, the AURKA F31I meta-analyses had only one overlapping study in the consortium versus published data.12 Although this study showed a decreased breast cancer risk for carriers of the I allele, all eight other published studies found increased risks, yielding a nominally significant per-allele summary OR of 1.10 (95% CI: 1.01–1.19). For ADH1C I350V, all four published studies showed nonsignificant lower breast cancer risks for carriers of the V allele, whereas the consortium included two unpublished studies that demonstrated increased risks, the largest of which (4317 patients) was statistically significant. Given the diversity of studies included by the consortium, there is no methodological reason why these published studies were not included.

The meta-analyses of published data and the BCAC analyses led to different conclusions about the association with breast cancer risk for seven of the 16 polymorphisms (12 of 48 meta-analyses) when based on nominal significance testing. However, the differences in the summary ORs were generally minor. When ORs are in the order of 1.1, a minor difference can make the difference between a finding being statistically significant or not. However, with proper attention to issues of heterogeneity and protection from bias, the final inferences may end up being similar with both approaches. On the other hand, interpreting meta-analyses without attention to these issues and looking only at the statistical significance may lead to erroneous inferences. Substantial differences in breast cancer risk were found in the comparison of homozygous carriers of the TGFB1, ADH1C, and AURKA alleles, but even in these, the 95% CIs overlapped with the two approaches. For four genes, meta-analyses of published data (ADH1C and the three polymorphisms in XRCC3) were no longer nominally statistically significant after exclusion of the first-published study, hence three differences remain. We found a nominally significantly increased risk of breast cancer for homozygous carriers of the AURKA I alleles, and the consortium found nominally significantly increased risks associated with TGFB1 (all genetic models) and borderline association with IGFBP3 (per-allele model only), but they do not exclude that these might be a false-positive findings.10 The AURKA meta-analysis was not statistically significant in whites. When consortium data were combined with data published by others, none of these meta-analyses retained nominal statistical significance.

Contrary to our expectations, we found no systematic differences in the degree of between-study heterogeneity between the two approaches. The consortium approach did not consistently reduce between-study heterogeneity. This may be explained by the fact that research groups in the BCAC had not applied the same study methodology, that is, used the same criteria and definitions for the selection of cases and controls and the same technology for the genotyping. The consortium was established retrospectively after the individual studies were designed and conducted. Given the diversity in studies included by the consortium, for analogy, we opted to be all-inclusive in our analyses and to consider studies regardless of menopausal status, unilateral versus bilateral breast cancer, familial versus sporadic cases, screened versus unscreened populations, hospital-based versus population controls, type of genotype platform, and different ethnicities. However, in some settings, consortia may wish to restrict participation to teams that fulfill certain criteria or moreover exclude teams that fail to pass some quality checks (e.g., in genotyping or phenotyping accuracy). For example, in a consortium analysis in Parkinson disease genetics, 7 of 18 teams were excluded from the analyses because of significant deviation in Hardy-Weinberg or genotyping error of 10% or higher upon central genotyping check.32 It remains to be demonstrated whether consortia show reduced heterogeneity when their members have harmonized and standardized their methodology on a prospective basis,33 such as is done by the Consortium on the Genetics of Schizophrenia and the Type 1 Diabetes Genetics Consortium.2,3

Previous studies have tried to compare the results of meta-analyses based on consortia and individual-level data with meta-analyses of published data. Most of these studies pertain to data from clinical trials and refer to single topics for which the evidence has been synthesized with these two different approaches.3437 Data from genetic association studies are sparse.38,39 Comparisons for meta-analyses involving clinical trials exhibit some additional issues that are uncommon for genetic associations, such as the intricacies of time-to-event analyses, in which individual-level consortia have a clear advantage.40,41 Consortia also have a clear advantage for the more efficient and reliable investigation of subgroup analyses and effect modifications,42 but this was not an issue in the data that we analyzed.

It is important to revisit the relative advantages and limitations of consortium-based approaches to genetic associations compared with meta-analyses of published literature. A major advantage of a consortium approach is the access to unpublished data, which reduces problems of publication and selective reporting bias that may affect the integration of published literature.8,38 On the other hand, meta-analysis of published literature may provide a relatively inexpensive and readily available approach to the synthesis of a large amount of data on genetic associations. Our study showed that sample size and between-study heterogeneity may not differ between the two approaches. The value of consortia may lie predominantly in the improvement of the design and conduct of gene-disease association studies through greater consensus about definitions, criteria, techniques, and procedures in the data collection and analyses, which prospectively facilitates future meta-analyses.8 Variability between techniques and procedures may remain even after standardization and training.43 Most important, differences in the summary ORs and inferences between the two approaches may largely depend on which studies are included in the analyses. Finally, with the rapid growth of genetic association investigations, it will become increasingly common to have several different consortia working on the genetics of the same disease. This is already the case in various diseases, including Parkinson disease, breast cancer, and Type 2 diabetes.6 Collaboration will particularly be important for rare tumors and less common diseases where it may not be possible to amass many thousands of participants with the disease of interest, uncertainty in the effects will be considerably larger and minor methodological biases could more easily create false positives and false negatives. In general, meta-analyses of data from all consortia44,45 and from additional investigators outside their confines will remain useful for integrating the total evidence. Transparent availability of all consortium and other data are essential in this regard.46

In conclusion, in the rapidly emerging field of genetic associations for common diseases, both meta-analyses of published literature and consortium-based analyses may provide important information on genetic associations. We performed meta-analyses of published studies for polymorphisms that were investigated by one consortium that in retrospect combined the data collected by consortium members. Our analyses need to be replicated for the findings of other consortia and for other diseases beyond breast cancer to make a more generalizable statement about all genetic association studies. Nevertheless, despite the methodological issues concerning published studies of genetic associations, we have shown here that consortia and meta-analyses of published data may offer complementary insights.