Main

Translational research to bridge the chasm between basic science, clinical practice, and health decision making has increasingly gained attention (Zerhouni, 2005; Woolf, 2008; Dzau et al, 2010). Personalised medicine on prevention and treatment of breast cancer, the top disease to combat worldwide, is one of such kind of tasks that is urgently needed.

Population-based screening for breast cancer using mammography has been demonstrated by using randomised controlled trials between 1970 and 1990 and has been widely applied in developed and developing countries. However, health-policy makers are concerned that the harm (false negative and false positive cases) and cost of screening should be minimised and the benefits, mainly measured by the reduction of mortality from breast cancer, maximised. This may be relieved by using an individually tailored screening (Esserman et al, 2009) with emphasis on optimal age of screening and inter-screening interval and the expedient use of alternative image technique. These subsidiary issues are related to individual variation on the temporal natural history of breast cancer from free-of-breast-cancer, through the pre-clinical detectable phase (PCDP) and finally to clinical phase (CP), three of which have been well defined by using various detection methods (Shapiro et al, 1974; Prorok, 1976; Walter and Day, 1983; van Oortmarssen et al, 1990). Screen-detected breast cancer represents the PCDP, whereas clinically detected one (such as interval cancer) stands for the CP (Duffy et al, 1995; Chen et al, 1997a, 1997b).

With the advent of genetic and biological markers for breast cancer, individually tailored screening for breast cancer can now be achieved by making use of information on genes, conventional risk factors, clinical attributes, and relevant tumour phenotypes such as HER-2/neu. However, it is very rare to focus on how to link these scientific discoveries with temporal natural history to assess which factors (initiators) affect the onset of breast cancer, that is, the transition from free-of-breast-cancer to the PCDP and which ones (promoters) responsible for the subsequent progression from the PCDP to the CP by comparing the distribution of each factor between screen-detected cases and clinically detected ones with the premise that the distribution of initiators should be identical between both the detection modes.

To this end, we conducted a literature review on state-of-the-art genomic discovery, tumour phenotypes, clinical correlates, and conventional risk factors. These factors were further superimposed into the three-state temporal natural history of breast cancer in the light of the roles of initiators and promoters mentioned above to build a multi-variable and three-state Markov regression model. We then estimated the effects of initiators and promoters to construct composite risk scores by the application of the proposed Markov regression model to the data set generated above. We applied this risk-score-based approach to emulating a scenario of Taiwanese population-based periodic screening programme for breast cancer to demonstrate how such a novel quantitative approach can be used to develop an individually tailored screening strategy that is efficient in terms of inter-screening interval, age at the start of screening and the expedient use of costly but accurate alternative image techniques.

Materials and methods

Study framework on initiators and promoters

As shown in Figure 1, our study design was to quantify the effects of initiators on the occurrence of breast cancer in the PCDP and those of promoters on the transition from the PCDP to CP. The definitions of these three-state outcomes (free of breast cancer (state 1), breast cancer in the PCDP (state 2, sometimes called pre-symptomatic cases), and breast cancer in the CP (state 3, sometimes called symptomatic cases)) are realised by the following three corresponding outcomes, free of breast cancer, screen-detected cases and interval cancer obtained from population-based screening data as defined in the previous studies (Shapiro et al, 1974; Prorok, 1976; Walter and Day, 1983; van Oortmarssen et al, 1990; Duffy et al, 1995; Chen et al, 1997a, 1997b). The initiators in Figure 1 included BRCA1, BRCA2 (Peto et al, 1999; Chen and Parmigiani, 2007), single-nucleotide polymorphisms (Cox et al, 2007; Easton et al, 2007; Hunter et al, 2007; Stacey et al, 2007), breast density (del Carmen et al, 2007; Martin and Boyd, 2008; Cummings et al, 2009), and conventional risk factors (body mass index (BMI) and age at first full-term pregnancy (AP)) (Hsieh et al, 2002; Chen et al, 2004). The promoters were identified by comparing the distributions of relevant tumour phenotypes between screen-detected cases and interval cancer without making any assumption because the initiators affecting the occurrence of breast cancer would not lead to a disparity between screen-detected cases and interval cancer (Dong et al, 2008). According to the literatures of previous studies (Hsieh et al, 2002; Dong et al, 2008), the promoters responsible for the progression from the PCDP to the CP in the current model included BMI, AP, oestrogen receptor (ER), HER-2 immunochemical score, and Ki-67 expression.

Figure 1
figure 1

Study design ascertaining outcomes from three rounds of screens based on the temporal natural history of breast cancer. BC, breast cancer; Ki-67, Ki-67 proliferation; HER-2/neu, HER-2/neu immunohistochemistry score.

Data sources

We emulated an organised service periodical screening of 1 million Taiwanese women aged 50 years or older assuming the distributions of covariates by various detection methods are identical to the first box in Figure 1. To fit in with this scenario, we tuned baseline pre-clinical incidence rate following age-specific incidence rate, ranging from 0.0017 to 0.021 per women-year during 2001–2005 in Taiwan (Taiwan Cancer Registry Annual Report, 2009). Since it is impossible to have a comprehensive empirical data to directly estimate the parameters underpinning the model, we turned, by using simulation technique, these empirical tabular data obtained from literatures into individual data as if they were collected according to the concept of sufficient statistics (Casella and Berger, 2001). For example, the sufficient statistics of generating data on breast density are the total numbers of each subgroup, which is equivalent to the collection of individual data when a multi-nominal distribution following the percentage of each category. The data on other covariates were spawned in a similar manner. After generating data set with covariates, we further applied a three-state Markov regression model to add information on various detection methods (including screen-detected cases and interval cancer) to capture time to enter the PCDP and time to the CP by using relative risk as specified in Tables 1 and 2. Data on three rounds of screen were generated as illustrated in Figure 1. This completes the creation of data set for parameter estimation (the second box of Figure 2).

Table 1 Percentage and relative risks for initiators affecting the transition rate from free-of-breast cancer to the pre-clinical screen-detectable phase (PCDP)
Table 2 Percentage and relative risks for promoters affecting the transition rate from the pre-clinical screen-detectable phase (PCDP) to the clinical phase (CP)
Figure 2
figure 2

Data sources, parameter estimation, and validation of three-state Markov regression model.

Statistical analysis

Estimation of parameters

The data created above were used to build up a series of likelihood functions based on our proposed Markov regression model (the third box of Figure 2). The details on the formula of likelihood function (the fourth box of Figure 2) are similar to those used in the previous studies (Hsieh et al, 2002).

Composite risk scores

The maximum likelihood estimation method was used to calculate risk scores (Chen et al, 2000; Hsieh et al, 2002) (the fourth box of Figure 2). The risk scores are the natural logarithms of the estimated transition rates for initiators and promoters. The risk stratification by deciles of risk score was constructed to demonstrate the following individually tailored screening strategies. The median value of 10-year breast cancer risk for the average-risk population at age 50 was used as a criterion to recommend the suitable starting screening age for other deciles, similar to Pharoah et al (2008).

Model validation

Figure 2 also shows model validation including cross-validation and external validation. The former was to divide full data set into the two parts, 2/3 trained data for estimating parameters and 1/3 validated data that were calculated by the application of parameters estimated from 2/3 trained data to the distributions of covariates of the other 1/3 validated data. We then compared the observed with the predicted number of breast cancer by different detection modes (first screen, subsequent screen, and interval cancer).

As far as external validation is concerned, we applied the parameters estimated from full data set we simulated from the empirical tabular data published in literatures to Taiwanese multi-centre screening (TAMCAS). The study design and results have been elaborated in previous studies (Lai et al, 1998; Chen et al, 2000; Wu et al, 2008). 4867 women with family history of breast cancer were invited and attended the first screen. We ascertained 76 breast cancer cases out of 4867 women. The observed breast cancer cases and number of women free of breast cancer identified at subsequent screen and interval cancers are listed in Table 3. Since different populations have different underlying incidence rates of breast cancer, the baseline pre-clinical incidence rate (in the absence of characteristics of relevant covariates) should be tuned to accommodate the underlying incidence rate of breast cancer for the target population of interest when the proposed Markov regression model is attempted. In our TAMCAS study, we let the pre-clinical incidence rate follow the Weibull distribution specified by the shape parameter equal to 1.94 and the scale parameter equal to 0.00015 on the basis of the underlying Taiwanese incidence rate of breast cancer among women with family history of breast cancer (Taiwan Cancer Registry Annual Report, 2009). The distributions of factors such as BMI, and AP were derived from TAMCAS data, whereas those of other attributes were based on Table 1 obtained from literatures, assuming, in addition to adjustment for baseline incidence rate, relative risks of those across studies would not vary with various ethnic groups.

Table 3 Observed and predicted results of cross-validation and external validation

Applications to individually tailored screening

To determine the optimal inter-screening interval in different high-risk groups, we adopted a triennial screening interval as a reference to identify the number of interval cancer for other risk score percentiles that did not exceed the median-risk score based on a triennial programme for high-risk screening. This is called a risk-score-based approach. We also applied estimates for improving sensitivity by using a combination of ultrasound or magnetic resonance imaging (MRI) with mammography compared with mammography alone borrowing information from previous studies (Berg et al, 2008; Kuhl et al, 2010) to identify the threshold of high-risk categories requiring alternative image techniques. According to Berg et al (2008), sensitivity improved 27.5% when mammography was combined with ultrasound. The sensitivity improved 14.8% when mammography was combined with ultrasound or 66.7% when it was combined with MRI in the light of Kuhl et al (2010). We calculated the incidence rate of interval cancer at which the threshold of risk percentiles combining ultrasound with mammography would be the same as the median population rate using mammography only.

We further extended the study period until 12 years. We calculated the proportion of interval cancer and the incidence rate of interval cancer as a percentage of the expected incidence rate (0.00185 per year) in the absence of screening (I/E ratio), a reflection of the sensitivity of the screening method, by different inter-screening intervals and risk-score-based strategy. The higher I/E ratio is, the lower the screening programme performance is. This also allowed us to assess how a risk-score-based approach contributes to reducing false positive cases resulting from the low-risk group and false negative cases from the high-risk group compared with universal screening policy. The sensitivity for each risk decile was estimated from the 1−I/E ratio, and each corresponding false-positive fraction from Berg et al (2008) was applied for calculating the false-positive fraction.

Results

Composite score for multi-state risk prediction

To identify the role of initiators and promoters, we therefore compared the frequencies of the risk factors across free of breast cancer, screen-detected cases, and clinically detected cases (such as interval cancers). The percentage (prevalence) and relative risk of having breast cancer in the PCDP (pre-symptomatic cases) for each initiator, which was extracted from the literature, are shown in Table 1. The transition rates from free-of-breast-cancer to the PCDP among carriers were estimated from the results of a meta-analysis (Chen and Parmigiani, 2007). The total law of probability was used to calculate the baseline effect of non-carriers. The estimated results for non-carrier, BRCA1 carrier and BRCA2 carrier were equal to 1.04 × 10−6, 2.10 × 10−5, and 1.60 × 10−5. The corresponding relative risks are equal to 20.26 and 15.44. The percentages listed in Table 2 for three tumour-specific risk factors (ER, Ki-67, and HER-2) are indicative of the distribution of breast cancer in the PCDP (pre-symptomatic cases). The relative risk for the progression from the PCDP to the CP (symptomatic cases) listed accompanied with the percentage of pre-symptomatic cases was used to simulate the corresponding percentage of interval cancers (symptomatic cases). Note that the percentages, of BMI and AP applied to pre-symptomatic (screen-detected) cases were simulated following the distribution from the normal (disease-free) women in Table 1, as these two attributes have been carried for the general women during the lifetime.

The estimated clinical weights of the two-step progression to breast cancer were expressed by the two risk scores

and

Information on risk score 1 provided the basis for risk stratification of the underlying general population. The 10-year risk of having breast cancer in each category is diagrammed in Figure 3. The 90th percentile of risk score of 10-year follow-up was 2.3%. The corresponding figure of lifetime risk was 10.2%. The corresponding relative risk compared with the median value (50th percentile) was 2.24.

Figure 3
figure 3

Ten-year risk and lifetime risk of developing breast cancer by risk score percentile.

Model validation

Table 3 shows the results of comparisons between the observed and the predicted number of breast cancer by detection modes for cross-validation and external validation. The proposed regression model may be adequate for risk score 1 and risk score 2, as there was lacking of statistical significance for cross-validation (=8.08, P=0.33) and external validation (=5.30, P=0.26).

Individually tailored screening policies

Table 4 shows the recommended policies by age at the start of screening and inter-screening intervals by different decile categories. The recommended age for the start of screening was suggested by identifying the equivalent age-specific risk for other percentiles, which is commensurate with the 10-year risk of the 50th percentile of the population and was estimated as 1% or so.

Table 4 The recommend age to start screening and inter-screening interval for two types of screening at different percentiles of risk score 1

Table 4 also shows the higher the risk of entering the PCDP and further progressing to the CP becomes, the more intensive is the inter-screening interval demanded. In addition to shortening the inter-screening interval to reduce the incidence rate of interval cancer, a higher sensitivity tool could also be considered. When triennial mammography plus ultrasound was applied, the incidence rate for interval cancer of women in the 85th risk percentile was reduced to the same rate as the median-risk women when the triennial mammography screening alone was applied in the light of Berg et al (2008). The thresholds for using ultrasound and MRI based on Kuhl et al (2010) (the sensitivity improved 14.8% when mammography was combined with ultrasound or 66.7% when it was combined with MRI) were at the 76th and 94th percentiles, respectively.

Following the spirit of translational research, we showed two cases to demonstrate how to implement individually tailored screening as suggested in Table 4. For a woman having average risk, she would be classified into the 40th∼50th percentile by calculating her risk score 1. By using Table 4, the recommend age to start screening is around 57 years and the recommended inter-screening interval is 4 years. For a woman in the 90th∼100th percentile, age at start of screening should not be older than 30 years, and the recommended inter-screening interval should not be over 6 months and the screening tool may consider using alternative image technique in addition to mammography.

The number of screenings, incidence of screen-detected cases, interval cancer, the proportion of interval cancer among breast cancer cases, and the I/E ratio for 12 years are simulated and listed in Table 5, which compares the risk-score-based policies with traditional universal screening policies. In terms of an indicator of I/E ratio, the risk-score-based approach (the optimal inter-screening interval was determined from Table 4) was equivalent to a universal biennial screen. The risk-score-based approach significantly reduced the interval caner rate as a percentage of expected rate in the absence of screening by 30% compared with triennial screenings, and it also reduced the false positive cases by 8.2%.

Table 5 A comparison of different screening policies for 1 million women during 12 years

Discussion

An individually tailored screening strategy can be developed by applying empirical evidence on genomic discovery, tumour phenotype, clinical correlates, and epidemiological profiles to a periodical population-based screening for breast cancer with a multi-variable and three-state Markov regression model. Developing this model is potentially helpful to ascertain the roles of different risk factors responsible for different phases in breast cancer progression, and also for targeting individually based preventive or surveillance interventions.

Individual susceptibility to breast cancer is determined by identifying initiators. Because high-risk women tend to develop breast cancer at an earlier age, screening age should be lower in the higher-risk group to detect early onset cases, especially in Asian countries, where younger breast cancer cases are predominant (Anderson et al, 2006). Our study demonstrated that the determination of an inter-screening interval using the risk-score-based approach was more effective for reducing interval cancer and false positive cases than the universal triennial screening programme. A risk-stratification approach truly reflects the priority for targeting individuals invited to be screened. However, denser breast tissues in younger women may lead to rapidly growing malignancies due to the low sensitivity of mammography. Although more intensive screening can reduce interval cancer, an alternative image technique may be a better solution to reduce interval cancer for some high-risk women. Our risk assessment model can be also applied to ascertaining women with the risk higher than 75% of the underlying population that may require mammography in combination with ultrasound and 90% that may need mammography combined with MRI, which were consistent with the previous findings (Kriege et al, 2004; Boyd et al, 2007).

In terms of translational research, our study is the first to demonstrate how to translate genetic markers, tumour phenotypes, clinical attributes, and conventional risk factors to facilitate individually tailored screening for breast cancer. Our approach can help women who have the BRCA1 or BRCA2 mutations in conjunction with other risk factors to reduce the breast cancer risk by intensive surveillance, chemotherapy (tamoxifen), oral contraceptives, prophylactic mastectomy, and prophylactic oophorectomy (Anderson et al, 2006). In addition, the lower risk with less-intensive screening or the lower likelihood of using alternative image techniques can also reduce false positive cases as demonstrated in our study. A cost-effectiveness study on testing breast cancer susceptibility genes showed that test and no-test strategies are very similar in cost and effectiveness, even with a small mutation probability (Holland et al, 2009).

In our current model, there are five promoters responsible for the progression from the PCDP (pre-symptomatic disease) to the CP (symptomatic disease) (including BMI, age at first full-term pregnancy (AP), ER, Ki-67, and HER-2). Although BMI and AP have been reported to have the role of promoters (Hsieh et al, 2002), the other three tumour-specific risk factors (ER, Ki-67, and HER-2) associated with the role of promoter have not been validated yet. It is therefore very interesting to assess whether and how the three tumour-specific factors (ER, Ki-67, HER-2) makes contribution to the progression from the PCDP to the CP. By using the likelihood ratio test, we compared the full model with the reduced model (excluding three biomarkers) and found the contribution of three tumour-specific factors was statistically significant (=137.06, P<0.0001). This confirms that these three biomarkers have an important role of promoter.

Promoters identified from screen-detected cases and interval cancer can be used as prognostic or predictive factors for personalised early treatment or prophylactic intervention. Other than the interval cancer resulting from false negatives, rapid tumour progression also accounts for the failure of detection by screening. The identification of promoters affecting the transition rate between the PCDP and the CP leading to interval cancer with rapid progression has an important role in personalised treatment and therapy (Dowsett and Dunbier, 2008). Perou et al (2000) proposed four main breast cancer molecular classes categorised by histological grade and gene-expression profile, including ER, PR, and HER-2/neu statuses, all of which may be defined as promoters and provide a better prediction of prognosis for optimal treatment (Dowsett and Dunbier, 2008). For example, trastuzumab is a first-line treatment for HER-2/neu-positive metastatic breast cancer (Viani et al, 2007). The other new high-throughput genomic technologies, such as Oncotype DX and MammaPrint, are already commercialised as prognostic tests (Dowsett and Dunbier, 2008). By identifying different risk groups at different stages, our risk assessment model provides guidance for implementing all these new therapies to achieve efficacy and efficiency.

The concept of individually tailored breast cancer screening created in the current study is also helpful for a woman to prevent breast cancer using knowledge of her genes as well as how she lives her life by the application of our method after gene testing and health consultation. Our proposed method and application can also provide a platform to facilitate the personal genetic-testing industry that has been highly debated for the time being but seems promising for the industry of direct-to-consumer genetics firms on health prevention of cancer. All these efforts have driven the use of new cancer markers adopted by clinical practice and health-policy makers to make translational research feasible and create medical enterprise as proposed in the previous study (Dalton and Friend, 2006).

In spite of strength in using our proposed approach to link the state-of-the-art research findings to clinical practice, several concerns and limitations should be clarified. Although the methodology developed in the current study is based on a multi-variable three-state Markov regression model, the definition of three-state outcome is in accordance with population-based screening data and the clinical weight (relative risk) of each factor are given in the light of current evidence from literatures. We therefore think the simulated results given such evidence-based information do not rely on mathematical model assumptions. Notably, factors responsible for the distinction between initiators and promoters were identified by the comparison of distributions between screen-detected cases and interval cancer without making any assumption. Nonetheless, the application of molecular findings to clinical practice through our novel quantitative approach may be still in a preliminary stage as initiators and promoters are ethnic-specific and may change from time to time. The current results may not be directly applied to other populations. Hence, the proposed model should be updated by modulating clinical weights with new empirical evidence to adapt the change. In addition, the determination of age of the start of screening and inter-screening interval and the threshold of screening modality with alternative image technique for high-risk group may vary with different populations. The direct application of our results on Table 5 based on our multi-variable Markov regression may be limited and should be modified. Moreover, to be comprehensive in making allowance for all factors affecting the two-step transitions of the three-state Markov regression model, those proposed in literatures were included as either initiators or promoters in our Markov regression model. The concern arises as to whether the results on the associations between certain factors, in particular genetic variants, and onset or the subsequent progression of breast cancer may be still too preliminary or costly to be applied to population-based screening. To this end, it is therefore necessary to corroborate the validity of each factor by adding more empirical evidence and extending the application of our risk-score-based algorithm to population-based screening by reducing costs involved in determining these genetic variants in economic scale.

In conclusion, this is the first to demonstrate how to translate state-of-the-art biological information on genetic, tumour, and clinical attributes as well as conventional risk factors into the realisation of the concept of an individually tailored screening strategy. The proposed risk-score-based approach can be readily applied to other screening programs in different countries and areas by tuning their own genetic susceptibility factors, tumour phenotypes, clinical attributes, and risk factors.