Benefits and harms of annual, biennial, or triennial breast cancer mammography screening for women at average risk of breast cancer: a systematic review for the European Commission Initiative on Breast Cancer (ECIBC)

Background Although mammography screening is recommended in most European countries, the balance between the benefits and harms of different screening intervals is still a matter of debate. This review informed the European Commission Initiative on Breast Cancer (BC) recommendations. Methods We searched PubMed, EMBASE, and the Cochrane Library to identify RCTs, observational or modelling studies, comparing desirable (BC deaths averted, QALYs, BC stage, interval cancer) and undesirable (overdiagnosis, false positive related, radiation related) effects from annual, biennial, or triennial mammography screening in women of average risk for BC. We assessed the certainty of the evidence using the GRADE approach. Results We included one RCT, 13 observational, and 11 modelling studies. In women 50–69, annual compared to biennial screening may have small additional benefits but an important increase in false positive results; triennial compared to biennial screening may have smaller benefits while avoiding some harms. In younger women (aged 45–49), annual compared to biennial screening had a smaller gain in benefits and larger harms, showing a less favourable balance in this age group than in women 50–69. In women 70–74, there were fewer additional harms and similar benefits with shorter screening intervals. The overall certainty of the evidence for each of these comparisons was very low. Conclusions In women of average BC risk, screening intervals have different trade-offs for each age group. The balance probably favours biennial screening in women 50–69. In younger women, annual screening may have a less favourable balance, while in women aged 70–74 years longer screening intervals may be more favourable.


INTRODUCTION
Breast cancer (BC) is the second most prevalent cancer in the world and the most frequent among women [1]. In the European Union, 404,920 women were diagnosed with BC and 98,755 women died during 2018 [2]. Despite these high rates, the mortality risk of BC has decreased over the last decades due to improvements in treatment, services quality, and to early diagnosis linked to the implementation of population-based screening programmes [3]. However, there is still ongoing research and debate on how to best implement BC screening programmes, including which is the optimal mammography screening interval.
Published recommendations on mammography screening frequencies vary among organisations. The National Health Service Breast Screening Program (NHSBSP) of the United Kingdom, recommends screening every 3 years to women aged 50-70  in England) [4]. The United States Prevention Services Task Force (USPSTF) recommends biennial mammography for women aged 50-74 and making a case by case decision for women in their 40s [5]. The American Cancer Society recommends annual screening between the ages of 45 and 54 (with the option of starting annual screening between 40 and 44), and screening every two years from age 55 or continue annually if the woman is in good health and expected to live ten more years [6].
Previous studies have suggested that the balance between benefits and harms for different screening intervals might vary depending on the age subgroup. A modelling study found that for every 1000 women aged 50-74, biennial screening avoided seven BC deaths, while annual screening had similar benefits but caused more harms [7]. Observational data from the US Breast Cancer Surveillance Consortium (BCSC) registries, observed that premenopausal women undergoing biennial screening had more BC lesions with less favourable prognostic characteristics compared to those having annual screening [8].
In 2015, the European Commission Initiative on Breast Cancer (ECIBC) was launched to develop the European Guidelines on Breast Cancer Screening and Diagnosis. This article describes the systematic review that informed the recommendations about mammography screening intervals for women of average breast cancer risk in three separate age subgroups [9,10]. During the guideline development process [9], the Guidelines Development Group (GDG) made detailed considerations about the balance between desirable and undesirable effects [9], values and preferences, equity, acceptability and feasibility; these considerations are described in the published methodology and summary of recommendations [9,10] (https://healthcare-quality.jrc.ec. europa.eu/european-breast-cancer-guidelines/screening-ages-andfrequencies).

Data sources and searches
We initially searched MEDLINE (via PubMed, October 2016), EMBASE (via Ovid, October 2016) and CENTRAL (via The Cochrane Library, October 2016) databases using pre-defined algorithms for individual studies. We updated our initial search in MEDLINE (via PubMed) and EMBASE (via Ovid) in April 2020 (Supplementary

Study selection
We included studies published in English of the following designs: (I) randomised clinical trials (RCTs), (ii) observational studies such as cohorts, time trend (before-after), or analysis of population surveillance registries, and (iii) decision analytic models (hereafter referred to as modelling studies) (Supplementary Tables S3a and 3b). All studies included at least two screening intervals in one of the age groups of interest; screening intervals from observational studies should had been defined based on at least two examinations prior to diagnosis; modelling studies should have assumed 100% adherence to the screening programmes and applied no discounting to the effects. Due to sparse empirical evidence in the 45-49 age subgroup, we included RCTs and observational studies that recruited women from 40 to 49. We excluded studies of women at high risk for breast cancer, i.e. having known susceptibility gene mutations (BRCA1/BRCA2), a history of previous breast cancer or lobular neoplasia, exposure to chest irradiation (other than diagnostic imaging over that anatomical area) or having a direct family member with breast cancer.
Pairs of reviewers (CCA, MP), after calibration, assessed eligibility and reviewed the full text of the selected references. Discrepancies were resolved either by consensus or with the help of a third reviewer.
Data extraction and risk of bias assessment CCA and MP independently extracted details of the study design, patient population, setting, screening method, follow-up, mammography intervals and results. If needed, we requested additional data from the authors. We assessed the risk of bias (or credibility for modelling studies) with the following tools: (I) for RCTs with the Cochrane Risk of Bias Assessment tool [12] (ii) for observational studies with the Risk of Bias in Non-randomised Studies of Intervention (ROBINS-I) [13], (iii) for modelling studies with the Questionnaire to Assess Relevance and Credibility of Modelling Studies (the ISPOR-AMCP-NPC Good Practice Task Force) [14].

Data analysis
We prioritised observational studies reporting the longest observation time when different studies used the same surveillance or clinical registries from an identical population covering overlapping time periods. We prioritised the more direct evidence for a European population of average risk when data was stratified by women´s characteristics (i.e. white women instead of other ethnic groups).
Modelling studies reported the incremental number of events for each screening interval compared to a non-screening scenario. For some studies, we calculated the number of events by subtracting overlapping age groups (i.e. to obtain events in annual screening in women 45-49 years old, we subtracted the estimates in women 50-69 from the larger group of 45-69). We used the estimates for women with scattered fibroglandular breast density when they were only reported by breast density categories. Across the different studies, we presented the range of the absolute difference of events per each pairwise screening interval comparison.
We did not attempt to conduct a meta-analysis of relative risks (RR) or odds ratios (OR) from empirical studies because there were not enough studies across age groups to be meaningful or because several publications reported the same population data at overlapping time periods.

Certainty of the evidence
We rated the certainty of the evidence, as high, moderate, low or very low, for each outcome based on the standard GRADE approach for RCTs and observational studies [15,16]. To apply the GRADE approach to modelling studies, we considered the certainty would depart from the lowest certainty of the bodies of evidence that informed the main inputs in the model. We used the credibility and relevance items from the ISPOR-AMCP-NPC tool to inform the judgments for the risk of bias and indirectness domains.
As is customary in systematic reviews, we adopted a partially contextualised approach to rate the certainty of evidence, this means that for a point (or range) estimate of a single outcome we assessed our certainty that the true effects lie within the boundaries of what we consider a trivial, small, medium or large effect without considering the evidence from other outcomes [17]. During the development of recommendations, guideline panel members might consider our results using a contextualised approach which means considering the evidence from other critical outcomes (i.e. whether the benefits are consistent across outcomes) when rating the certainty for a single outcome [17].

Studies' characteristics
We provide here a summary of the study design, and the main results for only the three age groups of interest. When there is empirical data (from observational or RCTs) we rely primarily on those estimates instead of simulated number of events from modelling studies. To interpret the modelling estimated events,  we must consider that they represent the estimated events for a cohort of individuals from the time of screening until death or during the individual´s lifetime (or other given time point). The estimated 10-year probability of false positive or false biopsy recommendation in the observational studies were estimated using a previously described statistical model [43]. A detailed reporting of the results from studies covering larger age groups (i.e. 66-74 years) can be found in Supplementary  [19]. Of the women originally invited to either of the screening arms, 38,492 (77%) attended triennial screening and 37,530 (76%) attended annual screening. The primary end point was predicted mortality based on two validated risk-models. However, as the UKCCR published observed data for survival up to the end of 2006, we reported these estimates in our assessment [18].
Nine studies performed analysis from surveillance systems data of the United States which differed in the time periods covered and the age group of the women included. Eight studies used national Breast Cancer Surveillance Consortium (BCSC) mammography registries which were linked to the Surveillance, Epidemiology, and End Results (SEER) pathology registries [8,30,32,34,37,38,40]. One study used the Vermont Breast Cancer Surveillance System (VBCSS) from the state of Vermont [33]. The studies included two types of analysis: first a case series of invasive BC that were used to evaluate the association between screening intervals and adverse tumour characteristics, and secondly, they estimated the 10-year cumulative probabilities of false positive results and false positive biopsy recommendations (Table 1) [43].
A quasi-experimental study included women aged 40-49 who were invited to attend a screening programme in Finland. Those women born in an even calendar year were invited for mammography screening every year, while those born in an odd calendar year were invited to screening every 3 years [39]. One study conducted a comparative analysis of two time periods in British Columbia-Canada, before and after 1997, year when the Screening Mammography Program of British Columbia (SMPBC) changed its policy from annual to biennial mammography for women aged 50-79 [31].
Two studies included women from screening programmes at medical centers from the US. The first performed a retrospective analysis of data from women who chose to attend either annual or biennial mammography examinations in a screening programme of the University of California San Francisco Medical Center [35]. The second study was a retrospective cohort of women without previous diagnosis of BC who attended a routine screening examination at Columbia University Medical Center in New York; the screening interval was defined using the time elapsed since their previous exam according to their electronic clinical records [42] (Table 1).
Six studies used microsimulation models developed within the Cancer Intervention and Surveillance Modelling Network (CISNET) collaboration: Model D (Dana-Farber) [44], Model E (Erasmus) [45], Model GE (Georgetown-Einstein) [46], Model M (MD Anderson) [47], Model S (Stanford) [48], and Model W (Wisconsin-Harvard) [49]. Each of these models has its own characteristics which are described elsewhere [50], they vary in the model structures and assumptions such as factors conditioning screen detection, individual risk factors or allowing spontaneous regression of ductal carcinoma in-situ (DCIS) lesions [51]. Four studies assessed mammography screening intervals for the U.S. population reporting the median estimates from two to six models [7,21,22,24]. Two studies simulated screening for a Canadian population based on an adaptation of Model W [26,28]. One microsimulation study projected adverse events related to radiation exposure from mammography exams in women 50-74 years of age (Table 2) [21]. One additional study adapted a microsimulation Markov model to the German context to assess annual, biennial, and triennial routine screening in women aged 50-69 [29].
The remaining four modelling studies implemented nonindividual models. One transition model evaluated annual versus biennial screening intervals in Japan [23]. One Markov model assessed breast cancer deaths averted and overdiagnosis due to screening for women in the United Kingdom [20], and another study applied the model developed by Preston to estimate radiation related events [25]. We obtained non-publicly available data of a transition modelling study for a Spanish cohort described elsewhere (Table 2) [27,52].
Benefits and harms in women aged [45][46][47][48][49] (Tables 3/4) Observational studies. A Finish study suggested an increase in the risk of BC mortality in annual versus triennial screening (incidence RR 1.14; 95%CI 0.59-2.19) although the estimate was very uncertain [39]. The odds of advanced breast cancer stage (IIB-IV) may be higher in women with a history of biennial screening compared to annual screening (OR 1.17; 95%CI 0.93-1.46) among incident breast cancers from US registries [37].
Moreover, indirect evidence from the wider age group of women  suggested that the incidence of interval cancers may be lower among annually screened (0.07%) compared to biennially screened (0.15%) women, but it was very uncertain given the small number of events [35].
Modelling studies. One study estimated, across six microsimulation models, a median of 30 more deaths averted per 100,000 women undergoing annual screening compared to biennial screening in the US population [7], while the median number of additional QALYs gained with annual screening was 480 more compared to biennial screening [7]. In the same modelling study, the overdiagnosis estimation was higher with annual screening compared to biennial screening [7]. One modelling study assessed the risk of radiation induced adverse events in this age group and found that annual screening yielded 14 more induced BC and 2 more deaths per 100,000 screened women compared to biennial screening [21].
Benefits and harms in women aged 50-69 (Tables 3/4) Randomised clinical trials. Duffy et al. reported in the UKCCR study, over a median of 162 months of follow-up, that annual screening may decrease the risk of BC mortality compared to triennial screening among attenders to the prevalent screening (RR = 0.89, 95% CI 0.73−1.07) [18]. Moreover, there was a small difference in the size of the tumour at diagnosis, with a major proportion of them being 10 mm or smaller in the annual screening group compared to the triennial group (25% vs. 19%) [18,19].
Observational studies. One study in a province of Canada comparing the period before and after mammography screening changed from annual to biennial found there may be little to no difference in mortality (MR 1.06; 95%CI 0.76, 1.46) or interval cancer (RR 0.98; 95%CI 0.90-1.06) between the two-time periods [31]. Miglioretti Table 2 continued Trentham-Dietz, 2016 [22] United States
Van Ravestein, 2012 [24] United States **A previous study by the same authors and using the same model and population was excluded (Mittmann 2015) as the updated study provided a more detailed description of the outcomes.

Very low
To review the reference for each study and the reasons for downgrading the certainty of the evidence see Supplementary file Table S4-S12.
*Only the study with the longest time of observation was included when there were several publications with overlapping time periods. When studies provided results stratified by women´s characteristics, we extracted data from subgroups more similar to European context (i.e. white women instead of other ethnic groups).
**We calculated the confidence interval from the raw data reported in the publication as the original interval was not consistent with the main effect and lower interval bound. ***Randomized clinical trial study.  [38]. From a US study using mammography and tumour registries, the 10-year probability of a false positive result was 55.2% (95%CI 54.8-55.7%) with annual screening, 35.4% (95%CI 35.0-35.7%) with biennial screening, and 24.8% (95%CI 24.5-25.2%) with triennial screening [38]. The cumulative 10-year probability of having a false positive biopsy recommendation was 9.7% (95%CI 9.3-10.1%) with annual screening, 5.4% (95%CI 5.2-5.6%) with biennial screening, and 3.7% (95%CI 3.6-3.9%) with triennial screening [38]. These findings were consistent with the risk of false positive results observed in a retrospective cohort of a screening programme of New York [42].
Modelling studies. In a Canadian modelling study, the number of BC deaths averted per 100,000 women aged 50-69 screened annually, biennially or triennially compared to no screening was 740, 520 and 400, respectively [26]. In another study, including three models tailored to the US population, the number of BC deaths averted per 100,000 screened women aged 50-74, with scattered fibroglandular breast density, was 690, 520 and 400 for annual, biennial and triennial screening [22] and the number of QALYs gained was 6000, 4700 and 3600, respectively [22]. A microsimulation model for the German population found a median of 4400, 3900 and 3330 additional QALYs with annual, biennial and triennial screening [29].
The estimated overdiagnosis was greater with more frequent screening intervals. In women with scattered fibroglandular density aged 50-74, a microsimulation model study estimated 2900, 2000 and 1600 for annual, biennial and triennial screening compared to no screening per 100,000 women [22]. A similar trend was reported in a study using non-individual models for a Spanish cohort of women aged 50-69 [27].
A microsimulation model estimated the risk of radiation induced adverse events in 100,000 women aged 50-74 to be of 27 induced BC cases with biennial screening and 49 with annual screening [21]. The attributed number of radiation related deaths simulated was 4 with biennial screening and 7 with annual screening for the same age group [21]. A similar difference between biennial and annual screening intervals was observed from an excess absolute risk model of radiation induced BC [25].
Modelling studies. The estimated difference for BC deaths between the different intervals might be small. A microsimulation model estimated the number of BC deaths averted for annual, biennial and triennial screening to be 100, 90 and 80, respectively, compared to no screening per 100,000 screened Canadian women  To review the reference for each study and the reasons the certainty of the evidence was downgraded see: Supplementary file Table S4 to S12. When more than one study informing an outcome, the number represents the range of point estimates reported across studies. *Number of events was not directly reported for this age group. We made an ad-hoc calculation subtracting the events from overlapping age groups (e.g. number of QALYS in women 45 to 69 years minus the estimates from 50 to 69 years).
**The certainty of evidence departed from low as the input parameters that inform the modelling studies were of low to very low certainty.
# Only one study providing unpublished data informed this comparison. The result was in a different direction than the other bodies of evidence and thus cautious interpretation is recommended.

##
Unpublished data from one study (Vilaprinyo 2014) reported 19 fewer BC deaths averted with annual compared to biennial screening. This result was inconsistent with the other studies and, therefore, is not included in the table.
C. Canelo-Aybar et al. [26]. This result was consistent with the one reported in a nonindividual model for a Spanish cohort which showed almost similar benefits for the three screening intervals (unpublished data) [27], and a small number of QALYs gained since life expectancy is lower in this age group.
Only one non-individual based model estimated overdiagnosis for this age group and it showed a small increasing trend with shorter screening intervals from 193 for triennial screening to 269 for annual screening [52].
Risk of bias and certainty of the evidence Overall, the certainty of the evidence was very low, and therefore the differences observed between the possible combinations of screening intervals and age groups are uncertain. The exemption was the evidence from the only RCT included in this systematic review which was downgraded to moderate certainty due to imprecision [19].
The evidence from observational studies was limited among other factors by indirectness as for the age group of 45-49 we only identified studies including a broader age range from 40 to 49 years of age at the time of invitation to screening, and from some studies we had to extract results from specific subgroups of women (e.g. normal weight or white women). All secondary analysis from surveillance registries were also subject to misclassification bias of the interventions as the periodicity of screening was assigned based on different time ranges that elapsed between the two latest mammographies prior to diagnosis. Additionally, US studies used opportunistic screening, thus women might have anticipated or delayed the mammography due to preferences or indications given by radiologists.
We decided that for modelling studies, our GRADE assessment departed from low certainty after considering methodological limitations of key input evidence (i.e. mammography sensitivity estimated from BCSC registries including women from wider age groups than our clinical question and with a clinical follow-up restricted to only one year [53], or no formal assessment of risk of bias in the individual-patient-data meta-analysis used to inform treatment effectiveness [54]) and that credibility assessment of model development was limited due to suboptimal reporting. There was also limited reporting of formal sensitivity analysis to assess the impact of input data assumptions on the simulated events [21,24,25]. We had concerns about indirectness given that most models used observational data from the US to inform their input parameters (i.e. radiation induced BC), and because in one modelling study data was only available by different levels of breast density (i.e. scattered fibroglandular density) [22]. Finally, one study providing unpublished data (Vilaprinyo 2014) [27] reported fewer BC deaths averted with annual compared to biennial or triennial screening in the age group of 45-49 years. This result was not internally consistent (i.e. annual screening had the largest number of BC deaths averted from 45 to 69) and differed from other studies or bodies of evidence; thus we included this result cautiously only if other studies were not available ( Table 4).
The detailed risk of bias assessment per study is available under request. The evidence profiles for all age groups and intervals comparisons describing the reasons for downgrading the certainty of evidence are available from Supplementary Tables S5-S13. In the evidence profiles we prioritised the reporting of evidence from observational/randomised studies over modelling studies (i.e. false positive results).

DISCUSSION
Main findings Our systematic review shows that in women of average breast cancer risk, screening intervals may have different trade-offs between benefits and harms for each age group. However, the available evidence was mostly of very low certainty and precludes us from reaching firm conclusions. In women 50-69 years old, annual compared to biennial screening may have small additional benefits but an important increase in false positive results. Triennial compared to biennial screening suggests the latter provides more benefits but also some additional harms. In younger women (45)(46)(47)(48)(49), the more frequent screening intervals (going from biennial to annual screening) provides smaller incremental benefits (i.e. number of BC deaths averted), nearly similar incremental estimates of overdiagnosis and slightly more incremental harms (i.e. false positive results and false positive biopsies recommendations from observational studies) than in women 50-69 years of age. Thus the overall balance between benefits and harms is more favourable in the latter age group. Finally, among women aged 70-74, the smaller incremental harms and similar benefits with shorter screening intervals suggests that longer intervals probably have a more favourable overall balance, but the difference may be small.
We observed sparse data, especially in older women and for critical outcomes, such as BC mortality or disease stage at diagnosis. The only included RCT showed that annual screening, compared to triennial screening, probably reduces BC mortality in women 50-62 years of age. Observational evidence consisted of population registries from different time periods with high uncertainty. We considered modelling evidence when empirical evidence was not available. However, its certainty was very low due to indirectness, since data for input parameters mostly come from opportunistic screening settings,. Model studies suggested that in women aged 50-69 the benefits with annual screening may be a bit larger but may also be associated to relevant harms, including the possibility of a small increase of new BC lesions induced by radiation exposure; thus, biennial screening may provide a more favourable balance, while in other age groups the potential benefits gains with more frequent screening intervals may be smaller.
Our results in the context of previous research Our results are broadly consistent but more comprehensive than previous reviews. The USPFTF based their assessment on one modelling study (included in our review), concluding that when moving from biennial to annual mammography, regardless of the starting age, there is a small increase in averted deaths but with a large increase of harms [7]. A systematic review conducted by the American Cancer Society included an indirect comparison between RCTs and a model study from the CISNET collaboration, concluded that beginning screening with more frequent intervals likely results in a greater mortality reduction but the magnitude is uncertain [55].
The modelling estimates of harms due to overdiagnosis remains a matter of debate as there is no consensus on the methods to quantify this outcome [56], and many assumptions are made, including the clinical impact of DCIS and the probability of some cancers to spontaneously regress [50]. It is worth noting that there is also considerable uncertainty in the evidence coming from RCTs. For example, a review including only studies that did not invite women of the control group to screening at the end of the trial period, reported a relevant proportion of overdiagnosis [57]. However, the UK age trial showed that the cumulative incidence of invasive cancers was similar, if not higher, in women who underwent only one mammogram after the age of 50 compared to women who underwent annual mammography from 40 to 49, and then entered a triennial screening programme [58].
The cost-effectiveness of implementing different screening intervals has been studied in few microsimulation models. One study assessed the impact of extending the Dutch screening programme in women under 50, showing that biennial strategies were cost-effective while other alternatives, such as annual screening starting at 45, resulted in less favourable incremental cost-effectiveness ratios (ICERs) [59]. However, the study used an 80% adherence to screening [59], which might have influenced the relative trade-offs between different screening intervals, as previously described [22]. In women from the US between 50 and 74 years of age, with different breast densities and individual risk level of developing BC, triennial strategies were considered costeffective (at a threshold of $100 000 per QALY) for subgroups with average risk and low breast density, while biennial strategies were cost-effective for other breast density subgroups at an average or intermediate risk [22].

Limitations and strengths
Although we included only English language articles, the risk of selection bias is probably small as we also screened previous systematic reviews and consulted the GDG experts, not identifying additional studies. Some results are not directly transferable to the European context; for example the cumulative 10-year false positive rates from US studies are higher than those reported in organised European screening programmes. However, we assumed that the difference between intervals would be more comparable across different settings. The scarce available empirical evidence to evaluate the trade-offs between benefits and harms limited our conclusions. We therefore included modelling evidence to complement the gaps in the evidence, an approach that is recommended for interventions such as population screening [60].

Implications for practice and research
Our findings may have different implications for practice depending on the age group, the balance between benefits and harms, available resources for public health services, and how women value the different outcomes. In the case of women invited to an opportunistic screening programme (or considering screening) a shared decision-making process to carefully explain the pros and cons of each decision is warranted. Similarly, given the low certainty of evidence and the variability and uncertainty of how women value outcomes at stake, guideline panellists are likely to formulate conditional recommendations, as opposed to strong ones. The scope of this review is determined by the European Breast Guidelines screening recommendations; [10] thus, policy makers should note that we did not include modelling estimates for women between the ages of 40 and 44 as screening is not suggested in this age group [10]. Also, readers should be careful when interpreting the effects of screening intervals across the different age groups, as comparisons are limited by the small number of screening rounds in the 45 to 49 and 70 to 74 age groups, compared to the 50-69 age group.
Recommendations about mammography screening intervals will also depend on the magnitude and relative importance of potential harms. Narayan, et al. assessed to what extent harms should decrease in order to make a screening interval with an unfavourable balance of benefits and harms acceptable [61]. They found that for annual screening a reduction of 31% false positive results would be required to support a recommendation in favour of starting at 50, although this was in the context of false positive rates prevailing in the US [61]. Policy makers should probably consider implementing interventions to improve mammography performance, mitigating concerns about potential harms. For example previous studies suggest that comparing mammograms with prior exams can significantly reduce the recall rate while maintaining the same detection rate [62,63].
Several research priorities were identified during this review, with feedback from the GDG experts, such as need for: (i) empirical research on the effectiveness of the different screening intervals due to the current very low certainty of evidence; ii) cost-effectiveness studies using unitary costs from different settings, and in particular for women aged 45 to 49, iii) assessment of alternative imaging modalities, iv) tailored screening according to risk vs population screening. For example, previous research has highlighted that breast density influences both mammography accuracy and risk of developing breast cancer [64,65]. For further information on the complete recommendations formulated in the European Guidelines on Breast Cancer Screening and Diagnosis, please visit the ECIBC website (https://healthcare-quality.jrc.ec.europa.eu/european-breastcancer-guidelines/screening-ages-and-frequencies).

DISCLAIMER
All views expressed in this article are strictly those of the authors.

DATA AVAILABILITY
All data sources used during this study are described in this published article and its additional information files. The datasets analysed are available from the corresponding author on reasonable request. discussions generated by this systematic review which led to the different recommendations they developed in the European Guidelines on Breast Cancer Screening and diagnosis (https://healthcare-quality.jrc.ec.europa.eu/european-breast-cancer-guidelines).

AUTHOR CONTRIBUTIONS
Carlos Canelo-Aybar, Margarita Posso, Nadia Montero, Ivan Sol and Pablo Alonso-Coello were responsible for conducting the systematic review. Carlos Canelo-Aybar and Margarita Posso conducted the search, data extraction, and analysis. Zuleika Saz-Parkinson, Paolo Giorgi Rossi, Stephen W. Duffy, Markus Follmann, Stephen W. Duffy, Markus Follmann, Axel Gräwingholt and Paolo Giorgi Rossi contributed to the definition of the research protocol, and provided comments to the preliminary results of the systematic review. Carlos Canelo-Aybar drafted the first version of the article. All authors contributed to the interpretation and reporting of the results and provided comments on subsequent versions of the article. All authors read and approved the final manuscript prior submission.

FUNDING INFORMATION
The systematic review was carried out by Iberoamerican Cochrane Collaboration under the Framework contract 443094 for procurement of services between the European Commission's Joint Research Centre and Asociación Colaboración Cochrane Iberoamericana.

ETHICS APPROVAL AND CONSENT TO PARTICIPATE
Not applicable.

CONSENT TO PUBLISH
Not applicable.