Introduction

Tamoxifen, a selective-estrogen receptor modulator, is highly effective for breast cancer prevention among high-risk women1,2,3 and significantly reduces the risk of recurrence, second breast cancer, and mortality in the adjuvant setting among women diagnosed with estrogen receptor positive breast cancer4,5,6. Despite the therapeutic benefit of tamoxifen, adherence is low in the clinical setting with approximately 30–60% of women discontinuing tamoxifen before 5 years7,8,9,10,11. Prior studies have consistently found that treatment discontinuation is related, in part, to the development or worsening of vasomotor symptoms (e.g., hot flashes) and joint pain11,12,13,14,15. Therefore, identifying patients with treatment-related symptoms who are most likely to benefit from tamoxifen may focus attention on encouraging specific patients to persist with therapy and enable development of targeted studies that aim to ameliorate side effects.

Mammographic density decline after tamoxifen therapy is a proposed marker of treatment response16,17 as it has been associated with a lower risk of developing breast cancer in the chemopreventive setting3 and reduced risk of recurrence or death from breast cancer in the adjuvant setting16,18,19,20. Limited data further suggests that treatment-associated endocrine symptoms after tamoxifen initiation, including both vasomotor and joint symptoms related to modulation of the estrogen receptor, may be a potential indicator of more efficient metabolism of tamoxifen and therefore a predictor of favorable response21,22,23,24. However, whether treatment-associated endocrine symptoms are associated with reductions in breast density remains unknown.

For this study, we evaluated whether treatment-associated endocrine symptoms were associated with reductions in breast density among women undergoing tamoxifen therapy for clinical indication in the Ultrasound Study of Tamoxifen. We used a novel ultrasound tomography (UST) scanner to assess whole breast sound speed, a highly accurate and reliable estimate of volumetric breast density, repeatedly over time while avoiding ionizing radiation25,26,27. UST has received FDA approval for clinical use and was recently approved for breast cancer screening to enhance mammography among women with dense breasts17,28. We have previously shown rapid declines in density after tamoxifen initiation using UST scans and therefore UST may have utility to capture early responses to tamoxifen efficacy17. We further incorporated measures of circulating tamoxifen metabolites and sequenced CYP2D6 to characterize tamoxifen metabolizer status. This 12-month longitudinal study provides a unique opportunity to examine relationships between serial measures of UST whole breast sound speed and repeated assessments of endocrine symptoms within the first year of tamoxifen initiation.

Results

Baseline characteristics

Participant characteristics are described in Table 1. At baseline, the mean age of participants was 51.4 years (SD = 9.0) and 66% of the women were premenopausal. Women who reported endocrine symptoms at baseline were slightly older, more likely to be Black women, and had higher body mass index (BMI) than those who did not report endocrine symptoms. Among those where CYP2D6 metabolizer status was determined, 60% (n = 36) were efficient, 30% (n = 18) were intermediate, and 7% (n = 4) were poor metabolizers.

Table 1 Characteristics of women treated with tamoxifen in the Ultrasound Study of Tamoxifen.

Trajectories of breast sound speed change by treatment-associated symptoms

Overall, women with treatment-associated endocrine symptoms had patterns of greater decline in sound speed compared to those who did not report treatment-associated symptoms after tamoxifen initiation (Fig. 1). Trajectories were similar when extreme values of sound speed change were excluded (see Supplementary Figure 1) and did not appear to largely differ for women with treatment-associated symptoms who reported high symptom frequency or low/moderate symptom frequency (see Supplementary Figure 2).

Fig. 1: Trajectories of change in breast sound speed (m/s) by treatment-associated symptoms.
figure 1

a No treatment-associated symptoms. b Treatment-associated symptoms. Treatment-associated endocrine symptoms were defined as either emergent or worsening vasomotor symptoms and/or joint pain after tamoxifen initiation. Loess curve was fit between change in breast sound speed and time from tamoxifen initiation. Extreme values for change in breast sound speed (m/s) are indicated with symbols. Each symbol represents an observation from the same women. The caret represents a woman with an extreme value of breast sound speed increase and who reported weight loss from T0 to T2 of approximately 20 lbs. and T0 to T3 of 25 lbs. The plus, asterisk, and angle bracket symbols represent three separate women with an extreme value of breast sound speed decline and cessation of menstrual periods by T3.

Change in breast sound speed by treatment-associated endocrine symptoms

In adjusted models, change in sound speed over time significantly differed by treatment-associated endocrine symptoms (p-interaction = 0.02) (Fig. 2a; see Supplementary Table 1). Statistically significant increasing declines in breast sound speed were observed among women with treatment-associated endocrine symptoms (mean change [95% CI] at T1: −0.26 m/s [−2.17, 1.65]; T2: −2.12 m/s [−4.02, −0.22]; T3: −3.73 m/s [−5.82, −1.63]; p-trend = 0.004), but not among women without symptoms (p-trend = 0.18). At the last UST scan, women who had experienced treatment-associated endocrine symptoms had a 5-fold higher odds of a decline of ≥2 m/s in sound speed compared to those without symptoms (OR = 5.23, 95% CI = 1.03–26.51) (see Supplementary Table 2).

Fig. 2: Multivariable-adjusted mean change in breast sound speed (m/s) by treatment-associated endocrine symptoms and endocrine symptom frequency.
figure 2

a By treatment-associated endocrine symptoms. Treatment-associated endocrine symptoms were defined as either emergent or worsening vasomotor symptoms and/or joint pain after tamoxifen initiation. P-trend for those with symptoms = 0.004, P-trend for those without symptoms = 0.18; P-interaction between endocrine symptom group and time = 0.02. b By endocrine symptom frequency. Endocrine symptom frequency was assessed with a Likert scale for symptom frequency per week (vasomotor and joint pain) and per day (vasomotor only). An endocrine symptom severity score was created by summing the values for each item and categorized into tertiles defined as no symptoms (score = 0), low/moderate (score = 1–5), and high (score=6+). P-trend for those with no symptoms = 0.91, low/moderate symptoms = 0.03, high symptoms = 0.004. P-interaction between endocrine symptom frequency and time = 0.61. All models were adjusted for age, race, menopause status, body mass index, and baseline sound speed. P-trends were calculated by modeling the time interval from tamoxifen initiation (T1, T2, T3) as a continuous variable using the Wald test. P-interactions were evaluated with multiplicative interaction terms using the Wald test. Error bars represent 95% confidence intervals.

Change in breast sound speed by endocrine symptom frequency

Statistically significant declines in sound speed over time were greater in women with high symptom frequency (mean change [95% CI] at T1: −1.21 m/s [−3.14, 0.71]; T2: −2.77 m/s [−4.93, −0.62]; T3: −4.37 m/s [−6.87, −1.87]; p-trend = 0.004) than low/moderate symptom frequency (mean change [95% CI] at T1: −0.54 m/s [−2.72, 1.64]; T2: −1.78 m/s [−4.14, 0.58]; T3: −2.46 m/s [−5.06, 0.15]; p-trend = 0.03) (Fig. 2b, see Supplementary Table 3). No decline in sound speed was observed among women with no endocrine symptoms (p-trend = 0.91), but the test for interaction was not statistically significant (p-interaction = 0.61).

Sensitivity and exploratory analyses

Results remained similar in analyses 1) restricted to premenopausal women (n = 49), 2) excluding women who discontinued tamoxifen or had undetectable levels of tamoxifen metabolites (n = 65; 9 women excluded), 3) adjusted for antidepressant use, or 4) adjusted for an indication for tamoxifen therapy (see Supplementary Tables 45). Among women with available CYP2D6 status (n = 60), we found similar associations for change in breast sound speed by treatment-associated endocrine symptoms both before and after excluding the small number of women with poor CYP2D6 metabolizer status (see Supplementary Table 4-5). As anticipated, poor or intermediate CYP2D6 metabolizers had lower levels of endoxifen (median (range) for poor = 3.02 (2.29-4.16) ng/ml; intermediate = 6.03 (0.62-24.60) ng/ml) compared with efficient CYP2D6 metabolizers (9.59 (0.25–32.4) ng/ml) (see Supplementary Table 6).

Discussion

In this longitudinal study of patients prescribed tamoxifen for clinical indications, we found that women with treatment-associated endocrine symptoms after tamoxifen initiation experienced statistically significant declines in breast density within the first year of therapy, and this decline was stronger in women with more symptoms. Statistically significant declines in breast density were not observed among women without symptoms despite similar tamoxifen adherence. Associations remained among premenopausal women, a group in which tamoxifen is generally clinically indicated as first-line adjuvant endocrine therapy. These findings suggest that emergent/worsening symptoms after tamoxifen initiation may be a potential marker of treatment response.

To our knowledge, no prior study has examined treatment-associated endocrine symptoms and change in breast sound speed among women undergoing tamoxifen therapy. Previous studies have examined endocrine symptoms and breast cancer outcomes in women using tamoxifen therapy with positive results21,22. The Arimidex, Tamoxifen, Alone or in Combination (ATAC) trial found that postmenopausal breast cancer patients with endocrine symptoms at 3-months after tamoxifen initiation, including both vasomotor or joint symptoms, had a lower risk of breast cancer recurrence compared to those without symptoms21. Results were similar among women treated with aromatase inhibitors. Data from the Women’s Healthy Eating and Living (WHEL) randomized trial also indicated a lower risk of breast cancer recurrence among women treated with tamoxifen therapy who reported hot flashes than those who did not report hot flashes22. Additional studies examining both tamoxifen and aromatase inhibitors have also found that either vasomotor, musculoskeletal, or joint symptoms may be associated with improved breast cancer outcomes23,24 but results have been mixed29,30. Notably, prior studies on endocrine symptoms and improved breast cancer outcomes have been primarily based on clinical trial data and among postmenopausal women in the adjuvant setting. Together with the results from these trials, our findings suggest that treatment-associated endocrine symptoms may be a marker of treatment response. However, whether these symptoms and declines in breast density are associated with a lower risk of breast cancer in the prevention setting or recurrence, second breast cancer, and mortality in the adjuvant or neo-adjuvant setting, remains to be examined. An association between endocrine symptoms plus breast density, if confirmed, may have important clinical implications given that estrogen receptor-positive breast cancer incidence is rising31, particularly among younger women who are generally not eligible to take aromatase inhibitors unless they are at high-risk of recurrence, and in which situation, ovarian suppression or oophorectomy might be clinically indicated32,33.

The potential biological mechanisms underlying the association between endocrine symptoms and sound speed change are unclear. However, estrogen suppression plays an important pathophysiological role in the development of side effects, such as vasomotor and joint symptoms, and is related to declines in breast density. It is possible that women who do not efficiently metabolize tamoxifen have lower circulating levels of active tamoxifen metabolites34, are subsequently less likely to have endocrine symptoms35,36,37, and may have less decline in breast density38. In exploratory analyses, we found that women with poor or intermediate CYP2D6 metabolizer status had lower levels of endoxifen compared with those with efficient CYP2D6 status. Importantly, CYP2D6 is responsible for the hydroxylation of N-desmethyltamoxifen to endoxifen, an active metabolite via which tamoxifen exerts the majority of its effects39,40. Although the distribution of CYP2D6 metabolizer status was as expected in our study, limited sample size precluded a comprehensive evaluation of potential biological mechanisms. In addition, although density decline with tamoxifen therapy likely reflects decreases in stroma, it may also be influenced by other factors including BMI, weight change, and the menopausal transition41,42,43, which we accounted for in our analyses. Additional work is needed to understand the underlying mechanisms between tamoxifen metabolism, endocrine symptoms, and potential variation in breast density decline. Whether density decline is an intermediate endpoint and causally related to tamoxifen effectiveness or rather a marker of tamoxifen effectiveness still needs to be determined16.

Strengths of this study include longitudinal data on both self-reported endocrine symptoms and quantitative measures of whole breast UST sound speed within the first year of tamoxifen treatment. The use of nonionizing 3-D imaging with whole breast UST scans also provided the opportunity to assess the sound speed at short intervals during the first year of treatment in both high-risk women and breast cancer patients27. Our study further included information on both circulating tamoxifen metabolites, to reduce potential misclassification of tamoxifen therapy and as a proxy for tamoxifen adherence, and CYP2D6 metabolizer status, to examine potential differences in tamoxifen metabolism. In addition, we were able to account for potential confounding by age, race, menopausal status, and BMI. Limitations of this study included the modest sample size, resulting in limited statistical power, particularly for stratified analyses. Further larger studies are needed to disaggregate results by race and ethnicity. In addition, repeated measures were only collected within the first year of tamoxifen treatment. However, prior studies have shown that the largest declines in breast density occur within the first 12–18 months of tamoxifen therapy16,44.

In conclusion, our study suggests that treatment-associated endocrine symptoms may be associated with declines in breast density and may be a marker of treatment response. Further studies are needed to confirm whether treatment-associated endocrine symptoms and breast density declines are associated with improved breast cancer outcomes. If treatment-associated symptoms are shown to be a surrogate for treatment response, this may be useful for patients and providers to improve and maintain tamoxifen adherence.

Methods

Study population

The Ultrasound Study of Tamoxifen enrolled 82 women prescribed tamoxifen for clinical indications and a comparison group of 165 women with screen-negative mammograms and no history of breast cancer at the Karmanos Cancer Institute (KCI) and Henry Ford Health Systems (HFHS) in Detroit, Michigan, USA between 2011–201425,27,45. Women included in the study were aged 30–70 years, weighed ≤350 pounds, had breast size <22 cm in diameter, had no breast implants, reduction mammoplasty, or active skin infections/open chest wounds, and were not currently pregnant, breastfeeding, or taking endogenous hormones at study enrollment (i.e., oral contraceptives and menopausal hormone therapy)17. For the present analysis, we restricted our analytic population to the 82 women prescribed tamoxifen for clinical indications, which included women diagnosed with breast cancer after a routine screening (n = 31 in situ; n = 34 invasive) and high-risk patients referred for chemoprevention (based on personal breast cancer risk) (n = 17). Among these 82 women, we further excluded those who self-reported never starting tamoxifen (n = 6), those without a baseline UST scan (n = 1) and those without at least one follow-up UST scan (n = 1). Thus, 74 women were included in our final analytic population. All participants provided written informed consent and the study protocol was approved by the Institutional Review Boards at KCI, HFHS, and National Cancer Institute (NCI).

Assessment of breast sound speed

Volume averaged sound speed (m/s), a surrogate measure of volumetric breast density45,46, was ascertained with whole breast UST scans conducted at baseline prior to tamoxifen initiation (T0) and approximately 1–3 months (T1), 4–6 months (T2), and 12-months (T3) after tamoxifen initiation (Fig. 3). Women with breast cancer had UST scans conducted on the contralateral unaffected breast. Methods to measure sound speed with UST scans have been previously described17,25,27. Briefly, all UST scans were conducted with an earlier prototype of the SoftVue system (manufactured by Delphinus Medical Technologies, Inc., Novi, MI, USA), which has been cleared by the FDA for clinical use and recently approved for breast cancer screening as an adjunct to mammography in women with dense breasts25,27,28,45. Volume average sound speed was estimated by averaging the sound speed of the breast across the tomographic DICOM image stacks (approximately 40–100 coronal image slices beginning from the chest wall to the nipple per UST scan). To assess change in sound speed between UST scans, we restricted image files to a common volume contained within all scans25. Sound speed was calculated from tomograms included in the common volume using automated scripts and then averaged together as a proxy for breast density17. UST sound speed has been shown to be a highly accurate and reliable estimate of volumetric breast density (intraclass correlation coefficient, ICC = 93.4%) and change in density (ICC = 70.4%)25, and may be more strongly associated with breast cancer risk than mammographic percent density27.

Fig. 3: Study design for ultrasound tomography scans and endocrine symptom assessment in the Ultrasound Study of Tamoxifen.
figure 3

Ultrasound tomography scans and endocrine symptoms were ascertained prior to tamoxifen initiation (T0) and post-tamoxifen initiation (T1–T3).

Ascertainment of endocrine symptoms

Endocrine symptoms were self-reported at each UST scan prior to tamoxifen initiation (T0) and at approximately 1–3 months (T1), 4–6 months (T2), and 12 months (T3) post-tamoxifen initiation (Fig. 3). Symptoms were ascertained with a questionnaire administered by a research nurse (see Supplementary Methods for questionnaire details on endocrine symptoms). Endocrine symptoms included vasomotor and joint symptoms as these both relate to effects on endogenous estrogen levels21. Vasomotor symptoms were defined as hot flashes or flushes, and joint symptoms included stiffness or soreness in bone joints. Symptom frequency was defined as symptom occurrence in days per week and times per day (vasomotor symptoms only).

We examined endocrine symptoms with two approaches. Our primary approach was to examine treatment-associated endocrine symptoms which included emergent or worsening vasomotor and/or joint symptoms (yes, no). Endocrine symptoms were treated as time-dependent and updated at each UST scan. A woman was classified as having 1) emergent symptoms if she did not report any endocrine symptoms prior to that time or 2) increasing symptoms if she had reported them previously but symptoms had increased in frequency. Once a woman was classified as having either emergent or increasing symptoms, this classification was carried forward for the remainder of the analysis.

We additionally examined severity of endocrine symptoms based on how many times women experienced symptoms per day or week at each UST scan. We created a symptom frequency score by assigning a Likert scale with questions that assessed 1) how many days a woman experienced vasomotor and/or joint symptoms and 2) how many times per day a woman usually experienced hot flashes or flushes within the past two weeks. These responses were combined into the following scores for symptom occurrence: 0 = no symptoms; 1 = 1–5 days; 2 = 6–8 days; 3 = 9–13 days; 4 = every day, and summed together with the number of times women experienced hot flashes or flushes per day. The distribution of the resultant endocrine symptom frequency score was right-skewed, therefore the variable was categorized into tertiles of no symptoms (0), low-moderate symptoms (1–5), and high symptom frequency (6+).

Genotyping and CYP2D6 phenotype assignment

DNA extraction

Saliva samples were collected from all participants at baseline using Oragene DISCOVER collection vials (DNAGenotek, Ontario, Canada) and received at ambient temperature. Source material aliquots of 850 uL were transferred into a new tube and aliquots were incubated at 50 °C for 60 min in a water bath. DNA was extracted using QIAsymphony DSP Virus/Pathogen Midi Kit on a QIAsymphony automated extraction instrument (Qiagen) following a customized version of the Complex800 CR2386 ID519 V2 protocol. Extracted DNA was quantified utilizing QuantiFluor dsDNA System (Promega Corporation, Madison, WI). Among our analytic population with saliva samples (n = 74), 60 participants had sufficient DNA that generated data for repeat amplification and sequencing.

CYP2D6 sequencing

CYP2D6 was sequenced since tamoxifen is metabolized by the CYP2D6 enzyme to 4-hydroxy-tamoxifen and endoxifen, and women with low enzyme activity may have less benefit from tamoxifen and be less likely to experience endocrine symptoms22,37,47. Sequencing was conducted at the NCI Cancer Genomics Research Laboratory, using a targeted long amplicon approach on the PacBio single molecule real-time (SMRT) sequencing platform. This approach facilitated accurate variant calling and direct haplotyping of the entire gene-locus, and produced reliable CYP2D6 diplotypes. Methods were performed as described in the protocol from the manufacturer with some optimization to the PCR reactions and cycling parameters48.

A two-step PCR approach was used to generate amplicons with expected size of 6591 bp that spanned the CYP2D6 gene. Amplicon coordinates in GRCh38 were chr22:42126037-42132625. During the first PCR, template-specific primers with universal priming site tags were used, where the template-specific primers span the CYP2D6 gene. Primer sequences for the first PCR were /5AmMC6/GCAGTCGAACATGTAGCTGACTCAGGTCAC | ATGGCAGCTGCCATACAATCCACCTG (forward) and /5AmMC6/TGGATCACTTGTGCAAGCATCACATCGTAG | CGACTGAGCCCTGGGAGGTAGGTAG (reverse). During the second PCR, the first portion of each primer was used as a universal priming site to build in barcode sequences to allow for pooling after amplification. The template-specific portion of each primer was obtained from the research article by Buermans and colleagues49. Primers were ordered from Integrated DNA Technologies (IDT) and high-performance liquid chromatography (HPLC) purified.

The first PCR reaction included the following: 2ul(5 ng/ul) genomic DNA, 15 ul water, 2.5ul LA PCR Buffer II, 4ul 2.5 mM dNTPs, 0.5 ul each of 100uM primer(F/R), and 0.5ul Takara LA Taq DNA Polymerase HS. The cycling conditions were 1 cycle of 94 °C for 1 min, followed by 30 cycles of 98 °C for 10 seconds and 68 °C for 6 min, followed by a final extension at 72 °C for 10 min, and a hold at 4 °C. The second PCR utilized Barcoded Universal Primers (purchased from PacBio), which primed the universal priming site incorporated during the first PCR for amplification, and incorporated a unique barcode sequence for each sample to allow for pooling after amplification. The second PCR reaction included the following: 2ul first round PCR product (normalized to 1 ng/ul), 14 ul water, 2.5ul LA Taq PCR Buffer II, 4ul 2.5 mM dNTPs, 2ul each of 2uM Barcoded Universal Primers, and 0.5 ul Takara LA Taq DNA Polymerase HS. The cycling conditions were identical to the first round of PCR with the exception that only 10 cycles of annealing and extension were performed, instead of 30. All samples were pooled for SMRTbell library prep and sequencing on a single SMRT Cell 1 M on the PacBio Sequel instrument, according to the PacBio Protocol “Preparing SMRTbell Libraries Using Barcoded Universal Primers for Multiplexing Amplicons”, version 248.

Circular Consensus Sequence (CCS) reads were generated using Circular Consensus Sequencing, demultiplexed using Lima, and mapped to GRCh38 using pbmm2. These steps were performed in SMRT Link 8.0. Mapped bams were inputted to Google Deep Variant and GATK Haplotype Caller, and variants were called. Variants that were called by both callers, and fell within the targeted amplicon region, were inputted to WhatsHap to phase variants by allele. An internal tool was developed to compare the variant call format (VCF) generated by WhatsHap to the PharmGKB CYP2D6 Allele Definition Table34,50 and to assign a diplotype. Additional manual review of the WhatsHap VCF identified some instances of copy number changes based on read depth ratios of heterozygous variants. Diplotypes that could not be called based on the results of the assay included copy number changes with only one type of allele (no heterozygous variants), as well as some types of hybrids, since our primers were anchored in CYP2D6. CYP2D6/CYP2D7 hybrids, which did not contain both of our primer sequences were not amplified and observed in the assay.

CYP2D6 phenotype

A CYP2D6 activity score was assigned to each allele and summed for an overall activity score34,51. Corresponding phenotypes for the diplotypes were ascertained from the PharmGKB CYP2D6 Diplotype-Phenotype Table34,50. CYP2D6 phenotypes were classified as poor (activity score: 0), intermediate (activity score: >0–1.0), efficient (activity score: >1.0–2.0), or ultra-metabolizers (activity score: >2.0)34,50.

Tamoxifen metabolites

Whole blood samples were collected 12 months after tamoxifen initiation. Tamoxifen metabolites were quantified in serum samples using liquid chromatography-tandem mass spectrometry (LC-MS/MS) at the Illinois Institute of Technology Research Institute52,53. Metabolites included: (Z)-tamoxifen, (Z)-n-desmethyl-tamoxifen, (Z)−4-OH-tamoxifen, and (Z)-endoxifen. Coefficients of variation (CVs) were <5% for all metabolites measured except for (Z)−4-OH-tamoxifen which was below the lower limit of detection in 7 participants and had a CV of 21%.

Covariates

Information on demographics (e.g., age, race/ethnicity, education level), reproductive factors (e.g., menopausal status, type of menopause, date of menopause, menopausal hormone therapy) and regular medication use (e.g., antidepressants) were collected at each UST scan with questionnaires administered by the research nurse. Race (white, Black, other) and ethnicity (Hispanic/Latina, Non-Hispanic/Latina) were self-reported by the participants. Information on medication use included type, dose, frequency, duration, and indication. Regular medication use was defined as taken regularly at least 2 times per week in the last month. Height was measured at baseline and weight was ascertained at each scan to calculate BMI in kg/m2.

Statistical analysis

Baseline characteristics were described overall and compared by baseline endocrine symptom status using t-tests or Wilcoxon rank-sum tests for continuous variables and Fisher exact tests for categorical variables. Absolute mean change in sound speed was calculated at each UST scan as the difference from baseline (T0). The distribution of mean change in sound speed was approximately normally distributed. Trajectories of sound speed change were examined by treatment-associated endocrine symptoms. We used linear regression to examine the mean change in sound speed over time by endocrine symptoms. Variances were estimated using generalized estimating equations to account for within-subject correlations over time54. Multivariable models were adjusted for a priori specified covariates including baseline age (continuous), race (white, Black/other), baseline menopausal status (premenopausal, postmenopausal), BMI (continuous), and sound speed prior to tamoxifen initiation (tertiles). BMI was treated as a time-varying variable in all models since it was a strong confounder. As change in menopausal status may have also been a strong confounder, in sensitivity analyses we excluded 4 women who changed from pre- to postmenopausal during follow-up (surgical or natural defined as ≥12 months since last menstrual period); however, this did not alter our results and therefore these women were retained in our analyses. We tested for linear trend by modeling the time interval from tamoxifen initiation (T1, T2, T3) as a continuous variable using the Wald test. To determine if sound speed change over time differed by endocrine symptoms, multiplicative interaction terms were evaluated using a Wald test. In a post-hoc analysis, we used logistic regression to estimate odds ratios (ORs) for the association between treatment-associated endocrine symptoms at each UST scan and a predetermined decline of ≥2 m/s in sound speed at the last UST scan (T3 minus T0). A decline of ≥2 m/s in sound speed was selected as a potentially clinically meaningful cut point based on results from our primary analysis and prior study17.

We conducted additional sensitivity analyses including 1) restricting analyses to premenopausal women (n = 49), 2) excluding women who discontinued tamoxifen or had undetectable levels of tamoxifen metabolites (as a proxy for tamoxifen adherence) (n = 9), 3) adjusting for antidepressant use since it may alter tamoxifen metabolism55, and 4) adjusting for tamoxifen indication (high-risk, in situ breast cancer, invasive breast cancer with/without chemotherapy).

In the subgroup with CYP2D6 status (n = 60), we further examined change in sound speed over time by endocrine symptoms excluding women with poor CYP2D6 metabolizer status (n = 4). In exploratory analyses, we also examined the distribution of tamoxifen metabolites by CYP2D6 status, with particular emphasis on Z-endoxifen, the most potent metabolite synthesized by CYP2D6.

Analyses were conducted using SAS (version 9.4). Figures were made in R (version 4.2.0, Boston, MA) with the ggplot package and loess smoother, and Stata (version 16.0, College Station, TX). Two-sided p-values <0.05 were considered statistically significant.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.