Abstract
Single-nucleus analysis allows robust cell-type classification and helps to establish relationships between chromatin accessibility and cell-type-specific gene expression. Here, using samples from 92 women of several genetic ancestries, we developed a comprehensive chromatin accessibility and gene expression atlas of the breast tissue. Integrated analysis revealed ten distinct cell types, including three major epithelial subtypes (luminal hormone sensing, luminal adaptive secretory precursor (LASP) and basal-myoepithelial), two endothelial and adipocyte subtypes, fibroblasts, T cells, and macrophages. In addition to the known cell identity genes FOXA1 (luminal hormone sensing), EHF and ELF5 (LASP), TP63 and KRT14 (basal-myoepithelial), epithelial subtypes displayed several uncharacterized markers and inferred gene regulatory networks. By integrating breast epithelial cell gene expression signatures with spatial transcriptomics, we identified gene expression and signaling differences between lobular and ductal epithelial cells and age-associated changes in signaling networks. LASP cells and fibroblasts showed genetic ancestry-dependent variability. An estrogen receptor-positive subpopulation of LASP cells with alveolar progenitor cell state was enriched in women of Indigenous American ancestry. Fibroblasts from breast tissues of women of African and European ancestry clustered differently, with accompanying gene expression differences. Collectively, these data provide a vital resource for further exploring genetic ancestry-dependent variability in healthy breast biology.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Most of the data are included in the paper. High-throughput data are available through the NCBI database with SuperSeries accession no. GSE244594. In addition, these data are publicly available through the CellXGene database of the Chan Zuckerberg Initiative. Source data are provided with this paper.
Code availability
No unique code was used in the study.
References
Reeder-Hayes, K. E. & Anderson, B. O. Breast cancer disparities at home and abroad: a review of the challenges and opportunities for system-level change. Clin. Cancer Res. 23, 2655–2664 (2017).
Dietze, E. C., Sistrunk, C., Miranda-Carboni, G., O’Regan, R. & Seewaldt, V. L. Triple-negative breast cancer in African-American women: disparities versus biology. Nat. Rev. Cancer 15, 248–254 (2015).
Newman, L. A. & Kaljee, L. M. Health disparities and triple-negative breast cancer in African American women: a review. JAMA Surg. 152, 485–493 (2017).
Newman, L. A. et al. Meta-analysis of survival in African American and white American patients with breast cancer: ethnicity compared with socioeconomic status. J. Clin. Oncol. 24, 1342–1349 (2006).
Cho, B. et al. Evaluation of racial/ethnic differences in treatment and mortality among women with triple-negative breast cancer. JAMA Oncol. 7, 1016–1023 (2021).
Martini, R. et al. African ancestry-associated gene expression profiles in triple-negative breast cancer underlie altered tumor biology and clinical outcome in women of African descent. Cancer Discov. 12, 2530–2551 (2022).
Kumar, B. et al. Stromal heterogeneity may explain increased incidence of metaplastic breast cancer in women of African descent. Nat. Commun. 14, 5683 (2023).
Arora, K. et al. Genetic ancestry correlates with somatic differences in a real-world clinical cancer sequencing cohort. Cancer Discov. 12, 2552–2565 (2022).
Kachuri, L. et al. Gene expression in African Americans, Puerto Ricans and Mexican Americans reveals ancestry-specific patterns of genetic architecture. Nat. Genet. 55, 952–963 (2023).
Yuan, J. et al. Integrated analysis of genetic ancestry and genomic alterations across cancers. Cancer Cell 34, 549–560 (2018).
Jiang, Y.-Z. et al. Genomic and transcriptomic landscape of triple-negative breast cancers: subtypes and treatment strategies. Cancer Cell 35, 428–440 (2019).
Kumar, T. et al. A spatially resolved single-cell genomic atlas of the adult human breast. Nature 620, 181–191 (2023).
Gray, G. K. et al. A human breast atlas integrating single-cell proteomics and transcriptomics. Dev. Cell 57, 1400–1420 (2022).
Murrow, L. M. et al. Mapping hormone-regulated cell–cell interaction networks in the human breast at single-cell resolution. Cell Syst. 13, 644–664 (2022).
Pal, B. et al. A single-cell RNA expression atlas of normal, preneoplastic and tumorigenic states in the human breast. EMBO J. 40, e107333 (2021).
Bhat-Nakshatri, P. et al. A single-cell atlas of the healthy breast tissues reveals clinically relevant clusters of breast epithelial cells. Cell Rep. Med. 2, 100219 (2021).
Reed, A. et al. A human breast atlas mapping the homestatic cellular shifts in the adult breast. Nat. Genet. 56, 652–662 (2024).
Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).
Stuart, T., Srivastava, A., Madad, S., Lareau, C. A. & Satija, R. Single-cell chromatin state analysis with Signac. Nat. Methods 18, 1333–1341 (2021).
Granja, J. M. et al. ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis. Nat. Genet. 53, 403–411 (2021).
Eeckhoute, J. et al. Positive cross-regulatory loop ties GATA-3 to estrogen receptor alpha expression in breast cancer. Cancer Res. 67, 6477–6483 (2007).
Zaret, K. S. & Carroll, J. S. Pioneer transcription factors: establishing competence for gene expression. Genes Dev. 25, 2227–2241 (2011).
Nakshatri, H. et al. Genetic ancestry-dependent differences in breast cancer-induced field defects in the tumor-adjacent normal breast. Clin. Cancer Res. 25, 2848–2859 (2019).
Mouabbi, J. A. et al. Invasive lobular carcinoma: an understudied emergent subtype of breast cancer. Breast Cancer Res. Treat. 193, 253–264 (2022).
Chandrashekar, D. S. et al. UALCAN: a portal for facilitating tumor subgroup gene expression and survival analyses. Neoplasia 19, 649–658 (2017).
Liu, H.-L. et al. The role of RNA splicing factor PTBP1 in neuronal development. Biochim. Biophys. Acta Mol. Cell. Res. 1870, 119506 (2023).
Nielsen, T. O. et al. Assessment of Ki67 in breast cancer: updated recommendations from the International Ki67 in Breast Cancer Working Group. J. Natl Cancer Inst. 113, 808–819 (2021).
Lupien, M. et al. Growth factor stimulation induces a distinct ERα cistrome underlying breast cancer endocrine resistance. Genes Dev. 24, 2219–2227 (2010).
Wu, S. Z. et al. A single-cell and spatially resolved atlas of human breast cancers. Nat. Genet. 53, 1334–1347 (2021).
Wu, S. Z. et al. Stromal cell diversity associated with immune evasion in human triple-negative breast cancer. EMBO J. 39, e104063 (2020).
Cords, L. et al. Cancer-associated fibroblast classification in single-cell and spatial proteomics data. Nat. Commun. 14, 4294 (2023).
Bergenstal, R. M. et al. Racial differences in the relationship of glucose concentrations and hemoglobin A1c levels. Ann. Intern. Med. 167, 95–102 (2017).
Nassar, A. H. et al. Ancestry-driven recalibration of tumor mutational burden and disparate clinical outcomes in response to immune checkpoint inhibitors. Cancer Cell 40, 1161–1172 (2022).
De Dominici, M. & DeGregori, J. Our ancestry dictates clonal architecture and skin cancer susceptibility. Nat. Genet. 55, 1428–1429 (2023).
Horwitz, R., Riley, E. A. U., Millan, M. T. & Gunawardane, R. N. It’s time to incorporate diversity into our basic science and disease models. Nat. Cell Biol. 23, 1213–1214 (2021).
Degnim, A. C. et al. Histologic findings in normal breast tissues: comparison to reduction mammaplasty and benign breast disease tissues. Breast Cancer Res. Treat. 133, 169–177 (2012).
Teschendorff, A. E. et al. DNA methylation outliers in normal breast tissue identify field defects that are enriched in cancer. Nat. Commun. 7, 10478 (2016).
Yao, S. et al. Breast tumor microenvironment in Black women: a distinct signature of CD8+ T-cell exhaustion. J. Natl Cancer Inst. 113, 1036–1043 (2021).
Wu, K. et al. Cell fate factor DACH1 represses YB-1-mediated oncogenic transcription and translation. Cancer Res. 74, 829–839 (2014).
Hamila, S. A., Ooms, L. M., Rodgers, S. J. & Mitchell, C. A. The INPP4B paradox: like PTEN, but different. Adv. Biol. Regul. 82, 100817 (2021).
Haider, N. et al. NEK10 tyrosine phosphorylates p53 and controls its transcriptional activity. Oncogene 39, 5252–5266 (2020).
Xu, W. et al. Transcription factor-like 5 is a potential DNA- and RNA-binding protein essential for maintaining male fertility in mice. J. Cell Sci. 135, jcs259036 (2022).
Cerami, E. et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2, 401–404 (2012).
Addou-Klouche, L. et al. Loss, mutation and deregulation of L3MBTL4 in breast cancers. Mol. Cancer 9, 213 (2010).
Liu, S. et al. Breast cancer stem cells transition between epithelial and mesenchymal states reflective of their normal counterparts. Stem Cell Reports 2, 78–91 (2014).
Molyneux, G. et al. BRCA1 basal-like breast cancers originate from luminal epithelial progenitors and not from basal stem cells. Cell Stem Cell 7, 403–417 (2010).
Lim, E. et al. Aberrant luminal progenitors as the candidate target population for basal tumor development in BRCA1 mutation carriers. Nat. Med. 15, 907–913 (2009).
Shalabi, S. F. et al. Evidence for accelerated aging in mammary epithelia of women carrying germline BRCA1 or BRCA2 mutations. Nat. Aging 1, 838–849 (2021).
Jiménez-Saucedo, T., Berlanga, J. J. & Rodríguez-Gabriel, M. Translational control of gene expression by eIF2 modulates proteostasis and extends lifespan. Aging 13, 10989–11009 (2021).
Enns, L. C. & Ladiges, W. Protein kinase A signaling as an anti-aging target. Ageing Res. Rev. 9, 269–272 (2010).
Jewer, M. et al. Translational control of breast cancer plasticity. Nat. Commun. 11, 2498 (2020).
Pattabiraman, D. R. et al. Activation of PKA leads to mesenchymal-to-epithelial transition and loss of tumor-initiating ability. Science 351, aad3680 (2016).
Bhat-Nakshatri, P. et al. Acquisition, processing, and single-cell analysis of normal human breast tissues from a biobank. STAR Protoc. 3, 101047 (2022).
Nievergelt, C. M. et al. Inference of human continental origin and admixture proportions using a highly discriminative ancestry informative 41-SNP panel. Investig. Genet. 4, 13 (2013).
Marker, K. M. et al. Human epidermal growth factor receptor 2-positive breast cancer is associated with Indigenous American ancestry in Latin American women. Cancer Res. 80, 1893–1901 (2020).
Bhat-Nakshatri, P. et al. Signaling pathway alterations driven by BRCA1 and BRCA2 germline mutations are sufficient to initiate breast tumorigenesis by the PIK3CAH1047R oncogene. Cancer Res. Commun. 4, 38–54 (2024).
Jakubek, Y. A. et al. Large-scale analysis of acquired chromosomal alterations in non-tumor samples from patients with cancer. Nat. Biotechnol. 38, 90–96 (2020).
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).
Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021).
Heaton, H. et al. Souporcell: robust clustering of single-cell RNA-seq data by genotype without reference genotypes. Nat. Methods 17, 615–620 (2020).
Hafemeister, C. & Satija, R. Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol. 20, 296 (2019).
Marsh, S., Salmon, M. & Hoffman, P. scCustomize: custom visualization & functions for streamlined analyses of single cell sequencing. R package version 2.1.2 https://cran.r-project.org/web/packages/scCustomize/index.html (2021).
Fornes, O. et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 48, D87–D92 (2020).
Schep, A. N., Wu, B., Buenrostro, J. D. & Greenleaf, W. J. chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat. Methods 14, 975–978 (2017).
Baranasic, D. JASPAR2020: data package for JASPAR database (v.2020). R package version 0.99.8 http://jaspar.genereg.net/ (2022).
Motulsky, H. J. & Brown, R. E. Detecting outliers when fitting data with nonlinear regression—a new method based on robust nonlinear regression and the false discovery rate. BMC Bioinformatics 7, 123 (2006).
Acknowledgements
We thank the countless number of women who donated normal breast tissues for research. We also thank the volunteers who facilitated this tissue collection. We offer special thanks to members of the KTB, including J. Henry, E. Nelson, M. Huynh, V. Rodriguez, A. Hughes, P. Rockey and J. Rose von Arx, as well as the Indiana University Simon Comprehensive Cancer Center (IUSCCC) tissue procurement facility for providing tissues and related data. We thank the flow cytometry core of IUSCCC for timely sorting of nuclei. We also thank D. Scoville of NanoString Technologies for processing the GeoMx data. H.N. acknowledges support for the research from the funders, the Catherine Peachey Fund and the Chan Zuckerberg Initiative Human Atlas Project. A.M.S. acknowledges funding from the Susan G. Komen Foundation to support the Susan G. Komen Tissue Bank at IUSCCC. The breast cancer research infrastructure at Indiana University School of Medicine is supported by the Vera Bradley Foundation for Breast Cancer Research.
Author information
Authors and Affiliations
Contributions
H.N. conceived and designed the study. P.B.-N., D.C., A.S.K., A.K.A., G.J., P.C.M., H.G., C.E., R.G., F.N., Y.L. and H.N. developed the methodology. P.B.-N., F.N., A.K.A., H.G., L.E. and G.S. acquired the data. P.B.-N., A.S.K., A.K.A., C.E., G.J., F.N., H.G., Y.L., G.S., A.M.S. and H.N. analyzed and interpreted the data. P.B.-N., D.C., R.G., H.G., F.N., A.K.A., A.M.S. and H.N. wrote, reviewed or revised the paper. A.M.S., Y.L. and H.N. provided administrative, technical or material support. H.N. and Y.L. supervised the study.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Medicine thanks Andrey Krokhotin and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Sonia Muliyil, in collaboration with the Nature Medicine team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Experimental workflow of single nucleus atlas generation.
Twelve major steps that were used in creation of single nucleus atlas of breast tissues are shown.
Extended Data Fig. 2 Expression pattern of epithelial subtypes identity gene.
Expression pattern of LHS, LASP and BM cell identity genes is shown. These genes have not been previously reported to be expressed in specific subtypes of breast epithelial cells.
Extended Data Fig. 3 DNA binding motif analyses using Signac.
a) DNA binding motifs differentially active in every cell type of the breasts are shown. b) Expression patterns of select transcription factors whose DNA binding motifs are enriched in epithelial subtypes. c) DNA binding motifs differentially active in epithelial cell types. d) Footprinting analyses show lack of Tn5 integration in regions that carry epithelial cell specific motifs. e) Representative immunohistochemistry images of breast tissues stained with antibodies against ERα (n=17), FOXA1 (n = 18) and GATA3 (n = 20). Nuclei in ducts and lobules analyzed has been marked.
Extended Data Fig. 4 Spatial transcriptomics to determine differences in gene expression between ductal and lobular breast epithelial cells.
a) UMAP showing differences in gene expression patterns between timepoint 1 and timepoint 2. b) Age and BMI of donors at two timepoints of tissues collected for spatial transcriptomics are also indicated. c) Staining pattern of breast tissues with antibodies against pan-keratin, FABP4 and smooth muscle actin. N = 10. d) Representative regions of interest related to ducts, lobules and adipocytes selected for RNA extraction and sequencing. N = 10. e) Deconvolution of spatial transcriptomics data show elevated Adi-2, macrophages and Endo-2 at timepoint 2 compared to timepoint 1 in most samples.
Extended Data Fig. 5 Gene expression and signaling differences between epithelial cells of ducts and lobules.
a) Expression pattern of 10 genes that showed differential expression in ductal epithelial cells compared to lobular epithelial cells assessed using multiome data. b) Differences in signaling pathways in ductal and lobular epithelial cells. Data from all samples were used to generate these networks. c) PTBP1 whose expression in normal breast epithelial cells was reduced in timepoint 2 compared to timepoint 1, is overexpressed in all breast cancer subtypes compared to normal breast. Statistical significance was derived using Unpaired t-test. Samples are biologically independent. (Normal: N = 114, low- 55.146, First quartile (Q1)-87.064, median- 109.154, Third quartile (Q3) - 123.208, high- 163.066; Luminal: N = 566, low- 85.382, q1- 138.168, median- 159.404, q3- 180.133, high- 242.444; HER2 positive: N = 37, low- 105.775, q1- 122.596, median- 132.549, q3- 148.043, high- 188.8; TNBC Basal-like 1: N = 13, low- 152.31, q1- 166.83, median- 182.37, q3- 206.14, high- 220.45; TNBC Basal-like 2: N = 11, low- 119.54, q1- 161.645, median- 179.97, q3- 210.075, high- 217.12; TNBC Immunomodulatory: N = 20, low- 123.85, q1- 139.06, median- 155.46, q3- 179.92, high- 242.18; TNBC luminal androgen receptor: N = 8, low- 123.99, q1- 129.368, median- 136.68, q3- 142.92, high- 153.48; TNBC mesenchymal stem-like: N = 8, low- 96.39, q1- 129.857, median- 154.925, q3- 177.99, high- 203.06; TNBC Mesenchymal: N = 29, low- 75.03, q1- 140.838, median- 165.18, q3- 200.85, high- 260.73; TNBC unspecified: N = 27, low- 100.17, q1- 144.165, median- 167.22, q3- 193.375, high- 264.97).
Extended Data Fig. 6 Age-dependent signaling pathway alterations in ductal and lobular epithelial cells of the breast.
Genes differentially expressed in ductal and lobular epithelial cells at timepoint 2 compared to timepoint 1 from sample #3 were subjected to Ingenuity Pathway Analysis. a) EIF2 signaling pathway enrichment with age. b) Oxidative phosphorylation pathway enrichment with age.
Extended Data Fig. 7 Chromatin accessibility and expression patterns of BM cell-enriched markers.
a) Expression and chromatin accessibility pattern of KRT14 and TP63 in various genetic ancestry and BRCA1/2 mutation carriers. b) Signaling pathways uniquely active in alveolar progenitor cells enriched in Indigenous Americans. Legend within the figure provides details of relationship between molecules of the signaling network.
Extended Data Fig. 8 Genetic ancestry dependent variability in expression of fibroblast-enriched genes.
a) Differences in expression of fibroblast-enriched genes in breast tissue fibroblasts of African ancestry compared to European ancestry. Fourteen clusters (0-13) are shown in Fig. 5g of the main text. b) Expression levels of genes that classify fibroblasts into four subtypes are also shown.
Extended Data Fig. 9 Relationship between breast epithelial gene signatures derived from this study with gene signatures derived from single cell analysis of breast tumors.
a) Gene signature of LHS cells overlap with gene expression modules of LumA, LumB and HER2+ breast cancers, whereas gene signatures of LASP and BM cells overlap with gene expression of modules of cancer cycling and cancer basal, respectively. b) Expression patterns of genes that identify myCAFs, iCAFs, dPVLs and iPVLs among fibroblast subclusters.
Supplementary information
Supplementary Tables 1–9
Supplementary files containing Excel spreadsheets with data on genes differentially expressed in several cell types of breast lobular and ductal epithelial cells and details of breast tissue donors.
Source data
Source Data Fig. 1
Flow cytometry gating strategy for the isolation of nuclei from the breast tissues of different genetic ancestry groups.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bhat-Nakshatri, P., Gao, H., Khatpe, A.S. et al. Single-nucleus chromatin accessibility and transcriptomic map of breast tissues of women of diverse genetic ancestry. Nat Med (2024). https://doi.org/10.1038/s41591-024-03011-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41591-024-03011-9