Mapping proteomic composition at distinct genomic loci in living cells has been a long-standing challenge. Here we report that dCas9–APEX2 biotinylation at genomic elements by restricted spatial tagging (C-BERST) allows the rapid, unbiased mapping of proteomes near defined genomic loci, as demonstrated for telomeres and centromeres. C-BERST enables the high-throughput identification of proteins associated with specific sequences, thereby facilitating annotation of these factors and their roles.
Chromosome organization is being defined at ever-increasing resolution with chromosome-conformation-capture-based methods1. Genome organization can also be analyzed in live cells via imaging, especially through the use of fluorescent protein fusions to nuclease-dead Streptococcus pyogenes Cas9 (dSpyCas9), which can be directed to nearly any genomic region by single guide RNAs (sgRNAs)2. It has proven more difficult to map subnuclear proteomes onto 3D genome landscapes in a comprehensive manner that does not require demanding fractionation protocols, specific DNA-associated protein fusions (as used in proximity-dependent biotin identification (BioID3)), or validated antibodies. dSpyCas9 has been combined with biotin ligases in approaches such as CasID4 and CAPTURE5 to allow the isolation of proteins associated with specific genomic regions in living cells. However, the efficiencies of these approaches usually necessitate long labeling times (hours), which limits the time resolution of dynamic processes.
Engineered ascorbate peroxidase (APEX2) has been used in an alternative live-cell biotinylation strategy called spatially restricted enzymatic tagging6, 7. In this approach, APEX2 is fused to a localized protein of interest, and cells are then treated with biotin–phenol (BP) and H2O2, which generates a localized (~20-nm radius) burst of diffusible but rapidly quenched biotin–phenoxyl radicals. These products react with electron-rich amino acid side chains (e.g., Tyr), leading to covalent biotinylation of nearby proteins and allowing for identification by streptavidin selection and liquid chromatography–tandem mass spectrometry (LC-MS/MS). Notably, this method is extremely efficient (1 min of H2O2 treatment). In part on the basis of the successful use of dSpyCas9–fluorescent protein fusions in imaging, we reasoned that a dSpyCas9 derivative that emits radicals rather than photons could be used for rapid (1 min) subnuclear proteomics (Fig. 1a).
To develop and validate this method, we used telomeres for benchmarking and proof-of-concept experiments, because they are associated with a well-defined suite of proteins and can be targeted by a well-established sgRNA (sgTelo)8, 9 in human U2OS cells. U2OS cells rely on ‘alternative lengthening of telomeres’ (ALT) pathways to maintain telomere length without telomerase activation10. We transduced U2OS cells with a lentiviral vector expressing dSpyCas9 under the control of a Tet-On cytomegalovirus promoter and fused to nuclear localization signals, a ligand-tunable degradation domain11, mCherry, and APEX2 (Supplementary Fig. 1a). This combination allows one to control dSpyCas9–mCherry–APEX2 protein levels in a sample (Supplementary Fig. 1b,c). We then transduced mCherry-positive cells with a lentiviral vector that encodes an sgRNA, as well as a blue fluorescent protein (BFP) construct that also expresses the TetR repressor (Supplementary Fig. 1a). One sgRNA construct encoded sgTelo, and the other encoded a nonspecific sgRNA (sgNS) whose target sequence is absent from the human genome12. We confirmed that sorted cells (Supplementary Fig. 1d and Supplementary Note 1) showed telomeric mCherry foci (Supplementary Fig. 2), as well as telomeric biotinylation activity (Supplementary Fig. 3a and Supplementary Note 1), in the presence of sgTelo. Analysis by chromatin immunoprecipitation followed by sequencing (ChIP-seq) of dCas9–mCherry–APEX2 confirmed sgTelo-directed localization to (TTAGGG) n repeats (Supplementary Fig. 3b and Supplementary Note 1). These observations indicate that sgTelo-guided dCas9–mCherry–APEX2 targets telomeres and enables restricted biotinylation of endogenous proteins.
For proteomic analysis, we used BP and H2O2 to induce APEX2 biotinylation in cells carrying either sgTelo or sgNS, and we also included an sgTelo control in which we omitted H2O2 treatment. Biotinylation of nucleoplasmic proteins in the sgNS sample served as a reference, permitting assessment of the telomere specificity of labeling with sgTelo. After nuclei isolation and streptavidin affinity purification, we carried out western blotting and silver staining analyses of proteins to confirm dCas9–mCherry–APEX2 expression, biotinylation, and enrichment of biotinylated proteins (Supplementary Fig. 4a–c). We analyzed streptavidin-selected proteins by LC-MS/MS. Using label-free intensity-based absolute quantification values to measure enrichment in the sgTelo sample relative to protein amounts in the sgNS sample (Benjamini–Hochberg-adjusted P < 0.05 and log2 fold change (FC) ≥ 2.0), we found 30 out of 143 proteins that have been reported to be telomere-associated or otherwise linked to telomere function (Supplementary Fig. 4d, Supplementary Tables 1 and 2, and Supplementary Note 2). These results indicate that validated telomeric proteins can be identified rapidly and efficiently by C-BERST.
To further improve our assessment of differential C-BERST biotinylation, we used a more quantitative approach enabled by stable isotope labeling with amino acids in cell culture (SILAC). We cultured telomere-targeted cells in heavy-isotope medium, sgNS cells in medium-isotope medium, and untransduced U2OS cells in light-isotope medium (Supplementary Fig. 5a,b). We carried out biotinylation and purification as described above, except that we mixed equal amounts of protein lysates from heavy, medium, and light samples before streptavidin purification for three-state SILAC6. We identified 913 proteins in both the heavy and the medium samples, 885 of which were also detectable in the light sample. Using significance (Benjamini–Hochberg-adjusted P < 0.01) and enrichment (log2 FC ≥ 2.5) cutoffs that were even more stringent than those used for label-free quantification, we identified 55 proteins that were strongly enriched in the sgTelo sample relative to their amounts in the sgNS sample (heavy/medium (H/M)) (Fig. 1b and Supplementary Table 3). Among these 55 hits, 34 are known telomere-associated factors, including all 6 shelterin components, as well as subunits from 5 other complexes that contribute to ALT-associated pathways or processes (Supplementary Fig. 6a). Of the 55 H/M-enriched proteins, 54 were also strongly enriched (log2 FC ≥ 1) in heavy/light (H/L) ratio, which indicates that background detection in the absence of dCas9–mCherry–APEX2 biotinylation was minimal. Gene Ontology (GO) analysis of the 55 H/M-enriched C-BERST hits showed strong functional associations with terms such as “telomere maintenance,” “homologous recombination,” and “DNA repair,” all of which are important for ALT pathways10 (Supplementary Fig. 6b).
Telomere-associated proteomes from ALT+ cell lines have been defined previously by TERF1-BirA* BioID13 and proteomics of isolated chromatin segments (PICh)14. Over 50% of proteins identified by C-BERST (Supplementary Fig. 7a) were also detected by one or both of the other methods. The remaining 23 proteins that were uniquely detected by C-BERST included 7 known telomeric/ALT factors. Of the 18 proteins detected by all three approaches, 17 are known telomere-related factors. The remaining consensus hit (SLX4IP) was not previously validated as telomeric, but its identification by all three proteomic approaches strongly suggests that it has an unrecognized role in telomere function or maintenance (Supplementary Fig. 7b). We used independent methods to confirm the telomere colocalization of SLX4IP, as well as a factor (RPA3) that was detected by C-BERST but missed by BioID and PICh (Fig. 1c, Supplementary Fig. 8, and Supplementary Note 3).
We extended C-BERST to centromeric α-satellite arrays in U2OS cells (Supplementary Fig. 9a), using a similar pipeline. The human α-satellite proteome from K562 cells was analyzed previously by the PICh-related protocol HyCCAPP (hybridization capture of chromatin-associated proteins for proteomics)15, which allowed for comparison. We first confirmed dCas9–mCherry–APEX2 inducible expression, specific centromere targeting16, and biotinylation (Supplementary Fig. 9b–d), and then used SILAC to identify 1,268 proteins (Supplementary Table 4) from each of two biological replicates. Among these 1,268 proteins, 460 were enriched to a statistically significant extent (log2 FC ≥ 2.5 and P < 0.01) in sgAlpha samples compared with amounts in sgNS samples (H/M) (Fig. 2a). We identified subunits of the CENP-A nucleosome-associated complex17, CENP-A distal complex17, CENP-A loading factors18, chromosome passenger complex19, and other known centromere-associated proteins. We found that 31 enriched proteins overlapped between C-BERST in U2OS cells and HyCCAPP in K562 cells (Supplementary Fig. 10a). C-BERST uniquely captured several centromeric factors, including CENP-F and ATR, which were recently reported to engage RPA-coated centromeric R loops20. GO analysis of the 460 C-BERST centromeric hits revealed strong functional associations with terms related to centromere maintenance or function10 (Supplementary Fig. 10b).
Our generation of both telomeric and centromeric C-BERST datasets afforded us the opportunity to compare protein enrichment at these two landmarks. We identified 36 proteins in both datasets (Fig. 2b and Supplementary Note 4). Significant GO terms for these 36 overlapping proteins included “DNA replication” and others that would be expected for both chromosomal elements. All CENP factors were found among the 424 nonoverlapping proteins from the sgAlpha dataset. Conversely, the 19 telomere-specific hits included five of the six shelterin subunits. These results provide strong evidence that C-BERST successfully profiles subnuclear proteomes enriched at distinct chromosomal elements.
Combining the flexibility of dSpyCas9 with the efficiency and rapid kinetics of APEX2 biotinylation, C-BERST promises to extend the unbiased definition of subnuclear proteomes to many other genomic elements, as well as to a range of dynamic processes (for example, cellular differentiation, responses to extracellular stimuli, and cell-cycle progression) that occur too rapidly to be analyzed via longer labeling procedures. C-BERST and BirA*-based methods favor biotinylation of distinct sets of proteins because of their different labeling specificities; using these approaches in tandem would probably diminish the number of false negatives resulting from inefficient labeling due to differences in surface-accessible amino acid distribution or the suitability of certain peptides for MS analysis. Importantly, C-BERST promises to augment and extend Hi-C and related methods by linking conformationally important cis-elements with their associated factors. Guide multiplexing should enable the extension of C-BERST subnuclear proteomics to single-copy, nonrepetitive loci. In the meantime, many types of repetitive elements within the genome have critically important roles in chromosome maintenance and function in ways that depend upon their associated proteins; C-BERST allows unbiased definition of subnuclear, locus-specific proteomes at such elements.
Construction of C-BERST plasmids
We made the Shield1- and doxycycline-inducible dSpyCas9–mCherry–APEX2 construct by subcloning Flag–APEX2 from Flag–APEX2–NES (Addgene 49386) into DD–dSpyCas9–mCherry21 using the pHAGE backbone. Two additional nuclear localization signals (SV40 and nucleoplasmin) were inserted at each terminus to improve nuclear localization. We created the sgTelo-encoding construct by replacing the C3–guide RNA sequence (pCMV_C3-sgRNA_2XBroccoli/pPGK_TetR_P2A_BFP) with sgTelo sequences (using a plasmid provided by Hanhui Ma and Thoru Pederson). sgNS12 and sgAlpha were constructed similarly. The sequences of the final constructs are provided in Supplementary Notes 5 and 6. The SLX4IP–turboGFP plasmid was obtained from OriGene (cat. no. RG220896). We made RPA3–turboGFP by replacing the SLX4IP coding sequence with the human RPA3 coding sequence.
Cell culture and cell line construction
Human U2OS cells obtained from Thoru Pederson’s lab (originally obtained from ATCC) were cultured in Dulbecco’s modified Eagle’s minimum essential medium (DMEM; Life Technologies) supplemented with 10% (vol/vol) FBS (Sigma). Lentiviral transduction was as described21. Titers of sgRNA-encoding lentiviruses used for transduction were sixfold higher relative to titers of dSpyCas9–APEX2 lentivirus. Stably transduced cells were grown under the same conditions as the parental U2OS cells.
One day before flow cytometry, doxycyline (Sigma; 2 µg/ml) and Shield1 (Clontech; 250 nM) were added to the media. ~2 × 106 cells expressing dSpyCas9–mCherry–APEX2 and BFP sgRNA were selected by a FACSAria cell sorter or analyzed with MacsQuant VYB. Both instruments are equipped with 405- and 561-nm excitation lasers, and the emission signals were detected via the use of filters at 450/50 nm (wavelength/bandwidth) for BFP and 610/20 nm (FACSAria) or 615/20 nm (MacsQuant) for mCherry. Bulk populations and single cells (Supplementary Fig. 1b) were sorted into plates containing 1% GlutaMAX, 20% FBS, and 1% penicillin–streptomycin in DMEM.
U2OS cells expressing sgRNA were seeded onto 170-μm, 35 × 10 mm glass-bottom dishes (Eppendorf) supplemented with doxycycline and Shield1 21 h before imaging. Live cells were imaged with a Leica DMi8 microscope equipped with a Hamamatsu camera (C11440-22CU), a 63× oil-immersion objective lens, and Microsystems software (LASX). Further imaging processing was done with MetaMorph (Molecular Devices) or ImageJ (version 2.0.0-rc-49/1.51d). Image contrast was set to ease the visualization of cells, foci, and nucleoplasmic background.
Cells for immunofluorescence microscopy were grown on glass coverslips. Transfected cells and normal control cells were fixed for 15 min in 2% paraformaldehyde in PHEM (0.05 M PIPES/0.05 M HEPES, pH 7.4, 0.01 M EGTA, 0.01 M MgCl2) and then subjected to a 2-min extraction with 0.1% Triton X-100 in PHEM. After being washed in PBS, the cells were blocked with 1% BSA/1× TBST at 4 °C overnight. Cells were incubated with primary antibodies for 2 h at room temperature and then washed three times with blocking solution (10 min per wash). Cells were then incubated with secondary antibodies for 1 h at room temperature, and subjected to another three washes in blocking solution and two washes in PBS22. Cells were mounted with ProLong antifade and visualized by fluorescence microscopy as described above. The experiment involving Neutravidin conjugated to OG488 was described previously6. Image processing was done as described above.
Six 15-cm plates of U2OS cells (~6 × 107) expressing specific (sgTelo) or nonspecific (sgNS) sgRNAs were used in this assay. Doxycycline (2 µg/ml) and Shield1 (250 nM) were added 21 h before biotinylation. Cells were then incubated with 500 µM BP (Adipogen) for 30 min at 37 °C. 1 mM H2O2 was then added to initiate biotinylation for 1 min on a horizontal shaker at room temperature. As a negative control, six 15-cm plates of sgTelo-expressing cells were treated in parallel, but without the addition of H2O2. Quencher solution (5 mM trolox, 10 mM sodium ascorbate, and 10 mM sodium azide) was added to stop the reaction, and cells were washed five times (three quencher washes and two DPBS washes) to continue the quench and to remove excess BP.
Enrichment of biotinylated proteins
Cells were scraped off the plates and used for the preparation of isolated nuclei23. Nuclei were washed with DPBS before lysis. RIPA lysis buffer (50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 0.125% SDS, 0.125% sodium deoxycholate, and 1% Triton X-100 in Millipore water) with 1× freshly supplemented Halt protease inhibitor was used to lyse the cells for 10 min on ice. Cell lysates in 1.5-ml Eppendorf tubes were sonicated for 15 min with a Diagenode Bioruptor with 30-s on/off cycles at high intensity. Cell lysates were clarified by centrifugation at 13,000 r.p.m. for 10 min. Clarified protein samples (~3.5 mg) were subjected to affinity purification with 400 µl of Dynabeads MyOne streptavidin T1 overnight at 4 °C. Each bead sample was washed with a series of buffers to remove nonspecifically bound proteins: twice with RIPA lysis buffer, once with 1 M KCl, once with 0.1 M Na2CO3, once with 2 M urea in 10 mM Tris-HCl, pH 8.0, and twice with RIPA lysis buffer. Proteins were eluted in 70 µl of 3× protein loading buffer supplemented with 2 mM biotin and 20 mM DTT with heating for 10 min at 95 °C6. 50 µl of eluents were loaded on a 4–12% SDS–PAGE gel (Bio-Rad) and run approximately 1 cm off the loading well for in-gel digestion and LC-MS/MS analysis. All samples, including negative controls, contained ~75-kDa endogenously biotinylated proteins that are routinely detected in samples labeled by spatially restricted enzymatic tagging6, 7. The gel-fractionated sample used for LC-MS/MS (see below) corresponded to proteins from ~4 × 107 cells.
Protein concentrations of cell lysates were determined by BCA assay (Thermo). 50 µg of each sample was mixed with protein loading buffer, boiled, and separated on SDS–PAGE gels. Proteins were transferred to PVDF membrane (Millipore) and blotted with streptavidin–HRP (Thermo), or with anti-mCherry (Abcam) or anti-HDAC1 (Bethyl). Additional details of the anti-SLX4IP and anti-RPA3 western blotting analyses are described in the relevant supplementary figure legends.
mCherry affinity purification of dSpyCas9–mCherry–APEX2 captured DNA and sequencing
1 × 107 U2OS cells stably expressing dCas9–mCherry–APEX2 transduced with sequence-targeting or nonspecific sgRNAs were washed with PBS, fixed with 1% formaldehyde for 10 min, and quenched with 0.125 M glycine for 5 min. Cells were harvested with a plate scraper and lysed in RIPA cell lysis buffer (50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 0.125% SDS, 0.125% sodium deoxycholate, and 1% Triton X-100 in Millipore water) with 1× freshly supplemented Halt protease inhibitor for 10 min on ice. Cell lysates were centrifuged at 2,300g for 5 min at 4 °C to isolate nuclei. Nuclei were suspended in 500 μl of RIPA nuclear lysis buffer (50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 0.5% SDS, 0.125% sodium deoxycholate, and 1% Triton X-100 in Millipore water) with 1× freshly supplemented Halt protease inhibitor and subjected to sonication to shear chromatin fragments to an average size of 200–500 bp on a Diagenode Bioruptor with 30-s on/off cycles at high intensity for 15 min. Fragmented chromatin was centrifuged at 16,100g for 10 min at 4 °C. 450 μl of supernatant was transferred to a new microcentrifuge tube. 4 μg of anti-mCherry (Thermo; PA5-34974) was added to each sample and incubated at 4 °C for 3 h. 50 μl of blocked Protein G Dynabeads (Thermo; 10003D) was added to each sample and rotated at 4 °C overnight. After overnight incubation, Dynabeads were washed seven times as described above for selection of biotinylated proteins. Chromatin was eluted from Dynabeads in 200 μl of elution buffer (50 mM Tris-HCl, pH 8.0, 10 mM EDTA, 1% SDS) and transferred to a new microcentrifuge tube. Eluted chromatin was treated with 1 μl of RNase A and incubated overnight at 65 °C to reverse cross-links. 7.5 μl of 20 mg/ml proteinase K was added to each sample, and samples were incubated for 2 h at 50 °C. ChIP DNA was then incubated with 1 ml of buffer PB (Qiagen) and 10 μl of 3 M sodium acetate, pH 5.2, at 37 °C for 30 min. DNA was purified on Qiagen Quickspin columns.
15 ng of ChIP DNA was processed for library preparation with the NEBNext ChIP-seq library prep kit (New England Biolabs) according to the manufacturer’s protocol.
15 ng of ChIP DNA was end-repaired with the NEBNext end-repair module (NEB; E6050) and purified with 1.8× AMPure XP beads (Beckman-Coulter; A63880). End-repaired DNA was processed in a dA-tailing reaction with the NEBNext dA-tailing module (NEB; E6053) and purified with 1.8× AMPure XP beads. Adaptor oligos 1 (5′-pGATCGGAAGAGCACACGTCT-3′) and 2 (5′-ACACTCTTTCCCTACACGACGCTCTTCCGATCT-3′) used in Y-shaped adaptor mix were ligated to dA-tailed DNA according to ref. 24 and purified with 1.5× AMPure XP beads. Ligated DNA was incubated in a thermal cycler (98 °C for 40 s, 65 °C for 30 s, and 72 °C 30 s) with one of the Illumina barcode primers and NEB Q5 polymerase master mix. Primer 1 (5′-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT-3′) was added to mix for ten cycles (98 °C for 10 s, 65 °C for 30 s, 72 °C for 30 s) and then incubated at 72 °C for 3 min. PCR-enriched DNA was purified with 1× AMPure XP beads.
Raw Illumina sequencing reads 150 nt in length were processed as FASTQ files in R. Reads were trimmed using the Bioconductor ShortRead R package at positions that contained 2 nt in a 5-nt bin with a quality encoding a Phred score less than 20. Reads with at least one (TTAGGG)4 or (CCCTAA)4 segment constituted a ‘hit’ and were counted with the Bioconductor Biostrings R package. We calculated the number of hits divided by the total number of trimmed reads to assess the specificity of Cas9–mCherry–APEX2 for each sample.
On day 0, early-passage, sorted, stably transduced sgTelo and sgAlpha U2OS cells were grown in heavy SILAC media, which contained l-arginine-13C6,15N4 (Arg10) and L-lysine-13C6,15N2 (Lys8) (Sigma). Stable sgNS cells were grown in medium SILAC media, which contained l-arginine-13C6 (Arg6) and L-lysine-4,4,5,5-d4 (Lys4) (Sigma). Untransduced U2OS cells were grown in light SILAC media, which contained l-arginine (Arg0) and L-lysine (Lys0) (Sigma). Cells were grown for more than 10 d (>5 passages) to allow for sufficient incorporation of the isotopes. On day 11, doxycycline and Shield1 were added to each isotope culture (four 15-cm plates for each cell line) 21 h before treatment with BP and H2O2. Biotinylation, nuclei isolation, and cell lysis followed the procedures described above. Before streptavidin affinity purification, equal amounts of proteins measured by a Pierce BCA protein assay kit (~1 mg from each isotope sample) were mixed in a 1:1:1 ratio (H:M:L). Streptavidin affinity purification and sample washes were as described above. Proteins were eluted in 50 µl of 3× protein loading buffer supplemented with 2 mM biotin and 20 mM DTT with heating for 10 min at 65 °C. 50 µl of eluents were loaded and run approximately to the center of the lane on a 4–12% SDS–PAGE gel (Bio-Rad). The Coomassie-stained protein bands were excised and cut into five slices for in-gel digestion and LC-MS/MS analysis.
LC-MS/MS and proteomic analyses for label-free quantification
Unresolved protein bands from SDS–PAGE were cut into 1 × 1 mm pieces and placed in 1.5-ml Eppendorf tubes with 1 ml of water. After 30 min, the water was removed and replaced with 70 µl of 250 mM ammonium bicarbonate. Proteins were then reduced by the addition of 20 µl of 45 mM DTT, incubated at 50 °C for 30 min, cooled to room temperature, alkylated with 20 µl of 100 mM iodoacetamide for 30 min, and washed twice with 1 ml of water. The water was removed and replaced with 1 ml of 50 mM ammonium bicarbonate:acetonitrile (1:1), and the sample was incubated at room temperature for 1 h. The solvent was then replaced with 200 µl of acetonitrile. Next, the acetonitrile was removed, and the pieces were dried in a SpeedVac (Savant Instruments, Inc.). Gel pieces were then rehydrated in 75 µl of 4 ng/µl sequencing-grade trypsin (Promega) in 0.01% ProteaseMAX surfactant (Promega) in 50 mM ammonium bicarbonate and incubated at 37 °C for 21 h. The supernatant was then moved to a 1.5-ml Eppendorf tube, the gel pieces were further dehydrated with 100 µl of acetonitrile:1% (v/v) formic acid (4:1), and the combined supernatants were dried on a SpeedVac. Peptides were then reconstituted in 25 µl of 5% acetonitrile containing 0.1% (v/v) trifluoroacetic acid for LC-MS/MS.
Samples were analyzed on a NanoAcquity UPLC (Waters Corporation) coupled to a Q Exactive (Thermo Fisher Scientific) hybrid mass spectrometer. In brief, 1.0-µl aliquots were loaded at 4 µl/min onto a custom-packed fused silica precolumn (100-µm inner diameter) with Kasil frit containing 2 cm of Magic C18AQ (5 µm, 100 Å) particles (Bruker Corporation). Peptides were then separated on a 75-µm-inner-diameter fused silica analytical column containing 25 cm of Magic C18AQ (3 µm, 100 Å) particles (Bruker) packed in-house into a gravity-pulled tip. Peptides were eluted at 300 nl/min with a linear gradient from 95% solvent A (0.1% (v/v) formic acid in water) to 35% solvent B (0.1% (v/v) formic acid in acetonitrile) over 60 min. Data were acquired by data-dependent acquisition according to a published method25. Briefly, MS scans were acquired for m/z 300–1,750 at a resolution of 70,000 (m/z 200), and this was followed by ten tandem MS scans using higher-energy collisional dissociation (HCD) fragmentation with an isolation width of 1.6 Da, a collision energy of 27%, and a resolution of 17,500 (m/z 200). Raw data files were processed with Proteome Discoverer (Thermo; version 18.104.22.168) and searched with Mascot (Matrix Science; version 2.6) against the Swiss-Prot Homo sapiens database. Search parameters used tryptic specificity considering up to two missed cleavages, a parent mass tolerance of 10 p.p.m., and a fragment mass tolerance of 0.05 Da. Fixed modification of carbamidomethyl cysteine was considered, as were variable modifications of N-terminal acetylation, N-terminal conversion of Gln to pyroGlu, oxidation of methionine, and BP conjugation of tyrosine. Results were loaded into Scaffold (Proteome Software Inc.; version 4.8.4) for peptide and protein validation and quantitation using the Peptide Prophet and Protein Prophet algorithms26, 27. The threshold was set to 80% for peptides (1.1% FDR) and 90% for proteins (3-peptide minimum). Contaminants such as human keratin were included in all statistical analyses but are not included in the figures.
LC-MS/MS and proteomic analyses for SILAC
A fully resolved SDS–PAGE gel was cut into five fractions, and each fraction was processed separately as described above. Gel bands were cut into 1 × 1 mm pieces and placed in 1.5-mL Eppendorf tubes with 1 mL of water for 30 min. The water was removed and 200 µl of 250 mM ammonium bicarbonate was added. For reduction, 20 µl of a 45 mM solution of DTT was added and the samples were incubated at 50 °C for 30 min. The samples were cooled to room temperature and then, for alkylation, 20 µl of a 100 mM iodoacetamide solution was added and allowed to react for 30 min. The gel slices were washed twice with 1 mL of water. The water was removed and 1 mL of 50:50 50 mM ammonium bicarbonate:acetonitrile was placed in each tube, and samples were incubated at room temperature for 1 h. The solution was then removed and 200 µl of acetonitrile was added to each tube, at which point the gel slices turned opaque white. The acetonitrile was removed and gel slices were further dried in a SpeedVac. Gel slices were rehydrated in 100 µl of 4 ng/µl sequencing-grade trypsin in 0.01% ProteaseMAX surfactant:50 mM ammonium bicarbonate. Additional bicarbonate buffer was added to ensure complete submersion of the gel slices. Samples were incubated at 37 °C for 18 h. The supernatant of each sample was then removed and placed in a separate 1.5-mL Eppendorf tube. Gel slices were further extracted with 200 µl of 80:20 acetonitrile:1% formic acid. The extracts were combined with the supernatants of each sample. The samples were then completely dried down in a SpeedVac.
Tryptic peptide digests were reconstituted in 25 µL of 5% acetonitrile containing 0.1% (v/v) trifluoroacetic acid and separated on a NanoAcquity UPLC. In brief, a 3.0-µL injection was loaded in 5% acetonitrile containing 0.1% formic acid at 4.0 µL/min for 4.0 min onto a 100-µm-inner-diameter fused-silica pre-column packed with 2 cm of 5-µm (200-Å) Magic C18AQ particles and eluted using a gradient at 300 nL/min onto a 75-µm-inner-diameter analytical column packed with 25 cm of 3-µm (100-Å) Magic C18AQ particles to a gravity-pulled tip. Solvent A was water, 0.1% formic acid, and solvent B was acetonitrile, 0.1% formic acid. A linear gradient was developed from 5% solvent A to 35% solvent B in 60 min. Ions were introduced by positive electrospray ionization via liquid junction into a Q Exactive hybrid mass spectrometer (Thermo). Mass spectra were acquired over m/z 300–1,750 at 70,000 resolution (m/z 200), and data-dependent acquisition selected the top ten most abundant precursor ions for tandem mass spectrometry by HCD fragmentation using an isolation width of 1.6 Da, collision energy of 27, and resolution of 17,500.
Raw data files were peak-processed with Mascot Distiller before database searching with Mascot Server against the UniProt human database. Search parameters included trypsin specificity with two missed cleavages. The variable modifications of oxidized methionine, pyroglutamic acid for N-terminal glutamine, N-terminal acetylation of the protein, BP on tyrosine, and a fixed modification for carbamidomethyl cysteine were considered. For SILAC labels, the medium samples were labeled with Lys4 and Arg6, and the heavy samples were labeled with Lys8 and Arg10. The mass tolerances were 10 p.p.m. for the precursor and 0.05 Da for the fragments. SILAC ratio quantitation was accomplished with Mascot Distiller, and the results from Mascot Distiller were loaded into the Scaffold Viewer for peptide/protein validation and SILAC label quantitation. For SILAC experiments, protein identification was subject to a two-peptide cutoff. For proteins detectable in the H sample but lacking an empirical H/L ratio value (owing to low background detection in the L sample), peak areas of all the identified peptides in the Distiller file were used to calculate H/L ratios.
Data were first filtered to exclude proteins detected in only one of the dCas9–mCherry–APEX2/sgTelo (+BP, +H2O2) (‘S1’) replicates, and then subjected to log2 transformation. Prior to the log2 transformation, intensity-based absolute quantification (iBAQ) values of 0 were replaced with the smallest iBAQ value from the corresponding sample in dCas9–mCherry–APEX2/sgTelo (+BP, –H2O2) (‘S2’) or dCas9–mCherry–APEX2/sgNS (+BP, +H2O2) (‘S3’) to avoid the generation of infinite ratios. Moderated t-test with a paired design was used to compare the log2-transformed iBAQ values between S1 and S3, S1 and S2, and S2 and S3 using the limma package28. To adjust for multiple comparisons, P values were adjusted using the Benjamini–Hochberg (BH) method29. Proteins were selected for subsequent analysis if they were (i) significantly enriched in both S1 versus S3 and S1 versus S2 and (ii) not enriched in S2 versus S3, and (iii) if S1/S3 and S1/S2 ratios were greater than 2.
Similarly, SILAC datasets were filtered to exclude proteins with H/M ratios detected in only one of the biological replicates. Detection in a biological replicate required identification in at least two of the three technical replicates that were done for each biological replicate; median values from the technical replicates were used for subsequent analyses. Proteins with BH-adjusted P values less than 0.05 (moderated t-test as described above) were considered statistically significant. Proteins with BH-adjusted P values < 0.01 and log2 fold change ≥ 2.5 were selected for subsequent GO (David Bioinformatics) and overlap analysis. To determine whether the proteins identified in this experiment overlapped significantly with three published datasets, we used a hypergeometric test. The hypergeometric test was also used to test for overlapping proteins between C-BERST telomere IDs and centromere IDs.
Further information on experimental design is available in the Nature Research Reporting Summary linked to this paper.
Mass spectrometry data that support the findings of this study have been deposited in the EMBO PRIDE archive with the dataset identifier PXD009216. Source data for all the graphical representations reported in the paper are provided in Supplementary Table 6. All other data that support the findings of this study are available from the corresponding authors upon reasonable request.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We are grateful to all members of the Sontheimer, Wolfe, and Dekker labs for advice and discussions; T. Fazzio, S. Bhaduri, and M. Green for helpful feedback; H. Ma, T. Wu, D. Grünwald, and T. Pederson (University of Massachusetts Medical School, Worcester, MA, USA) for reagents; the Flow Cytometry Core Facility at UMass Medical School for cell sorting; and L. Zhu for assistance with figure preparation. This work was supported by the US National Institutes of Health (4D Nucleome grant U54 DK107980 to J.D., S.A.W., and E.J.S.).