Single-stranded DNA binding proteins influence APOBEC3A substrate preference

The cytidine deaminase, APOBEC3A (A3A), is a prominent source of mutations in multiple cancer types. These APOBEC-signature mutations are non-uniformly distributed across cancer genomes, associating with single-stranded (ss) DNA formed during DNA replication and hairpin-forming sequences. The biochemical and cellular factors that influence these specificities are unclear. We measured A3A’s cytidine deaminase activity in vitro on substrates that model potential sources of ssDNA in the cell and found that A3A is more active on hairpins containing 4 nt ssDNA loops compared to hairpins with larger loops, bubble structures, replication fork mimics, ssDNA gaps, or linear DNA. Despite pre-bent ssDNAs being expected to fit better in the A3A active site, we determined A3A favors a 4 nt hairpin substrate only 2- to fivefold over linear ssDNA substrates. Addition of whole cell lysates or purified RPA to cytidine deaminase assays more severely reduced A3A activity on linear ssDNA (45 nt) compared to hairpin substrates. These results indicate that the large enrichment of A3A-driven mutations in hairpin-forming sequences in tumor genomes is likely driven in part by other proteins that preferentially bind longer ssDNA regions, which limit A3A’s access. Furthermore, A3A activity is reduced at ssDNA associated with a stalled T7 RNA polymerase, suggesting that potential protein occlusion by RNA polymerase also limits A3A activity. These results help explain the small transcriptional strand bias for APOBEC mutation signatures in cancer genomes and the general targeting of hairpin-forming sequences in the lagging strand template during DNA replication.


Results
To investigate the enzymatic requirements of A3A's substrate preferences, we measured the activity of purified A3A (Fig. 1A) on multiple ssDNA-containing substrates by in vitro deaminase activity assays. Cytidine deaminase activity at the TTCA sequence in each substrate creates a deoxyuridine base, which is converted to a heat-labile abasic site by the activity of uracil-DNA glycosylase (UDG), resulting in shorter ssDNA fragments (Fig. 1B) that are resolved from uncleaved substrate on a denaturing polyacrylamide gel. We initially compared A3A activity between linear ssDNA substrate and a hairpin-forming oligonucleotide. The sequence of our hairpin substrate consists of a hotspot for APOBEC mutagenesis in breast cancers 25 likely due to A3A's affinity for U-shaped ssDNA present in hairpin structures 30,49,50 . We first incubated A3A at a range of concentrations with a constant concentration of substrate to see how deaminase activity would titrate on these substrates (Fig. 1C). We found that activity on the linear substrate decreases faster than it does for hairpins. At most, we see ~ fivefold preference for hairpin over linear substrate with 4 nM A3A.
Previous comparative measurements of A3A activity on hairpin and linear ssDNA substrates have reported strong A3A preferences for hairpin structures 13 , suggesting that differences in the sequence composition of the hairpin may be important effectors of A3A activity. To test this, we incubated our hairpin substrate alongside a previously evaluated mutation hotspot hairpin sequence (NUP93 substrate) and a substrate with the 3′ arm of the stem loop replaced with poly-A to prevent hairpin formation (NUP93-noHP) 13 (Fig. S1). A3A displayed a similar ~ 2-5-fold preference for each hairpin-forming sequence compared to their corresponding linear ssDNA substrate ( Fig. 1D and Fig. S1). This indicates that a broader sequence content of the hairpin is unlikely to be a significant factor in determining A3A activity, but instead the general conformation of the stem-loop structure of the hairpin is the major activity determinant.

APOBEC3A prefers small hairpin loops with target C in central positions.
To further characterize the effect hairpin structure has on A3A activity, we conducted deaminase activity assays with hairpin substrates with varying loop sizes. The hairpin substrates have the target TTCA sequence at the same 3′ position in the loop, with the terminal A being the first base within duplex DNA. The loop size in each substrate was increased by the addition of poly-adenine, for which A3A has low binding affinity 30 , upstream of the target site towards the 5' end of the loop. When incubated with 20 nM A3A, hairpin substrate deamination was decreased fourfold, sixfold, and fivefold on the 8 nt loop, 11 nt loop, and 14 nt loop, respectively ( Fig. 2A), indicating that the larger loops either reduce A3A's binding affinity or catalytic efficiency.
To determine if the position of the target C within hairpin loops affects A3A activity, we conducted deaminase activity assays with 2 and 0.5 nM A3A, and 8 nt loops with the target C shifted to the 3′-most end, the middle, and the 5′-most end of the loop. Again, shifting of the target C was accomplished by insertion of A nucleotides around the target TTCA (Fig. 2B). We observed that moving the target C to the middle of the 8 nt loop increased A3A activity, while moving the target C to the 5′-most position reduced A3A cytidine deamination. However, for the 5′-most target C, the TT in TTCA could potentially pair to AA on the opposite side of the loop, making it double stranded and reducing the loop size from 8 to 4 nt. Therefore, we tested a variant of this substrate where the AA dinucleotides at the 3′ most position of the loop were replaced with GG to prevent unwanted annealing    ssDNA bubbles are disfavored APOBEC3A substrates. In addition to hairpin-forming DNA sequences, several transient DNA structures represent abundant sources of genomic ssDNA that could be targeted by A3A. We therefore tested A3A's ability to deaminate substrates that mimic a replication fork with lagging strand ssDNA, a ssDNA bubble that mimics a transcription bubble, and a small ssDNA gap substrate (Fig. 3A). Each substrate was 5′ Cy5 tagged and contained a single TTCA sequence within a centrally located 4-nucleotide ssDNA region. The replication fork, ssDNA bubble, and ssDNA gap substrates were annealed, gel purified, and respective structures confirmed via restriction digest prior to use (Fig. S2). The substrates (20 nM) were incubated with 100 nM or 20 nM purified A3A for 30 min at 37 °C and the deamination product was resolved from the un-cleaved substrate via denaturing gel electrophoresis. A3A deaminated the hairpin substrate 1.5-, 3-, and sixfold more efficiently than the ssDNA gap, replication fork, and the bubble substrate, respectively (Fig. 3B,C). Previous reports have indicated that A3A activity can be limited on ssDNAs smaller than 5 nt 29,30,59 . Therefore, the 4 nt ssDNA present in our replication fork, ssDNA gap, and bubble substrates may be size exclusionary. We therefore recreated each substrate with an additional 6 nt of ssDNA 5′ of the target TTCA sequences and repeated the deaminase assays (Fig. 3D,E). We found that A3A activity did not significantly change on the larger ssDNA gaps or replication fork substrates, indicating that additional ssDNA 5′ of the target C does not influence A3A activity. This is consistent with crystallographic data showing limited contact between the  49,50 . Interestingly, increasing the amount of ssDNA 5′ of the deamination site increased A3A activity on bubble substrates. This result is consistent with previous reports 29,59 , which interpreted the increase in activity as A3A requiring longer ssDNA regions. However, the higher level of deamination on the 4 nt ssDNA gap substrate compared to the 4nt bubble instead suggests that the presence of an intact non-deaminated strand in the substrate imposes structural constraints that lessen the ability of A3A to catalyze the deamination reaction. This structural constraint appears to be weaker in larger bubbles, which behave more similarly to longer ssDNA gaps. We quantified the differences in A3A activities on these substrates by calculating specific activities (Table 1).
APOBEC3A activity is reduced on linear substrates when incubated with cell lysate. A3A has been shown previously to greatly prefer hairpin substrates over linear ssDNA substrates 13,17,30,51 , which results  www.nature.com/scientificreports/ in recurrent mutation of hairpin-forming sequences in human tumors 13,25 . However, the fold-preference for hairpin substrates has varied substantially under different experimental conditions, with assays conducted utilizing cell lysates showing the greatest hairpin preference 13 . We have found that purified A3A prefer hairpins over linear substrates, but that the preference is only 2-to fivefold depending on A3A concentration (Fig. 1D). We hypothesized that the presence of proteins in cell lysate may enhance A3A preference for hairpin over linear substrate; these additional proteins may compete with A3A for binding of linear ssDNA, but not hairpin DNA.
To test this, we performed a deaminase assay with or without the addition of 40 µg whole cell lysate from the breast cancer cell line SKBR3. We pre-incubated the hairpin, linear, or ssDNA gap substrates with the cell lysate for 30 min at 37 °C to allow any ssDNA-binding proteins to bind the substrates prior to the addition of A3A (20 nM final concentration). Following denaturing gel electrophoresis, we found that deamination of the hairpin substrate was unaffected upon the addition of whole cell lysate, while A3A activities on the linear and the ssDNA gap substrates were reduced 3-and 2.5-fold, respectively (Fig. 4A). Thus, other proteins present in the whole cell lysate appear to compete with A3A for binding of linear and ssDNA gap substrates. However, the bent conformation of ssDNA within the hairpin substrate is resistant to this inhibitory effect. The identity of ssDNA-binding proteins that preferentially inhibit A3A on linear ssDNA substrates is unknown. Replication Protein A (RPA) is a likely candidate, as it is known to inhibit A3A activity on linear ssDNA substrates 45,60 and requires approximately 30 nt of ssDNA for high-affinity binding 61,62 . We therefore purified human RPA (Fig. 4B), a heterotrimer composed of RPA1 (70 kDa), RPA2 (32 kDa) and RPA3 (14 kDa) and tested its ability to inhibit A3A cytidine deaminase activity on both hairpin and linear ssDNA. We found that RPA nearly saturated the linear ssDNA substrate, but not the hairpin substrate when in approximately a twofold excess over substrate (Fig. 4C), indicating that RPA binds hairpin substrates less efficiently. The stronger RPA binding of linear ssDNA translated into RPA dramatically decreasing A3A activity 6.3-fold on linear ssDNA. RPA also reduced A3A activity on hairpin substrates, but only by 3.5-fold (Fig. 4D), indicating that RPA binds less strongly to small ssDNA loops in hairpin structures than to linear ssDNA, which allows A3A to maintain activity at sites of DNA secondary structures. A3A activity is low on transcription bubbles. Due to the potential for protein obstruction to hinder A3A's activity on ssDNA, we evaluated whether other proteins may also help to restrict A3A activity to the lagging strand template during replication. Transcription, like replication, produces ssDNA on the non-transcribed strand and is required for genomic cytidine deamination by AID 37,58 . However, little evidence for transcriptional asymmetry of APOBEC-induced mutations in cancer genomes exists 23,24,26 , indicating that ssDNA within transcription bubbles may be protected from APOBEC cytidine deaminase activity. Our previous data indicates that the small ssDNA bubbles themselves reduce A3A activity. Moreover, transcription bubbles are frequently 14-22 nt in length 63 , indicating that most ssDNA present will be near the synthesizing RNA polymerase, which may further reduce A3A activity. To better understand whether RNA polymerase itself blocks A3A activity, we generated an in vitro transcription system similar to previously described systems utilizing T7 RNA polymerase 29,31,35,37,38,45 , a 90 bp dsDNA substrate, and our purified A3A. To enhance the stability of transcription-associated ssDNA, we designed the substrate to only contain T bases in the transcribed DNA strand downstream of the TTCA deamination target on the non-transcribed strand and excluded rATP from the transcription reaction. The lack of rATP stalls T7 RNA polymerase across from the TTCA deamination target sequence, causing it to remain single stranded as long as T7 RNA polymerase is bound. 0, 20, 50, or 100 nM of purified A3A were incubated with 20 nM dsDNA substrate and 50 units of T7 RNA polymerase at 37 °C for 24 h. A3A was incapable of deaminating the dsDNA alone (Fig. 5, upper panel), or in the presence of stalled T7 RNA polymerase transcription, which was confirmed by the presence of a 21 nt RNA product in a GelRed scan for total nucleotide content (Fig. 5, lower panel). A low-abundance ssDNA oligonucleotide (indicated with * in Fig. 5 upper panel) was observed as a contaminant in our dsDNA T7 RNA polymerase substrate. A3A deaminated this ssDNA contaminant, but not the T7 transcription substrate, indicating that the presence of the T7 RNA polymerase efficiently blocks A3A from directly deaminating ssDNA within a transcription bubble.
Previous findings indicate that A3A can deaminate cytidines during in vitro transcription 29,31,45 . Our in vitro transcription system stalls the T7 RNA polymerase at the first cytidine:guanosine base pair in the substrate such that the A3A target is only ssDNA when it is associated with the T7 RNA polymerase. In contrast, all   www.nature.com/scientificreports/ previous in vitro systems used to investigate A3A-mediated deamination of transcription, T7 RNA polymerase can transcribe past potential A3A-deamination sites. Establishment of extended RNA-loops during this process would allow cytidines within the non-transcribed DNA strand to remain single stranded in the absence of the T7 polymerase. Thus, our data indicate that that A3A is unlikely to actively deaminate cytidines within RNA polymerase-associated transcription bubbles, but instead likely requires the formation of extended R-loops behind the extending polymerase to enable A3A to preferentially deaminate the non-transcribed DNA strand.

Discussion
Despite having lesser enrichment than other types of cancer mutations in some chromosomal features like late replicating regions and heterochromatin 24,64 , APOBEC-induced mutations are still non-randomly distributed with respect to the lagging strand template 23,24,26,53,65 , tRNA and rDNA genes 27,54,55 , and hairpin-forming sequences 13,25 . However, the underlying mechanisms that contribute to the distribution of APOBEC-induced mutations in many cases are unknown. A3A has been implicated as a major source for APOBEC-signature  Figure 5. APOBEC3A activity is low on transcription bubbles. Deaminase assay of 0, 20, 50, or 100 nM A3A during in vitro T7 RNA polymerase transcription, stalled at T in AAGT within the transcribed strand (i.e. across from a TTCA on the non-transcribed strand). Reactions were carried out with 50 units of T7 RNA polymerase or an equal volume of 50% glycerol, and GTP, CTP, and UTP but no ATP, to cause the stalling. Reactions were incubated for 24 h at 37 °C in the presence of Uracil DNA Glycosylase, stopped by the addition of Proteinase K and SDS buffer, then heated for 10 min at 95 °C in formamide buffer before separating out product from substrate via denaturing polyacrylamide gels. Gels were imaged for Cy5 fluorescent tags (upper panel) and after staining with GelRed to observed total nucleic acids produced in the reaction (lower panel). S denotes the substrate band; P denotes the product band. Full-length gel images are presented in Fig. S8 www.nature.com/scientificreports/ mutations in human cancers, in part due to its strong preference for deaminating cytidines in hairpin structures, which are also frequently mutated at APOBEC target sequences in tumors. Consistent with previous results, we find that A3A prefers ssDNA within hairpin secondary structures for deamination. Hairpins containing a 4 nt loop were favored over those with larger ssDNA loops as was a centrally positioned target cytidine within 8 nt loops. Among different substrates mimicking potentially physiological sources of ssDNA, A3A prefers substrates with small ssDNA regions in the order of: hairpin > ssDNA gap > replication fork > bubble. Within a cell, ssDNA regions, especially those associated with lagging strand synthesis are likely to be significantly longer. We expect that A3A will exhibit similar activity towards these regions as towards the fully linear ssDNA substrate, unless these regions contain sequences that allow for local secondary structures to form. One potential caveat of these data is the presence of a c-terminal strep-tag on A3A, which could influence A3A activity, however, based on the small size of the tag (8 amino acids), we believe this impact would be small. Supporting this assertion, un-tagged A3A and the c-terminal strep-tagged A3A display similar levels of cytidine deaminase activity on a hairpin substrate with a 4 bp loop (Fig. S9). Surprisingly, A3A only preferred hairpin ssDNA 2-to fivefold over fully linear ssDNA, a less drastic fold difference than has been previously reported 13 . Favoring of the hairpin structure becomes more pronounced at lower A3A concentrations. Still, the relatively small preference for hairpin substrates over ssDNA substrates suggests that factors beyond A3A's innate structural preference for hairpin DNA binding likely influence A3A specificity in cells and give rise to the over 200-fold preference for APOBEC-induced mutagenesis in hairpin-forming sequences compared to linear DNA. In particular, hairpins may be less likely to be occluded by ssDNA-binding proteins within a cell, which would enhance A3A specificity for hairpin-forming sites. We found that A3A activity towards cytidines within linear ssDNA was reduced when whole cell extracts from the SKBR3 breast cancer cell line was included in deamination reactions, while activity on the hairpin substrate was unaffected. We initially hypothesized that the increased length of ssDNA in the linear substrates (i.e. 45 nt long) would allow binding of ssDNA-binding proteins whose footprint would be unable to bind to the shorter 4 nt ssDNA contained within the hairpin substrate. However, A3A activity towards a 4 nt ssDNA gap substrate was also reduced by SKBR3 whole cell extracts, indicating that the bent conformation of the 4 nt loop of the hairpin may be the primary barrier to the binding of putative competitor proteins.
The identity of specific protein factors that influence the distribution of A3A-induced mutation are unknown. RPA is one likely candidate as it inhibits the activity of multiple APOBECs, including A3A, on linear ssDNA 45,53,60,66,67 , and binds to a relatively large stretch of ssDNA (~ 30 nt) 61,62 , suggesting that it may be unable to inhibit A3A activity on smaller, structured ssDNA regions. Supporting this, RPA efficiently inhibits A3A activity at near equimolar ratios of RPA to a 40 nt linear ssDNA substrate, while impacting A3A activity on a hairpin substrate with a 4 nt loop to a lesser extent. Thus, the binding of longer ssDNAs, formed during replication or in extended DNA R-loops, may provide protection from mutation, enhancing the preferential mutagenesis on hairpin-forming sequences. RPA typically binds and prevents the formation of DNA secondary structures, protecting the ssDNA from DNA damaging agents like A3A. In cancers, replication stress-induced ssDNA formation can exhaust RPA levels 68 , leaving sections of ssDNA in replication forks unprotected. These long regions of ssDNA can then form hairpins that would otherwise be repressed by RPA binding and further make themselves targetable by A3A. As a result, A3A-induced mutations are enriched at hairpin-forming regions of cancer genomes and show strand bias for replication intermediates.
It is also possible that larger ssDNA binding proteins, like RNA Polymerase occlude A3A binding. In our in vitro transcription assay, a 99 kDa T7 RNA Polymerase is stalled across from the TTCA target site and its proximity likely inhibits the smaller 23 kDa A3A by blocking access to the target site. Transcription of protein coding genes in human cells is carried out by the RNA Polymerase II complex, which is comprised of 10 subunits that together constitute a mass of ~ 500 kDa 69 . The large size of this complex likely occludes A3A and prevents A3A from deaminating and eventually mutating transcription intermediates. Hypothetically, the non-transcribed strand would be single-stranded for a period during transcription and could be deaminated by APOBEC3s. Indeed, there is experimental evidence for A3A deaminating transcription intermediates in vitro 29,31,45 . However, only a small enrichment has been found for APOBEC-signature mutations on the non-transcribed strand in human cancers 23,24,26 . Consistent with this result, we saw nearly undetectable A3A activity on the non-transcribed strand during stalled T7 RNA polymerase transcription. Additionally, A3A activity was also reduced towards a ssDNA bubble substrate that mimicked the static structure of a transcription intermediate. These data suggest that A3A is unlikely to be active at transcription bubbles, especially those directly blocked by the presence of the elongating RNA polymerase. Instead, transcription would likely require the creation of an elongated ssDNA bubble by the formation of an R-loop, allowing A3A sufficient access to the ssDNA for deamination. Consequently, transcriptional strand asymmetry of A3A-induced mutations in human cancers may only be present in DNA regions prone to R-loop formation.

Methods
Purification of APOBEC3A. A3A was purified as in 17 . Briefly, HEK293T cells were transduced with lentiviral vectors encoding the APOBEC3A gene under control of the Tet repressor. Cell populations were expanded to approximately sixty 10 cm dishes, after which A3A protein expression was induced with doxycycline for approximately 84 h. A3A-expressing cells were then harvested via trypsinization and centrifugation. Lysates were prepared in buffer containing protease inhibitors and DTT, sonicated, and treated with Benzonase. Insoluble materials were removed via centrifugation and filtering. The Strep-tagged A3A was batch bound to Streptactin resin and put over a glass econo-column. The column was washed to remove any non-specific binding proteins and then the Strep-tagged A3A was eluted using buffer containing 20 nM d-desthiobiotin. Eluted A3A was further purified through an 1 mL Enrich-Q (BioRad) column. A3A-containing fractions were pooled, con- RPA purification. Three subunit human RPA was expressed from plasmid p11d-tRPA(123), (Addgene; 102613), in T7 express E. coli (NEB; C2566I). Human RPA was purified following protocols previously described in 70 , using chromatography on subsequent 5 mL HiTrap Blue HP (Cytiva lifesciences), 2 mL hydroxyapatite CHT-type1 (BioRad), and 1 mL Enrich-Q (BioRad) columns.
Characterization of whole cell extracts from HEK293T expressing A3A. Creation of HEK293T-TETR cells and pTM-664, a lentiviral vector for inducible expression of Strep-tagged A3A, were described previously 17 . The coding sequence of A3A with its natural stop codon (no tag) and A3A intron 3 was cloned into the HincII and EcoRV sites of pENTR1A no ccDB (Addgene, #17398). An LR-Clonase reaction between the resulting plasmid and pTM-637 17 was used to create pTM-925, a lentiviral vector for inducible expression of untagged A3A. HEK293T-TETR cells were transfected with plasmids psPAX2 (Addgene, #12260), pMD2.G (Addgene, #12259), and either pTM-664 or pTM-925 to produce lentiviral particles, which were subsequently used to transduce HEK-293T-TETR cells using LentiBlast (OZ Biosciences Assembling substrate structures in vitro. The oligonucleotides used were obtained from IDT technologies and sequences are listed in Table S1. To make the 4 nt set of substrates, the following oligos were incubated: Replication fork (ALH256, ALH232, ALH234, ALH235), ssDNA gap (ALH256, ALH234, ALH240), Bubble (ALH256, ALH241). To make the 10 nt set of substrates, the following oligos were incubated: Replication fork (ALH257, ALH232, ALH234, ALH235), ssDNA gap (ALH257, ALH234, ALH240), and Bubble (ALH257, ALH244). Substrates with complementary oligonucleotides were annealed by mixing the corresponding oligos in buffer containing 50 mM Tris, pH 7.5 and 100 mM NaCl before heating at 95 °C for 5 min and then slowly cooling (1 °C per minute) in a BioRad Thermocycler. Annealing reactions were then concentrated by EtOH precipitation and run out on a large native 5% polyacrylamide gel (1 × TBE), bands imaged on the Biorad ChemiDoc for Cy5, and excised. DNA was isolated from excised bands using the freeze/squeeze method. Briefly, bands were minced and resuspended in 200 µL of 10 mM Tris, 0.1 mM EDTA buffer, and put through 3 cycles of vortexing (20 s), flash-froze in liquid nitrogen, thawed (65 °C) before incubating at 37 °C, and rotated overnight. Gel debris was separated from DNA-containing supernatant using a 1 mL filter column and spinning at 13 k rpm for 10 min at 4 °C, then DNA was EtOH precipitated from the supernatant. The band with the correct annealing product was determined by restriction digest, as double-stranded sections would have a combination of PacI-HF, KpnI or DraI sites, and would only cut if annealed properly (Fig. S2).
Measurement of deaminase activity. All deaminase activity reactions in this manuscript were conducted with a single preparation of purified A3A. Deaminase activity assays with A3A (with concentrations as indicated), 5 units of Uracil DNA glycosylase (NEB), 20 nM oligonucleotide substrate, 20 mM Tris HCl pH 7.5, 1 mM DTT, and 1 mM EDTA, in a 20 μL volume were incubated for 30 min (unless otherwise indicated) at 37 °C and terminated by the addition of 7 µL stop buffer (10 mM Tris, 1 mM EDTA, 0.5% SDS). Proteinase K (0.05 mg/ mL final) was then added and incubated at 37 °C for another 30 min to limit protein binding to substrates during electrophoresis. Alternatively, cytidine deaminase reactions to test the impact of the strep-tag on A3A activity contained 20 µg cell extract from un-transduced HEK293T cells or HEK293T cells expressing un-tagged A3A or C-terminally Strep-tagged A3A and 1 µM oTM-814 and were incubated for 5 min at 37 °C prior to termination as conducted for assays described above with purified A3A. We then added 23.2 µL formamide buffer (39.4% formamide, 7.5 mM EDTA, and 0.010% SDS), and 4.8 µL of 1 M NaOH (0.1 M final) and incubated at 95 °C for 10 min to break abasic sites present in the substrates. Reactions used to evaluate the impact of RPA on A3A activity also contained 10 nM of unlabeled ssDNA that lacks a cytidine. Deaminase activity products were separated from undigested substrate on pre-warmed 15% polyacrylamide gels with 7.9 M urea (7 cm) in 1xTBE buffer at 15 W for 15 min. Gels were imaged on a BioRad Chemidoc using the Cy5 setting. Percent substrate cleaved was determined by quantification of the intensity of the substrate band and cleavage product bands using the Volume tool on BioRad's Image Lab software version 5.2, and by using the following equation = (Product intensity)/ (substrate + product intensity) × 100.
For deaminase assays with the addition of whole cell lysate, 40 µg of SKBR3 whole cell lysate (generated as in 17 ) or buffer were pre-incubated at 37 °C for 30 min with the 20 nM substrate in 20 mM Tris HCl pH 7.5, 1 mM DTT, and 1 mM EDTA with 5 units of UDG, in a 19 µL volume prior to the addition of A3A. After preincubation, 1 µL of A3A was added to 20 nM final concentration followed by incubation for an additional 30 min Scientific Reports | (2021) 11:21008 | https://doi.org/10.1038/s41598-021-00435-y www.nature.com/scientificreports/ at 37 °C and was processed like the other deaminase assays. Similarly, RPA-containing deaminase assays, involved an initial 30 min pre-incubation of the deamination substrate with a twofold excess of RPA relative to substrate at 37 °C, prior to the addition of A3A.
RPA EMSA with linear or hairpin substrate. RPA was excluded from, or added to 25, 50, or 100 nM to 50 nM 5′-Cy5-labeled oligonucleotide substrate (hairpin-forming: oTM-814 or linear ssDNA: oTM-910), 20 mM Tris-HCl pH 7.5, 1 mM DTT, 1 mM EDTA, with 5 units of UDG. Binding reactions were incubated at room temperature for 15 min, before adding glycerol to 30%, and electrophoresing through a 6% native polyacrylamide gel on ice at 80 V for 1 h. Gels were imaged on a BioRad Chemidoc using the Cy5 setting. Percent substrate bound was determined by quantification of the intensity of size-shifted (bound) band relative to unbound substrate, using the "Volume" tool on BioRad's Image Lab software version 6.0.1.
In vitro transcription. The dsDNA substrate was prepared and purified as above using the oligos ALH287 and ALH288 (see Table S1). The gel band with the correct annealing product was determined by restriction digest as double-stranded sections contained AseI and PsiI sites that would only be cut if annealed properly (Fig. S3). Deaminase activity assays with purified APOBEC3A (with concentrations as indicated), 50 units of T7 RNA Polymerase (NEB), 5 units of Uracil DNA glycosylase (NEB), 20 nM oligonucleotide substrate, 1 mM each of GTP, CTP, and UTP, 40 mM Tris HCl pH 8, 6 mM MgCl 2 , 1 mM DTT, and 1 mM EDTA, in a 20 μL volume were incubated for 24 h at 37 °C. Control reactions contained no T7 Polymerase or no A3A enzyme to ensure that deamination was transcription-dependent. In samples without enzyme, an equal volume of 50% glycerol was used instead. Reactions were terminated, products separated, and abundances quantified as described above for the measurement of deaminase activity. All full-length gel images of cytidine deaminase assays are presented in the supplementary information (Figs. S4-S8).