Development and characterization of a DNA aptamer for MLL-AF9 expressing acute myeloid leukemia cells using whole cell-SELEX

Current classes of cancer therapeutics have negative side effects stemming from off-target cytotoxicity. One way to avoid this would be to use a drug delivery system decorated with targeting moieties, such as an aptamer, if a targeted aptamer is available. In this study, aptamers were selected against acute myeloid leukemia (AML) cells expressing the MLL-AF9 oncogene through systematic evolution of ligands by exponential enrichment (SELEX). Twelve rounds of SELEX, including two counter selections against fibroblast cells, were completed. Aptamer pools were sequenced, and three candidate sequences were identified. These sequences consisted of two 23-base primer regions flanking a 30-base central domain. Binding studies were performed using flow cytometry, and the lead sequence had a binding constant of 37.5 + / − 2.5 nM to AML cells, while displaying no binding to fibroblast or umbilical cord blood cells at 200 nM. A truncation study of the lead sequence was done using nine shortened sequences, and showed the 5′ primer was not important for binding. The lead sequence was tested against seven AML patient cultures, and five cultures showed binding at 200 nM. In summary, a DNA aptamer specific to AML cells was developed and characterized for future drug-aptamer conjugates.

www.nature.com/scientificreports/ A summary schematic of the SELEX method workflow is shown in Figure S1. Briefly, a single-stranded DNA library made up of a template that had a 30 base randomized region flanked by known 5′-and 3′-primer regions was subjected to consecutive binding and elution to enrich DNA sequences specific to the target cell. Selection was done by heating the DNA pool to 95 °C. Once cooled to room temperature, one million cells were suspended in the DNA pool-buffer solution and incubated. After incubation, cells were pelleted, and the supernatant was removed and discarded to partition non-binding sequences. Cells were resuspended in sterile water and bound sequences were recovered with heat. Samples were then centrifuged and the supernatant was used in the next round as the enriched DNA pool. Counter selection was done by incubating an enriched pool with fibroblast cells, pelleting by centrifugation to partition bound sequences, and taking the supernatant (containing nonbinding sequences) and directly incubating it with MA9Ras cells. Two counter selection rounds were done to eliminate off-target, non-specific binding following rounds 5 and 11. DNA pools collected after each round of selection were amplified and monitored by SYBR green qPCR. The cycle in which fluorescence can be detected is termed quantitation cycle (Cq for short) and is theresult generated by the thermocycler software following a qPCR experiment (representative traces are shown in Figure S2). A lower Cq value reflects higher initial copy numbers of the target, which is indicative of more binding sequences being recovered from the cell sample. The Cq values for a designated selection round are reported in Fig. 2. Increased library binding was indicated by an observed decrease in Cq number: the first round of selection yielded a Cq of 18.7 whereas in the final round of selection the Cq had decreased to 13.8. Other studies using qPCR have shown that a reduction in Cq values between selection rounds and an increase in the fluorescence signal is an indication of pool enrichment 61,62 . The consistency of the Cq value between thirteen and fourteen in the last three selections was encouraging evidencethat the pool had reached a maximum enrichment, and so the libraries were evaluated using high throughput sequencing following the 12th round.    63 which boasts several algorithms for aptamer sequencing analyses. The software generates several summary plots and values to gauge the success of the selection, including the total processed reads, the distribution of these reads between each pool analysed, the % base distribution of each pool analysed, and the unique fraction of each library. These data are summarized in Figure S3. Top sequence candidates were identified by sorting the sequencing data by highest count per million ( Figure S4). Three sequences of interest were identified based on their high relative frequency compared to all other sequences, and their enrichment trends ( Figure S4). Generally, in a sequencing analysis, if there is enough homogeneity of the enriched libraries clusters of aptamer families can be identified using the software. Though clustering failed for this selection due to the high fraction of the pool that was unique, a seven base consensus motif and its derivatives were identified in the top sequence candidates ( Figure S5). Specifically, the count generated by AptaSUITE is the number of that sequence within a sequenced round normalized per million, whereas the enrichment is the ratio of the count from one sequenced round to the next. A high enrichment means that a given sequence appeared more often in the later round. Comparing enrichment of a sequence, and not just the copy number, from earlier to later selection rounds, allows for the choice of sequences with improved aptamer-target binding 64 . Full sequences of the three chosen aptamers can be found in Table 1 along with their measured apparent dissociation constant. The secondary structures and predicted DeltaG of all candidate aptamers were predicted using RNA structure software. All three candidates have relatively simple structures, with KGE01 and KGE02 having similar structures (Fig. 3B,D,F); however, the structure may be more complex when interacting with their target. DeltaG values for KGE01, KGE02, and KGE03 were − 12.8 kJ, − 6.8 kJ, and − 5.7 kJ, respectively. therefore, it is necessary to determine the binding constants to coincide with the predicted structures.
Screening for the binding affinity of selected aptamers and K D determination. To evaluate the binding affinity, the three selected aptamers were modified with a 5′ fluorescein tag and were subjected to binding studies via flow cytometry analysis against the target MA9Ras cell line to determine their apparent dissociation constant (K D ) values. All K D values obtained were in the nanomolar range, which is typical for aptamers developed against cancer cells using cell-SELEX 65 . Two of the selected aptamers, KGE03 and KGE02, showed K D values in the low nanomolar range (12.4 + / − 2.2 nM and 37.5 + / − 2.5 nM, respectively) while the third, KGE01, showed a higher value (153.9 + / − 98.2 nM) (Fig. 3). KGE02 was chosen as the lead aptamer because of three qualities: the high enrichment between rounds, the nanomolar binding constant to the target, and the double hairpin secondary structure with the lower DeltaG, which allows for possible truncation. KGE02 showed the highest mean fluorescence intensity of the three aptamers and had a Hill Coefficient of 2, as compared to 1 for the other aptamers (fit equation data not shown). A Hill Coefficient greater than one indicates positive cooperativity in which the binding of one aptamer could facilitate the binding of subsequent aptamers. However, before further testing, it was necessary to first check the specificity of this particular sequence.
Binding specificity of aptamers to cancer and normal cell cultures. The binding specificity of KGE02 was further tested for normal and cancer cell lines via flow cytometry because of its predicted threedimensional structure and low K D . The predicted structure had two central short hairpin loops flanked with 9 or more bases on either end, which would allow for potential truncation. As expected, KGE02 bound with a high affinity, shown by the shift in fluorescence, to the target MA9Ras cells (red line), but showed no affinity for WI-38, a normal human fetal lung fibroblast cells (blue line) (Fig. 4A). To further prove its specificity, KGE02 was incubated with two different lines of CD34 + human umbilical cord blood cell cultures. Results showed it had no affinity to either cell line (Fig. 4B). It is worth noting that the MA9Ras cell line was created by expressing the MLL/AF9 and NRAS(G12D) oncogenes in CD34 + UCB cells 58 .
We next used fluorescence and confocal microscopy to further confirm binding and gain some understanding of the localization of the cognate biomarker for KGE02. Fluorescein-tagged KGE02 was incubated with the cells and fluorescence images before and after a DNA nuclease treatment can be seen in Fig. 5. KGE02 shows distinct localization in the membrane of the MA9Ras cells (Fig. 5A). A scrambled version of the KGE02 sequence (Table 1) shows minimal binding at the same concentration (Fig. 5C), confirming that the interaction noted is more than non-specific oligonucleotide binding. After a DNA nuclease treatment of the KGE02-labelled cells, the www.nature.com/scientificreports/ majority of the fluorescence signal is lost (Fig. 5B), suggesting that the associated target for the aptamer is likely membrane-bound. We confirmed these observations by confocal microscopy. Once again, fluorescence from tagged-KGE-02 can be observed with a punctate localization surrounding the cells (Fig. 5D) which is mostly, but not completely, lost after DNA nuclease treatment (Fig. 5E). This fluorescence was notably absent when the cells are treated with tagged-scrambled sequence ( Fig. 5F and Figure S6). Having determined its selectivity to the target and gathered some information about the potential location of the cognate target, the next step was to truncate to a shorter sequence to lower the cost of synthesis and to perhaps increase cell uptake in future applications.
Truncation of KGE02 and assessment of their binding abilities. Once KGE02 was proven to be specific, the next step was to truncate the 76-base sequence to create a smaller binding sequence. Truncated sequences were created based on the predicted three-dimensional structure (Fig. 3B), which shows two hairpin www.nature.com/scientificreports/ loops formed between bases 20-70. Truncation from the 5′-end produced three sequences, in which either a portion (10 or 20 bases), or the entire 5′ primer domain (23 bases) was removed. Truncation from the 3′-end produced two sequences, in which either 20 bases (interrupting the second hairpin loop) or 42 bases (interrupting the first hairpin loop) were removed. Then, to determine if one or both hairpin loops were important to binding, three more sequences were produced. These sequences included both loops , just the first loop , and just the second loop . The final truncated sequence was the central region (30 bases) lacking both primers . Sequence lengths are listed in Fig. 6 (left side).
To determine binding ability, a blocking assay was performed. The premise of the assay was that if truncated sequences bound better than the 5′-fluorescein full length KGE02, the fluorescence signal would be decreased, based on competitive displacement. The larger the decrease in fluorescence, the greater the binding ability of the truncated sequence. MA9Ras cells were first incubated with 1 µM of the truncated sequence (non-fluorescent) for 60 min. Then, 200 nM of 5′-fluorescein tagged KGE02 (Fl-Apt) was directly added to the reaction and incubated for 60 min. Reactions were then pelleted, supernatant removed, resuspended in fresh buffer, and the total fluorescence of the solution was measured. As a maximum fluorescence control, MA9Ras cells were incubated with just 200 nM of 5′-fluorescein tagged KGE02; this gave the maximum fluorescence. As a minimum fluorescence control, MA9Ras cells were incubated first with 1 µM of full length non-fluorescent KGE02, followed by 200 nM of Fl-Apt; this gave the minimum fluorescence (maximum blocked). Results are quantified in Fig. 6.
Based on truncation results, it can be assumed that both hairpin loops are important in aptamer binding. Sequences that interrupted either loop (20-46, 42-70, 3′-20, and 3′-42) all showed little to no decrease in fluorescence. While the 30 base central domain lacking both primers showed a decrease in fluorescence, it was not statistically significant (p > 0.05) compared to the maximum fluorescence control from just Fl-Apt alone.  www.nature.com/scientificreports/ Sequences that were truncated from the 5' prime end showed the greatest blocking ability. Sequences 5′-10, 5′-20, and 5′-23 all showed significant decrease in fluorescence from just Fl-Apt alone and encroached toward the binding ability of the minimum fluorescence control. It can be concluded that both hairpin loops are necessary for aptamer binding and the 5′ primer is not important to the binding ability of the aptamer. To check that the removal of the 5′ primer did not change the predicted secondary structure, the structure of the shortened 53-base sequence was predicted using RNA structure software. The double hairpin structure in the full sequence remains uninterrupted in the shortened sequence, which can be seen in Figure S7.
Binding of KGE02 to primary AML patient cells. 5′ There is a mix of karyotypes among the samples that show binding, and the two samples that did not show binding are not similar in karyotype. Samples 35 and 97 could be considered similar subtypes, called RAM immunophenotype, which was recently identified as a very aggressive type of AML 66,67 ; the aptamer bound one RAM sample and not the other. This shows that there is no correlation to karyotype, or subtype of AML, and aptamer binding. However, it does show that the aptamer has been designed to bind a biomarker target on many AML subtypes, but the biomarker target does not appear on healthy umbilical cord blood cells that lack binding. Future work will involve efforts to characterize the molecular target of the aptamer 68,69 .

Discussion
Using SELEX to produce aptamers that bind whole cells has been used in literature for a variety of targets. While there are some AML aptamers designed by whole-cell SELEX 70 , there is still a lack of published aptamers that bind multiple types of acute myeloid leukemia cells without using specific markers. Our technique and approach allowed for the development of a DNA library pool that binds MLL-AF9 RAS cells, proving that the whole-cell SELEX process using a cell line expressing a common oncogene was successful.
Characterization of cell-based aptamers relies strongly on flow cytometry, where one end of the aptamer is labeled with a fluorophore. High throughput sequencing of multiple selection rounds allows for the ability to compare enrichment and count between sequences. This allowed for the selection of three aptamers with different and unique sequences. The binding affinity of the three chosen aptamers was done via flow cytometry using a fluorescein tag, which is a common fluorophore and is commercially available. All three aptamers had binding constants in the nanomolar range (Table 1), so the secondary structures had to be considered. While KGE03 had the lowest binding constant, its secondary structure was complex, with multiple loops and junctures. KGE01 www.nature.com/scientificreports/ had a simpler structure but had the weakest binding. KGE02 had the desired low nanomolar binding constant 37.5 + / − 2.5 nM, and the secondary structure was not only simple (containing two hairpin loops that formed almost a key-like structure) but also had room on both the 5'-prime and 3'-prime ends for truncation (Fig. 3). Prior to truncation, KGE02 was subjected to binding affinity studies. The counter selection in the SELEX process was fibroblast cells; therefore, it was necessary to test whether it showed any affinity. Fibroblast cells were used because of the desire to directly eliminate those sequences that disperse to organs or tissues not of interest in future clinical applications. While there was a slight shift in fluorescence, equating to some binding to fibroblast cells, the shift was not as evident as the one seen when the aptamer was incubated with the target MA9Ras cells. Even though the target cells were made by expressing leukemic oncogenes in a healthy CD34 + human umbilical blood cord cell cultures, KGE02 showed no apparent shift in binding, which means the biomarker responsible for binding is only expressed on AML cells (Figs. 4, 5).
Because the full-length aptamer was 76 bases long, a truncation study was done to try and make the sequence shorter, making it more cost effect and easier to manufacture. The secondary structure showed two hairpin loops, so truncation sequences consisted of sequences that cut the primers and the hairpin loops. Truncation of the 3'-prime end reduced its ability to bind, while loss of the 5' primer still allowed the aptamer to bind (Fig. 6).
Because the goal is to be able to use the aptamer clinically, the affinity of KGE02 towards other subtypes of AML was tested. Five of the seven relapse or refractory AML samples showed some binding affinity, and there was no real correlation between binding and cell karyotype (Fig. 7). This indicates that KGE02 may effectively target a wide array of pediatric AML types and shows promise of future treatment with clinical drug conjugates.

Materials and methods
SELEX library and primers. The ssDNA library contained a 30-base randomized region flanked by 23 base PCR primer sequences (5′-TAG GGA AGA GAA GGA CAT ATGAT-N30-TTG ACT AGT ACA TGA CCA CTTGA-3′). Fluorescein labeled 5′ primer and poly-A 3′ primer were used during PCR. The ssDNA library was purchased from TriLink Biotechnologies (United States), and the primers were purchased from Eurofins Genomics (United States). The synthesized ssDNA was purified using 8% polyacrylamide gel electrophoresis. Cell culture. Umbilical cord blood (UCB) CD34 + cells were isolated with the EasySep CD34 + Selection Kit (StemCell Technologies). The MA9.3Ras cell line has been described previously 68 . Pre-existing and de-identified patient derived xenograft (PDX) cell lines were obtained from the Pediatric Avatar Program at Cincinnati Children's Hospital. MA9.3Ras AML cells were cultured in IMDM 20% bovine calf serum. Human umbilical cord blood (UCB) cultures and primary patient cell lines were cultured in IMDM 20% bovine calf serum with 1X Pen-Strep antibiotics and supplemented with 10 ng/mL of SCF, IL-3, IL-6, Flt-3L and TPO. Fibroblast cells were cultured in 5% Fetal Bovine Serum, 1X Antibiotic/Antimycotic Solution, 1 mM Sodium Pyruvate, 2 mM Glutamax, 10 ng/mL Epidermal Growth Factor, 5 µg/mL Insulin, 0.5 µg/mL Hydrocortisone in DMEM.   www.nature.com/scientificreports/ dren's Hospital after donating mothers provided informed consent (IRB #02-3-4x). De-identified PDX-derived leukemia cell lines were previously generated from diagnostic patient material originally obtained and utilized according to IRB protocols #2008-0021 and #2010-0658. Informed written consent of parents/guardians and assent of patient over 11 years old was obtained. All experiments were performed in accordance with relevant guidelines and regulations.

Systematic evolution of ligands by exponential enrichment (SELEX) procedure (in vitro). Pos-
itive selection: DNA pool (20-30 µL) was heated at 95 °C for 5 min. The pool was removed from the heat and Binding Buffer (10 mM Tris-HCl pH 7.5, 2 mM MgCl 2 , 140 mM NaCl) was added to 500 µL total volume, and then let to cool to room temperature. One million target cells were counted, pelleted, and washed twice with Binding Buffer. The cells were resuspended in the 500 µL solution containing DNA pool. Sample was incubated at 37 °C for 30 min, while shaking. The cells were pelleted and washed twice with Binding Buffer, then resuspended in 500 µL of sterile water. Cells were heated at 95 °C for 15 min to lyse. The supernatant was removed and used as next round DNA pool. For negative selection: DNA pool (20-30 µL) was heated at 95 °C for 5 min.
The pool was removed from heat and Binding Buffer was added to 500 µL total volume, and then let to cool to room temperature. One million counter cells were counted, pelleted, and washed twice with Binding Buffer. The cells were resuspended in the 500 µL solution containing DNA pool, then incubated at 37 °C for 30 min, while shaking. The cells were pelleted, and the supernatant was removed. One million target cells were counted, pelleted, and washed twice with Binding Buffer. The supernatant was added to target cells and the positive selection protocol was finished.
Sequencing and structure prediction. High throughput sequencing (HTS) was performed using Illuminia sequencing. The enriched ssDNA pool from selection rounds 0, 6, 8, 11, 12+, and 12-SELEX were amplified via PCR using Illuminia special adapters TruSeq primers) to a minimum number of cycles (20 cycles). PCR products were purified using 8% PAGE and the DNA quantified using a Nanodrop spectrophotometer (Thermo Fisher, Canada). All amplified pools were combined to provide a total of 75 ng of DNA in each pool. Amplified pools were then sequenced at Carleton University using Illumina MiSeq sequencing platform. AptaSUITE 63 (https:// drive nbyen tropy. github. io/) was used as the software to analyze the sequencing data and RNAstructure software (https:// rna. urmc. roche ster. edu/ RNAst ructu reWeb/) was used to predict the secondary structure of the candidate sequences.
Binding screen and K D determination using flow cytometry. Binding  www.nature.com/scientificreports/ Fluorescence and confocal microscopy. For preparing the slides for microscopy, coverslips were coated with poly-L-lysine (P4832) purchased from Sigma-Aldrich for 4 h at room temperature. Cells were grown on coverslips and treated with either 200 nM 6FAM-tagged KGE-02 or 6-FAM-tagged scrambled aptamer (5′-6-FAM-ACA ACG GTA TGT TAT GTA TAT CAA TAC TAA CAG GTA CAG TCG GAT GAT TCG AAC  ATA TCG GCG GTA GGA ACA CAG A-3′). After cell treatment with aptamer, cells were fixed with 4% paraformaldehyde for 10 min at room temperature, then washed with binding buffer. To remove the aptamer from the cell surface, the aptamer-coated cells were incubated in binding buffer containing 2.5Unit/μL DNase I, amplification grade (AMPD1), (Sigma-Aldrich) at 37 °C for 10 min. In all treatments, the cells were washed with the binding buffer and were imaged by fluorescence microscopy using an EVOS FL fluorescence microscope with a 40 × objective. Cells were imaged on both brightfield and GFP filters, and the images merged.
For the confocal experiments cells were also stained for 15 min at room temperature with PureBlu DAPI Nuclear Staining Dye (Bio-Rad, Hercules, California) followed with washing the coverslips with 1 × PBS. Then we applied the appropriate volume of mounting medium and sealed the coverslip with nail polish and observed them under the confocal microscope. Samples were run in triplicate. High magnification confocal photomicrographs were acquired using a Nikon C2 confocal system (Nikon Instruments Inc., Mississauga, Canada) on an Eclipse Ti2 inverted microscope (Nikon) with a Plan Apochromat 40 × objective lens (0.95 numerical aperture). Cells were excited using lasers with excitation lights of 405-nm and 510-nm wavelength to visualize 4′,6-diamidino-2-phenylindole (DAPI) and fluorescein-labeled aptamers, respectively. All images were acquired with identical laser power settings. Images were processed with NIS-Elements software (Nikon Corporation, Konan, Japan) so that DAPI and fluorescein-labeling were pseudocolored blue and green, respectively. Image brightness and contrast were adjusted for each channel via the range of "lookup table" (LUT) values. The LUT range established for DNase-treated cells were applied to images from all treatment conditions to enable an accurate comparison of localization and fluorescence intensity between cell treatment conditions. Truncation study. Truncation studies were done as a fluorescence blocking assay. Reactions contained 500,000 cells (MA9Ras) in 250 µL of Binding Buffer (2 mM MgCl 2 -HBSS). All aptamer sequences were heated at 95 °C in Binding Buffer for 5 min and cooled to room temperature before addition. The first incubation was done with 1 µM of non-fluorescent blocking sequence or no aptamer at room temperature for 60 min, shaking. After 60 min, 200 nM 5′-fluorescein tagged aptamer was added directly to all samples for the second incubation at room temperature for 60 min, shaking. Cells were pelleted and washed in 2 mM MgCl 2 -100 µg/mL BSA-HBSS. Samples were resuspended in 250 µL Binding Buffer and analyzed by a fluorimeter with a plate reader. Samples were done in triplicate. P values were calculated using Kaleidagraph.