Modeling clonal hematopoiesis in umbilical cord blood cells by CRISPR/Cas9

To investigate clonal hematopoiesis associated gene mutations in vitro and to unravel the direct impact on the human stem and progenitor cell (HSPC) compartment, we targeted healthy, young hematopoietic progenitor cells, derived from umbilical cord blood samples, with CRISPR/Cas9 technology. Site-specific mutations were introduced in defined regions of DNMT3A, TET2, and ASXL1 in CD34+ progenitor cells that were subsequently analyzed in short-term as well as long-term in vitro culture assays to assess self-renewal and differentiation capacities. Colony-forming unit (CFU) assays revealed enhanced self-renewal of TET2 mutated (TET2mut) cells, whereas ASXL1mut as well as DNMT3Amut cells did not reveal significant changes in short-term culture. Strikingly, enhanced colony formation could be detected in long-term culture experiments in all mutants, indicating increased self-renewal capacities. While we could also demonstrate preferential clonal expansion of distinct cell clones for all mutants, the clonal composition after long-term culture revealed a mutation-specific impact on HSPCs. Thus, by using primary umbilical cord blood cells, we were able to investigate epigenetic driver mutations without confounding factors like age or a complex mutational landscape, and our findings provide evidence for a direct impact of clonal hematopoiesis-associated mutations on self-renewal and clonal composition of human stem and progenitor cells.


Sample Acquisition and Primary Cell Culture
The freshly collected blood was filtered through a 40-µM strainer and diluted with one volume PBS/ 1 mM EDTA. Mononuclear cells were isolated via density centrifugation. After washing the cell pellet, remaining erythrocytes were eliminated by incubating the cell pellet for 30 min at 4 °C in red cell lysis buffer (155 mM NH4Cl, 10 mM KHCO3, 0.1 mM EDTA, pH 7.4). Viable cells were quantified via trypan blue exclusion in a Neubauer chamber. Stem and progenitor cells were enriched by magnetic separation using MACS CD34 MicroBead Kit UltraPure (Miltenyi Biotec, Bergisch Gladbach, Germany) according to the manufacturer's instructions. Purified cells were cultured in StemSpan SFEM II (STEMCELL™ Technologies, Vancouver Canada) expansion medium supplemented with SCF, FLT3-L, and TPO 1 (100 ng/ml each, all from PeproTech, Inc., Rocky Hill, New Jersey, United States). For differentiation analysis, the primary CB derived cells were cultured in StemSpan SFEM II supplemented with SCF, FLT3-L (50 ng/ml each), TPO, IL-6 (20 ng/ml each), and IL-3 (10 ng/ml) for up to 3 weeks (short-term culture medium). In Figure S1 different medium compositions are shown. Unmodified cells were cultured in different cytokine combinations to determine best conditions for expansion and short-term culture. For CD34+ expansion after transfection, we used medium B. For short-term culture experiments, the cells were cultured in medium D. The purpose of our short-term liquid culture system was to facilitate unbiased differentiation because we wanted to analyze the direct consequences of DTA mutations in an unconfounded environment.

SDS-PAGE and Western Blotting
Cells were lysed in protein lysis buffer (50 mM Tris, 150 mM NaCl, 1 mM EDTA, 1% NP-40) using a bioruptor sonificator (15 cycles, 15 s on/ 15 s off). The protein lysates were mixed with 4x Laemmli buffer (Bio-Rad Laboratories, Hercules, California, United States) containing 10% β-mercaptoethanol and proteins were separated by electrophoresis using 8-12% SDS-PAGE gels in a Mini-PROTEAN® Tetra Cell system (Bio-Rad Laboratories). Subsequently, proteins were transferred onto Amersham™ Protran® Premium Western-Blotting nitrocellulose membrane (pore size 0.45 µm, Cytiva, Marlborough, Massachusetts, Untied States). Immunodetection was carried out using antibodies listed in Table S2 according to the manufacturer's instructions. Protein bands were visualized using ECL™ Prime Western Blot reagent (Cytiva) and ImageQuant LAS 4000 system (GE Healthcare, Chicago, Illinois, United States). Band intensities were quantified using ImageJ 1.48v. Intensities were normalized to β-Actin and calculated relative to the control sample.

Flow Cytometry Analysis of Cell Surface Markers
Cells were harvested and washed twice in PBS/ 0.5% BSA. The cell pellets were resuspended in antibody mix in PBS/BSA (Table S3). Cells were analyzed on BD FACS™ Canto II flow cytometer (BD: Beckton, Dickson and Company, Franklin Lakes, New Jersey, United States). Analysis and gating was performed in FlowJo v.10 (BD).

Dot Blot Analysis of 5-Methylcytosin and 5-Hydroxymethylcytosin levels
DNA was extracted by isopropanol precipitation and the concentration of all samples was adjusted. DNA was denatured for 5 min at 99 °C and spotted on a positively charged nylon membrane (Merck Group) in 2-fold serial dilutions. The membrane was air-dried for 30 min, UV cross-linked for 3 min on a transilluminator (350 nm), and blocked in 5% milk powder in TBS-T at 4 °C over night. After washing the membrane three times, primary antibodies against 5-hydroxymethylcytosin (Active Motif, Inc., Carlsbad, California, United States) and 5methylcytosin (Cell Signaling Technology, Danvers, Massachusetts, United States) were added in 5% BSA in TBS-T over night at 4 °C (Table S2). The membranes were again washed three times in TBS-T and incubated with secondary antibodies for 90 min. Chemiluminescence was detected using the ImageQuant LAS 4000 system. As loading control, total DNA was stained with 0.2% methylene blue in 0.3% sodium acetate for 15 min. Signal intensities were determined using ImageJ 1.48v. All Dot Blot signals were normalized to the methylene blue loading controls.

Deep Sequencing and Indel Analysis via CRISPRseq
Deep sequencing was performed for advanced indel detection. Following primers were used: ASXL1 5'-GGACCCTCGCAGACATTAAA-3' and 5'-CTCACCACCATCACCACTG-3'; DNMT3A 5'-CTTCAGCGGAGCGAAGAG-3' and 5'-GGTCCTGCTGTGTGGTTAG-3'; TET2 5'-CTGTGAGGCTGCAGTGATT-3' and 5'-CAACCAAAGATTGGGCTTTCC-3'. The resulting amplicons were pooled without overlap and indexing for Illumina sequencing was performed using NEBNext Ultra DNA Library Prep Kit and NEB Next Multiplex Oligos for Illumina (New England Biolabs®) according to the manufacturer's instructions. The libraries were single-end sequenced on a MiSeq sequencer using a MiSeq reagent Kit v2 (300 cycles, Illumina, San Diego, California, United States). Sequencing reads were aligned to hg19 3 using BWA-MEM 4 . The "Unknown indel analysis"-pipeline of CRISPRseq was used as described by Tothova et al. 5 . Aligned reads were filtered for those mapping to the target amplicons of ASXL1, DNMT3A, and TET2. Single nucleotide variants were called with DeepVariant (version 0.9.0) 6 in WGS mode limited to the gene of interest. Mutated reads were uniquely labelled as follows: D for deletion, I for insertion; size of the indel; start position (i.e. I:1 (25457243)). Only insertions and deletions found at least 10 times per sample were used for further analysis.

RNA Sequencing
Cells were harvested on different time points and snap frozen until use. For mRNA-Seq doubleindexed libraries were prepared with TruSeq™ Stranded mRNA Library Prep Kit (Illumina, #20020595). Libraries sizes distribution was assessed using Agilent TapeStation. Libraries were sequenced on Illumina NovaSeq 6000 SP Flowcell in paired-end 2x100 mode. Demultiplexing was performed using bcl2fastq v2.20.0. Reads were aligned to GRCh38 and processed using the rna-seq-star-deseq2 snakemake-workflow version 1.1.2 (https://github.com/snakemake-workflows/rna-seq-star-deseq2; DOI: https://doi.org/10.5281/zenodo.4737358 8 ) with default parameters. Differentially expressed genes were visualized by Prism 9.1.0. Gene set enrichment analysis (GSEA) was performed by use of GSEA 4.1.0 with a log2fold-change preranked list. The MSigDB curated C2 gene sets collection v7.4 (including BIOCHARTA, KEGG, PID, REACTOME and WIKIPATHWAYS) was used to identify enrichments.               The target is indicated above the plot and the time point is shown on the x-axis. B) Indel frequencies compared between culture conditions for each target. No significant differences were determined by Wilcoxon rank-sum test, p-values were corrected for multiple testing.

Supplementary tables
Figure S10: Effect of non-randomly induced indels on clonal selection. Rational indel meta-analysis (RIMA) was performed to quantify the mean variant allele frequency (VAF) of c-MMEJ-associated variants 7 days after genome editing and after long-term culture in n=4 biological replicates. A) Mean (SD) VAF of c-MMEJ-derived indels before and after long-term culture for each target gene and overall showed a slightly, but not significantly higher increase in VAF in c-MMEJ versus non-c-MMEJ derived variant. B) Relative changes in mean (SD) VAF after long-term culture demonstrated that there was no higher selective advantage of nonrandomly induced indels compared to other variants (one-way ANOVA corrected for multiple testing).
Figure S11: Mutational composition of ASXL1 mut samples. A) Short-term culture (7 days) and B) long-term culture. The biological replicate is indicated above the plots. The annotation above the bars shows the overall indel frequency for the short-term culture samples. The annotation on the bars indicates the protein consequence of the respective mutation. The 30 most frequent mutations are shown and the annotated variants are highlighted in the legend.
Figure S12: Mutational composition of DNMT3A mut samples. A) Short-term culture (7 days) and B) long-term culture. The biological replicate is indicated above the plots. The annotation above the bars shows the overall indel frequency as well as the frequency of the point mutation (SNV) for the short-term culture samples. The annotation on the bars indicates the protein consequence of the respective mutation. The 30 most frequent mutations are shown and the annotated variants are highlighted in the legend.
Figure S13: Mutational composition of TET2 mut samples. A) Short-term culture (7 days) and B) long-term culture. The biological replicate is indicated above the plots. The annotation above the bars shows the overall indel frequency for the short-term culture samples. The annotation on the bars indicates the protein consequence of the respective mutation. The 30 most frequent mutations are shown and the annotated variants are highlighted in the legend.