The H2BG53D oncohistone directly upregulates ANXA3 transcription and enhances cell migration in pancreatic ductal adenocarcinoma

Supplementary data figures and tables The H2BG53D oncohistone directly upregulates ANXA3 transcription and enhances cell migration in Pancreatic Ductal Adenocarcinoma Yi Ching Esther Wan, Jiaxian Liu, Lina Zhu, Tze Zhen Evangeline Kang, Xiaoxuan Zhu, John Lis, Toyotaka Ishibashi, Charles G. Danko, Xin Wang, Kui Ming Chan Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute of City University of Hong Kong, Shenzhen, China Department of Molecular Biology and Genetics, Cornell University, NY, USA Division of Life Science, Hong Kong University of Science and Technology, Hong Kong, China USA James A Baker Institute for animal health, Cornell University, NY, USA


Supplementary Fig.3 | H2BG53D alters transcription of ANXA3 in vivo. (a) Genome viewer
showing the normalized RNA-seq reads (upper) and PRO-seq reads (lower) of ANXA3 in two wildtype and two H2BG53D cell lines. Validation of the elevated expression of ANXA3 by (b) RT-qPCR (*p < 0.05; LSD post hoc one-way ANOVA test) and (c) Western Blotting. (d) Levels of transcription of the indicated genes were detected by RT-qPCR using primers at indicated intron-exon boundaries. (* p < 0.05) Schematic diagram showing exon (Ex)-intron (In) junctions along the gene body of ANXA3 and SNAP47. Cells were first incubated with 300 μM of DRB for 3.5 hours and then the cells were washed with PBS and further incubated in fresh medium for the indicated times. Levels of pre-mRNA of the regions were measured by RT-qPCR. Pre-mRNA values are normalized to the values of DMSO-treatment control, which was set to 1. Results are shown as means ± standard deviation (SD) from three independent experiments (*p < 0.05 vs. WT with unpaired t-test).  GCGTTCGATTGGATGGCTAT ACCTGAGCAGTTTCTACCCC  ; 1% Sarkosyl (Sigma-Aldrich, # L5125)) and incubated at 37°C for 5 min. The run-on reaction was terminated by adding Trizol LS (Invitrogen, 10296010) and pelleted by ethanol precipitation. RNA pellets were re-dissolved in nuclease-free water and briefly denatured at 65°C followed by base hydrolysis with NaOH to produce 100-150 nt fragments. The biotinylated nascent transcripts were purified three times using Dynabeads™ M-280 Streptavidin (Invitrogen, 11206D), each round followed by Trizol (Invitrogen, 15596026) extraction and ethanol precipitation. The 5' cap of transcripts were removed with RNA 5' Pyrophosphohydrolase (NEB, M0356S) and the 5' hydroxyl group repaired with T4 polynucleotide kinase (NEB, M0201). The libraries were then generated using TruSeq small RNA adapters and size-selected to a range of 140-350bp through Solid Phase Reversible Immobilisation beads (Beckman Coulter AMPURE XP, A63881) before being sequenced using Illumina NextSeq500 with 75 bp paired-end reads.

ATAC-seq
ATAC-seq libraries were generated as described in 3 . 50,000 cells were harvested and washed once with 50 l of cold 1X PBS and then resuspended in 50 l of lysis buffer (10 mM Tris-HCl, pH 7.5; 10 mM NaCl; 3 mM MgCl2; 0.1% NP-40). Cells were then centrifuged for 10 min at 500 g. The supernatant which contains the cytoplasmic components was discarded and the pellet was collected. Transp osition was initiated by adding 2X TD buffer with 2.5 l of Tn5 transposase (Illumina, FC121-1030) in 50 l total volume. Transposition was allowed to proceed for 30 min at 37°C in a thermomixer shaking at 500 rpm. Transposition reactions were cleaned up with Qiagen MinElute Kit. Libraries were generated using the custom Nextera PCR primers 3 and were amplified for 10-12 cycles. Libraries were purified with AMPure beads to remove primer dimmers and > 1,000 bp DNA. Library quality was assessed using the Agilent Bioanalyzer High-Sensitivity DNA kit and quantified using the NEBNext Library Quant Kit. Libraries were sequenced on Illumina NextSeq 500 with 50 bp paired-end reads.

ChIP-qPCR
Cells were cross-linked with 1% PFA at room temperature for 5 min and then quenched the formaldehyde with 125 mM glycine at room temperature for 5 min. Cells were washed twice with 1X TBS and harvested by scarping in 1 ml extraction buffer (10 mM Tris-HCl, pH7.5; 10 mM NaCl; 0.5% NP-40; proteinase inhibitor cocktail) and incubated on ice for 30 minutes. Nuclei was washed once with MNase digestion buffer (20 mM Tris-HCl, pH7.5; 15 mM NaCl; 60 mM KCl; 2 mM CaCl2). Digestion was started by adding 5l MNase (NEB M0247S, diluted 1:10) to the nuclei suspension. The reaction was then incubated at 37°C with 500 rpm shaking for 5 min. Digestion was quenched by adding 2X STOP buffer (100 mM Tris-HCl, pH 8.0; 20 mM EDTA; 200 mM NaCl; 2% Triton X-100; 0.2% sodium deoxycholate). Soluble chromatin was collected after two sequential high-speed centrifugations of the sonicated lysate (10,000 g for 5min and 15 min at 4°C). 5% of the lysate was taken as input and the remaining lysate was incubated with specific antibodies at 4°C for overnight. 30 l of pre-washed Protein G Sepharose (GE Healthcare, 17061802) were added to each sample and incubated at 4°C for 1-2 hours. The beads were washed with different buffers, once with ChIP lysis buffer, once with lysis buffer with 0.5 M NaCl, once with Tris/LiCl buffer (10 mM Tris, pH 8.0; 0.25 M LiCl; 0.5% NP-40; 0.5% Na-deocycholate; 1 mM EDTA) and twice with Tris/EDTA buffer (50 mM Tris, pH 8.0; 10 mM EDTA). After washing, 100 l of 10% chelex (Bio-Rad, cat. no. 142-1253) were added to the washed protein-G beads and boiled at 95°C for 10 min and then 5 l of 20 mg/ml Proteinase K (NEB, P8107S) were added and incubated at 37°C for 30 min. Samples were boiled again for 10 min to inactivate proteinase K and centrifuged to collect the supernatant. 100 l of 20 mM Tris, pH 8.0 was added to the pellet and centrifuged again to collect the supernatant. The supernatants were combined, and it was used as template for qPCR reaction. qPCR was performed using Applied Biosystems QuantStudio 3 Real-Time PCR System.

CUT&RUN sequencing data analysis
Reads were aligned to human reference genome hg19 and yeast reference genome sacCer3 by Bowtie 2 separately 4 . Human reads were normalized by spike-in yeast reads using deepTools 5 . MACS2 6 was used to call peaks using parental as control under p < 0.001 with paired-end mode. Mutant enriched peaks were identified by 'DESeq2' 7 taking yeast spike-in reads as scale factor after counting reads within peaks using BEDTools (p < 0.05 and log2 fold change > 0.5) 8 . Peak annotation to different genomic regions, including gene body, promoter (transcription start site (TSS) +-1kb), downstream (3kb downstream of transcription end site (TES)) and distal intergenic regions, was performed by R package 'ChIPseeker' 9 . As a random control, the same number of 'shuffled peaks' of the same lengths as mutant enriched peaks were randomly generated for each chromosome for 1,000 times using BEDTools 8 . A Chi-square test was then performed to assess the statistical significance in the difference of genomic distributions between mutant enriched peaks and shuffled peaks. For each gene, the occupancy of FLAG was quantified by the number of reads located in gene body and 1 kb upstream of TSS counted by BEDTools 8 , and subsequently normalized by the yeast spike-in factor. To identify genes showing differential occupancy of FLAG between mutants and WT, differential occupancy analysis was further performed by 'DESeq2' (p < 0.01 and log2 fold change > 0.5) 7 .

RNA-seq data analysis
Reads were mapped to the human reference genome hg19 and counted by STAR 2.6.1a 10 with default parameters. R package 'DESeq2' 7 was used to perform differential expression analysis (p < 0.05 and |log2 fold change| > 0.25). 'bamCoverage' in deepTools 5 was used to generate bigwig files for IGV visualization using the traditional normalization method: Reads Per Kilobase per Million mapped reads (RPKM). Gene set overrepresentation analysis based on differentially expressed genes was performed by R package 'HTSanalyzeR2' 11 using hypergeometric tests, and significant gene sets were defined by Benjamini-Hochberg adjusted p < 0.05.

PRO-seq data analysis
Adapter cutting, reads alignment and coverage files generation were based on the pipeline illustrated by Dig et al. 12 using human genome hg19 as reference genome. Count data was obtained from the bigwig files using R package 'bigWig' 13 . R package 'DESeq2' 7 was used to perform differential expression analysis, and significantly differentially expressed genes were defined by p < 0.05 and |log2 fold change| > 0.25.

ATAC-seq data analysis
Reads were aligned to the human reference genome hg19 by BWA 14 with default parameters. Reads from mitochondrial were removed and de-duplicated. Only paired reads were used for further analysis. MACS2 was used to call peaks with paired-end mode (q < 0.05) 6 . Coverage files for IGV visualization were generated from bam files using deepTools 5 .
DRB treatment 5,6-Dichlorobenzimidazole 1-b-D-ribofuranoside (DRB) (Sigma, D1916) was dissolved in DMSO as 75 mM solution stored at -20°C. S2VP10 wild type and G53D mutation cells grew overnight on 35 mm plates to 60%-70% confluency and then were treated with 300 μM DRB for 3.5 hours. Cells were washed with PBS to remove the DRB and then incubated in fresh medium for various time periods. Following the incubation period, cells were washed with PBS and subjected to total RNA isolation using a universal RNA extraction kit (Takara, 9767). 500 ng of total RNA were used for reverse transcriptase reaction according to PrimeScript RT Master Mix (TaKaRa,RR036A). The levels of pre-mRNA at various positions of ANXA3 gene were determined by real-time PCR. Values obtained were normalized relative to the average level of 5S and GAPDH. Results were expressed in relation to the pre-mRNA value of cells treated with DMSO.