Atlas of quantitative single-base-resolution N6-methyl-adenine methylomes

Various methyltransferases and demethylases catalyse methylation and demethylation of N6-methyladenosine (m6A) and N6,2′-O-dimethyladenosine (m6Am) but precise methylomes uniquely mediated by each methyltransferase/demethylase are still lacking. Here, we develop m6A-Crosslinking-Exonuclease-sequencing (m6ACE-seq) to map transcriptome-wide m6A and m6Am at quantitative single-base-resolution. This allows for the generation of a comprehensive atlas of distinct methylomes uniquely mediated by every individual known methyltransferase or demethylase. Our atlas reveals METTL16 to indirectly impact manifold methylation targets beyond its consensus target motif and highlights the importance of precision in mapping PCIF1-dependent m6Am. Rather than reverse RNA methylation, we find that both ALKBH5 and FTO instead maintain their regulated sites in an unmethylated steady-state. In FTO’s absence, anomalous m6Am disrupts snRNA interaction with nuclear export machinery, potentially causing aberrant pre-mRNA splicing events.

The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement An indication of whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistics including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable. Fastq sequences were first trimmed of 5' and 3' adapter sequences and poly(A) tails using Cutadapt. The 8-mer 'N7B' (N=A/C/G/T, B=C/ G/T) UMI located at the first 8 nucleotides of read 1 was registered and trimmed. Any complementary UMI sequence in read 2 was also trimmed. Reads were mapped to the methylated spike-in (Supplementary Data 1) using Bowtie2, or to the hg38 assembly transcriptome (Gencode v28 comprehensive gene annotations) using STAR. Aligned pairs that had the same mapping coordinates and UMIs were filtered out as PCR duplicates. Read-start coordinates in hg38-mapped reads that began with an adenosine nucleotide, and had a minimum mean read count of 1 across the triplicate samples were collated. m6A or m6Am sites were identified as read starts that were at least 2-fold enriched in m6ACE libraries than in the corresponding input libraries. This enrichment was calculated using DESeq2 performed on A-only sites across triplicate pairs of m6ACE and corresponding input libraries (FDR<0.1, padj<0.05). Based on read-start patterns observed from m6ACE-seq of methylated spike-ins, we considered identified sites that were 1-4 nucleotides upstream of another identified significant Rm6AC site or sites found within clustered read-starts to be m6ACE-seq false-positives and filtered them out. To identify m6A or m6Am sites that were differentially methylated between sample conditions, we calculated the RML of each site in each sample: The read-start counts at positions -4 to 0 of each site in the m6ACE library were summed and divided by the read-start counts at positions -51 to 0 of the same site in the input library to give 'X'. Similarly, the read-start counts at positions -4 to 0 of the spikein m6A site in the m6ACE library were summed and divided by the read-start counts at positions -21 to 0 of the same spike-in m6A site in the input library to give 'Y'. X was normalized to Y to give RML. RML values of each site was averaged across triplicates for each sample condition. A site was denoted as differentially methylated if the average RML differs between sample conditions with a log2fold-change For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers upon request. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability Field-specific reporting Please select the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/authors/policies/ReportingSummary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative. Obtaining unique materials (LFC) cutoff of 2.0 (for methylase-KO or demethylase-OE induced RML reduction) or 1.0 (for demethylase-KO induced RML accumulation), as well as a one-tailed T-test p-value cutoff of <0.05. Consensus motif analysis was performed using Meme-chip. Metagene analysis was performed using MetaPlotR. Gene ontology analysis was performed using the PANTHER classification system. Probability of overlap of lists of m6A/m6Am sites were calculated using a hypergeometric distribution. ROCAUC analysis was performed as previously described with the following changes: the collection of all m6A and m6Am sites present in WT cells or exhibiting RML accumulation in demethylase-KO cells were ranked with the most insignificant site first, based on WT padj-value as calculated by DESeq2. An ROC curve was plotted based on the ability for a demethylase-regulated site (at LFC=0.0,0.5,1.0,1.5; T-test p<0.05) to predict insignificant m6A/m6Am sites in WT cells, and the area under the curve was calculated.
Data were deposited in NCBI's Gene Expression Omnibus (GEO) under accession number GSE119094.
We used a minimum of biological triplicates.
No data was excluded.
Experiments were repeated to verify reproducibility. This is not relevant as we worked with cell lines.
Blinding was not relevant in our studies.
Gene-knockout and knockdown cells were generated using CRISPR-Cas9 procedures or siRNA knockdown respectively.