DOT1L-mediated murine neuronal differentiation associates with H3K79me2 accumulation and preserves SOX2-enhancer accessibility

During neuronal differentiation, the transcriptional profile and the epigenetic context of neural committed cells is subject to significant rearrangements, but a systematic quantification of global histone modification changes is still missing. Here, we show that H3K79me2 increases and H3K27ac decreases globally during in-vitro neuronal differentiation of murine embryonic stem cells. DOT1L mediates all three degrees of methylation of H3K79 and its enzymatic activity is critical to modulate cellular differentiation and reprogramming. In this context, we find that inhibition of DOT1L in neural progenitor cells biases the transcriptional state towards neuronal differentiation, resulting in transcriptional upregulation of genes marked with H3K27me3 on the promoter region. We further show that DOT1L inhibition affects accessibility of SOX2-bound enhancers and impairs SOX2 binding in neural progenitors. Our work provides evidence that DOT1L activity gates differentiation of progenitors by allowing SOX2-dependent transcription of stemness programs.

Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency in reporting. For further information on Nature Research policies, seeAuthors & Referees and theEditorial Policy Checklist .

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.
n/a Confirmed The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.

For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings
For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of computer code Data collection

Data analysis
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability Sample sizes were determined based on the recommendations for high-throughput sequencing experiments (minimal 2-3 replicates per condition). Given the high number of epigenetic marks included in this study, we generated 2 replicated per cell-type.
No data was excluded.
All sequencing experiments presented in this paper have been conducted once, with the appropriate biological replicates included in the experimental run. Immunoblotting was replicated at least twice, with each replicate including all biological replicates shown in the study. Sox2 ChIP followed by qPCR was performed once.
Randomization was not relevant to the present study because we worked with cultured cells grown in a highly controlled and homogeneous environment.
Authors contributing to the experimental validation involving SOX2-ChIP followed by qPCR, were blind with respect to the NGS predictions. For example, the researcher conducting the experimental validation was unaware of which loci were negative controls and which were differentially accessible. Confirm that both raw and final processed data have been deposited in a public database such as GEO.
Confirm that you have deposited or provided access to graph files (e.g. BED files) for the called peaks.

Data access links
May remain private before publication.
Dot1l-HA-FLAG C57BL/6 mouse embryonic stem cells and in-vitro derived neural progenitor cells (according to Bibel at al. 2008). Dot1l-HA-FLAG were generated by inGenious Targeting Laboratory, and purchased from the same vendor.
Cells were authenticated via immunostainings, RNA-seq and ChIP-seq The cells were not tested for mycoplasma contamination No commonly misidentified cell lines were used in the study.