The tumor suppressor microRNA let-7 inhibits human LINE-1 retrotransposition

Nearly half of the human genome is made of transposable elements (TEs) whose activity continues to impact its structure and function. Among them, Long INterspersed Element class 1 (LINE-1 or L1) elements are the only autonomously active TEs in humans. L1s are expressed and mobilized in different cancers, generating mutagenic insertions that could affect tumor malignancy. Tumor suppressor microRNAs are ∼22nt RNAs that post-transcriptionally regulate oncogene expression and are frequently downregulated in cancer. Here we explore whether they also influence L1 mobilization. We show that downregulation of let-7 correlates with accumulation of L1 insertions in human lung cancer. Furthermore, we demonstrate that let-7 binds to the L1 mRNA and impairs the translation of the second L1-encoded protein, ORF2p, reducing its mobilization. Overall, our data reveals that let-7, one of the most relevant microRNAs, maintains somatic genome integrity by restricting L1 retrotransposition.


Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.

For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings
For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of computer code Data collection

Data analysis
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability Sara R. Heras Sep 16, 2020 We used the Genomic Data Commons (GDC) Data Transfer Tool Client for downloading data from TGCA repository. It is a standard client-based mechanism in support of high-performance data downloads.
The analysis of non-reference Mobile Element Insertions (MEIs) in whole genome sequencing (WGS) data were performed using the software package MELT: The Mobile Element Locator Tool version 2.1.5. The GraphPad Prism 6 software was used for statistical analysis.
Data sets used in Figure 1 and Supplementary Figure 1 are detailed in Supplementary Table I. All data is available from the GDC legacy archive (https:// portal.gdc.cancer.gov/legacy-archive). Though most data files can be accessed without requiring access approval, WGS files need a special request due to their potential identification information. Researchers interested in accessing to restricted data can obtain authorization following the instructions in https:// gdc.cancer.gov/access-data/obtaining-access-controlled-data. The raw data underlying Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.   Table S2 and Supplemental Methods) and excluded insertion calls under three split reads. Samples where common reference polymorphic calls number was abruptly reduced under a 10% after filtering were excluded. The rationale behind it lies in the fact that common reference polymorphic calls are real insertions found in both tumor and normal tissue and therefore they can be used as a measure of how filter parameters affect to putative de novo insertions. This 10% criteria was pre-established to exclude low quality samples.
Experiments are reliably reproduced. Generally, experiments were performed at least three times, unless otherwise noted in figure legends. The correlation between microRNA expression and the accumulation of tumor specific L1 insertions identified by MELT was replicated using the number of tumor specific L1 insertions obtained by Helman et al. in a group of different samples and using a different tool (Transpo-seq) Randomization is not relevant to our study because the study does not involve the allocation of samples into experimental groups No blinding was performed in this study because group allocation was not involved in our study. The researchers were not blinded during data collection because most of the measurements were performed using instruments, or were quantitative in nature (RT-qPCR,blots or numbers of colonies on a plate).