Genome-wide analysis of the human p53 transcriptional network unveils a lncRNA tumour suppressor signature

Despite the inarguable relevance of p53 in cancer, genome-wide studies relating endogenous p53 activity to the expression of lncRNAs in human cells are still missing. Here, by integrating RNA-seq with p53 ChIP-seq analyses of a human cancer cell line under DNA damage, we define a high-confidence set of 18 lncRNAs that are p53 transcriptional targets. We demonstrate that two of the p53-regulated lncRNAs are required for the efficient binding of p53 to some of its target genes, modulating the p53 transcriptional network and contributing to apoptosis induction by DNA damage. We also show that the expression of p53-lncRNAs is lowered in colorectal cancer samples, constituting a tumour suppressor signature with high diagnostic power. Thus, p53-regulated lncRNAs establish a positive regulatory feedback loop that enhances p53 tumour suppressor activity. Furthermore, the signature defined by p53-regulated lncRNAs supports their potential use in the clinic as biomarkers and therapeutic targets.


RNA-seq data analysis
Raw sequencing data generated on the Illumina genome Analyzer II were processed using the following workflow: (1) the quality of the samples was verify using FastQC software; (2) the preprocessing of reads included elimination of contaminant adapter substrings with Scythe and quality-based trimming using Sickle; (3) the alignment of reads to the human genome (hg19) was performed using Tophat2 mapper 1 (4) the transcript assembly and quantification using FPKM of genes and transcripts was carried out with Cufflinks2 2 ; (5) the annotation of the gene locus obtained was performed using Cuffmerge with Gencode v19 as reference; (6) differential expression analysis was performed using Cuffdiff2 and selected the transcript with a p<0.01 between untreated and treated p53 HCT116 cells for 12h 2 . Further analysis and graphical representations have been performed using R/Bioconductor 3 .

ChIP-seq data analysis
Reads were aligned using Bowtie2 4 to the reference genome (hg19 for human samples).
Peak detection for p53 ChIP-Seq samples were performed using MACS 1.4.2 5 with default parameters but without input (the enrichment regions are identified for WT0h and WT12h conditions using the background of each experiment as reference in the Poisson model). The annotation of the obtained peaks was done using the Bioconductor package ChIPpeakAnno 6 using as reference the annotation obtained from the combination of Gencode v19 and Cufflinks genes. Overlapping genes were discarded and only regions as far as 10kb from TSS are considered for further analyses.
Public data from mouse p53 ChIP-Seq experiments were downloaded from Gene Expression Omnibus (GEO) database (accession code GSE46240) and analyzed using previously described pipeline. In order to compare human and mouse results, peak coordinates from mm10 were transformed into hg19 coordinates using liftOver tool from UCSC, and the converted regions were annotated using ChIPpeakAnno.
Histone modification ChIP-Seq experiments of H3K27ac, H3K4me1 andH3K4me3 for HCT-116 cell line were downloaded from ENCODE project 7 , with GEO accession codes GSE31755 and GSE35583.
Coverage signals used to represent heatmap density maps and centered peak regions were generated using seqMINER 8 and visualized with Genesis 9 and ggplot2 package from Bioconductor.

Microarray hybridization and data analysis
The cells were harvested with TRIzol Reagent (Invitrogen) and the RNA was extracted according to the manufacturer's instructions. As a last step of the extraction procedure, the RNA was purified with the RNeasy Mini-kit (Qiagen, Hilden, Germany). Before cDNA synthesis, RNA integrity from each sample was confirmed on Agilent RNA Nano LabChips (Agilent Technologies). The sense cDNA was prepared from 300 ng of total RNA using the Ambion® WT Expression Kit. The sense strand cDNA was then Both background correction and normalization were done using RMA (Robust Multichip Average) algorithm 10 using Affymetrix Power Tools (APT). After quality assessment a filtering process was performed to eliminate low expression probe sets.
Applying the criterion of an expression value greater than 16 in 2 samples for each experimental condition, 41697 probe sets were selected for statistical analysis. R and Bioconductor were used for preprocessing and statistical analysis. LIMMA (Linear significant differential expression between experimental conditions. Genes were selected as significant using a B statistic cut off B>1. Functional enrichment analysis of Gene Ontology (GO) categories was carried out using standard hypergeometric test 12 . The biological knowledge extraction was complemented through the use of Ingenuity Pathway Analysis (Ingenuity Systems, www.ingenuity.com), which database includes manually curated and fully traceable data derived from literature sources Measurement of mRNA stability mRNA stability was measured by actinomycin D chases. Actinomycin D (5 ug/ml) was added to cells for the indicated time, RNA was extracted and analyzed by qRT-PCR.

Oligonucleotides
Sequences of antisense oligos (ASOs) and PCR primers are shown in Supplementary   Table 9.