Long non-coding RNA expression profiling in the NCI60 cancer cell line panel using high-throughput RT-qPCR

Mestdagh, Pieter; Lefever, Steve; Volders, Pieter-Jan; Derveaux, Stefaan; Hellemans, Jan; Vandesompele, Jo

doi:10.1038/sdata.2016.52

Download PDF

Data Descriptor
Open access
Published: 05 July 2016

Long non-coding RNA expression profiling in the NCI60 cancer cell line panel using high-throughput RT-qPCR

Pieter Mestdagh^1,2,
Steve Lefever^1,2,
Pieter-Jan Volders^1,2,
Stefaan Derveaux¹,
Jan Hellemans³ &
…
Jo Vandesompele^1,2

Scientific Data volume 3, Article number: 160052 (2016) Cite this article

4216 Accesses
4 Citations
10 Altmetric
Metrics details

Subjects

Abstract

Long non-coding RNAs (lncRNAs) form a new class of RNA molecules implicated in various aspects of protein coding gene expression regulation. To study lncRNAs in cancer, we generated expression profiles for 1707 human lncRNAs in the NCI60 cancer cell line panel using a high-throughput nanowell RT-qPCR platform. We describe how qPCR assays were designed and validated and provide processed and normalized expression data for further analysis. Data quality is demonstrated by matching the lncRNA expression profiles with phenotypic and genomic characteristics of the cancer cell lines. This data set can be integrated with publicly available omics and pharmacological data sets to uncover novel associations between lncRNA expression and mRNA expression, miRNA expression, DNA copy number, protein coding gene mutation status or drug response

Design Type(s)	disease state design • cell type comparison design • transcription profiling by RT-PCR design
Measurement Type(s)	long non-coding RNA expression
Technology Type(s)	transcription profiling by RT-PCR assay
Factor Type(s)	cancer cell line
Sample Characteristic(s)	Homo Sapiens • RCC 786-O cell • A-498 cell • A-549 cell • Colo-205 cell • DU-145 cell • HOP-62 cell • K-562 cell • MCF-7 cell • MDA-MB-231 cell • MOLT-4 cell • OVCAR-3 cell • OVCA-4 cell • PC-3 cell • SF-268 cell • SF-295 cell • SK-MEL-2 cell • SK-MEL-28 cell • SW-620 cell • CCRF-CEM cell • HCC-2998 cell • HCT-116 cell • Hs-578T cell • HOP-92 cell • HT-29 cell • KM-12 cell • LOXIMVI cell • ACHN cell • BT-549 cell • Caki-1 cell • EKVX cell • HCT-15 cell • HL-60 cell • IGROV-1 cell • M14 melanoma cell • MALME-3M cell • MDA-MB-435 cell • MDA-MB-468 cell • NCI/ADR-RES cell • NCI-H226 cell • NCI-H23 cell • NCI-H322M cell • NCI-H460 cell • NCI-H522 cell • OVCA-5 cell • OVCA-8 cell • RPMI-8226 cell • RXF-393 cell • SF-539 cell • SK-MEL-5 cell • SK-OV-3 cell • SN-12C cell • SNB-19 cell • SNB-75 cell • SR cell • T-47D cell • TK-10 cell • U-251 MG cell • UACC-257 cell • UACC-62 cell • UO-31 cell

Machine-accessible metadata file describing the reported data (ISA-Tab format)

Comprehensive RNA dataset of tissue and plasma from patients with esophageal cancer or precursor lesions

Article Open access 14 March 2022

Kathleen Schoofs, Annouck Philippron, … Katleen De Preter

Long non-coding RNA dysregulation is a frequent event in non-small cell lung carcinoma pathogenesis

Article Open access 05 February 2020

Amelia Acha-Sagredo, Bubaraye Uko, … Triantafillos Liloglou

lncRNAKB, a knowledgebase of tissue-specific functional annotation and trait association of long noncoding RNA

Article Open access 05 October 2020

Fayaz Seifuddin, Komudi Singh, … Mehdi Pirooznia

Background & Summary

Genome-wide studies have shown that the human genome is pervasively transcribed, resulting in the identification of tens of thousands of long non-coding RNA (lncRNA) genes. Many lncRNAs are associated with disease-linked SNPs or show pronounced tissue-specific expression profiles, hinting at a possible role in human disease and development. The potential importance of lncRNAs (and ncRNAs as a whole) in development is further supported by the intriguing observation that organismal complexity (using the number of distinct cell types as a proxy) is strongly correlated to the proportion of the genome that is non-coding¹. During the last few years, a growing number of studies provided evidence that lncRNA expression is deregulated in cancer and contributes to many of the established cancer hallmarks (e.g. sustained proliferation, evading apoptosis, invasion and metastasis, angiogenesis, genome instability, …)². Notable examples are HOTAIR and MALAT1 that promote metastasis in breast and lung cancer^3,4, PCAT-1 driving proliferation in prostate cancer⁵, SAMMSON sustaining energy metabolism in melanoma⁶ and aHIF regulating angiogenesis in various cancer types⁷.

To further study the role of lncRNAs in cancer, we measured the expression of 1707 human lncRNAs in the NCI60 cancer cell line panel using a high-throughput nanowell RT-qPCR platform (SmartChip, Wafergen) with custom qPCR assays (Fig. 1). These assays were designed using an in-house primer design pipeline (primerXL), taking into account a series of in silico quality control steps to ensure optimal assay performance in terms of specificity and efficiency. Upon empirical validation of assay performance, lncRNA expression was quantified in all 60 cancer cell lines followed by data processing and normalization.

The NCI60 cancer cell line panel is one of the best characterized cell line panels for which various omics data sets (including mRNA and miRNA expression, DNA copy number and cancer gene mutation) and drug response data are publicly available. Researchers can reuse the lncRNA expression data described here to study associations with any of the above-described omics data and drug response data and generate hypotheses on lncRNA functionality in cancer.

Methods

qPCR assay design and validation

Target lncRNA sequences were obtained from ensembl (v61 and v63) and lncRNAdb (v1) (www.lncRNAdb.org). Whereas lncRNAdb only contains lncRNAs with validated functions, ensembl is not enriched for functional lncRNAs. LncRNA selections from ensembl were random. For lncRNAs overlapping (sense or antisense) with protein coding genes, only the non-overlapping lncRNA sequence was considered for assay design. Detailed annotation of selected lncRNA sequences is available in Supplementary Table 1.

Assays were designed using the primerXL pipeline (www.primerxl.org), applying stringent design criteria to ensure optimal assay performance. Primer annealing sites were evaluated for the presence of SNPs and secondary structures (UNAFold⁸). Assay specificity was evaluated using BiSearch⁹. Assays devoid of SNPs and secondary structures in primer annealing sites with no predicted amplification of non-target sequences were selected for validation.

All assays were validated by means of a titration-response experiment using the MAQC samples as reported previously¹⁰. In brief, total human reference RNA (MAQC A, Agilent Technologies) and human brain RNA (MAQC B, Ambion) are combined in a 3:1 or 1:3 ratio to create MAQC C and MAQC D respectively. Titration response is calculated for each assay in function of the measured expression difference between MAQC A and MAQC B. In case expression in MAQC A>MAQC B, an assay is considered ‘titrating’ if MAQC A>MAQC C>MAQC D>MAQC B. In case expression in MAQC B>MAQC A, an assay is considered ‘titrating’ if MAQC B>MAQC D>MAQC C>MAQC A.

For a subset of 92 (single exonic) assays we determined amplification efficiency by means of an amplicon dilution series. Amplicons were generated by performing a qPCR reaction with each of the 92 assays on human genomic DNA (Roche). Amplicons were pooled and a 10-fold dilution series ranging from 2.10⁷ to 2.10¹ molecules was generated using 5 ng l⁻¹ carrier RNA. Efficiencies were calculated by combining amplicon concentrations and corresponding assay Cq-values in the following formula: E=(10^−1/S −1)×100 with s=slope of the linear regression of log10 (concentrations) and Cq-values.

RNA isolation and reverse transcription

Cell pellets for each of the NCI60 cancer cell lines were obtained from the National Cancer Institute (Developmental Therapeutics Program). RNA was isolated using the miRNeasy mini kit (Qiagen) according to the manufacturer’s instructions. RNA concentration was measured using the Nanodrop spectrophotometer. RNA from the NCI60 cancer cell line panel and MAQC samples was reverse transcribed using the iScript SsoAdvanced kit (Bio-Rad) with 4 μg of input RNA according to the manufacturer’s instructions.

qPCR

A total of 1707 lncRNA assays were spotted in triplicate on a SmartChip (Wafergen) to obtain a final concentration of 250 mM primer in a 100 nl reaction volume (Wafergen SmartChip Human lncRNA-1 Panel V1.0, cat n° 430-000103). Per sample, one SmartChip was prepared. A reaction mix containing 4 μg of cDNA (20 μl), 340 μl of 2 X SsoAdvanced Universal SYBR Green Supermix (Bio-Rad), 6.8 μl of BSA (Wafergen), 1.36 μl of yeast cocktail (Wafergen) and 311.8 μl H₂O was dispensed in each well of the SmartChip to obtain a volume of 100 nl per well. SmartChips were analyzed on a SmartChip Real-Time PCR System using the SmartChip qPCR software (version 2.5.3.69).

Data processing and normalization

Raw Cq-values (Data Citation 1) were filtered based on a detection threshold of 28 PCR cycles, excluding measurements with Cq>28. This cutoff was derived from the miRNA Quality Control study and represents the Cq-value above which reproducible detection is no longer possible¹⁰. The median Cq-value of triplicate measurements per gene was subsequently used to calculate a sample-specific normalization factor, applying the global mean normalization strategy¹¹. Briefly, each Cq-value was normalized by subtracting the mean Cq-value per sample. Normalized Cq-values were multiplied by -1 such that higher values represent higher expression levels (Data Citation 1 and Supplementary Table 2).

Data Records

Raw and normalized Cq-values resulting from the RT-qPCR measurements of 1707 lncRNAs in the NCI60 cancer cell lines can be found in the Gene Expression Omnibus (Data Citation 1).

Technical Validation

Experimental assay validation

Assay efficiency was determined for a random selection of 92 lncRNA assays using a standard dilution series. Optimal efficiency, generally considered between 90% and 110%, was observed for all tested assays (Fig. 2a,b).

**Figure 2: Technical validation of the lncRNA profiling platform.**

Assay reproducibility and titration response

Platform reproducibility was validated by replicate measurements of MAQCA RNA (r=0.976, Figure 2c). Assay titration response was evaluated by profiling all four MAQC samples. The percentage of titrating assays was calculated in function of the expression difference between MAQCA and MAQCB and, as expected, increased with an increasing difference between MAQCA and MAQCB (Fig. 2d). The observed titration response was in line with previously reported titration response results for high-throughput RT-qPCR platforms¹⁰.

NCI60 lncRNA expression data

The NCI60 cancer cell line panel consists of 60 cell lines representing 9 different cancer types. As lncRNAs are known to be differentially expressed between tissue and cancer types^12,13, lncRNA expression profiles should be able to classify the cell lines in the NCI60 panel. The lncRNA expression profiles in this dataset were filtered to retain only those lncRNAs expressed in at least 80% of the samples per cancer type. Hierarchical clustering of the NCI60 cell lines based on the expression of 1109 lncRNAs showed similar classification accuracy as compared to clustering using expression data from 19,185 mRNAs, obtained from publicly (http://discover.nci.nih.gov/cellminer/) available mRNA expression data (Fig. 3a,b). To further support the technical quality of the NCI60 lncRNA expression dataset, we evaluated the expression of two well-established dosage-sensitive lncRNAs, ANRIL (copy number deletion¹⁴) and PVT1 (copy number amplification¹⁵), in relation to their matching copy number status in the NCI60 cell lines. As expected, ANRIL expression was significantly down regulated in cell lines with ANRIL copy number loss whereas PVT1 was significantly up regulated in cell lines with PVT1 copy number amplification (Mann-Whitney P<0.05, Fig. 3c). These results demonstrate that the lncRNA expression profiles that have been generated match the phenotypic and genomic characteristics of the cancer cell lines, underscoring the technical quality of the NCI60 RT-qPCR lncRNA expression dataset. Technical quality could also be validated by direct comparison of the RT-qPCR data with matching microarray data obtained from the Cancer Cell Line Encyclopedia. When comparing expression values for 3 well studied lncRNAs (MALAT1, NEAT1 and TUG1), a significant positive correlation between both datasets was observed (Spearman Rank, P<0.01). The quality and applicability of this dataset is further exemplified by a recent study in which the expression of the melanoma-specific lncRNA SAMMSON was validated using the NCI60 lncRNA expression profiles presented here⁶.

Usage Notes

Researchers interested in integrating the lncRNA expression data with matching omics datasets (copy number, mRNA expression, miRNA expression, protein expression, mutation, methylation), drug activity scores or cell line metadata can use the CellMiner webtool (http://discover.nci.nih.gov/cellminer/) to download the respective datasets. Associations between lncRNAs and the above mentioned data layers can be studies through correlation analysis or alternative methods across the 60 cell lines. These include for example correlation analysis between lncRNA and mRNA expression data across cell lines or correlation analysis between lncRNA expression and IC50 values for various compounds across cell lines. Note that lncRNAs are expressed in a more tissue-restricted manner as compared to protein-coding genes, explaining why some lncRNAs only have expression values in a subset of the samples.

Additional Information

How to cite this article: Mestdagh, P. et al. Long non-coding RNA expression profiling in the NCI60 cancer cell line panel using high-throughput RT-qPCR. Sci. Data 3:160052 doi: 10.1038/sdata.2016.52 (2016).

References

Liu, G., Mattick, J. S. & Taft, R. J. A meta-analysis of the genomic and transcriptomic composition of complex life. Cell cycle (Georgetown, Tex) 12, 2061–2072 (2013).
Article CAS Google Scholar
Gutschner, T. & Diederichs, S. The hallmarks of cancer: a long non-coding RNA point of view. RNA Biol. 9, 703–719 (2012).
Article CAS Google Scholar
Gupta, R. A. et al. Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis. Nature 464, 1071–1076 (2010).
Article ADS CAS Google Scholar
Gutschner, T. et al. The non-coding RNA MALAT1 is a critical regulator of the metastasis phenotype of lung cancer cells. Cancer Res. 73, 1180–1189 (2012).
Article Google Scholar
Prensner, J. R. et al. Transcriptome sequencing across a prostate cancer cohort identifies PCAT-1, an unannotated lincRNA implicated in disease progression. Nat. Biotechnol. 29, 742–749 (2011).
Article CAS Google Scholar
Leucci, E. et al. Melanoma addiction to the long non-coding RNA SAMMSON. Nature 531, 518–522 (2016).
Article ADS CAS Google Scholar
Rossignol, F., Vaché, C. & Clottes, E. Natural antisense transcripts of hypoxia-inducible factor 1alpha are detected in different normal and tumour human tissues. Gene 299, 135–140 (2002).
Article CAS Google Scholar
Markham, N. R. & Zuker, M. DINAMelt web server for nucleic acid melting prediction. Nucleic Acids Res. 33, W577–W581 (2005).
Article CAS Google Scholar
Tusnády, G. E., Simon, I., Váradi, A. & Arányi, T. BiSearch: primer-design and search tool for PCR on bisulfite-treated genomes. Nucleic Acids Res. 33, e9 (2005).
Article Google Scholar
Mestdagh et al. Evaluation of quantitative miRNA expression platforms in the microRNA quality control (miRQC) study. Nature Methods 11, 809–815 (2014).
Article CAS Google Scholar
Mestdagh et al. A novel and universal method for microRNA RT-qPCR data normalization. Genome Biol. 10, R64 (2009).
Article Google Scholar
Cabili, M. N. et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 25, 1915–1927 (2011).
Article CAS Google Scholar
Iyer, M. K. et al. The landscape of long noncoding RNAs in the human transcriptome. Nat. Genet. 47, 199–208 (2015).
Article CAS Google Scholar
Pasmant, E. et al. Characterization of a germ-line deletion, including the entire INK4/ARF locus, in a melanoma-neural system tumor family: identification of ANRIL, an antisense noncoding RNA whose expression coclusters with ARF. Cancer Res. 67, 3963–3969 (2007).
Article CAS Google Scholar
Guan, Y. et al. Amplification of PVT1 contributes to the pathophysiology of ovarian and breast cancer. Clinical cancer research: an official journal of the American Association for Cancer Research 13, 5745–5755 (2007).
Article CAS Google Scholar

Data Citations

Mestdagh, P., & Vandesompele, J. Gene Expression Omnibus GSE80332 (2016)

Download references

Acknowledgements

The authors would like to acknowledge Anthony Van Driessche for technical assistance. P.M. and S.L. are funded by the Fund For Scientific Research Flanders (FWO).

Author information

Authors and Affiliations

Center for Medical Genetics, Ghent University, Ghent, 9000, Belgium
Pieter Mestdagh, Steve Lefever, Pieter-Jan Volders, Stefaan Derveaux & Jo Vandesompele
Cancer Research Institute Ghent (CRIG), Ghent, 9000, Belgium
Pieter Mestdagh, Steve Lefever, Pieter-Jan Volders & Jo Vandesompele
Biogazelle, Zwijnaarde, 9052, Belgium
Jan Hellemans

Authors

Pieter Mestdagh
View author publications
You can also search for this author in PubMed Google Scholar
Steve Lefever
View author publications
You can also search for this author in PubMed Google Scholar
Pieter-Jan Volders
View author publications
You can also search for this author in PubMed Google Scholar
Stefaan Derveaux
View author publications
You can also search for this author in PubMed Google Scholar
Jan Hellemans
View author publications
You can also search for this author in PubMed Google Scholar
Jo Vandesompele
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

P.M. generated and analysed the data and wrote the manuscript, S.L. designed the assays, J.H. and S.D. assisted in data generation, P.V. retrieved the lncRNA annotations, J.V. and P.M. co-supervised the study. All authors approved the manuscript.

Corresponding author

Correspondence to Pieter Mestdagh.

Ethics declarations

Competing interests

Primer sequences are property of Biogazelle and non-exclusively licensed to WaferGen Biosystems.

Additional information

Supplementary information accompanies this paper at (http://www.nature.com/scidata)

ISA-Tab metadata

Supplementary information

Supplementary Table 1 (XLS 579 kb)

Supplementary Table 2 (XLS 1750 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0 Metadata associated with this Data Descriptor is available at http://www.nature.com/sdata/ and is released under the CC0 waiver to maximize reuse.

Reprints and permissions

About this article

Cite this article

Mestdagh, P., Lefever, S., Volders, PJ. et al. Long non-coding RNA expression profiling in the NCI60 cancer cell line panel using high-throughput RT-qPCR. Sci Data 3, 160052 (2016). https://doi.org/10.1038/sdata.2016.52

Download citation

Received: 11 May 2016
Accepted: 08 June 2016
Published: 05 July 2016
DOI: https://doi.org/10.1038/sdata.2016.52

This article is cited by

Correlations of an Insertion/Deletion Polymorphism (rs10680577) in the RERT-lncRNA with the Susceptibility, Clinicopathological Features, and Prognosis of Lung Cancer
- Jing Zhu
- Jin-Zhu Luo
- Cheng-Bin Li
Biochemical Genetics (2019)
High throughput quantification of the functional genes associated with RDX biodegradation using the SmartChip real-time PCR system
- J. M. Collier
- B. Chai
- Alison M. Cupples
Applied Microbiology and Biotechnology (2019)

Subjects

Abstract

Similar content being viewed by others

Background & Summary

Methods

qPCR assay design and validation

RNA isolation and reverse transcription

qPCR

Data processing and normalization

Data Records

Technical Validation

Experimental assay validation

Assay reproducibility and titration response

NCI60 lncRNA expression data

Usage Notes

Additional Information

References

References

Data Citations

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

ISA-Tab metadata

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links