Single cell analysis of cancer cells using an improved RT-MLPA method has potential for cancer diagnosis and monitoring

Single cell analysis techniques have great potential in the cancer genomics field. The detection and characterization of circulating tumour cells are important for identifying metastatic disease at an early stage and monitoring it. This protocol is based on transcript profiling using Reverse Transcriptase Multiplex Ligation-dependent Probe Amplification (RT-MLPA), which is a specific method for simultaneous detection of multiple mRNA transcripts. Because of the small amount of (circulating) tumour cells, a pre-amplification reaction is performed after reverse transcription to generate a sufficient number of target molecules for the MLPA reaction. We designed a highly sensitive method for detecting and quantifying a panel of seven genes whose expression patterns are associated with breast cancer, and optimized the method for single cell analysis. For detection we used a fluorescence-dependent semi-quantitative method involving hybridization of unique barcodes to an array. We evaluated the method using three human breast cancer cell lines and identified specific gene expression profiles for each line. Furthermore, we applied the method to single cells and confirmed the heterogeneity of a cell population. Successful gene detection from cancer cells in human blood from metastatic breast cancer patients supports the use of RT-MLPA as a diagnostic tool for cancer genomics.


Supplementary Figures Supplementary
. Outline of RT--MLPA protocol a) Reverse transcription and pre--amplification. b) Denaturation and hybridization of specific MLPA probes. c) Ligation and multiplex amplification.   Supplementary Table S1. Summary of MLPA probe, primer and barcode sequences for the genes in the panel Supplementary

Detection of gene specific amplicons based on length and barcodes
The array was printed using a Nanoplotter NP--2 (GeSIM, Grosserkmannsdorf, Germany) on activated CodeLink slides (SurModics, Eden Prairie, MN, USA). RT--MLPA products were stained using a primer coupled to Cy3 fluorophore during final amplification. Single--stranded DNA was generated from the double-stranded MLPA products using  CD44, HUWE1, CDH1). The output signal was normalized against the barcode yielding the strongest signal in that printed array batch. The individual MLPA probe efficiencies in the multiplex reaction were estimated by performing hybridization, ligation and amplification using seven synthetic DNA templates (10 nM) complementary to the hybridizing regions of the MLPA probes. A correction factor was calculated by taking the mean FU for each gene (Y1--7), in four replicates for combined and ten replicates for separate ligation and multiplex amplification, calculating the grand mean (X) from all the gene--specific means (Yi), then normalizing the values for each individual gene using X and calculating individual MLPA probe efficiencies Zi (Zi = X / Yi). The final correction factor 2*Zi (2 being a scaling factor) for each gene was multiplied with FU raw data before statistical analysis, see Supplementary  Table 3 for list of MLPA correction factors. For patient samples the noise was determined to be the median including three times the standard deviation for the negative controls. The noise was removed from patient samples and negative values were set to zero in the histograms.

RNA library construction, sequencing and data analysis
High--quality RNA (RNA integrity number >9) from the three cell lines was used for sequencing library preparations according to the protocol of the manufacturer (Illumina, San Diego, CA, USA). The libraries were clustered on a cBot cluster--generation system and sequenced as paired--end, 2x100 bp sequences on an Illumina HiSeq. The sequencing run was performed according to the manufacturer's instructions. The reads were aligned to the human reference genome (hg19) with TopHat 1 , the aligned reads were assembled into transcripts using Cufflinks 2 and Fragments Per Kilobase of exon per Million fragments mapped (FPKM) values were calculated. Reads corresponding to gene CD24 were miss--aligned to CD24P4 due to gene losses from genome assemblies. Therefore, the expression value for CD24P4 was used when studying CD24 expression (NCBI Gene ID 10013394180113). FPKM values for the genes in the gene panel are listed in Supplementary Table 4.

Quantitative PCR
The qPCR was performed using iQ SYBR Green supermix (Bio--Rad, Hercules, CA, USA) for the pre--amplification step using 100 pg of total RNA from MCF--7 cells, and for the final amplification step using single MCF--7 cells picked using LCM. For the latter, separate ligation and amplification was performed where 0.25 µM each of the Y and X primers were added to the iQ SYBR Green Supermix.

Formalin fixation
An in--house formalin fixation method was applied to a few samples. A MNC fraction without MCF--7 cells and a MNC fraction with approximately 1 million MCF--7 cells were spun down and excess liquid was removed. The cell pellets were fixed in 50 µl freshly prepared 2% formaldehyde in PBS (pH 7.4) for 10 min at room temperature. The samples were spun down, the supernatant was removed, and cells were washed with 100 µl PBS. This was repeated twice, before proceeding to IMS. Furthermore, two blood samples (without MCF--7 cells and with approximately 300 000 MCF--7 cells) diluted in PBS were fixed by adding 1 ml of 2% formaldehyde and incubated at room temperature for 10 min before proceeding to IMS.