CircRNA-protein complexes: IMP3 protein component defines subfamily of circRNPs

Circular RNAs (circRNAs) constitute a new class of noncoding RNAs in higher eukaryotes generated from pre-mRNAs by alternative splicing. Here we investigated in mammalian cells the association of circRNAs with proteins. Using glycerol gradient centrifugation, we characterized in cell lysates circRNA-protein complexes (circRNPs) of distinct sizes. By polysome-gradient fractionation we found no evidence for efficient translation of a set of abundant circRNAs in HeLa cells. To identify circRNPs with a specific protein component, we focused on IMP3 (IGF2BP3, insulin-like growth factor 2 binding protein 3), a known tumor marker and RNA-binding protein. Combining RNA-seq analysis of IMP3-co-immunoprecipitated RNA and filtering for circular-junction reads identified a set of IMP3-associated circRNAs, which were validated and characterized. In sum, our data suggest that specific circRNP families exist defined by a common protein component. In addition, this provides a general approach to identify circRNPs with a given protein component.

Here we describe the initial characterization of circRNA-protein complexes (circRNPs). For a set of relatively abundant circRNAs we demonstrate that they exist in the form of discrete RNPs, stable in sedimentation analysis through glycerol gradient centrifugation. We then focus on IMP3 (= IGF2BP3, insulin-like growth factor 2 binding protein 3), a known oncofetal and tumor marker RNA-binding protein with multiple post-transcriptional roles [18][19][20] . In particular, there is evidence for IMP3 playing a role in pancreas development, with IMP3 being overexpressed in pancreatic ductal adenocarcinomas 18,21 . Here we combined immunoprecipitation, RNA-seq, and bioinformatic circular-junction analysis, to identify a subfamily of circRNAs stably associated with IMP3, followed by validation and further characterization of several examples of IMP3-containing circRNPs. This provides a general approach to identify circRNPs carrying a specific protein component. Our data indicate that specific proteins define subclasses of circRNPs possibly linked by a common function or biogenesis pathway.

Results and Discussion
Evidence for distinct cytoplasmic circRNPs in mammalian cells. To identify circRNA-protein complexes, we subjected cytoplasmic extract (S100) from HeLa cells to glycerol gradient fractionation, whereby RNAs and RNPs sediment according to their molecular size and shape. We focussed the analysis on S100 extract, since circRNAs predominantly localize in the cytoplasm. 22 fractions were collected, followed by RNA preparation from every second fraction and RT-PCR assays for 12 relatively abundant circRNAs, based on circular-junction-specific primers. In parallel, and for direct comparison, total RNA prepared from S100 extract was fractionated through glycerol gradients under the same conditions, followed by the RT-PCR analysis of the circRNA distribution across the gradient ( Fig. 1A; for quantitation of these results, see Fig. 1B).
The free circRNAs distribute in the glycerol gradient between fractions #7 and #11, depending on the size of the respective circRNA (219 nts for GSE1 to 1,099 nts for HIPK3). In contrast, in the S100 extract the circRNAs peak at fractions #9 to #14, but in most cases with a clear peak, indicating that each circRNA exists in the form of a major distinct large complex in the 10-15 S region. To obtain direct evidence that this is due to protein components, we also analyzed the gradient distribution of circRNAs after proteinase K treatment: Clearly, the circRNAs shifted back to the position of free RNA ( Fig. 1B and Supplementary Fig. S1), demonstrating that the large complexes we detect in S100 extract represent circRNPs. The difference between the positions of free circRNAs and respective circRNPs are between two fractions (for the small circRNAs, such as GSE1 or LPAR1) and four fractions (for the larger circRNAs, such as GLIS3 or CDYL2); this shift corresponds to a molecular mass difference of approximately 50 to 110 kDa (two to four fractions).
We have also used the same approach to identify circRNP complexes in a cytoplasmic extract prepared under less stringent conditions than S100 (without the centrifugation step at 100,000 × g), as well as in nuclear extract ( Supplementary Fig. S1): CircRNPs detected in cytoplasmic extract were generally larger than the corresponding complexes from S100 extract (differing by two to four fractions), and they were more heterogeneous in gradient sedimentation, but also protein-dependent; circRNPs in nuclear extract, as far as they were detectable (see CAMSAP1, GLIS3, and HIPK3), behaved similarly as in cytoplasmic extract.
In sum, our analysis provided biochemical evidence for that each of the circRNAs tested exists in cells as cir-cRNPs of distinct sizes.
Selected abundant circRNAs are not associated with polysomes. To analyze whether endogenous circRNAs are associated with polysomes and may be translated in HeLa cells, we selected 10 out of the 12 abundant circRNAs described above. Cells were first treated with cycloheximide to stabilize the RNA-ribosome interaction, or, as an important control, with puromycin, which releases ribosomes from mRNA being translated ( Fig. 2A). Cytoplasmic extracts were prepared, loaded onto 10-50% sucrose gradients, and subjected to ultracentrifugation. Following fractionation, the distribution of linear HIPK3 mRNA (as a positive control) and of 10 abundant circRNAs across the gradient was determined by RT-PCR (Fig. 2B). As expected, we detected for the linear HIPK3 mRNA a characteristic shift upon puromycin treatment, indicative of translational activity (Fig. 2B, HIPK3 linear mRNA, from fractions #10-11 to 6-7). In contrast, the 10 circular RNAs analyzed showed a similar sedimentation across the gradient, with peaks in fractions #1-3 (that is up to the 40S region), and with only very minor quantities (if detectable at all) in the monosome-to-polysome fractions (#4-11). Importantly, we did not observe any significant shift of these cirRNAs upon puromycin treatment (compare gradient distributions CHX versus Puro). Only for HIPK3 we found considerable quantities of circRNPs in the polysome region, yet without any major change upon puromycin treatment. Note that for CAMSAP1 circRNA we observed two RT-PCR products of very similar sizes (see fractions #6-11): The bottom band represents the circular RNA; the top band (marked with an asterisk) is due to mispriming at the linear CAMSAP1 mRNA and shifts upon puromycin treatment, as expected from translationally active linear mRNA.
We conclude that our polysome gradient analysis did not yield any evidence for circRNAs to be efficiently translated, as shown here for 10 abundant circRNAs in HeLa cells. This is consistent with three other recent reports, based on ribosome-footprinting and polysome-gradient data from mammalian cell lines 5,22 and mouse brain 23 . Although our carefully controlled initial analysis argues against a widespread translational potential of circRNAs, obviously this does not exclude that there may be natural cases of circRNA translation, for example restricted to a small subset of specialized circRNAs, or to certain cell types, tissues, developmental stages, or growth conditions. Identification of IMP3-associated circRNAs. To characterize circRNPs further, we initially tried to use the iCLIP approach (individual-nucleotide crosslinking-immunoprecipitation), which allows mapping of RNA-protein contacts at single-nucleotide resolution 24 . Because of the initial immunoprecipitation (IP) step, this approach had to focus on a specific protein to be characterized as a potential circRNP component. We chose IMP3 (also called IGF2BP3, insulin-like growth-factor 2 binding protein 3), a known multifunctional Scientific RepoRts | 6:31313 | DOI: 10.1038/srep31313 RNA-binding protein implicated in posttranscriptional gene regulation and an established tumor marker protein (see Introduction). We mapped transcriptome-wide binding sites for IMP3 protein by iCLIP in HepG2 cells (human liver carcinoma cell line), followed by searching for iCLIP tags spanning the characteristic circular junctions (see Supplementary Fig. S2 for the initial IMP3 iCLIP analysis and specific examples of IMP3 targets). However, it turned out that due to short sequencing reads and limited sequencing depth, our analysis of these IMP3-iCLIP data was not suitable for circRNA identification.
Therefore we developed an alternative approach, combining IP, RNA-seq, and bioinformatic filtering for cir-cRNA junction sequences (Fig. 3A).
First, cell lysates were prepared from cultured HepG2 cells, in parallel also from PANC1 and PATU cells, two human pancreas tumor-derived cell lines. All of these three cell lines express IMP3 protein (data not shown).
As evidence for the circularity of these putative IMP3 target circRNAs, we assayed their resistance towards RNase R, an exo-ribonuclease, using RT-PCR specific for the circular versus a downstream linear splice junction: All circRNA candidates selected showed high RNase R resistance, relative to the corresponding linear isoform from the same gene (Fig. 3D).
anti-IMP3 immunoprecipitations from lysates of HepG2, PANC1, and PATU cells, followed by RNA purification and semiquantitative RT-PCR assays of circular and linear isoforms for three IMP3 circRNP candidates (CDYL, NFATC3, and ANKRD17), comparing in each case input (5%) and immunoprecipitate (90%); as a negative IP control, anti-FLAG IPs were analyzed in parallel (Fig. 4A). As another negative control, both linear and circular variants of CAMSAP1 were analyzed; the linear FTL mRNA, which we knew from our iCLIP study to be IMP3-associated (see above and Supplementary Fig. S2B), served as a positive control.
As a result, we were able to confirm all three IMP3 circRNP candidates as IMP3 targets (CDYL, NFATC3, and ANKRD17); both linear and circular isoforms of CAMSAP1, in contrast, exhibited only background or undetectable levels after IMP3 immunoprecipitation. These semiquantitative results were further confirmed by real-time PCR assays (Fig. 4B): The anti-IMP3 IP efficiencies were 4-5% for CDYL, 11-18% for NFATC3, and 16-24% for ANKRD17 circRNA, with only minor differences between the three cell lines.
As the next step, we fractionated HeLa cytoplasmic extract (S100) by glycerol gradient centrifugation, in parallel with total RNA prepared from the same extract, and proteinase K-treated S100 extract (Fig. 4C). The two major IMP3-associated circRNAs, NFATC3, and ANKRD17, were detected in the gradient fractions by RT-PCR, using circular-junction-specific primers, analogously to the characterization of abundant circRNAs (see above and Fig. 1). Both IMP3-associated circRNAs showed a distinct peak, suggesting that a predominant complex exists, which concentrated in fractions #15 (NFATC3) and #17 (ANKRD17). Corresponding free circRNAs sedimented approximately four fractions more towards the top, and very similarly as deproteinized circRNPs, confirming that the circRNA complexes contain protein components. We noted that the free circRNAs for NFATC3  Co-precipitated RNA was purified and assayed by RT-PCR for FTL mRNA (positive control), CAMSAP1 (negative control circRNA), and for the putative IMP3-associated circRNAs CDYL, NFATC3, and ANKRD17. For each of the circRNAs, the linear isoform of the respective gene was tested in addition (circ/lin; 90% of the mock-and IMP3-immunoprecipitates were used in RT-PCR). In addition, 5% of the input material was assayed (I). M, markers (in bp). (B) Quantitative immunoprecipitation analysis of IMP3-circRNA association in three cell lines (HepG2, PANC1, and PATU). For the same set of IMP3 circRNA targets and controls as shown in panel A, the immunoprecipitation efficiences were determined by RT-qPCR assays (% of input; statistical deviations based on biological duplicates). (C) Sedimentation profiles of IMP3-containing circRNPs. Cytoplasmic S100 extract from HeLa cells (extract), corresponding free RNA (RNA), and proteinase K-treated extract (extract + PK) were fractionated by glycerol gradient centrifugation (#1-22; the last fraction contains the resuspended pellet), followed by RT-PCR analysis of two IMP3-containing circRNAs (NFATC3 and ANKRD17). The relatively low recovery of circRNAs NFATC3 (1298 nt) and especially ANKRD17 (1832 nt), the largest circRNAs analyzed in S100 extract, may be caused by the higher tendency of such large circRNPs to aggregate and form precipitates, which were lost in the pellet fraction. The positions of ribosomal RNA size markers are indicated (5S, 18S, and 28S), as well as the shift of the circRNA vs. circRNP peak fractions (brackets). For comparison, the distribution of total IMP3 protein across the gradient was visualized by Western blotting in extract and, as a control, in proteinase K-treated extract. (D) IMP3 immunoprecipitation efficiencies of gradient-purified NFATC3 and ANKRD17 circRNPs. HeLa cytoplasmic S100 extract was gradientfractionated, and NFATC3 and ANKRD17 circRNPs were IMP3-immunoprecipitated from the respective peak fractions (NFATC3, #15; ANKRD17, #17), using CAMSAP1 circRNA (peak fraction #11) and anti-eIF4E immunoprecipitation as negative controls. Immunoprecipitation efficiencies (% of input; statistical deviations based on technical triplicates) were determined by RT-qPCR.
(1298 nt) and ANKRD17 (1832 nt) show a broader distribution across the gradient (Fig. 4C), similarly as the free circRNAs of CDYL2 (592 nt) and HIPK3 (1099 nt; Fig. 1), compared with smaller circRNAs. This may reflect the higher potential of large circRNAs to form alternative structures (or aggregates by RNA-RNA interactions) that would affect their sedimentation behavior. The distribution of IMP3 protein was detected by Western blotting in the same gradient fractions: A large portion of IMP3 protein peaked in fraction #5, most likely representing free protein, but the rest of it distributed across fractions #7-19, that includes the region where the IMP3 circRNPs fractionated.
Finally, we measured IMP3 association for two of these gradient-purified circRNPs, NFATC3 and ANKRD17, using anti-IMP3 immunoprecipitation from the respective peak fractions (NFATC3, #15; ANKRD17, #17), followed by real-time RT-PCR for the circular splice junctions; as negative controls we used IP against eIF4E, an abundant cytoplasmic RNA-binding protein, which binds to the cap structure of linear mRNAs, and CAMSAP1 circRNA, which was not among the IMP3 targets identified (Fig. 4D). The IP efficiencies of the NFATC3 and ANKRD17 IMP3 circRNPs with IMP3 antibodies were 6.4 and 21.7%, respectively, background binding below 1.3% (anti-eIF4E IP) or undetectable (for CAMSAP1 circRNP). In sum, we thereby confirmed that IMP3-containing circRNPs of the NFATC3 and ANKRD17 circRNAs exist and are stable through glycerol gradient centrifugation.

Specificity of IMP3-circRNA binding based on a SELEX-derived C/A-rich motif.
To explain the specific and stable IMP3 association with a subset of circRNAs, we further investigated the intrinsic RNA-binding specificity of the IMP3 protein. We therefore applied an in vitro SELEX procedure, using four rounds of selection with an N 20 RNA pool and recombinant GST-IMP3, which contains the full-length IMP3 protein (Fig. 5A). GST protein served as a control in parallel selection and enrichment rounds. After each round, aliquots of the RT-PCR amplified RNA pools were analyzed by Solexa sequencing, resulting in between 0.56 and 0.70 mio sequence tags per cycle for GST-IMP3 (for the GST control: 1.68 mio tags after the fourth cycle).
The representation of each of the 256 possible tetramer motifs was determined after each SELEX cycle and the enrichment of each tetramer evaluated by z-score values. Figure 5B shows as a heatmap only the top 20 tetramers (ordered according to their z-score sum of the four SELEX rounds with GST-IMP3 protein) and the bottom 10 tetramers, in particular how these motifs changed over the four cycles (R1 to R4). Clearly, the top 10 tetramers (highlighted in yellow) are highly C/A-rich, with CACA, ACAC, and AACA becoming enriched most strongly, consistent with the "compendium motif " of Ray et al. 26 . On the other extreme, the bottom 10 motifs are all G/Cor G/T-rich. A corresponding analysis of hexamer motif enrichment confirmed these results (Fig. 5B).
To analyze whether these SELEX-based IMP3 RNA-binding motifs are enriched in the IMP3-associated cir-cRNAs, we calculated the sum of the top 10 tetramer motif counts in 34 IMP3-bound circRNAs, normalized to 100 nts sequence length (Fig. 5C, left part). These motif counts in this IMP3 target circRNA group were compared with those in a non-target group (circRNAs well-expressed in HepG2 cells, n = 117; for details, see Methods). Based on kernel density estimation and a p-value of 1.077e-07 (Welch two sample t-test), we conclude that C/A-rich motifs are significantly enriched in IMP3-associated circRNAs (Fig. 5C, right part).
In conclusion, the IMP3 protein by itself can recognize certain C/A-rich motifs. The specificity of IMP3 RNA recognition is most likely complex, due to its domain structure with two RNA-recognition (RRMs) and four KH motifs. This may contribute to why the motif from a PAR-CLIP-based in vivo study in HEK293 cells (CAUU) 25 differed from an in vitro analysis (CA-rich motif) 26 . There may also be additional factors, for example associated other RNA-binding proteins, that modulate the RNA-binding preference of IMP3 in vivo or act in a combinatorial manner. Therefore we have to consider that IMP3 may bind in 3′ -UTR versus coding-exon regions through different ways and with different cofactors, reflected in different enriched motifs.

Methods
Glycerol gradient sedimentation analysis. Cytoplasmic S100 fraction (S100), cytoplasmic extract (CE; prepared without the centrifugation step at 100,000 × g), and nuclear extract (NE) from HeLa cells (IpraCell), as well as RNA isolated from S100 extract (RNA), proteinase K-treated S100 (S100 + PK), and proteinase K-treated cytoplasmic extract (CE + PK) were analyzed by glycerol gradient centrifugation. For the gradient with free RNA, RNA from 500 μ l S100 was isolated by TRIzol (Ambion). Proteinase K (PK) treatment was done by incubating 500 μ l extract (S100/CE) in 1x PK buffer with 100 μ g/ml PK (Roth) and 0.5% SDS for one hour at 37 °C. Samples of 500 μ l each were loaded onto 10-30% (v/v) glycerol gradients (10 ml) and subjected to ultracentrifugation for 16 hours at 4 °C (32,000 rpm; SW-40). After centrifugation, the gradient was fractionated manually into 21 fractions of 500 μ l each, whereas the resuspended pellet was labeled as fraction 22. RNA was isolated from 200 μ l of every second fraction by TRIzol (Ambion), followed by RT-PCR (described below). PCR products were analyzed on 2% agarose gels and quantified (GeneTools software; Syngene). As size markers, the sedimentation of ribosomal RNAs was analyzed (Agilent 2100 Bioanalyzer; RNA 6000 Nano Kit, Agilent).
Scientific RepoRts | 6:31313 | DOI: 10.1038/srep31313 IMP3 iCLIP. The iCLIP experiment was performed with HepG2 cells. For immunoprecipitation, IMP3-specific polyclonal antibodies (Millipore) were used, for the negative control the antibody was omitted. For validation of immunoprecipitation by Western blotting, an IMP3-specific monoclonal antibody (E2, Santa Cruz) was used. Motifs are ordered according to their cumulative (R1-R4) z-score. The top 10 tetramer motifs (highlighted in yellow) were used for motif enrichment analysis of specific circRNAs (see panel C). (C) IMP3-binding motif enrichment in IMP3-associated circRNAs. The sum of the top ten tetramer motif counts in each of the IMP3 target circRNAs was determined (left part) and compared with a non-target group of circRNAs (n = 117). This is represented by a kernel density estimation plot with group median values shown by vertical lines (p-value 1.077e-07; Welch two sample t-test; right part). Sequencing was performed on an Illumina MiSeq instrument (75 bp single-end reads). For details on iCLIP experimental procedures and data analysis, see Rossbach et al. 27 and references therein.
Identification of IMP3-associated circRNPs. RNA co-immunoprecipitation. Cell lysates were prepared in RIPA buffer (50 mM Tris-HCl pH 7.4, 150 mM NaCl, 5 mM EDTA, 1% NP-40, 0.1% SDS) and subsequently cleared by centrifugation as well as by pre-incubation with protein-G dynabeads (Life technologies) without antibodies. Antibody binding was performed o/n at 4 °C, using a polyclonal IMP3 antibody (Millipore) and as mock control, FLAG-antibody (Sigma-Aldrich). Bead capturing was carried out for one hour at room temperature with protein-G dynabeads, and protein-RNA complexes were washed five times with 500 μ l washing buffer (50 mM Tris-HCl pH 7.4, 150/300/600 mM NaCl, 0.05% Tween-20), increasing the stringency during the washing steps. RNA from the input (5%) and from the immunoprecipitated (IP) fractions was extracted by TRIzol (Ambion), followed by RQ1 DNase (Promega) digestion and ethanol precipitation. cDNA synthesis and RT-PCR assays are described below. RNA from co-IP experiments (mock-IP and IMP3-IP) was used for cDNA library preparation, using the TruSeq stranded total RNA sample preparation kit (Illumina) according to the manufacturer's protocol starting with RNA fragmentation. Libraries were sequenced on a HiSeq instrument (paired-end, 2 × 100 bp, Illumina). Immunoprecipitation from gradient fractions was done as described above, starting with 100 μ l of glycerol gradient fractions. The IP was carried out with a monoclonal IMP3 antibody and a monoclonal eIF4E antibody as mock control (Santa Cruz Biotechnology).
RNA-seq data analysis. Sequence reads were aligned to the human genome sequence (hg19) using STAR, an ultrafast universal RNA-seq aligner with chimeric alignment options 28 . Chimeric mapped reads were selected as circRNA-specific junction reads by applying four additional criteria: (1) Sequence read map to the same chromosome and the same strand, with the two sequence segments mapping to the genomic region in reverse order. (2) The overhang spanning the "back-spliced" junction is ≥ 12 nts.
(3) The alignment score of the chimeric mapped reads (using column 14:aS of Standard SAM attributes) must be greater (> 2) than the linear alignment with genomic sequences and annotated transcripts (Comprehensive Gene Annotation Set from GENCODE Version 19). (4) Both 5′ and 3′ splice sites are either annotated or conform to canonical splice sites. The circRNA abundance was predicted on the basis of circRNA specific junction read counts.
RT-PCR. RNA from gradient fractions or immunoprecipitated RNA was extracted by TRIzol (Ambion). cDNA was prepared by reverse transcription of 1 μ g total RNA (RNase R assay), 10% of RNA from gradient fractions, or 10% of the coimmunoprecipitated RNAs (input/IP), using the qScript flex cDNA synthesis kit (Quanta) and random hexamer primers. Circular isoforms were PCR-amplified with divergent primers detecting the circRNA junction and linear isoforms with primers detecting the canonical splice junction downstream of the circRNA producing exons (for primer design, see ref. 29). PCR products were analyzed on 2% agarose gels (for primer sequences, see Supplementary Table S2). Real-time PCR was carried out, using PerfeCTa SYBR Green Fast-Mix (Quanta) and an Eppendorf realplex 2 thermocycler. Primer efficiencies were determined by four serial dilutions of cDNA derived from total RNA (R 2 = 0,99, slope = − 3.46 to − 4.15). The fraction of bound target RNAs in co-IP assays was calculated from technical triplicates by the Δ Δ C t -method, with each target normalized to the corresponding input fraction (results represented as percent of the input). Biological replicates were used to calculate standard deviations.
Western Blotting. Western blotting was performed with 1% of glycerol gradient fractions and a polyclonal IMP3 antibody (Millipore), which was secondarily detected with an anti-rabbit antibody.

SELEX-seq and motif analysis of IMP3-RNA binding. Protein expression and purification. The IMP3
(IGF2BP3) open reading frame was PCR-amplified with IMP3_fwd/IMP3_TEV_His_rev primers, including His-tag and TEV-cleavage site, and cloned (EcoRI, XhoI) into the pGEX-6P2 expression vector (GE Healthcare). The expression of the GST-IMP3-TEV-His fusion protein was induced by IPTG (1 mM) in E. coli BL21, followed by a two-step purification. Cells were lysed in His lysis and washing buffer (50 mM NaH 2 PO 4 pH 8.0, 2 M NaCl, 50 mM imidazole, 10 mM 2-mercaptoethanol, 10% glycerol, 2% Triton X-100) by sonication (three times 20 sec). The fusion protein was purified from cell lysate by incubation with Ni-NTA agarose (Qiagen) and subsequent elution (50 mM NaH 2 PO 4 pH 8.0, 300 mM NaCl, 250 mM imidazole). The His tag was cleaved off (AcTEV-protease, 4 °C o/n, Life Technologies), and the remaining protein was purified in a second step via the GST tag (glutathione-Sepharose beads, GE Healthcare). SELEX selections were carried out with the fusion protein bound to glutathione-Sepharose.
Selected RNAs were reverse-transcribed (qScript Flex cDNA Synthesis Kit, Quanta), using the SLX_RT reverse primer, followed by PCR amplification with SLX_RT and SLX_T7-fw primers (16 cycles). Transcripts for the next round of selection were produced by in vitro transcription. After four rounds of selection, RNA aliquots from each round and from the fourth round of GST selection were used for barcoding by reverse transcription with the SLX_R13-16 (GST-IMP3) and SLX_R18 (GST) reverse primers. cDNA libraries were amplified by PCR (17 cycles; SLX_Sol-5xN_fwd and SLX_Sol_rev). All libraries were pooled in equal amounts and purified by Caliper (XT DNA 750 assay kit, Perkin-Elmer). The final library pool was subjected to high-throughput sequencing on a MiSeq instrument (single-read 100 bp, Illumina). For primer sequences, see Supplementary Table S2.
SELEX-seq data analysis of RNA binding of IMP3 protein. Sequence reads were first sample-barcode sorted, trimmed by PCR primer sequences on both ends, and further random-barcode filtered to obtain 18-to 20-nt sequence tags of the enriched RNA pools (numbers of filtered sequence tags given in Fig. 5A). The numbers of filtered sequence tags (from each SELEX round) containing either of the 256 or 4096 possible tetramer or hexamer motifs, respectively, was summarized, and the z-score values were calculated for enrichment of each motif.
IMP3 non-target circRNA group. To derive a group of circRNAs well-expressed in HepG2 cells, RNA-seq data from HepG2 whole cells [poly(A)minus selection; generated by ENCODE Consortium Long RNA-seq] were analyzed. The expression of circRNAs was determined by circRNA-specific junction counts. Based on three criteria (circular-junction counts ≥ 10; derived from exons of protein-coding genes; not detected in our IMP3 IP experiment), 117 circRNAs were selected as "IMP3 non-target group".