SpliceDetector: a software for detection of alternative splicing events in human and model organisms directly from transcript IDs

Baharlou Houreh, Mandana; Ghorbani Kalkhajeh, Payam; Niazi, Ali; Ebrahimi, Faezeh; Ebrahimie, Esmaeil

doi:10.1038/s41598-018-23245-1

Download PDF

Article
Open access
Published: 22 March 2018

SpliceDetector: a software for detection of alternative splicing events in human and model organisms directly from transcript IDs

Mandana Baharlou Houreh¹,
Payam Ghorbani Kalkhajeh²,
Ali Niazi¹,
Faezeh Ebrahimi³ &
…
Esmaeil Ebrahimie ORCID: orcid.org/0000-0003-1699-3476^1,4,5,6

Scientific Reports volume 8, Article number: 5063 (2018) Cite this article

15k Accesses
7 Citations
Metrics details

Subjects

Abstract

In eukaryotes, different combinations of exons lead to multiple transcripts with various functions in protein level, in a process called alternative splicing (AS). Unfolding the complexity of functional genomics through genome-wide profiling of AS and determining the altered ultimate products provide new insights for better understanding of many biological processes, disease progress as well as drug development programs to target harmful splicing variants. The current available tools of alternative splicing work with raw data and include heavy computation. In particular, there is a shortcoming in tools to discover AS events directly from transcripts. Here, we developed a Windows-based user-friendly tool for identifying AS events from transcripts without the need to any advanced computer skill or database download. Meanwhile, due to online working mode, our application employs the updated SpliceGraphs without the need to any resource updating. First, SpliceGraph forms based on the frequency of active splice sites in pre-mRNA. Then, the presented approach compares query transcript exons to SpliceGraph exons. The tool provides the possibility of statistical analysis of AS events as well as AS visualization compared to SpliceGraph. The developed application works for transcript sets in human and model organisms.

Prediction of alternative pre-mRNA splicing outcomes

Article Open access 15 November 2023

SpliceVault predicts the precise nature of variant-associated mis-splicing

Article Open access 06 February 2023

rMATS-turbo: an efficient and flexible computational tool for alternative splicing analysis of large-scale RNA-seq data

Article 23 February 2024

Introduction

Transcripts are products of pre-mRNA splicing processes. Novel transcripts discover each day^1,2 and add to public databases. Development of high throughput transcriptome sequencing (RNA-seq) has provided a new opportunity to thoroughly investigate the expression differences between genes as well as within the transcripts of a gene³. Compared to microarrays, RNA-seq technology allows higher accuracy in discovery of splice junctions and sequences⁴. AS event and its types are important in composition of protein domains, drug designing and drug resistance^5,6.

In the splicing process, introns are removed from pre-mRNA, and exons fit together with various arrangements. Consequently, each gene develops distinct transcripts to produce distinct proteins. Depending on the AS pattern, properties of cell construction, functions or destination may be affected. It has been revealed that many diseases are associated with the change of particular AS pattern in transcripts^5,7,8, such as spinal muscular atrophy (SMA) disease⁹ and Hutchinson-Gilford progeria syndrome (HGPS)¹⁰.

Various types of AS events are known and are divided into 5 groups (Fig. 1). The first one is exon skipping (ES) where an exon is removed together with its introns on both sides of the transcript. The second and third types of alternative splicing are related to both the 3′ and 5′ ends of exons(A5′ss & A3′ss). These types of AS events occur when there are more than one splice site at one end of an exon. If an exon has both of these splicing sites, the alternative 5 and 3 splicing sites will be formed (A5′ & 3′ss). The fourth type is introns retaining (RI) where introns remain in transcript. This is the rarest known type in both vertebrates and invertebrates (less than 5 percent of AS events). There is another type of splicing type related to the latter type which includes a partial retention of an intron. We call it sub_RI type. The last group of AS events takes place when first or last exon or both of them are alternates of first or last putative exons and make alternate promoters and alternate terminators splicing types^11,12.

Transcripts are the important output of many high throughput transcriptome analysis tools which are widely used in RNA-seq data analysis¹³. However, many of the AS finding tools do not have the sufficiency of finding AS events straightly from specified transcripts.

There is an increased attention to develop tools to extract and analyze AS events. A majority of these tools implement AS analysis using transcripts reconstruction. In some tools, performing alignment with a reference genome for model organisms is the basis of analysis. For instance, SpliceSeq works based on known splice junctions and detects AS events using inclusion of exons and splice junctions in transcripts¹⁴. Another tool, Cufflinks/Cuffdiff gets a prerequisite data in GTF format as reference for comparison^15,16 and works based on alignment approach. The second category of AS discovery tools reconstructs transcripts without any reference. Trinity methodology for de novo full-length transcriptome reconstruction¹⁷ and ASGS which knows alternative splicing junction though the approach of SpliceGraph forming¹⁸ are within this category. A more complete list has been offered in supplementary materials (Supplemental files, S1). In addition, most of these applications and web tools need a high level of computer skills and also a prerequisite data for their data processing tasks^{19,20,21,22,23,24,25,26,27,28}. To work with current tools, it is a necessity for the researchers to be familiar with data formats and software environments.

There is a need for new tools with the capability of directly AS occurrence analysis in a set of transcripts. In order to fill the mentioned gap, our application was designed to discover AS events from known transcripts at a high speed and in a simple and user friendly environment. The developed application in this study solves the above mentioned problems and has considerable advantages. The software does not need any computer skill. Furthermore, the need to data updating was eliminated by using the updated information placed in the Ensembl database to form SpliceGraphs. The basic pathway of the application includes, taking transcript IDs as input, building a SpliceGraph based on all of the exon coordinates of the related gene, and producing AS events as output.

Methods

Application Architecture and Data Acquisition

The application has been coded in Microsoft Visual Studio utilizing C#. NET and comprises two main parts: the SpliceGraph builder and AS events finding. Due to the open source software and the relational database system of the Ensembl database^29,30, we used Ensembl database to obtain the required data for building SpliceGraph and extracting AS events of known transcripts. The protein-coding type of transcripts was applied as the resource transcripts for software. These basic processes are depicted in Fig. 2.

SpliceGraph building

Building SpliceGraphs is the basic part of many splicing detector tools^14,18,31,32. To ease the difficulty of case by case analyzing of each splice variant and also to investigate the relationship between different transcripts, the approach of graph representation of splicing variants was employed³³. Graphs include putative exons to use for comparing and extracting AS. Various tools use different methods to build SpliceGraphs. SpliceGrapher³² as a Python-based scripting tool constructs the SpliceGraph by summarizing short reads aligned to a reference genome. SplAdder³⁴ integrates annotation information and RNA-Seq data to generate an augmented splicing graph, and SpliceSeq¹⁴ summarizes known transcript variations and knowledge about gene structure into a directed acyclic graph. Requiring a prerequisite data as a reference data is a noticeable clue in all of these mentioned tools. However, our approach has been established on the frequency of active splice sites in pre-mRNA which is provided by the SpliceDetector application directly from Ensembl database due to online mode of software. In the first step, exons with the highest frequency of their splice sites were selected as putative exons. Then, the lengths of exons were considered as the selection factor and longer exons were selected as putative exons when we had an equal frequency of splice sites. In the third step, we selected multiple exons as putative exons when an exon was equivalent to several smaller exons. Figure 3 shows the rules applied in this project for building SpliceGraphs.

Rules applied in SpliceGraph building

In the first phase, putative exons were selected based on the highest frequency of the splice sites which are known as the exons start and end points.
In the second phase, the lengths of exons were considered. It means, putative exons were selected regarding to their nucleotide numbers when there was equal frequency of splice sites. In other words, minimum start point for repeated end points and maximum end point for repeated start points were selected.
At the third step, multiple exons were selected as putative exons, if an exon was equivalent to several smaller exons. In other words, when an exon in a transcript includes some shorter exons in another transcript, the multiple exons were classified as putative exons.

The gene in the example has 4 transcripts and Fig. 3 shows how these rules of classification have been applied.

Steps to form SpliceGraph

1.
Using BioMart of Ensembl database and the XML web service format, all known exons of protein-coding transcripts of the related gene were downloaded. The obtained exon set might have duplicated exons.
2.
Reverse strand transcripts, presented as the minus strand direction in the downloaded data, were turned over using their genomic positions to be considered as forward strands.
3.
All start and end points of all exons were collected in a pool, regardless of exon repetition, transcript length and transcripts support level (http://www.ensembl.org/Help/Glossary).
4.
The collected start and end points of mentioned exons were sorted and their frequencies were measured.
5.
Putative exons were selected using the previously mentioned rules regarding their start or end properties and then the SpliceGraph was formed.

As an example, we present the steps of SpliceGraph building for an Ensembl transcript ID of OSGIN1 gene.

Example Query Transcript ID:ENST00000565123.

Retrieving required data:

At the first step, genomic coordinates of query transcript ID was downloaded using an XML file (Supplemental files, S3) to retrieve genomic coordinates of query transcript exons. In order to apply an integrated approach for all transcripts, downloaded coordinates of reverse strand transcripts were turned over to form forward strand coordinates for reverse transcripts. Then, all exon coordinates of the gene of interest were downloaded using retrieved gene ID.

Algorithm implementation:

Our designed algorithm employed GROUP BY clause to measure the frequency of all retrieved start and end points of all exons which are collected in a digits pool.

SpliceGraph formation:

For SpliceGraph building, putative exons were selected using the designed rules regarding their start and end properties. By eliminating the exons that do not follow the SpliceGraph construction rules, we have a SpliceGraph including 8 putative exons(Supplemental Tables, ST1–5).

Method of comparison

We designed an integrated algorithm to compare the query transcript exons with the SpliceGraph exons. The algorithm takes the start and end coordinates of the query transcript and the relevant arranged SpliceGraph coordinates as input and gives splice types as output. SpliceDetector source code is available in the supplemental files (S2).

The algorithm of data processing

If E₁T is the first exon of the query transcript and E₁G is the first exon of the s SpliceGraph that has been built using the query transcript, we have:

Differential splicing analysis

In addition to expanding proteome diversity, alternative splicing may produce splice forms that are not translated into proteins, but play major roles in regulation of gene expression³⁵. In order to study the effect of treatment on AS events alteration before and after the treatment, we added a statistical analysis of AS events of transcripts to our software. We considered unique mapped transcript reads as effective read count for AS events to avoid read mapping errors and prevent false positive outcomes³⁶. In a comparison between an experimental group and a control group, the number of AS events of each transcript before treatment can be calculated by AS events number of that transcript multiply by its unique reads count in control sample (before treatment);

Total ES events count for each transcript before the treatment = Unique transcript mapped reads count before treatment * ES event number of the transcript in control sample.

Similarly, the number of AS events of each transcript after treatment can be calculated by AS events number of that transcript multiply by its unique mapped reads count in treated sample (after treatment).

Total ES events for each transcript after the treatment = Unique transcript mapped reads count after treatment * ES event number of the transcript in treated sample.

Regarding the fact that transcripts without differential expression have the same amount of AS events and expression rates before and after the treatment, we can get an estimation of AS events changes using AS events of differentially expressed (DE) transcripts under the treatment. The Chi-square goodness-of-fit test is used for nominal variables and calculates the probability of getting a result like observed data under the null hypothesis³⁷. Therefore, we applied the Chi-Square goodness of fit test to compare AS events abundance before and after the treatment. Treatments may alter the amount of AS events in each transcript and differentially expressed transcripts usually exhibits a significant alteration in the number of AS events before and after the treatment due to their significant different expression. The presented comparison approach examines the overall changes in the amounts of AS events. Table 1 shows a simplified example for ES (exon skipping) event.

Table 1 An example of comparing Alternative Splicing events abundance before and after treatment. Total number of Exon Skipping events for each transcript before the treatment equals with Unique transcript number of reads before treatment multiply by ES event number of that transcript in control sample and similarly, total number of ES events for each transcript after the treatment equals with Unique transcript number of reads after treatment multiply by ES event number of that transcript in treated sample. Then a Chi-square goodness of fit test evaluates the significance of the difference in total number of ES events on the whole experiment level before and after the treatment. The number of final events may be adjusted on the whole experiment level with a non significant p-value (part a), or show a significant total alteration of AS events (part b).

Full size table

As an example, we performed a statistical analysis of AS events in a set of DE transcripts upon treatment with Genistein (the soy isoflavone metabolite). This DE transcripts list was generated from MCF-7 breast cancer cell line RNA-Seq data (FASTQ files) downloaded from GEO database under accession number GSE56066³⁸. Figure 4 shows the outcome of applied statistical test on the transcripts associated with ‘transcription’ gene ontology. According to results of the test (Table 2), AS events including RI (Retained Intron), sub_RI (sub-Retained Intron), AP (Alternative promoter) and AT (Alternative Terminator) event types exhibited significant differences in occurrence between control and treated samples. In contrast, ES (Exon Skipping), A3′SS (Alternative 3′ splice site) and A5′SS (Alternative 5′ splice site) event types did not show a significant difference. The data related to DE transcripts identification and gene ontology analysis can be viewed in in the supplemental files (S4).

Table 2 Results of performed statistical analysis of AS events in MCF-7 breast cancer cell line after treatment with Genistein. The input data was the differentially expressed transcripts, associated with ‘transcription’ gene ontology, of MCF-7 breast cancer cell line under GEO accession number GSE56066³⁸. According to results of the test, AS events including RI (Retained Intron), sub_RI (sub-Retained Intron), AP (Alternative promoter) and AT (Alternative Terminator) event types exhibited significant differences between control and treated samples and ES (Exon Skipping), A3′SS (Alternative 3′ splice site) and A5′SS (Alternative 5′ splice site) event types did not show a significant differences.

Full size table

Results

Unlike the other AS detector tools, our application detects AS events types directly from transcripts without any advanced computer skills, prerequisite application installation, or required data downloading by users. The application works in two forms of single and multiple forms (Fig. 5) and accepts the query transcript IDs in Microsoft office excel, GTF, and GFF3 formats. A graph which represents the query transcript exons as well as the constructed SpliceGraph, illustrates the alternative splicing regions and offers an understanding of splice sites and alternative splicing events.The online working mode of the application results in low application size. Furthermore, due to the downloaded references from Ensembl site, SpliceGraph are updated in each use. Meanwhile, this eliminates the need for application updating or the need to any given repository or database data and reference.

Data Storage, Visualization and Updating

The present application does not require any given (repository or database) data. The only requirement for application installation on private computers is. NET Framework 4.5 (or higher) and the only given data is transcript IDs of interest. In addition, this tool works online (connected to the Internet), so, SpliceGraph building process relies on updated data of Ensembl database and there is no need for the users to get involved. This application is not specific to a particular organism and works with all model organisms on Ensembl database.

In order to examine the results of SpliceDetector application, we downloaded the result obtained from an experiment made by Obstetrics and Gynecology department of University of Alabama at Birmingham in 2014 where ovarian cancerous tissue was treated with the herbal drug paclitaxel (PTX) derived from a plant called Taxusbrevifolia (Pacific yew)³⁹. We implemented RNA-seq analysis on downloaded short reads to get their known transcripts based on Ensembl database using CLC Genomic Workbench 9.0.0 software (https://www.qiagenbioinformatics.com). In order to get differential expression details, the proportions-based (Baggerley’s) test was applied on results. We filtered result data based on p-value less than 0.01 and a fold change more than 2.5 in treated samples against controls (Supplemental files, S5). Two of the differentially expressed genes which we found were TMEM123 (Transmembrane Protein 123) and DHRS4L2 (Dehydrogenase/Reductase 4 Like 2). The ENST00000361236 transcript of the TMEM123 gene has been downregulated and the ENST00000335125 transcript of the DHRS4L2 gene has been upregulated due to the treatment. These alterations are originated from changes in AS events patterns occurring in transcripts formation. Therefore, we can extract each transcript splicing type and compare them. Below is the results of SpliceDetector application analyzing.

ENST00000361236: AT,SE(Exon5),SE(Exon4)

ENST00000335125: AP,RI(Exon9),RI(Exon7)

These results show an alteration in exon skipping of exons 4 and 5 in TMEM123 under paclitaxel. In contrast the treatment increases the retention of the introns 7 and 9 in DHRS4L2. Investigating the gene ontology analysis of TMEM123 gene, through the Ensembl gene ontology annotation led us to necrotic cell death while the DHRS4L2 involves in oxidation-reduction process that results in the removal or addition of one or more electrons to/or from a substance.

Verifying the results of the application

Regarding the lack of an application or webtool with similar operation to our SpliceDetector application, we decided to verify output of our software with Ensembl splice variants through manual checking. We selected APOA2 gene with ENSG00000158874 Ensembl gene ID. According to GeneCards database (http://www.genecards.org)⁴⁰ information, APOA2 gene encodes apolipoprotein (apo−) A-II, as the second most abundant protein of the high density lipoprotein particles. APOA2 is associated with Hypercholesterolemia, Familial and Aapoaii Amyloidosis.

This gene contains 10 known transcripts which 8 of them are classified as protein-coding biotypes. We examined the AS occurred types in protein-coding transcripts to evaluate our application performance. The last graph (Fig. 6, SpliceGraph) is formed using our basic rules of SpliceGraph construction. Occurred AS types in transcripts which are extracted based on the arrangement and positioning of exons show the accuracy of our splicing tool results (Table 3). The SpliceGraph includes 5 putative exons. AS types are presented as well as the relevant alternate exons regarding to applied formula in AS detection algorithm of our application.

Table 3 Extracted Alternative Splicing events from transcripts of APOA2 gene. Eight transcripts of all 10 transcripts of APOA2 gene are from protein-coding biotypes. The first column shows the Ensembl transcript IDs of APOA2 transcripts and the second column represents AS events occurred on every transcript.

Full size table

Discussion

Alternative splicing of pre-mRNA, as the main cause of the functional diversity in proteins, could also lead to some genetic diseases. Furthermore, AS pattern alteration in samples under treatment has been detected. For instance, exon skipping events are observed after 6TG (6-Thioguanine) treatment throughout the dystrophin transcript⁴¹. Especially, investigating these alterations in genes with a differential expression which usually appear as transcripts alternation can help to determine the treatments effect on the activity of cells. Sudemycin E which causes a rapid alteration in AS events and consequently changes the overall gene expression and arrests the G2 phase of the cell cycle⁴² is an example of this influence. Regarding the impact of AS events in disease occurrence, efforts to clarify AS events consequences in cellular activity are helpful.

Due to the lack of tools that accept transcript IDs as input for the SpliceGraph building, we decided to compare the criteria for the SpliceGraph formation in some tools regardless of the type and format of the input data. The major part of alternative splicing visualization tools is performing alignment with the reference genome as initial step and then determining the putative exons, based on the criteria of exons expression level, the splice junctions support, genomic coordinate similarity, etc. Regarding the mentioned items, we selected the following tools which are structurally compatible with our application. SpliceGrapher that constructs the SpliceGraph relying on existing gene model annotations. It takes RNA-Seq data as input, and visualizes SpliceGraphs, splice junctions, and read depth. It identifies the splice junction sequence features by spliced-alignment filtering. Vials⁴³ is a useful tool that enables researchers to identify abundance of reads associated with exons, recognize splice junctions, and predict isoforms frequency patterns in experimental groups. The tool illustrates the transcripts splicing by the weighted, directed, acyclic graphs modeled using exons genomic coordinates and the splice junctions support (weights). The third selected tool for comparison is SpliceSeq, that is the most similar SpliceGraph design method to our applied method in SpliceDetector application. This software, by summarizing known transcripts in the Ensembl database, constructs a SpliceGraph and then stores them. In the next step, the RNA-seq sample reads are aligned with the pre-deposited reference genome, and genes splice events are extracted using the constructed SpliceGraphs. Our software utilizes transcripts overlapping, a similar method to the SpliceSeq software, and calculates the frequency of splice junctions. However, similar to the three mentioned tools, our application builds SpliceGraphs. In addition to the splice sites support, we used features such as the length of exons and prioritized multiple exons over a continuous exon (including all mentioned multiple exons) with the identical start and end coordinates to improve the SpliceGraph structure, get a better definition of differences between transcripts variants, and recognize all possible exons. Use of this software is as simple as Vials tool which works with the gene names, but we have provided the possibility to enter a set of transcripts in a using process, and we believe it as an advantage for our software. Also, our tool represents a clear view of the alternative splicing events of the query transcript regarding the SpliceGraph and determines the exonic and genomic regions of the events.

We presented the possibility of the investigation of AS patterns in both single and multiple forms: single form for specific transcript investigations and multiple form for cases of having a set of transcripts. Also, an image that represents the query transcript as well as the SpliceGraph constructed from known transcripts of the corresponded gene, gives a clear view of the alternative splicing region and illustrates how the AS events are happened. In addition, in the cases that the Unique transcript reads count of transcripts are input along with transcript IDs, the application provides the possibility to perform a Chi-square Goodness of Fit statistical test to determine significance of alteration rates between Experimental Group and Control Group. The possibility of result exporting in text and Microsoft excel format is considered for results. Methods of application are shown in the practical guide. Data for testing is supplied in the supplemental files (S6–9).

Conclusion

We developed a practical SpliceGraph-based application for detecting alternative splicing events from transcripts in all model organisms. We eliminated the complicated steps for downloading reference data and using strict command lines arguments in our software to ease extracting AS events straight from transcripts rather than RNA-seq data. Using this software, researchers are able to investigate AS events as the significant factor of alteration in proteins functions through the updated SpliceGraph in each use. The SpliceDetector software is compatible with Windows and needs .NET Framework 4.5. SpliceDetector can be downloaded from https://drive.google.com/open?id=1dlXKzbvxOH3A85_DVR__V2eI5s16-llv or https://www.dropbox.com/s/j5o0og159ig6tej/SpliceDetector%20Executable%20File.rar?dl=0.

References

Barbosa-Morais, N. L. et al. The evolutionary landscape of alternative splicing in vertebrate species. Science 338, 1587–1593, https://doi.org/10.1126/science.1230612 (2012).
Article CAS PubMed ADS Google Scholar
Chen, F. C., Chen, C. J., Ho, J. Y. & Chuang, T. J. Identification and evolutionary analysis of novel exons and alternative splicing events using cross-species EST-to-genome comparisons in human, mouse and rat. BMC bioinformatics 7, 136, https://doi.org/10.1186/1471-2105-7-136 (2006).
Article PubMed PubMed Central Google Scholar
Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature methods 5, 621–628, https://doi.org/10.1038/nmeth.1226 (2008).
Article CAS PubMed Google Scholar
Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111, https://doi.org/10.1093/bioinformatics/btp120 (2009).
Article CAS PubMed PubMed Central Google Scholar
Douglas, A. G. & Wood, M. J. RNA splicing: disease and therapy. Briefings in functional genomics 10, 151–164, https://doi.org/10.1093/bfgp/elr020 (2011).
Article CAS PubMed Google Scholar
Tazi, J., Bakkour, N. & Stamm, S. Alternative splicing and disease. Biochimica et biophysica acta 1792, 14–26, https://doi.org/10.1016/j.bbadis.2008.09.017 (2009).
Article CAS PubMed Google Scholar
Garcia-Blanco, M. A. Alternative splicing: therapeutic target and tool. Progress in molecular and subcellular biology 44, 47–64 (2006).
Article CAS PubMed Google Scholar
Havens, M. A., Duelli, D. M. & Hastings, M. L. Targeting RNA splicing for disease therapy. Wiley interdisciplinary reviews. RNA 4, 247–266, https://doi.org/10.1002/wrna.1158 (2013).
Article CAS PubMed PubMed Central Google Scholar
Zhang, M. L., Lorson, C. L., Androphy, E. J. & Zhou, J. An in vivo reporter system for measuring increased inclusion of exon 7 in SMN2 mRNA: potential therapy of SMA. Gene therapy 8, 1532–1538, https://doi.org/10.1038/sj.gt.3301550 (2001).
Article CAS PubMed Google Scholar
McClintock, D. et al. The mutant form of lamin A that causes Hutchinson-Gilford progeria is a biomarker of cellular aging in human skin. PloS one 2, e1269, https://doi.org/10.1371/journal.pone.0001269 (2007).
Article PubMed PubMed Central ADS Google Scholar
Keren, H., Lev-Maor, G. & Ast, G. Alternative splicing and evolution: diversification, exon definition and function. Nature reviews. Genetics 11, 345–355, https://doi.org/10.1038/nrg2776 (2010).
Article CAS PubMed Google Scholar
Panahi, B., Mohammadi, S. A., Ebrahimi Khaksefidi, R., Fallah Mehrabadi, J. & Ebrahimie, E. Genome-wide analysis of alternative splicing events in Hordeum vulgare: Highlighting retention of intron-based splicing and its possible function through network analysis. FEBS letters 589, 3564–3575, https://doi.org/10.1016/j.febslet.2015.09.023 (2015).
Article CAS PubMed Google Scholar
Conesa, A. et al. A survey of best practices for RNA-seq data analysis. Genome Biol 17, 13, https://doi.org/10.1186/s13059-016-0881-8 (2016).
Article PubMed PubMed Central Google Scholar
Ryan, M. C., Cleland, J., Kim, R., Wong, W. C. & Weinstein, J. N. SpliceSeq: a resource for analysis and visualization of RNA-Seq data on alternative splicing and its functional impacts. Bioinformatics 28, 2385–2387, https://doi.org/10.1093/bioinformatics/bts452 (2012).
Article CAS PubMed PubMed Central Google Scholar
Trapnell, C. et al. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nature biotechnology 31, 46–53, https://doi.org/10.1038/nbt.2450 (2013).
Article CAS PubMed Google Scholar
Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature biotechnology 28, 511–515, https://doi.org/10.1038/nbt.1621 (2010).
Article CAS PubMed PubMed Central Google Scholar
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature biotechnology 29, 644–652, https://doi.org/10.1038/nbt.1883 (2011).
Article CAS PubMed PubMed Central Google Scholar
Bollina, D., Lee, B. T., Tan, T. W. & Ranganathan, S. ASGS: an alternative splicing graph web service. Nucleic acids research 34, W444–447, https://doi.org/10.1093/nar/gkl268 (2006).
Article CAS PubMed PubMed Central Google Scholar
Anders, S., Reyes, A. & Huber, W. Detecting differential usage of exons from RNA-seq data. Genome Res 22, 2008–2017, https://doi.org/10.1101/gr.133744.111 (2012).
Article CAS PubMed PubMed Central Google Scholar
Conesa, A. et al. Erratum to: A survey of best practices for RNA-seq data analysis. Genome Biol 17, 181, https://doi.org/10.1186/s13059-016-1047-4 (2016).
Article PubMed PubMed Central Google Scholar
Florea, L., Song, L. & Salzberg, S. L. Thousands of exon skipping events differentiate among splicing patterns in sixteen human tissues. F1000Research 2, 188, https://doi.org/10.12688/f1000research.2-188.v2 (2013).
PubMed PubMed Central Google Scholar
Hu, Y. et al. DiffSplice: the genome-wide detection of differential splicing events with RNA-seq. Nucleic acids research 41, e39, https://doi.org/10.1093/nar/gks1026 (2013).
Article CAS PubMed Google Scholar
Kato, T. et al. Multi-stage optical FDM of 12-channel 10-Gb/s data with 20-GHz exact channel spacing using fiber cross-phase modulation with optical subcarrier signals. Optics express 19, B295–300, https://doi.org/10.1364/OE.19.00B295 (2011).
Article PubMed ADS Google Scholar
Katz, Y., Wang, E. T., Airoldi, E. M. & Burge, C. B. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nature methods 7, 1009–1015, https://doi.org/10.1038/nmeth.1528 (2010).
Article CAS PubMed PubMed Central Google Scholar
Singh, D. et al. FDM: a graph-based statistical method to detect differential transcription using RNA-seq data. Bioinformatics 27, 2633–2640, https://doi.org/10.1093/bioinformatics/btr458 (2011).
Article CAS PubMed PubMed Central Google Scholar
Stephan-Otto Attolini, C., Pena, V. & Rossell, D. Designing alternative splicing RNA-seq studies. Beyond generic guidelines. Bioinformatics 31, 3631–3637, https://doi.org/10.1093/bioinformatics/btv436 (2015).
Article PubMed PubMed Central Google Scholar
Wang, W., Qin, Z., Feng, Z., Wang, X. & Zhang, X. Identifying differentially spliced genes from two groups of RNA-seq samples. Gene 518, 164–170, https://doi.org/10.1016/j.gene.2012.11.045 (2013).
Article CAS PubMed Google Scholar
Wu, J. et al. SpliceTrap: a method to quantify alternative splicing under single cellular conditions. Bioinformatics 27, 3010–3016, https://doi.org/10.1093/bioinformatics/btr508 (2011).
Article CAS PubMed PubMed Central Google Scholar
Hubbard, T. et al. The Ensembl genome database project. Nucleic acids research 30, 38–41 (2002).
Article CAS PubMed PubMed Central Google Scholar
Yates, A. et al. Ensembl 2016. Nucleic acids research 44, D710–716, https://doi.org/10.1093/nar/gkv1157 (2016).
Article CAS PubMed Google Scholar
Harrington, E. D. & Bork, P. Sircah: a tool for the detection and visualization of alternative transcripts. Bioinformatics 24, 1959–1960, https://doi.org/10.1093/bioinformatics/btn361 (2008).
Article CAS PubMed Google Scholar
Rogers, M. F., Thomas, J., Reddy, A. S. & Ben-Hur, A. SpliceGrapher: detecting patterns of alternative splicing from RNA-Seq data in the context of gene models and EST data. Genome Biol 13, R4, https://doi.org/10.1186/gb-2012-13-1-r4 (2012).
Article CAS PubMed PubMed Central Google Scholar
Heber, S., Alekseyev, M., Sze, S. H., Tang, H. & Pevzner, P. A. Splicing graphs and EST assembly problem. Bioinformatics 18(Suppl 1), S181–188 (2002).
Article PubMed Google Scholar
Kahles, A., Ong, C. S., Zhong, Y. & Ratsch, G. SplAdder: identification, quantification and testing of alternative splicing events from RNA-Seq data. Bioinformatics 32, 1840–1847, https://doi.org/10.1093/bioinformatics/btw076 (2016).
Article CAS PubMed PubMed Central Google Scholar
Magen, A. & Ast, G. The importance of being divisible by three in alternative splicing. Nucleic acids research 33, 5574–5582, https://doi.org/10.1093/nar/gki858 (2005).
Article CAS PubMed PubMed Central Google Scholar
Pyrkosz, A. B., Cheng, H. & Brown, C. T. RNA-seq mapping errors when using incomplete reference transcriptomes of vertebrates. arXiv preprint arXiv : 1303.2411 (2013).
McDonald, J. H. Handbook of Biological Statistics. (Sparky House Publishing, 2014).
Gong, P. et al. Transcriptomic analysis identifies gene networks regulated by estrogen receptor alpha (ERalpha) and ERbeta that control distinct effects of different botanical estrogens. Nuclear receptor signaling 12, e001, https://doi.org/10.1621/nrs.12001 (2014).
PubMed PubMed Central Google Scholar
Dobbin, Z. C. et al. Using heterogeneity of the patient-derived xenograft model to identify the chemoresistant population in ovarian cancer. Oncotarget 5, 8750–8764, https://doi.org/10.18632/oncotarget.2373 (2014).
Article PubMed PubMed Central Google Scholar
Stelzer, G. et al. The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses. Current protocols in bioinformatics 54, 1 30 31–31 30 33, https://doi.org/10.1002/cpbi.5 (2016).
Verhaart, I. E. & Aartsma-Rus, A. The effect of 6-thioguanine on alternative splicing and antisense-mediated exon skipping treatment for duchenne muscular dystrophy. PLoS currents 4, https://doi.org/10.1371/currents.md.597d700f92eaa70de261ea0d91821377 (2012).
Convertini, P. et al. Sudemycin E influences alternative splicing and changes chromatin modifications. Nucleic acids research 42, 4947–4961, https://doi.org/10.1093/nar/gku151 (2014).
Article CAS PubMed PubMed Central Google Scholar
Strobelt, H. et al. Vials: Visualizing Alternative Splicing of Genes. IEEE transactions on visualization and computer graphics 22, 399–408, https://doi.org/10.1109/TVCG.2015.2467911 (2016).
Article PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Biotechnology, Shiraz University, Shiraz, Iran
Mandana Baharlou Houreh, Ali Niazi & Esmaeil Ebrahimie
Science and Research Branch, Islamic Azad university, Hamedan, Iran
Payam Ghorbani Kalkhajeh
Department of Biology, University of Qom, Qom, Iran
Faezeh Ebrahimi
Adelaide Medical School, The University of Adelaide, Adelaide, Australia
Esmaeil Ebrahimie
School of Information Technology and Mathematical Sciences, Division of Information Technology, Engineering and the Environment, The University of South Australia, Adelaide, SA, Australia
Esmaeil Ebrahimie
School of Biological Sciences, Faculty of Science and Engineering, Flinders University, Adelaide, SA, Australia
Esmaeil Ebrahimie

Authors

Mandana Baharlou Houreh
View author publications
You can also search for this author in PubMed Google Scholar
Payam Ghorbani Kalkhajeh
View author publications
You can also search for this author in PubMed Google Scholar
Ali Niazi
View author publications
You can also search for this author in PubMed Google Scholar
Faezeh Ebrahimi
View author publications
You can also search for this author in PubMed Google Scholar
Esmaeil Ebrahimie
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.B., P.G., A.N. and E.E. had the idea, performed the data collection and analysis. F.E. contributed on the manuscript preparation and data interpretation.

Corresponding author

Correspondence to Esmaeil Ebrahimie.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary information

Supplementary Dataset S4

Supplementary Dataset S5

Supplementary Dataset S6

41598_2018_23245_MOESM5_ESM.zip

S7

41598_2018_23245_MOESM6_ESM.zip

S8

Supplementary Dataset S9

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Baharlou Houreh, M., Ghorbani Kalkhajeh, P., Niazi, A. et al. SpliceDetector: a software for detection of alternative splicing events in human and model organisms directly from transcript IDs. Sci Rep 8, 5063 (2018). https://doi.org/10.1038/s41598-018-23245-1

Download citation

Received: 21 June 2017
Accepted: 02 March 2018
Published: 22 March 2018
DOI: https://doi.org/10.1038/s41598-018-23245-1

This article is cited by

FN1 encoding fibronectin as a pivotal signaling gene for therapeutic intervention against pancreatic cancer
- Gayathri Ashok
- Sravan Kumar Miryala
- Sudha Ramaiah
Molecular Genetics and Genomics (2022)
Splice-disrupt genomic variants in prostate cancer
- Ibrahim O. Alanazi
- Salman F. Alamery
- Manijeh Mohammadi-Dehcheshmeh
Molecular Biology Reports (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Methods

Application Architecture and Data Acquisition

SpliceGraph building

Rules applied in SpliceGraph building

Steps to form SpliceGraph

Method of comparison

The algorithm of data processing

Differential splicing analysis

Results

Data Storage, Visualization and Updating

Verifying the results of the application

Discussion

Conclusion

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Additional information

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links