Visualization is an integral aspect of genomics data analysis. Algorithmic-statistical analysis and interactive visualization are most effective when used iteratively. Epiviz (http://epiviz.cbcb.umd.edu/), a web-based genome browser, and the Epivizr Bioconductor package allow interactive, extensible and reproducible visualization within a state-of-the-art data-analysis platform.
At a glance
- IEEE Trans. Vis. Comput. Graph. 17, 2301–2309 (2011). , &
- Commun. ACM 51, 75–84 (2008). , &
- Cell 133, 523–536 (2008). et al.
- Nat. Methods 8, 989–990 (2011). et al.
- Genome Biol. 5, R80 (2004). et al.
- IEEE Trans. Vis. Comput. Graph. 13, 1224–1231 (2007). , , &
- Nucleic Acids Res. 39, D1011–D1015 (2011). , , , &
- Nucleic Acids Res. 36, D773–D779 (2008). et al.
- Nucleic Acids Res. 37, D690–D697 (2009). et al.
- Bioinformatics 21, 3439–3440 (2005). et al.
- Genome Biol. 11, R106 (2010). &
- PLoS Comput. Biol. 9, e1003118 (2013). et al.
- Nat. Genet. 43, 768–775 (2011). et al.
- Nat. Methods 10, 1200–1202 (2013). , , &
- Nature 462, 315–322 (2009). et al.
- Bioinformatics 30, 1363–1369 (2014). et al.
- Cancer Genome Atlas Network. Nature 487, 330–337 (2012).
- BMC Genomics 14, 397 (2013). et al.
- Bioinformatics 29, 381–383 (2013). , , &
- Supplementary Figure 1: The Epiviz architecture. (82 KB)
Presentation, visualizations and data representations are distinct. This allows Epiviz to reuse visualizations regardless of data source (Epiviz sever, or WebSocket connection through Epivizr). Data providers and visualizations can be plugged in on the fly using Epiviz’ plugin API.
- Supplementary Figure 2: Chart load times with and without cache. (190 KB)
Average comparison of time taken by ‘add chart’ and ‘navigate’ operations per 1,000 data objects with and without using the predictive cache in the Epiviz data management tier.
- Supplementary Figure 3: Chart draw times for different parameter values. (302 KB)
A comparison of draw times when varying specific chart parameters for Scatter Plot and Blocks Track. The parameter for scatter plot is “circle ratio” which splits the chart in a grid of squares of width equal to this parameter, and draws at most one circle in each cell of the grid. All data objects that overlap this point are mapped to the single circle displayed. The parameter for block tracks is the minimum distance in screen pixels between two blocks before they are merged into one display object. Again, all data objects merged are mapped to the single display object. The data to visual object mapping is used for brushing, tooltips and other interactivity actions.
- Supplementary Figure 4: A comparison of draw times when varying specific chart parameters for Heatmap Plot and Lines Track. (285 KB)
The parameter for heatmap is the maximum number of columns to be drawn by the heat map before multiple columns are averaged into one. All data objects that are merged are mapped to the single column displayed. The data to visual object mapping is used for brushing, tooltips and other interactivity actions. The parameter for line tracks is the maximum number of points drawn. If the number of data points is greater than this parameter, the required number of points are sampled uniformly.
- Supplementary Figure 5: Gene expression analysis of colon cancer methylation loss regions with Epiviz. (57 KB)
A) We used the Epiviz computed columns feature to define an MA plot of colon cancer expression in the MMP gene family region (Figure 1). B) Gene expression barcode data for the same region shows similar expression patterns across multiple cancer types. Both of these plots were saved as pdfs directly from Epiviz.
- Supplementary Figure 6: Comparison of hypomethylation block finding methods. (179 KB)
We compare hypomethylation blocks inferred using BSmooth on whole-genome bisuflite sequencing with blocks inferred with minfi on Illumina HumanMethylation450k beadarray data. In this plot we show the regions found along with smoothed bp-level mean methylation (for BSmooth) and probe-level mean methylation (aggregated over CpG clusters for minfi) data. The block-finding method used in minfi ignores methylation measurements in CpG islands by design, so that long blocks of methylation change would span across CpG islands. BSmooth does not use this design so blocks are frequently punctuated by CpG islands. We see this effect in this specific integrative visualization using Epivizr, where the only difference hypomethylation blocks is the punctuation at the CpG island for the BSmooth block.
- Supplementary Figure 7: The spatial distribution of genes in correlation with hypomethylated blocks. (252 KB)
Visualizing genes and corresponding exons side by side with methylation levels in normal and cancer tissues using Epiviz confirms that hypo-methylated blocks are gene-poor.
- Supplementary Figure 8: Exon-level expression in differentially methylated regions. (286 KB)
The track-based visualization of exon-level expression data, side by side with a view of DNA methylation and one of differentially methylated blocks reveals that at low resolution, exons tend to be silenced within blocks, and highly expressed outside.
- Supplementary Text and Figures (2,187 KB)
Supplementary Figures 1–8 and Supplementary Note