The genomics world has no shortage of visualization tools. But as new methods and data types emerge, existing techniques can struggle to cope. Now, a tool known as Gosling allows bioinformaticians to build apps that can display genomic information with the same level of flexibility that developers have come to expect from other graphics programming tools.
First released in 2020 by bioinformatician Nils Gehlenborg and his team at Harvard Medical School in Boston, Massachusetts, Gosling stands for ‘grammar of scalable linked interactive nucleotide graphics’1. But the name is also a nod to structural biologist Raymond Gosling, who, with Rosalind Franklin, captured the famous ‘Photograph 51’, which revealed the structure of DNA.
Gosling is what is known as a grammar. It is implemented in programming libraries that provide a flexible syntax for describing genomic regions and interactions and how they should be laid out on a web page. Researchers and bioinformaticians can use these libraries to create interactive, scalable visualizations that they can share with their colleagues, and to build bespoke genetic-analysis tools.
Bridging the gap
“Gosling really bridges that gap, making it way easier to make new tools that have visualization components,” says Maria Nattestad, a software engineer at Google in Mountain View, California. As part of her PhD research in 2015, Nattestad developed a tool called SplitThreader, which presents the genome in a circular layout known as a Circos plot, with sequencing reads as arcs to highlight structural variations. With no other options, she drew those elements from scratch, using D3.js to specify the placement and dimensions of each line, rectangle and circle. “It was such a learning curve,” she says. “It took me a long time to build SplitThreader,” she says, but adds that it could have probably built a lot faster with Gosling.
Gehlenborg says that Gosling arose out of a 2019 literature review2, during which his team surveyed the genome visualization landscape and built up a taxonomy for the tools and their capabilities. From there, the researchers developed a syntax to systematically describe the visualizations those tools could make. Gosling, Gehlenborg explains, “is a fundamental approach to assemble genomic visualizations using that same taxonomy”.
Postdoc Sehi L’Yi, who led Gosling’s development, says that what differentiates Gosling from other visualization tools is its expressiveness. With most tools, he says, the graphics that can be made and what they will look like are predefined. “It is really not easy to customize visualizations as a user.” But with Gosling, users can, for instance, specify the colour, dimensions and placement of the symbol used to represent a centromere or genomic interval, then overlay that on an ideogram of a chromosome to highlight a region of interest.
An interesting space
A team of master’s students at the University of British Columbia decided to use Gosling to create its final project in a data-visualization class. “One of my team mates had heard about it at a conference last year,” says team member Armita Safa. “Even for someone who doesn’t have a coding background, it is relatively easier to work with Gosling than most other things that are used for visualization,” she says. That said, she notes that they initially struggled to extract the data they needed to allow users to click on regions and create new visualizations.
Dominic Girardi, chief product officer at the data visualization company Datavisyn in Linz, Austria, has also experimented with Gosling to create an interactive playground that allows users to filter a table of genes by genomic region. The firm — which Gehlenborg co-founded — is now using Gosling to generate visualization tools for its corporate clients, although it has not yet completed one, Girardi says.
Gosling isn’t the only visualization library for genomic data; other examples include ggbio, gggenomes and gggenes, all of which are extensions of the ggplot2 graphing library. But most of these tools create static images, Gehlenborg says — pictures, rather than interactive visualizations. Gehlenborg says that future plans for Gosling include giving it a graphical interface, so that researchers can create visualizations by dragging-and-dropping widgets onto a virtual canvas rather than having to program them.
Robert Buels, who is leading development of a genome browser at the University of California, Berkeley, says that Gosling “occupies a really interesting space” in the genomics visualization toolbox. “You can get a lot more customizability with Gosling,” he says. But users don’t have to write nearly as much code as they do for tools such as D3.js.
“It’s a really interesting niche in between the two things,” he says, “that I think is a really great addition to the field.”