The UCSC SARS-CoV-2 Genome Browser

Fernandes, Jason D.; Hinrichs, Angie S.; Clawson, Hiram; Gonzalez, Jairo Navarro; Lee, Brian T.; Nassar, Luis R.; Raney, Brian J.; Rosenbloom, Kate R.; Nerli, Santrupti; Rao, Arjun A.; Schmelter, Daniel; Fyfe, Alastair; Maulding, Nathan; Zweig, Ann S.; Lowe, Todd M.; Ares, Manuel; Corbet-Detig, Russ; Kent, W. James; Haussler, David; Haeussler, Maximilian

doi:10.1038/s41588-020-0700-8

Download PDF

Comment
Published: 09 September 2020

The UCSC SARS-CoV-2 Genome Browser

Jason D. Fernandes^1,2,
Angie S. Hinrichs¹,
Hiram Clawson¹,
Jairo Navarro Gonzalez¹,
Brian T. Lee¹,
Luis R. Nassar¹,
Brian J. Raney¹,
Kate R. Rosenbloom ORCID: orcid.org/0000-0001-8799-4826¹,
Santrupti Nerli¹,
Arjun A. Rao ORCID: orcid.org/0000-0003-4480-3190³,
Daniel Schmelter¹,
Alastair Fyfe¹,
Nathan Maulding¹,
Ann S. Zweig¹,
Todd M. Lowe^1,5,
Manuel Ares Jr ORCID: orcid.org/0000-0002-2552-9168^4,5,
Russ Corbet-Detig¹,
W. James Kent¹,
David Haussler ORCID: orcid.org/0000-0003-1533-4575^1,2,5 &
…
Maximilian Haeussler ORCID: orcid.org/0000-0001-8721-8253¹

Nature Genetics volume 52, pages 991–998 (2020)Cite this article

21k Accesses
56 Citations
38 Altmetric
Metrics details

Subjects

The UCSC SARS-CoV-2 Genome Browser (https://genome.ucsc.edu/covid19.html) is an adaptation of our popular genome-browser visualization tool for this virus, containing many annotation tracks and new features, including conservation with similar viruses, immune epitopes, RT–PCR and sequencing primers and CRISPR guides. We invite all investigators to contribute to this resource to accelerate research and development activities globally.

The University of California, Santa Cruz (UCSC) Human Genome Browser¹, a web-based, interactive viewer for human and other vertebrate genome sequences featuring research data, clinical molecular data, annotations and sequence alignments, has been used for almost 20 years by hundreds of thousands of biomedical researchers and cited in more than 37,500 scientific articles. To address the current COVID-19 epidemic, we have built a similar browser for the SARS-CoV-2 reference genome (NC_045512v2, wuhCor1). Here, we provide an overview of this tool for the international community racing to understand the details of the virus, its evolution, its mechanisms of action in human cells, and its immunological and molecular vulnerabilities.

A brief introduction to the Genome Browser

The UCSC SARS-CoV-2 browser, like alternative genome browsers^2,3,4, displays the reference nucleotide sequence of the viral genome and provides an intuitive way to visualize annotations or data on specific parts of the genome (Fig. 1). The genome sequence is shown from left to right, 5′ to 3′, as an image with the label NC_045512v2, which is the National Center for Biotechnology Information (NCBI)/International Nucleotide Sequence Database Collaboration (INSD) accession ID for the reference sequence. This reference sequence is the RNA genome isolated from one of the first cases in Wuhan, China, and is known as ‘Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome’⁵. This sequence is widely used as a standard reference and, because of its early identification, is used as the root genome in phylogenetic trees produced by Nextstrain⁶, COVID-19 Genomics UK (COG-UK)⁷ and the China National Center for Bioinformation project⁸. Above this image are the navigation controls. Buttons in the navigation controls allow users to move left and right along the genomic sequence and zoom in and out by various factors. Clicking the zoom factor ‘base’ zooms in to show the nucleotide bases in the selected region. The position bar shows the current region of the genome being viewed via a red box, and the search box allows users to manually enter a specific position to view. Coordinates in the position bar should be entered with the prefix NC_045512v2 (the NCBI RefSeq accession of this genome) followed by the stop and start nucleotide numbers (for example, NC_045512v2:25,341-25,401). Users can also directly enter a nucleotide sequence or the name of a gene (corresponding to the NCBI/INSDC annotation) to go directly to that region of the genome.

**Fig. 1: An overview of the UCSC Genome Browser user-interface structure.**

Genome annotations are shown underneath the position bar. At the top of the annotations, the genome sequence is shown (when zoomed in) as an RNA sequence. At the time of writing, there are 50 tracks with different types of molecular information, 11 of which are shown by default. The small gray buttons on the left of the image are the track-configuration buttons, which allow users to configure how track information is displayed. Alternatively, users can right-click to configure the track.

Under the genome sequence in the image, gene and protein annotations are shown by default as blue rectangles, with the direction of transcription indicated by small arrows. If activated, additional tracks displaying alignments and variants are shown underneath.

Under the annotations display, various buttons allow users to reset the currently shown tracks back to the defaults. Two buttons, ‘add custom tracks’ and ‘track hubs’⁹, allow experienced users to add their own data. The ‘configure’ button allows users to make viewing adjustments (for example, increase the font size). The ‘reverse’ button reverse-complements the display, to show the antisense sequence from 3′ to 5′, and the ‘resize’ button fits the image to the current screen size.

Below the buttons, the track list shows all available tracks (the first 12 are shown in Fig. 1). To provide a compact display, most tracks are hidden by default. For example, in Fig. 1, only 4 of the more than 50 currently available tracks are visible. Hovering the mouse over the title of a track for several seconds reveals a longer description of the track data. Clicking the title of the track shows a full description and configuration page for the track data. On this page, or in the track list, tracks can be set to one of the four visibility modes: ‘dense’, ‘squish’, ‘pack’ and ‘full’ (Supplementary Fig. 1). Depending on the track, different viewing modes may emphasize features of the data not immediately apparent in other modes (for example, in Supplementary Fig. 1, individual open reading frames (ORFs) are hidden in dense mode, and names are hidden in squish mode). In general, setting a track to pack mode provides a good starting point to explore the data. At the top of the page is the menu bar, which contains several tools helpful for working with the genome sequence (Table 1).

Table 1 Important tools of the UCSC Genome Browser, accessible through the menu bar

Full size table

Genomic organization of SARS-CoV-2

The reference SARS-CoV-2 genome (NCBI RefSeq NC_045512.2) is a single strand of 29,903 RNA nucleotides, yet, like many viral genomes, it encodes substantial molecular complexity, generating ~10 canonical RNA transcripts, ~14 ORFs and ~29 proteins. This complex organization has several features that are atypical of standard genomic analyses.

The SARS-CoV-2 transcriptome

Like other coronaviruses, SARS-CoV-2 is a single-stranded positive-sense RNA, which shares many features with the messenger RNAs of most human genes, including 5′ and 3′ untranslated regions as well as a 3′-poly(A) tail¹⁰. In infected cells, the first gene (orf1a/orf1ab) is directly translated and cleaved into proteins that form the replication–transcription complexes (RTCs)¹¹. These RTCs use the same positive-strand RNA genome as a template to generate negative-strand RNAs. During negative-strand transcription, RTCs occasionally encounter a body transcription-regulatory sequence (TRS-B). TRS-B sequences work in combination with a single leader transcription-regulatory sequence (TRS-L) at the 5′ end of the genome. When RTCs encounter a TRS-B sequence, they can ‘jump’ to the TRS-L sequence via long range RNA–RNA interactions, thereby generating a negative-strand RNA with a large portion of the genome omitted¹² (Fig. 2a). These negative-strand RNAs then serve as templates for the transcription of positive-strand subgenomic mRNAs. Positive-strand subgenomic RNAs created through this mechanism have downstream AUGs in an optimal start-codon context, thus allowing ribosomes to initiate translation at locations normally not available in the full-length genomic RNA. These subgenomic RNAs serve as mRNAs for viral proteins encoded downstream in the genome.

**Fig. 2: Molecular and genomic visualizations of SARS-CoV-2 transcription and protein cleavage in the browser.**

In the SARS-CoV-2 browser, we have included a transcriptome track, which annotates each predicted TRS site in the reference genome according to the presence of the motif ACGAAC, the reported core TRS for SARS-CoV¹³. The track also includes the canonical mRNAs produced from the full-length positive-strand RNA (Fig. 2b). Another track, Kim transcripts, shows subgenomic mRNAs that have been experimentally validated by transcriptomic sequencing¹⁴. In addition, the Kim-transcripts track includes annotations of several recently reported experimentally observed subgenomic RNAs, some of which have non-canonical junctions¹⁴. As the SARS-CoV-2 transcriptome continues to be elucidated, we will update and add appropriate tracks, including recent reports of variant ORFs¹⁵.

The SARS-CoV-2 proteome

SARS-CoV-2 proteome consists of two polyproteins, four structural proteins and possibly nine accessory proteins^5,16,17. The two polyproteins are processed into 16 non-structural proteins, thus requiring consideration of as many as 29 proteins in analyses (Fig. 2c).

The polyproteins pp1a and pp1ab are products of orf1a and orf1ab, respectively, both of which are produced by translation of the full-length genomic RNA. To generate pp1a, the ribosome initiates translation at the AUG codon at nucleotides 266–268. Translation continues in the canonical manner until a UAA stop codon is encountered at nucleotides 13481–13483, thus producing the 4,405–amino acid pp1a polyprotein. This polyprotein contains two viral proteases (nsp3/PL-PRO) and (nsp5/3CL-PRO), which cleave the pp1a polyprotein into 11 mature non-structural proteins (nsp1–11)^18,19, which are shown in the UniProt/mature proteins and NCBI Proteins tracks (Fig. 2d).

The pp1ab polyprotein is generated from translation initiation at the same start codon as pp1a; however, the viral RNA contains a ‘slippery’ sequence and structured RNA element that occasionally cause the translation machinery to slip near nucleotide C13468 and therefore read this nucleotide twice: once as the final nucleotide in the AAC codon for amino acid p.Asn4401 and once again as the first nucleotide in the CGG codon for amino acid p.Arg4402 (refs. ^20,21 and Fig. 3a). The result is a ‘programmed’ –1 frameshift, with the pp1a ‘stop’ codon no longer in frame, thus lengthening pp1ab by 2,695 entirely different amino acids. These additional amino acids encode essential non-structural proteins (Figs. 2b and 3); thus, pp1ab encodes nsp1–10 as well as nsp12–16. To properly display this frameshift in the genome browser, we use specialized colored codons that lead to faithful translation of each ORF (Fig. 3a) in the track NCBI Genes.

**Fig. 3: *orf1a/orf1ab* ribosomal frameshifting.**

The remaining proteins are produced through traditional ribosomal recognition of AUG sequences in subgenomic RNAs (Fig. 2). Of particular interest is the spike protein (S), which is the target of most immunology-based therapies. The S protein governs the entry of the virus into the cell and is cleaved by host proteases. Other therapeutic targets include the RNA-dependent RNA polymerase (RdRp) protein, which makes copies of the viral RNA genome (also known as Pol/nsp12), the virally encoded proteases 3CL-PRO and PL-PRO, the viral envelope protein E, the membrane protein M and the nucleocapsid protein N, which organizes the RNA genome of the virus in viral particles.

SARS-CoV-2 has at least five accessory proteins encoded by orf3a, orf6, orf7a, orf7b and orf8. Additional accessory ORFs not in the NCBI and UniProt SARS-CoV-2 annotations have been observed in other coronaviruses (orf3b, orf9b and orf9c), although whether these ORFs produce functional proteins in SARS-CoV-2 is unclear^22,23. Additionally, orf10, present in most annotation sets, has little experimental support as a protein-coding gene^14,24. A variety of variant ORFs have also been recently reported for non-canonical subgenomic mRNAs¹⁵.

An overview of SARS-CoV-2 genome annotation tracks

We provide several standard annotation tracks based on molecular data generated by experimental and computational analyses of the SARS-CoV-2 genome. These annotations are sorted into groups as described below.

Mapping and sequencing

Tracks in this group are all based on short segments of local nucleotide composition. We have also added tracks specific to this viral genome: COVID-19 RT–PCR primers (https://sites.google.com/view/opencovid19), nanopore sequencing primers from the ARTIC Network (https://artic.network) and high-scoring and validated CRISPR–Cas13 guides^25,26. The crowd-sourced annotations track (described below) contains CRISPR guides used in SARS-CoV-2 detection via Cas12 (ref. ²⁷) and Cas13 (refs. ^26,28), as well as loop-mediated isothermal amplification (LAMP) primers²⁹. These tracks can be used in combination with the variants tracks to determine whether specific primers or detection methods might be less effective in detecting certain viral clades (Supplementary Fig. 2).

Genes and gene predictions

These tracks contain information centered around the genes in the viral genome, as illustrated in Fig. 1. For instance, the NCBI Genes track contains annotations of viral gene models from NCBI. Because individual viral genes often have many names (for example, nsp12, RdRp or Pol), many of these tracks list synonyms or notes in additional fields (viewable by clicking an annotation) so that researchers can compare the annotation nomenclature. Additional tracks contain information such as interactions between viral proteins and human proteins from affinity-purification and mass-spectrometry experiments²² (protein interact), PDB structures and Rfam and other predicted RNA structure annotations³⁰ (Rangan RNA).

UniProt protein annotations

Protein annotations from SwissProt/UniProt³¹ are an essential complement to the NCBI RefSeq gene annotations³². These tracks display a variety of protein annotation data including highlights, special regions highlighted by SwissProt curators (for example, the region of the S protein that binds the human receptor protein ACE2); mature products, the mature proteins that result from polypeptide cleavage (Fig. 2c); protein domains; and glycosyl/phosphoryl sites of post-translational modification.

Immunology

These tracks contain SARS-CoV-2 protein epitopes reported in the literature. Included are epitopes that have been predicted and/or validated to be immunogenic. These data can be overlaid with structural and variation information to track mutations that overlap with potential therapeutic targets (Fig. 4). The data feature both linear epitopes recognized by B-cell receptors or antibodies as well as information on conformational epitopes^33,34 (recorded in the crowd-sourced-annotations track; data not shown). These tracks (IEDB predictions, Poran HLA I and Poran HLA II) also display epitopes recognized by CD8⁺ or CD4⁺ T cells when presented by human leukocyte antigen (HLA) molecules on host cells^35,36. When possible, the latter are organized according to the HLA allele of the host used in their presentation. For the track CD8 RosettaMHC, interactive 3D models from Rosetta are available through clicking on the annotated epitope³⁷. We will update this track group as validation and identification of epitopes continues.

**Fig. 4: Combining data tracks to generate hypotheses.**

Comparative genomics

This group contains three tracks that show multiple alignments built from sequences provided by NCBI/INSDC: (1) 7 human CoVs, an alignment of the seven coronavirus sequences that infect humans, (2) 119 vertebrate CoVs, an alignment of 119 sequences, most of which are human coronaviruses (though not necessarily SARS-CoV-2), with various animal viruses and (3) 44 Bat CoVs, an alignment of 44 various bat coronaviruses most closely related to human SARS CoV-2. The mutations in the alignments are colored according to their effects on the protein (white, no difference; red, nonsynonymous; green, synonymous; blue, noncoding; yellow, missing data due to unalignable, absent or unknown sequences). Analysis of evolutionary rates derived from comparative genomics, including insertions and deletions, can pinpoint functionally interesting sections of the viral genome. For instance, comparison of coronaviruses across species and across host genomes can clearly illustrate evolutionary features such as accelerated evolution at receptor spike-binding regions^38,39 (Supplementary Fig. 3).

Variant and repeats

These tracks contain information on the variation and evolutionary patterns observed in SARS-CoV-2 sequences in samples from around the world. NextstrainVars (Supplementary Fig. 4) contains the time-stamped molecular phylogenetic tree produced by the Nextstrain team⁶, on the basis of complete and quality-controlled viral genomes from Global Initiative on Sharing All Influenza Data (GISAID)⁴⁰. The tree is shown in pack and squish views. Dense and full modes display the frequencies at which these variants are found. In the tree, samples are sorted according to their order in the JSON file produced by Nextstrain describing the phylogenetic tree, and they appear in different colors according to clades identified by Nextstrain. Tools are provided to filter these data to show only well-supported mutation calls, set thresholds for minor-allele frequency and display data for specific clades.

Crowd-sourced data

To encourage shared community annotation of the viral genome without requiring submitters to have detailed knowledge of genomic data type formats, we have created a crowd-sourced-annotations track. Its documentation includes a link (http://bit.ly/cov2annots) to a spreadsheet in which anyone can enter the start and end positions of some annotations, descriptive text, and links to websites or articles containing the source information. Every night, all new user annotations from the previous day undergo a brief manual check and are added to the public crowd-sourced-annotations track.

Custom tracks

Users can add annotations for their own use or to share with other groups by clicking the ‘custom tracks’ button. Annotation data can be pasted directly into a text box or uploaded in a standard genomic file format. To share a custom track, users can create a stable link via the menu My Data → My Sessions. Session links can be shared multiple times and will always load the data exactly as originally saved.

Discussion

The rapid pace of SARS-CoV-2 research is generating a wealth of molecular and genomic data across a variety of databases. The UCSC Genome Browser is an established and highly accessed web-based viewer and standardized repository of genomic data with extensive functionality and a 20-year track record of serving the scientific community.

The browser, together with its underlying data organization, is a familiar environment for hundreds of thousands of biomedical researchers who study the human genome, using it to explore and download standardized data formats from a single source for custom analyses. Here, we hope to introduce these tools to virologists, epidemiologists, vaccinologists, antiviral-therapy developers and those seeking to repurpose existing biomedical resources and therapies to combat the virus and its pathological effects. To leverage the expertise of scientists who lack familiarity with genomic analyses but possess expertise in other areas, we have developed a crowd-sourced-data track in which users can simply enter the coordinate and name of a feature. This annotation is then displayed and shared with others on the browser, where it will be viewed and compared with the many existing data tracks. Together, these features make the SARS-CoV-2 browser a simple but powerful tool for researchers to track developments in SARS-CoV-2 science, to detail the changes in the viral genome over the course of the pandemic, and to develop testable hypotheses and novel strategies to combat it.

All SARS-CoV-2 data shown at https://genome.ucsc.edu can be downloaded as customizable spreadsheet tables via our tool UCSC Table Browser at https://genome.ucsc.edu/cgi-bin/hgTables or can be downloaded as raw data from our data downloads server, https://hgdownload.soe.ucsc.edu. In accordance with the GISAID license, we cannot allow downloading of any mutation data derived from these databases, but the raw sequences can be downloaded from https://gisaid.org after registration. Researchers who face problems downloading data from our website are invited to contact the help desk at genome@soe.ucsc.edu.

As scientists continue to generate SARS-CoV-2 data, we will continue to rapidly process, display and share these data in the SARS-CoV-2 browser. We urge authors to contact us at genome- www@soe.ucsc.edu for help in properly citing, annotating and displaying their data in a clear, accurate and intuitive manner in the browser so that it can reach the widest possible audience of researchers. Through this type of open collaboration, we believe the SARS-CoV-2 browser will facilitate the analysis and display of the collective molecular information needed to defeat the virus.

References

Kent, W. J. et al. Genome Res. 12, 996–1006 (2002).
Article CAS Google Scholar
Flynn, J.A. et al. Preprint at bioRxiv https://doi.org/10.1101/2020.02.07.939124 (2020).
Buels, R. et al. Genome Biol. 17, 66 (2016).
Article Google Scholar
Stalker, J. et al. Genome Res. 14, 951–955 (2004).
Article CAS Google Scholar
Wu, F. et al. Nature 579, 265–269 (2020).
Article CAS Google Scholar
Hadfield, J. et al. Bioinformatics 34, 4121–4123 (2018).
Article CAS Google Scholar
Rambaut, A. et al. Nat. Microbiol. https://doi.org/10.1038/s41564-020-0770-5 (2020).
Zhao, W.-M. et al. Yi Chuan 42, 212–221 (2020).
PubMed Google Scholar
Raney, B. J. et al. Bioinformatics 30, 1003–1005 (2014).
Article CAS Google Scholar
Chen, Y., Liu, Q. & Guo, D. J. Med. Virol. 92, 418–423 (2020).
Article CAS Google Scholar
Nakagawa, K., Lokugamage, K. G. & Makino, S. Adv. Virus Res. 96, 165–192 (2016).
Article CAS Google Scholar
Sola, I., Almazán, F., Zúñiga, S. & Enjuanes, L. Annu. Rev. Virol. 2, 265–288 (2015).
Article CAS Google Scholar
Yount, B., Roberts, R. S., Lindesmith, L. & Baric, R. S. Proc. Natl Acad. Sci. USA 103, 12546–12551 (2006).
Article CAS Google Scholar
Kim, D. et al. Cell 181, 914–921.e10 (2020).
Article CAS Google Scholar
Nomburg, J., Meyerson, M. & DeCaprio, J.A. Preprint at bioRxiv https://doi.org/10.1101/2020.04.28.066951 (2020).
Brian, D. A. & Baric, R. S. Curr. Top. Microbiol. Immunol. 287, 1–30 (2005).
CAS PubMed Google Scholar
Fehr, A. R. & Perlman, S. Methods Mol. Biol. 1282, 1–23 (2015).
Article CAS Google Scholar
Barretto, N. et al. J. Virol. 79, 15189–15198 (2005).
Article CAS Google Scholar
Zhang, L. et al. Science 368, 409–412 (2020).
CAS PubMed PubMed Central Google Scholar
Bekaert, M. & Rousset, J.-P. Mol. Cell 17, 61–68 (2005).
Article CAS Google Scholar
Plant, E. P. & Dinman, J. D. RNA 12, 666–673 (2006).
Article CAS Google Scholar
Gordon, D. E. et al. Nature 583, 459–468 (2020).
Article CAS Google Scholar
Schaecher, S.R. & Pekosz, A. in Molecular Biology of the SARS-Coronavirus (ed. Lal, S. K.) 153–166 (Springer, 2010).
Davidson, A. D. et al. Genome Med. 12, 68 (2020).
Article CAS Google Scholar
Abbott, T. R. et al. Cell 181, 865–876.e12 (2020).
Article CAS Google Scholar
Wessels, H.-H. et al. Nat. Biotechnol. 38, 722–727 (2020).
Article CAS Google Scholar
Broughton, J. P. et al. Nat. Biotechnol. 38, 870–874 (2020).
Article CAS Google Scholar
Metsky, H.C., Freije, C.A., Kosoko-Thoroddsen, T.-S.F., Sabeti, P.C. & Myhrvold, C. Preprint at bioRxiv https://doi.org/10.1101/2020.02.26.967026 (2020).
Park, G.-S. et al. J. Mol. Diagn. 22, 729–735 (2020).
Article CAS Google Scholar
Rangan, R., Watkins, A. M., Kladwang, W. & Das, R. Preprint at bioRxiv https://doi.org/10.1101/2020.04.14.041962 (2020).
UniProt Consortium. Nucleic Acids Res. 47, D506–D515 (2019).
Article Google Scholar
Pruitt, K. D., Tatusova, T. & Maglott, D. R. Nucleic Acids Res. 35, D61–D65 (2007).
Article CAS Google Scholar
Pinto, D. et al. Preprint at bioRxiv https://doi.org/10.1101/2020.04.07.023903 (2020).
Yuan, M. et al. Science 368, 630–633 (2020).
Article CAS Google Scholar
Grifoni, A. et al. Cell Host Microbe 27, 671–680.e2 (2020).
Article CAS Google Scholar
Poran, A. et al. Genome Med. 12, 70 (2020).
Article CAS Google Scholar
Nerli, S. & Sgourakis, N.G. Preprint at bioRxiv https://doi.org/10.1101/2020.03.23.004176 (2020).
Demogines, A., Farzan, M. & Sawyer, S. L. J. Virol. 86, 6350–6353 (2012).
Article Google Scholar
Damas, J. et al. Proc. Natl Acad. Sci. USA https://doi.org/10.1073/pnas.2010146117 (2020).
Shu, Y. & McCauley, J. Euro Surveill. 22, 30494 (2017).
Article Google Scholar
Andersen, K. G., Rambaut, A., Lipkin, W. I., Holmes, E. C. & Garry, R. F. Nat. Med. 26, 450–452 (2020).
Article CAS Google Scholar
Korber, B. et al. Preprint at bioRxiv https://doi.org/10.1101/2020.04.29.069054 (2020).
Pettersen, E. F. et al. J. Comput. Chem. 25, 1605–1612 (2004).
Article CAS Google Scholar
Grubaugh, N. D., Hanage, W. P. & Rasmussen, A. L. Cell 182, 794–795 (2020).
Article CAS Google Scholar
Kent, W. J. Genome Res. 12, 656–664 (2002).
Article CAS Google Scholar

Download references

Acknowledgements

We thank K. Kober and K. Charbonneau from the Kober laboratory at UCSF for their contributions to the crowd-sourced data track, as well as H. Beale, J. Sim, A. Gonzalez Armenta, P. Berman, N. Sgourakis, E. Jung, P. Angulo and the rest of the scientists at UCSC and around the world for making these tracks of molecular information possible.

The UCSC Human Genome Browser software, quality control and training are funded by the NIH National Human Genome Research Institute, currently with grant 5U41HG002371. The SARS-CoV-2 genome browser and data annotation tracks are funded by generous individual donors including Pat & Rowland Rebele, Eric and Wendy Schmidt by recommendation of the Schmidt Futures program, the Center for Information Technology Research in the Interest of Society (CITRIS, 2020-0000000020) and University of California Office of the President (UCOP) Emergency COVID-19 Research Seed Funding Grant R00RG2456. Funding for open-access charges was provided by the National Human Genome Research Institute (5U41HG002371). R.B.C. was supported by NIGMS/R35GM128932.

Author information

Authors and Affiliations

Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
Jason D. Fernandes, Angie S. Hinrichs, Hiram Clawson, Jairo Navarro Gonzalez, Brian T. Lee, Luis R. Nassar, Brian J. Raney, Kate R. Rosenbloom, Santrupti Nerli, Daniel Schmelter, Alastair Fyfe, Nathan Maulding, Ann S. Zweig, Todd M. Lowe, Russ Corbet-Detig, W. James Kent, David Haussler & Maximilian Haeussler
Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
Jason D. Fernandes & David Haussler
ImmunoX Initiative, University of California San Francisco, San Francisco, CA, USA
Arjun A. Rao
Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA, USA
Manuel Ares Jr
Center for Molecular Biology of RNA, University of California Santa Cruz, Santa Cruz, CA, USA
Todd M. Lowe, Manuel Ares Jr & David Haussler

Authors

Jason D. Fernandes
View author publications
You can also search for this author in PubMed Google Scholar
Angie S. Hinrichs
View author publications
You can also search for this author in PubMed Google Scholar
Hiram Clawson
View author publications
You can also search for this author in PubMed Google Scholar
Jairo Navarro Gonzalez
View author publications
You can also search for this author in PubMed Google Scholar
Brian T. Lee
View author publications
You can also search for this author in PubMed Google Scholar
Luis R. Nassar
View author publications
You can also search for this author in PubMed Google Scholar
Brian J. Raney
View author publications
You can also search for this author in PubMed Google Scholar
Kate R. Rosenbloom
View author publications
You can also search for this author in PubMed Google Scholar
Santrupti Nerli
View author publications
You can also search for this author in PubMed Google Scholar
Arjun A. Rao
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Schmelter
View author publications
You can also search for this author in PubMed Google Scholar
Alastair Fyfe
View author publications
You can also search for this author in PubMed Google Scholar
Nathan Maulding
View author publications
You can also search for this author in PubMed Google Scholar
Ann S. Zweig
View author publications
You can also search for this author in PubMed Google Scholar
Todd M. Lowe
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Ares Jr
View author publications
You can also search for this author in PubMed Google Scholar
Russ Corbet-Detig
View author publications
You can also search for this author in PubMed Google Scholar
W. James Kent
View author publications
You can also search for this author in PubMed Google Scholar
David Haussler
View author publications
You can also search for this author in PubMed Google Scholar
Maximilian Haeussler
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.S.H., H.C., J.N.G., B.T.L., L.R.N., B.J.R., K.R.R., D.S., A.S.Z., W.J.K. and M.H. built the SARS-CoV-2 Browser. J.D.F., A.S.H., S.N., A.A.R., B.J.R., M.H., T.M.L. N.M., A.F. and M.A. developed tracks for the browser. J.D.F., T.M.L., M.A., R.C.-D., W.J.K., D.H. and M.H. provided general guidance on aspects of virology, RNA biology and genomics. J.D.F., D.H. and M.H. wrote the manuscript.

Corresponding authors

Correspondence to David Haussler or Maximilian Haeussler.

Ethics declarations

Competing interests

A.S.H., H.C., J.N.G., B.T.L., L.R.N., B.J.R., K.R.R., D.S., A.S.Z., W.J.K., D.H. and M.H. receive royalties from the sale of UCSC Genome Browser source code, LiftOver, GBiB and GBiC licenses to commercial entities. W.J.K. owns Kent Informatics.

Supplementary information

Supplementary Information

Supplementary Note and Figs. 1–4

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fernandes, J.D., Hinrichs, A.S., Clawson, H. et al. The UCSC SARS-CoV-2 Genome Browser. Nat Genet 52, 991–998 (2020). https://doi.org/10.1038/s41588-020-0700-8

Download citation

Published: 09 September 2020
Issue Date: October 2020
DOI: https://doi.org/10.1038/s41588-020-0700-8

This article is cited by

SARS-CoV-2 rapidly evolves lineage-specific phenotypic differences when passaged repeatedly in immune-naïve mice
- Julian Daniel Sunday Willett
- Annie Gravel
- Louis Flamand
Communications Biology (2024)
Human Identical Sequences, hyaluronan, and hymecromone ─ the new mechanism and management of COVID-19
- Shuai Yang
- Ying Tong
- Wenqiang Yu
Molecular Biomedicine (2022)
Coordinated regulation of interferon and inflammasome signaling pathways by SARS-CoV-2 proteins
- Na-Eun Kim
- Yoon-Jae Song
Journal of Microbiology (2022)
Unlocking capacities of genomics for the COVID-19 response and future pandemics
- Sergey Knyazev
- Karishma Chhugani
- Serghei Mangul
Nature Methods (2022)
Ultrafast Sample placement on Existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic
- Yatish Turakhia
- Bryan Thornlow
- Russell Corbett-Detig
Nature Genetics (2021)