Series 01 January 2018

Computational Tools

The scale and complexity of genetic and genomic data are ever-expanding, requiring biologists to apply increasingly more sophisticated computational tools in the analysis, interpretation and storage of these data. This series contains articles that focus on the application of these software tools in genetics and genomics.

Content

Computational analysis of cancer genome sequencing data

In this Review the authors provide an overview of key algorithmic developments, popular tools and emerging technologies used in the bioinformatic analysis of genomes. They also describe how such analysis can identify point mutations, copy number alterations, structural variations and mutational signatures in cancer genomes.
- Isidro Cortés-Ciriano
- Doga C. Gulhan
- Peter J. Park
Review Article8 Dec 2021 Nature Reviews Genetics
Navigating the pitfalls of applying machine learning in genomics

Machine learning is widely applied in various fields of genomics and systems biology. In this Review, the authors describe how responsible application of machine learning requires an understanding of several common pitfalls that users should be aware of (and mitigate) to avoid unreliable results.
- Sean Whalen
- Jacob Schreiber
- Katherine S. Pollard
Review Article26 Nov 2021 Nature Reviews Genetics
Decoding disease: from genomes to networks to phenotypes

In this Review, the authors discuss computational methods for interpreting the molecular and clinical effects of genetic variants. They focus on methods leveraging machine learning, including those that characterize the effects on wider molecular networks.
- Aaron K. Wong
- Rachel S. G. Sealfon
- Olga G. Troyanskaya
Review Article2 Aug 2021 Nature Reviews Genetics
Next-generation computational tools for interrogating cancer immunity

The interactions between tumours and the immune system are highly complex. This article discusses methods — primarily computational tools — for characterizing diverse aspects of cancer–immune cell interactions, including antigen presentation, T cell repertoires and heterogeneity in cell types and cell states. The Review particularly highlights the insights from single-cell data from both sequencing technologies and in situ imaging of tissues.
- Francesca Finotello
- Dietmar Rieder
- Zlatko Trajanoski
Review Article12 Sep 2019 Nature Reviews Genetics
Computational tools to unmask transposable elements

The repetitive nature of transposable elements (TEs) creates bioinformatic challenges that frequently result in them being disregarded (‘masked’) in analyses. As physiological and pathological roles for TEs become increasingly appreciated, this Review discusses bioinformatics tools dedicated to TE analysis, including for genomic annotation, TE classification, identifying polymorphisms and assessing likely functional impacts.
- Patricia Goerner-Potvin
- Guillaume Bourque
Review Article19 Sep 2018 Nature Reviews Genetics
From genome-wide associations to candidate causal variants by statistical fine-mapping

Fine-mapping is the process by which a trait-associated region from a genome-wide association study (GWAS) is analysed to identify the particular genetic variants that are likely to causally influence the examined trait. This Review discusses the diverse statistical approaches to fine-mapping and their foundations, strengths and limitations, including integration of trans-ethnic human population data and functional annotations.
- Daniel J. Schaid
- Wenan Chen
- Nicholas B. Larson
Review Article29 May 2018 Nature Reviews Genetics
Piercing the dark matter: bioinformatics of long-range sequencing and mapping

Various genomics-related fields are increasingly taking advantage of long-read sequencing and long-range mapping technologies, but making sense of the data requires new analysis strategies. This Review discusses bioinformatics tools that have been devised to handle the numerous characteristic features of these long-range data types, with applications in genome assembly, genetic variant detection, haplotype phasing, transcriptomics and epigenomics.
- Fritz J. Sedlazeck
- Hayan Lee
- Michael C. Schatz
Review Article29 Mar 2018 Nature Reviews Genetics
Cloud computing for genomic data analysis and collaboration

Next-generation sequencing technologies have fuelled a rapid rise in data, which require vast computational resources to store and analyse. This Review discusses the role of cloud computing in genomics research to facilitate data sharing and new analyses of archived sequencing data, as well as large-scale international collaborations.
- Ben Langmead
- Abhinav Nellore
Review Article30 Jan 2018 Nature Reviews Genetics
Computational genomics tools for dissecting tumour–immune cell interactions

Cancer immunotherapies are promising strategies for cancer treatment. However, their optimized use will require a comprehensive understanding of the diverse cell types, antigens and genetic variants (both germline and somatic) that comprise the tumour–immune system interface. This Review discusses various bioinformatics tools that process multi-level omics data for insights into tumour–immune cell interactions.
- Hubert Hackl
- Pornpimol Charoentong
- Zlatko Trajanoski
Review Article4 Jul 2016 Nature Reviews Genetics
A comparison of tools for the simulation of genomic next-generation sequencing data

Computer simulation of next-generation sequencing data can be extremely useful for assessing and validating biological models, benchmarking sequence analysis tools or gaining an understanding of specific data sets. Here, the authors review the functionality, requirements and applications of 23 currently available simulation tools and provide a guide for the selection of the most appropriate one.
- Merly Escalona
- Sara Rocha
- David Posada
Review Article20 Jun 2016 Nature Reviews Genetics
Human genotype–phenotype databases: aims, challenges and opportunities

With biomedical datasets growing exponentially in size and number, efforts to increase their utility and availability are essential, but much work remains to maximize exploitability. This Review summarizes trends, developments and future perspectives in the rapidly advancing field of human genotype–phenotype databases.
- Anthony J. Brookes
- Peter N. Robinson
Review Article10 Nov 2015 Nature Reviews Genetics
Methods and models for unravelling human evolutionary history

The rapid accumulation and increasing quality of human DNA sequence-variation data brought about by advances in genome-scale sequencing present opportunities to investigate human evolution. The authors discuss the statistical methods and models that can be used to gain insight into the evolution of human populations from analyses of large-scale genomic data sets, as well as the challenges associated with these approaches.
- Joshua G. Schraiber
- Joshua M. Akey
Review Article10 Nov 2015 Nature Reviews Genetics
Expanding the computational toolbox for mining cancer genomes

The field of cancer genomics has been transformed by recent advances in sequencing and the development of new computational methods. This Review outlines the available cancer genomics software and describes recent insights gained from the application of these tools.
- Li Ding
- Michael C. Wendl
- Benjamin J. Raphael
Review Article8 Jul 2014 Nature Reviews Genetics
Emerging methods in protein co-evolution

Functional interactions between proteins and within proteins results in co-evolutionary signatures in amino acid sequences that serve as clues to various forms of interdependence. This Review discusses the principles and distinctions of the large range of computational tools to analyse protein co-evolution and the biological insight that they are providing.
- David de Juan
- Florencio Pazos
- Alfonso Valencia
Review Article5 Mar 2013 Nature Reviews Genetics
Sequence assembly demystified

As the use of next-generation sequencing has proliferated, so has the range of sequencing applications and software tools that are available for assembling sequences. To help readers to make informed choices about assembly techniques, this Review discusses the available options and practical trade-offs.
- Niranjan Nagarajan
- Mihai Pop
Review Article29 Jan 2013 Nature Reviews Genetics
Text-mining solutions for biomedical research: enabling integrative biology

Text mining — retrieving information from papers and databases — is increasingly used in data-rich fields such as genomics, systems biology and biomedical research. This Review discusses recent tools that can aid researchers and sets out the potential of enhancing integrative research using text mining.
- Dietrich Rebholz-Schuhmann
- Anika Oellrich
- Robert Hoehndorf
Review Article14 Nov 2012 Nature Reviews Genetics
Analysing and interpreting DNA methylation data

The analysis and interpretation of genome-wide DNA methylation data poses unique bioinformatics challenges. In this article, the tools that are available for processing, visualizing and interpreting these epigenetic data sets are discussed, and the relative advantages of various methods are considered.
- Christoph Bock
Review Article18 Sep 2012 Nature Reviews Genetics
Molecular phylogenetics: principles and practice

Phylogenetic analysis is pervading every field of biological study. The authors review and assess the main methods of phylogenetic analysis — including parsimony, distance, likelihood and Bayesian methods — and provide guidance for selecting the most appropriate approach and software package.
- Ziheng Yang
- Bruce Rannala
Review Article28 Mar 2012 Nature Reviews Genetics
A beginner's guide to eukaryotic genome annotation

Although genome sequencing is becoming routine, genome annotation is becoming increasingly challenging. The authors provide an overview of the steps and software tools that are available for annotating eukaryotic genomes, and describe the best practices for sharing, quality checking and updating the annotation.
- Mark Yandell
- Daniel Ence
Review Article18 Apr 2012 Nature Reviews Genetics
Computer simulations: tools for population and evolutionary genetics

Computer simulations can be valuable components of studies in many fields, including population genetics, evolutionary biology, genetic epidemiology and ecology. The recent increase in the available range of software packages is now making simulation an accessible option for researchers with limited bioinformatics experience.
- Sean Hoban
- Giorgio Bertorelle
- Oscar E. Gaggiotti
Review Article10 Jan 2012 Nature Reviews Genetics
Repetitive DNA and next-generation sequencing: computational challenges and solutions

Repeat sequences in DNA remain one of the most challenging aspects of next-generation sequencing data analysis and interpretation. This Review explains the problems and current strategies for handling repeats; ignoring repeats risks missing important biological information.
- Todd J. Treangen
- Steven L. Salzberg
Review Article29 Nov 2011 Nature Reviews Genetics
Experimental and analytical tools for studying the human microbiome

Studies of the composition, dynamics and function of the human microbiome have taken off in the past two years thanks to the development of new sequencing technologies and advanced algorithms. This article provides a guide to the experimental and analytical best practices in this flourishing field.
- Justin Kuczynski
- Christian L. Lauber
- Rob Knight
Review Article16 Dec 2011 Nature Reviews Genetics
Software for systems biology: from tools to integrated platforms

Systems biology is intrinsically reliant on software tools and data resources. Through looking at each stage in a systems biology workflow, this Review presents the available options and key challenges, and sets out the concept of an integrated software platform.
- Samik Ghosh
- Yukiko Matsuoka
- Hiroaki Kitano
Review Article3 Nov 2011 Nature Reviews Genetics
Next-generation transcriptome assembly

Advances in sequencing technologies, assembly algorithms and computing power are making it feasible to assemble the entire transcriptome from short RNA reads. The article reviews the transcriptome assembly strategies, their advantages and limitations and how to apply them effectively.
- Jeffrey A. Martin
- Zhong Wang
Review Article7 Sep 2011 Nature Reviews Genetics
Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data

The recent surge in sequencing output has uncovered a wealth of genetic variation, but interpretation of these data remains a challenge. This Review discusses computational and experimental methods for estimating the deleteriousness and functional significance of genetic variants to better identify those that are potentially causal for disease.
- Gregory M. Cooper
- Jay Shendure
Review Article18 Aug 2011 Nature Reviews Genetics
Genotype and SNP calling from next-generation sequencing data

An overview of the steps required in converting next-generation sequencing (NGS) data into accurate called SNPs and genotypes, a process that is crucial for the many downstream analyses of NGS data.
- Rasmus Nielsen
- Joshua S. Paul
- Yun S. Song
Review Article18 May 2011 Nature Reviews Genetics
Genome structural variation discovery and genotyping

Structural variation in the genome can influence disease, complex traits and evolution, but comprehensive characterization of variants is challenging. This Review compares current methods — particularly microarray platforms and sequencing-based computational analysis — and considers future research strategies.
- Can Alkan
- Bradley P. Coe
- Evan E. Eichler
Review Article1 Mar 2011 Nature Reviews Genetics
Computational solutions to large-scale data management and analysis

This Review describes the different types of computational environments — such as cloud and heterogeneous computing — that are increasingly being used by life scientists to manage and analyse large multidimensional data sets.
- Eric E. Schadt
- Michael D. Linderman
- Garry P. Nolan
Review Article1 Sep 2010 Nature Reviews Genetics
Next-generation genomics: an integrative approach

A huge range of genome-scale data sets — including genomic, epigenomic and transcriptomic information — are now available, and it is widely acknowledged that combining several data sets can provide important biological insights. However, there are practical, conceptual and computational challenges to data integration.
- R. David Hawkins
- Gary C. Hon
- Bing Ren
Review Article8 Jun 2010 Nature Reviews Genetics

Computational Tools

Content

Computational analysis of cancer genome sequencing data

Navigating the pitfalls of applying machine learning in genomics

Decoding disease: from genomes to networks to phenotypes

Next-generation computational tools for interrogating cancer immunity

Computational tools to unmask transposable elements

From genome-wide associations to candidate causal variants by statistical fine-mapping

Piercing the dark matter: bioinformatics of long-range sequencing and mapping

Cloud computing for genomic data analysis and collaboration

Computational genomics tools for dissecting tumour–immune cell interactions

A comparison of tools for the simulation of genomic next-generation sequencing data

Human genotype–phenotype databases: aims, challenges and opportunities

Methods and models for unravelling human evolutionary history

Expanding the computational toolbox for mining cancer genomes

Emerging methods in protein co-evolution

Sequence assembly demystified

Text-mining solutions for biomedical research: enabling integrative biology

Analysing and interpreting DNA methylation data

Molecular phylogenetics: principles and practice

A beginner's guide to eukaryotic genome annotation

Computer simulations: tools for population and evolutionary genetics

Repetitive DNA and next-generation sequencing: computational challenges and solutions

Experimental and analytical tools for studying the human microbiome

Software for systems biology: from tools to integrated platforms

Next-generation transcriptome assembly

Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data

Genotype and SNP calling from next-generation sequencing data

Genome structural variation discovery and genotyping

Computational solutions to large-scale data management and analysis

Next-generation genomics: an integrative approach

Search

Quick links