Primers

  • Primer |

    A mathematical concept known as a de Bruijn graph turns the formidable challenge of assembling a contiguous genome from billions of short sequencing reads into a tractable computational problem.

    • Phillip E C Compeau
    • , Pavel A Pevzner
    •  & Glenn Tesler
  • Primer |

    Hierarchical models provide reliable statistical estimates for data sets from high-throughput experiments where measurements vastly outnumber experimental samples.

    • Hongkai Ji
    •  & X Shirley Liu
  • Primer |

    Flux balance analysis is a mathematical approach for analyzing the flow of metabolites through a metabolic network. This primer covers the theoretical basis of the approach, several practical examples and a software toolbox for performing the calculations.

    • Jeffrey D Orth
    • , Ines Thiele
    •  & Bernhard Ø Palsson
  • Primer |

    When prioritizing hits from a high-throughput experiment, it is important to correct for random events that falsely appear significant. How is this done and what methods should be used?

    • William S Noble
  • Primer |

    Networks in biology can appear complex and difficult to decipher. Merico et al. illustrate how to interpret biological networks with the help of frequently used visualization and analysis patterns.

    • Daniele Merico
    • , David Gfeller
    •  & Gary D Bader
  • Primer |

    Mapping the vast quantities of short sequence fragments produced by next-generation sequencing platforms is a challenge. What programs are available and how do they work?

    • Cole Trapnell
    •  & Steven L Salzberg
  • Primer |

    Only a subset of single-nucleotide polymorphisms (SNPs) can be genotyped in genome-wide association studies. Imputation methods can infer the alleles of 'hidden' variants and use those inferences to test the hidden variants for association.

    • Eran Halperin
    •  & Dietrich A Stephan
  • Primer |

    Only a subset of genetic variants can be examined in genome-wide surveys for genetic risk factors. How can a fixed set of markers account for the entire genome by acting as proxies for neighboring associations?

    • Eran Halperin
    •  & Dietrich A Stephan
  • Primer |

    How can genome browsers help researchers to infer biological knowledge from data that might be misleading?

    • Melissa S Cline
    •  & W James Kent
  • Primer |

    Decision trees have been applied to problems such as assigning protein function and predicting splice sites. How do these classifiers work, what types of problems can they solve and what are their advantages over alternatives?

    • Carl Kingsford
    •  & Steven L Salzberg
  • Primer |

    The expectation maximization algorithm arises in many computational biology applications that involve probabilistic models. What is it good for, and how does it work?

    • Chuong B Do
    •  & Serafim Batzoglou
  • Primer |

    Principal component analysis is often incorporated into genome-wide expression studies, but what is it and how can it be used to explore high-dimensional data?

    • Markus Ringnér
  • Primer |

    Artificial neural networks have been applied to problems ranging from speech recognition to prediction of protein secondary structure, classification of cancers and gene prediction. How do they work and what might they be good for?

    • Anders Krogh
  • Primer |

    Computational prediction of gene structure is crucial for interpreting genomic sequences. But how do the algorithms involved work and how accurate are they?

    • Michael R Brent
  • Primer |

    Instrumentation aside, algorithms for matching mass spectra to proteins are at the heart of shotgun proteomics. How do these algorithms work, what can we expect of them and why is it so difficult to find protein modifications?

    • Edward M Marcotte
  • Primer |

    Support vector machines (SVMs) are becoming popular in a wide variety of biological applications. But, what exactly are SVMs and how do they work? And what are their most promising applications in the life sciences?

    • William S Noble
  • Primer |

    How can we computationally extract an unknown motif from a set of target sequences? What are the principles behind the major motif discovery algorithms? Which of these should we use, and how do we know we've found a 'real' motif?

    • Patrik D'haeseleer
  • Primer |

    Sequence motifs are becoming increasingly important in the analysis of gene regulation. How do we define sequence motifs, and why should we use sequence logos instead of consensus sequences to represent them? Do they have any relation with binding affinity? How do we search for new instances of a motif in this sea of DNA?

    • Patrik D'haeseleer
  • Primer |

    Bayesian networks are increasingly important for integrating biological data and for inferring cellular networks and pathways. What are Bayesian networks and how are they used for inference?

    • Chris J Needham
    • , James R Bradford
    • , Andrew J Bulpitt
    •  & David R Westhead
  • Primer |

    Clustering is often one of the first steps in gene expression analysis. How do clustering algorithms work, which ones should we use and what can we expect from them?

    • Patrik D'haeseleer
  • Primer |

    Programs such as MFOLD and ViennaRNA are widely used to predict RNA secondary structures. How do these algorithms work? Why can't they predict RNA pseudoknots? How accurate are they, and will they get better?

    • Sean R Eddy
  • Primer |

    Statistical models called hidden Markov models are a recurring theme in computational biology. What are hidden Markov models, and why are they so useful for so many different problems?

    • Sean R Eddy
  • Primer |

    There seem to be a lot of computational biology papers with 'Bayesian' in their titles these days. What's distinctive about 'Bayesian' methods?

    • Sean R Eddy
  • Primer |

    Sequence alignment methods often use something called a 'dynamic programming' algorithm. What is dynamic programming and how does it work?

    • Sean R Eddy