Article

European Journal of Human Genetics (2007) 15, 694–702. doi:10.1038/sj.ejhg.5201815; published online publication, 21 March 2007

The use of grid computing to drive data-intensive genetic research

Jorge Andrade1, Malin Andersen1,2, Anna Sillén3, Caroline Graff3 and Jacob Odeberg1,2,4

  1. 1Department of Biotechnology, AlbaNova University Center, Royal Institute of Technology (KTH), Stockholm, Sweden
  2. 2Department of Medicine, Atherosclerosis Research Unit, Gustaf V Research Institute, Karolinska Institutet, Karolinska University Hospital, Stockholm, Sweden
  3. 3Karolinska Institutet Dainippon Sumitomo Pharma Alzheimer Center (KASPAC) Department of Neurobiology, Care Science and Society, Karolinska Institutet, Novum, Huddinge, Sweden
  4. 4Karolinska Biomics Centre, Karolinska University Hospital, Stockholm, Sweden

Correspondence: Dr J Odeberg, Department of Biotechnology, Royal Institute of Technology, Alba Nova University Center, SE – 10691 Stockholm, Sweden. Tel: +46 85 537 8332; Fax: +46 85 537 8481; E-mail: Jacob.odeberg@biotech.kth.se

Received 4 October 2006; Revised 10 January 2007; Accepted 14 February 2007; Published online 21 March 2007.

Top

Abstract

In genetics, with increasing data sizes and more advanced algorithms for mining complex data, a point is reached where increased computational capacity or alternative solutions becomes unavoidable. Most contemporary methods for linkage analysis are based on the Lander-Green hidden Markov model (HMM), which scales exponentially with the number of pedigree members. In whole genome linkage analysis, genotype simulations become prohibitively time consuming to perform on single computers. We have developed 'Grid-Allegro', a Grid aware implementation of the Allegro software, by which several thousands of genotype simulations can be performed in parallel in short time. With temporary installations of the Allegro executable and datasets on remote nodes at submission, the need of predefined Grid run-time environments is circumvented. We evaluated the performance, efficiency and scalability of this implementation in a genome scan on Swedish multiplex Alzheimer's disease families. We demonstrate that 'Grid-Allegro' allows for the full exploitation of the features available in Allegro for genome-wide linkage. The implementation of existing bioinformatics applications on Grids (Distributed Computing) represent a cost-effective alternative for addressing highly resource-demanding and data-intensive bioinformatics task, compared to acquiring and setting up clusters of computational hardware in house (Parallel Computing), a resource not available to most geneticists today.

Keywords:

grid, bioinformatics, genome-wide, linkage analysis, genotype simulation

Top

MORE ARTICLES LIKE THIS

These links to content published by NPG are automatically generated

RESEARCH

Development of a novel mouse glioma model using lentiviral vectors

Nature Medicine Technical Report (01 Jan 2009)

An approach for cutting large and complex pedigrees for linkage analysis

European Journal of Human Genetics Article Response

Expanded high-resolution genetic study of 109 Swedish families with Alzheimer's disease

European Journal of Human Genetics Article Response

Allegro version 2

Nature Genetics Correspondence (01 Oct 2005)

Allegro, a new computer program for multipoint linkage analysis

Nature Genetics Correspondence (01 May 2000)

See all 8 matches for Research

Extra navigation

.

naturejobs

ADVERTISEMENT