To the Editor:

Genome-wide association studies (GWAS) of complex human traits have become an important approach in human genetics. Taken together, GWAS are arguably the largest biological investigations of humans ever conducted. The total number of people genotyped to date with a GWAS array is difficult to know but probably exceeds 1,000,000. Major findings from these studies are that many common diseases have a polygenic architecture, the genetic effect sizes of common SNP variants are small, the identification of the involvement of genes and biological processes not previously suspected, and the association of some loci with different diseases1,2. Critically, the sample sizes necessary to identify robust and replicable findings are beyond those achievable by single groups, and collaborations have rapidly evolved to augment statistical power.

We sought to describe the collaborative networks that emerged as part of GWAS. We used the National Human Genome Research Initiative (NHGRI) GWAS catalog1 and PubMed to identify the authors of 604 GWAS published from the first report in 2005 up to the last complete year, 2010 (Supplementary Methods). These 604 GWAS papers had a total of 21,007 authorships (8,718 individuals; Supplementary Fig. 1).

We constructed network diagrams in the form of graphs, where nodes are authors and edges connect coauthors on a GWAS paper (Supplementary Figs. 2 and 3). Overall, there was a more than tenfold increase in the number of coauthorships from 2005 to 2010 (Supplementary Table 1).

We created a network graph using the open-source network visualization platform Gephi (Fig. 1). The graph shows modularity at several levels: there are 14 empirical coauthorship modules (groups of nodes of the same color), several modules show substructure (clusters within a module), and there are often abundant connections between modules. This graph is annotated more fully in Supplementary Figure 4 and Supplementary Table 2 and is coherent in the identification of individuals, laboratories and phenotypes studied.

Figure 1: Coauthorship network graph for GWAS published 2005–2008.
figure 1

Authors are nodes and coauthorships on papers are edges. Node size is proportional to the number of edges, and the distance between nodes decreases with the number of collaborations. Different community modules are colored and annotated in Supplementary Figure 4 and Supplementary Table 2.

Small-scale collaborations between laboratories are common in human genetics. There are also multiple examples of large-scale, 'big science' collaborations (for example, the Human Genome Project, HapMap, Table 1 the 1000 Genomes Project and ENCODE). Most large collaborations developed in a federated manner with a formal, top-down structure. GWAS collaborations are an unusual event in the history of biomedicine, as large and extensive collaborations self-organized and emerged rapidly from grassroots origins.

Table 1 Resources cited

Author contributions

B.K.B.-S. provided technical knowledge of graph theory and wrote the paper. P.F.S. designed the project, conducted the analyses, and wrote the paper.