Volume 455 Issue 7209, 4 September 2008

In Nature this week, features and opinion pieces on one of the most daunting challenges facing modern science: how to cope with the flood of data now being generated. [Editorial/Introduction p. 1] A petabyte is a lot of memory, however you say it - a quadrillion, 1015, or tens of thousands of trillions of bytes. But that is the currency of 'big data'. We visited the Sanger Institute's supercomputing centre, and its petabyte of capacity. [News Feature p. 16] Wikipedia's success shows how well the 'wiki' concept of open-access editing can work. It could work too as a way of coping with the data flows of modern biology. [News Feature p. 22] The world’s leading search engine is ten this month. Eleven years ago few would have predicted Google’s domination: undaunted we ask scientists and business people to try to predict the next big thing, a Google for the petabyte era. [Special Report p. 8] Digital data are easily shared, and just as easily wiped or lost. The problem of keeping on-line data accessible is especially difficult for the smaller lab. [Commentary p. 28] In Books & Arts, Felice Frankel and Rosalind Reid champion the cause of data visualization as a way of finding meaning in an otherwise daunting data stream. [Books & Arts p. 30] From the 1700s to the mid 1950s, most 'computers' were human. Best known were the 'Harvard computers', a group of women working from the 1880s until the 1940s, at the Harvard College Observatory. Employed to classify stars captured on millions of photographic plates, some of the ‘computers’ made significant contributions to science. [Essay p. 36] Online databases are a vital outlet for publishing the data being produced by biological research. But the data need to be properly organized. This is the role of the biocurator, but as a team of authors from 15 of the world’s major online research resources explains, biocuration is now sadly neglected. [Feature p. 47] An aspect of the data boom with a political dimension is the environment: how much data to collect, how much money to spend. [Party of One p. 15] For 'Big data' online, go to www.nature.com/news/specials/bigdata/ and to www.nature.com/podcast. [Cover art by Danny Allison]


    Researchers need to adapt their institutions and practices in response to torrents of new data — and need to complement smart science with smart searching.

    High-energy physicists should not gloss over fundamental conundrums.

    Scientific collaboration between East and West must survive the crisis in Georgia.

    Collecting and releasing environmental data have stirred up controversy in Washington, says David Goldston, and will continue to do so.

    • David Goldston

    What does it take to store bytes by the tens of thousands of trillions? Cory Doctorow meets the people and machines for which it's all in a day's work.

    • Cory Doctorow
    Pioneering biologists are trying to use wiki-type web pages to manage and interpret data, reports Mitch Waldrop. But will the wider research community go along with the experiment?

    • Mitch Waldrop



    Scientists need to ensure that their results will be managed for the long haul. Maintaining data takes big organization, says Clifford Lynch.

    • Clifford Lynch

    Buried in vast streams of data are clues to new science. But we may need to craft new lenses to see them, explain Felice Frankel and Rosalind Reid.

    • Felice Frankel
    •  & Rosalind Reid
    Will the possibilities for mass creativity on the Internet be realized or squandered, asks Tony Hey.

    • Tony Hey
    On the unveiling of the second phase of the Darwin Centre at London's Natural History Museum, Anna Maria Indrio, partner at the Scandinavian architectural firm C. F. Møller, explains how the new £78 million (US$145 million) wing will reveal 20 million of the museum's insect and plant specimens to the public when it opens in September 2009.

    • Joanne Baker
    The first English translation of Gottfried Leibniz's earth science treatise records the difficulties of understanding our planet before geologists appreciated deep time, Richard Fortey discovers.

    • Richard Fortey


    The first mass data crunchers were people, not machines. Sue Nelson looks at the discoveries and legacy of the remarkable women of Harvard's Observatory.

    • Sue Nelson

    Do black holes exist? Observations at the finest resolution so far indicate that only gross deviations in the behaviour of gravity from that predicted by general relativity can invalidate the case that they do.

    • Christopher S. Reynolds
    The oxysterol-dependent gene transcription factor LXRβ restricts premature expansion of T cells by limiting cellular cholesterol levels. This pathway might be a pharmacological target for regulating immune responses.

    • Christopher K. Glass
    •  & Kaoru Saijo
    Spectroscopic measurement of the energy absorbed or emitted by an object is an invaluable experimental technique. An innovative approach opens the door to the acquisition of previously inaccessible data.

    • Frank K. Wilhelm
    The magnetic resonance imagers used in medicine fill rooms with their large-field magnets. But developments in ultra-low-field devices may give the doctor of tomorrow a more portable version.

    • Klaas P. Pruessmann
    Individual microRNA sequences can suppress the production of hundreds of proteins. Reduction of protein levels in this way is often modest, however, and many such RNAs probably collectively fine-tune gene expression.

    • Zissimos Mourelatos


    To thrive, the field that links biologists and their data urgently needs structure, recognition and support.

    • Doug Howe
    • , Maria Costanzo
    • , Petra Fey
    • , Takashi Gojobori
    • , Linda Hannick
    • , Winston Hide
    • , David P. Hill
    • , Renate Kania
    • , Mary Schaeffer
    • , Susan St Pierre
    • , Simon Twigger
    • , Owen White
    •  & Seung Yon Rhee


    Artificial atoms, quantum systems with atom-like energy structure, have been studied with frequency spectroscopic techniques. However, much information about the energy level spectrum has been hidden, as the technique is impractical for high frequencies. A complementary technique has been developed where the energy level of an artificial atom is not scanned by tuning frequency, but amplitude of the radiation, while the frequency is tuned to a specific feature in the spectrum.

    • David M. Berns
    • , Mark S. Rudner
    • , Sergio O. Valenzuela
    • , Karl K. Berggren
    • , William D. Oliver
    • , Leonid S. Levitov
    •  & Terry P. Orlando
    In one of two studies, a technique known as SILAC is used to measure, on a large scale, changes in protein level as a function of expression of endogenous and exogenous miRNAs. It is found that although miRNAs directly repress the translation of hundreds of genes, additional indirect effects result in changes in expression of thousands of genes.

    • Matthias Selbach
    • , Björn Schwanhäusser
    • , Nadine Thierfelder
    • , Zhuo Fang
    • , Raya Khanin
    •  & Nikolaus Rajewsky
    In one of two studies, a technique known as SILAC is used to measure, on a large scale, changes in protein level as a function of expression of endogenous and exogenous miRNAs. It is found that although miRNAs directly repress the translation of hundreds of genes, additional indirect effects result in changes in expression of thousands of genes.

    • Daehyun Baek
    • , Judit Villén
    • , Chanseok Shin
    • , Fernando D. Camargo
    • , Steven P. Gygi
    •  & David P. Bartel
    This paper establishes a role of the extracellular matrix for regulating the BMP morphogen gradient responsible for dorsal–ventral patterning of vertebrate and invertebrate embryos. Type IV collagen binds to the Dpp ligand (the Drosophila form of BMP) and regulates its signalling in the Drosophila embryo and ovary by sequestering Dpp. Human type IV collagen binds the analogous protein in humans.

    • Xiaomeng Wang
    • , Robin E. Harris
    • , Laura J. Bayston
    •  & Hilary L. Ashe


    The cores of most large galaxies are thought to harbour super massive black holes. Sagittarius A*, the compact source of radio, infrared and x-ray emission at the centre of the Milky Way, is the closest example of this phenomenon. This paper reports observations that set a limit less than the expected apparent size of the event horizon of the presumed black hole, suggesting that the bulk of Sgr A* emission may not be centred on the black hole, but arises in the surrounding accretion flow.

    • Sheperd S. Doeleman
    • , Jonathan Weintroub
    • , Alan E. E. Rogers
    • , Richard Plambeck
    • , Robert Freund
    • , Remo P. J. Tilanus
    • , Per Friberg
    • , Lucy M. Ziurys
    • , James M. Moran
    • , Brian Corey
    • , Ken H. Young
    • , Daniel L. Smythe
    • , Michael Titus
    • , Daniel P. Marrone
    • , Roger J. Cappallo
    • , Douglas C.-J. Bock
    • , Geoffrey C. Bower
    • , Richard Chamberlin
    • , Gary R. Davis
    • , Thomas P. Krichbaum
    • , James Lamb
    • , Holly Maness
    • , Arthur E. Niell
    • , Alan Roy
    • , Peter Strittmatter
    • , Daniel Werthimer
    • , Alan R. Whitney
    •  & David Woody
    Angle-resolved photoemission spectroscopy (ARPES) of LaOFeP (Tc = 5.9 K) is reported. These results favour the itinerant ground state, albeit with band renormalization. In addition, the data reveal important differences between these and copper-based superconductors.

    • D. H. Lu
    • , M. Yi
    • , S.-K. Mo
    • , A. S. Erickson
    • , J. Analytis
    • , J.-H. Chu
    • , D. J. Singh
    • , Z. Hussain
    • , T. H. Geballe
    • , I. R. Fisher
    •  & Z.-X. Shen
    Small water droplets within larger oil droplets, which are themselves distributed in an aqueous phase, are a type of double emulsion. This paper details that double emulsions can prepared and stabilized over several months using amphiphilic diblock copolypeptides, and can even generate robust double nanoemulsions.

    • Jarrod A. Hanson
    • , Connie B. Chang
    • , Sara M. Graves
    • , Zhibo Li
    • , Thomas G. Mason
    •  & Timothy J. Deming
    Core-level photoelectron emission and intermolecular Coulombic decay for an aqueous hydroxide solution is measured. The results show that in contrast to hydrated protons, hydrated hydroxide ions can transiently donate a hydrogen bond to surrounding water molecules. This capability can explain the unusual and fast transport of hydroxide ions in water.

    • Emad F. Aziz
    • , Niklas Ottosson
    • , Manfred Faubel
    • , Ingolf V. Hertel
    •  & Bernd Winter
    Although cyclones in the tropical Atlantic seem to be getting stronger in response to increasing ocean temperatures, no clear trends of this sort have been discerned in other tropical regions. A new analysis of cyclone intensity using satellite data suggests that there is a global trend, but that it is quite subtle. The main changes appear not in an upward trend of average cyclone intensity, but rather in the maximum speeds attained by cyclones during their lifetimes, the stronger the cyclone, the greater the change.

    • James B. Elsner
    • , James P. Kossin
    •  & Thomas H. Jagger
    Tiger moth multimodal warning signals vary according to the activity patterns of predators with divergent sensory capacities. It is suggested that selective pressures from multiple predator classes play distinct roles in the evolution of multimodal warning displays.

    • John M. Ratcliffe
    •  & Marie L. Nydam
    The satellite virus Sputnik is a parasite that infects the giant Mamavirus and replicates in the virus factory built by Mamavirus, interfering with Mamavirus reproduction.

    • Bernard La Scola
    • , Christelle Desnues
    • , Isabelle Pagnier
    • , Catherine Robert
    • , Lina Barrassi
    • , Ghislain Fournous
    • , Michèle Merchat
    • , Marie Suzan-Monti
    • , Patrick Forterre
    • , Eugene Koonin
    •  & Didier Raoult
    Recent genomic efforts have demonstrated that large chunks of DNA differ between individuals in many species. Insertions or deletions of larger blocks cause single-nucleotide changes to their immediate vicinity, and population genetic models should take into account the 'mutator' effect of these insertions or deletions.

    • Dacheng Tian
    • , Qiang Wang
    • , Pengfei Zhang
    • , Hitoshi Araki
    • , Sihai Yang
    • , Martin Kreitman
    • , Thomas Nagylaki
    • , Richard Hudson
    • , Joy Bergelson
    •  & Jian-Qun Chen
    This paper investigates the structure of the HIV glycoprotein gp120 by cryo-electron tomography and molecular modelling. gp120 is analysed in an unliganded state, complexed with a neutralizing antibody and in a CD4 liganded state. The analysis provides insight into the conformational changes that occur with ligand binding.

    • Jun Liu
    • , Alberto Bartesaghi
    • , Mario J. Borgnia
    • , Guillermo Sapiro
    •  & Sriram Subramaniam
    A study reveals that overexpression of a single target of neurogenin 2, Rnd2, can restore the neuronal migration defects of neurogenin 2-depleted neurons. Rnd2 is thus an atypical member of the Rho family of small GTP-ases, which regulate actin cytoskeleton dynamics, with its activity regulated at the gene transcription level, rather than by the usual post-translational GTP/GDP cycle.

    • Julian Ik-Tsen Heng
    • , Laurent Nguyen
    • , Diogo S. Castro
    • , Céline Zimmer
    • , Hendrik Wildner
    • , Olivier Armant
    • , Dorota Skowronska-Krawczyk
    • , Francesco Bedogni
    • , Jean-Marc Matter
    • , Robert Hevner
    •  & François Guillemot
    Polo-like kinase-1 (PLK1) is an essential mitotic kinase regulating multiple aspects of the cell division process. Activation of PLK1 is shown to occur before mitosis and to depend on phosphorylation by aurora-A kinase, facilitated by a cofactor Bora. The initial activation of PLK1 seems to be a primary function of aurora-A.

    • Libor Macůrek
    • , Arne Lindqvist
    • , Dan Lim
    • , Michael A. Lampson
    • , Rob Klompmaker
    • , Raimundo Freire
    • , Christophe Clouin
    • , Stephen S. Taylor
    • , Michael B. Yaffe
    •  & René H. Medema
    Epac proteins are activated by binding of cyclic AMP (cAMP) and act as guanine nucleotide exchange factors for Rap GTPases. The structure of Epac2 in complex with cAMP and Rap1B is determined and comparison of this activated state of the complex with the inactive one reveals the conformational changes in Epac2 induced by cAMP binding.

    • Holger Rehmann
    • , Ernesto Arias-Palomo
    • , Michael A. Hadders
    • , Frank Schwede
    • , Oscar Llorca
    •  & Johannes L. Bos
    Myosin Va is a two-headed molecular motor that transports cargo inside cells by moving along actin filaments. The trailing head detaches and swings 72 nm forward to bind to a new leading position. During the processive movement at least one of the heads remain bound to actin. This report visualizes the movement of a fluorescently-labelled myosin Va molecule while simultaneously observing the binding and dissociation of a fluorescent ATP analogue. This is the first direct demonstration of nucleotide binding to and movement of myosin V motors during stepping.

    • Takeshi Sakamoto
    • , Martin R. Webb
    • , Eva Forgacs
    • , Howard D. White
    •  & James R. Sellers



    Brazil strives to remain among the world's biofuels leaders.

    • Virginia Gewin

    A trip to the hospital made me forget my research - and then realize its limitations.

    • Zachary Lippman


    Testing, testing ...

    • David Langford


