Nature 453, 698 (5 June 2008) | doi:10.1038/453698a; Published online 4 June 2008

A flood of hard data


Social scientists have a new handle on group behaviour — but its causes remain a challenge.

Every human being makes choices, takes action and is affected by the environment in a way that seems utterly idiosyncratic. Yet in the aggregate, as the British philosopher John Stuart Mill put it a century and a half ago, human events "most capricious and uncertain" can take on "a degree of regularity approaching to mathematical". A case in point is the analysis of mobile-phone data discussed by González et al. on page 779 of this issue. It reveals just such a mathematical regularity in the seemingly unpredictable way people move around during their daily lives (see also page 714).

As remarkable as this result is — and it is still not completely understood — the research is just as notable for its methodology. Social scientists have long struggled with a paucity of hard data about human activities; people's self-reporting about their social interactions, say, or their movement patterns is labour-intensive to collect and notoriously unreliable. In this case, the researchers obtained objective data on individuals' movements from mobile-phone networks (albeit without access to any individual's identity, for privacy reasons). This gave them a data set of proportions almost unheard of for such a complex aspect of behaviour: more than 16 million 'hops' for 100,000 people. The resulting statistics show a strikingly small scatter, giving grounds for confidence in the mathematical laws they disclose.

The goal of social science is not simply to understand how people behave in large groups, but to understand what motivates individuals to behave the way they do.

The mobile-phone technique is simply the latest example of how modern information technologies are giving social scientists the power to make measurements that are often as precise as those in the 'hard' sciences. By analysing e-mail transmissions, for example, or doing automated searches of publication databases, social scientists can collect detailed information on the network structure of scientific collaborations and other social interactions. And by allowing their subjects to interact online, researchers can do large-scale studies of, for example, the role of social interactions in opinion formation, complete with control groups and tuneable parameters.

It's not an overstatement to say that these tools are fostering a whole new type of social science — with applications that go well beyond the conventional boundaries of the field. There is sure to be commercial interest in the detailed patterns of usage for portable electronics, for example, and the nature of mass human movement could inform urban planning and the development of transportation networks. Epidemiologists, meanwhile, will no longer be forced to work with highly oversimplified models of infection rates and disease spread: recent work has clarified how the transmission of disease depends on the precise structural details of the network of person-to-person contacts.

For all their promise, making sense of these new data sets requires a rather different set of statistical skills than those needed in conventional social science — which may be one reason why studies such as that by González et al. are so often conducted by researchers trained in the physical sciences. To some extent this 'physicalization' of the social sciences is healthy for the field; it has already brought in many new ideas and perspectives. But it also needs to be regarded with some caution.

As many social scientists have pointed out, the goal of their discipline is not simply to understand how people behave in large groups, but to understand what motivates individuals to behave the way they do. The field cannot lose focus on that — even as it moves to exploit the power of these new technological tools, and the mathematical regularities they reveal. Comprehending capricious and uncertain human events at every level remains one of the most challenging questions in science.