A fundamental theme across the study of complex systems1,2,3 -from ecosystems4,5 to human behavior1,7 and socioeconomic organization3,8,9- deals with the mechanisms by which diversity arises and is sustained.

Diversity means generally that component agents of a large system are heterogeneous so that they exhibit different traits and perform different functions. For all its ubiquity and importance, the functional role of heterogeneity in complex systems has remained poorly understood. General lines of argument for explaining diversity can be roughly divided into two classes: those that emphasize its protective role in uncertain environments and those that invoke direct gains in productivity as a result of functional heterogeneity5,6.

On the one hand, heterogeneity may confer collectives with resilience to selective shocks through a hedging effect that allows some components of the system to survive in times of crisis10. For example, cities that have remained very specialized (in manufacturing) have historically fared poorly over the long run. This effect already speaks of the importance of heterogeneity in enabling differential selection and therefore speeding up “evolution”11,12, whether in ecosystems or in markets. Selection, in this picture, remains a function of the external environment and confers no immediate incentive on particular individuals to differentiate: It develops because its effects, in retrospect, are necessary for the group's survival.

On the other hand, a more direct and potentially more fundamental link between diversity, productivity and scale may exist. This connection has often been been invoked, both in ecosystems and in economics, to explain why larger systems tend to be at once more diverse and more productive3,4,5,8,9,10,13. Nevertheless, the formalization of these ideas remains incomplete and does not yet explain quantitative patterns in observed levels of diversity3,14. The key concept is that diversity is both a consequence and an enabler of scale-dependent system organization and that it is directly associated with higher individual productivity. As a consequence, higher heterogeneity can be positively selected endogenously by agents and organizations that benefit directly from such strategies. Here, we show how diversity, productivity and social network structure can be combined at the microscopic level to generate an open-ended dynamics of growth, differentiation and complexification in cities.

In contemporary human societies socioeconomic diversity is associated primarily with large cities7,15. This reflects the primary role of urban environments in promoting the production of new ideas and stimulating socioeconomic development3,15,16. However, counter-arguments have also emerged noting that specialized cities are, at specific times, more productive16,17,18,19. Familiar examples are contemporary Silicon Valley or manufacturing cities in their heyday such as Manchester or Detroit. These questions remain far from settled, in part because of the difficulties inherent to measuring diversity in any complex system2,20.

Empirically, diversity has remained difficult to characterize unambiguously. Most measures account for the presence and sometimes the relative proportion21,22,23, of different functional types: For example different professions or business types in cities or nations, or different species in an ecosystem24. Such measures of diversity, are inevitably linked to particular classification schemes or taxonomies. To appreciate this point, consider the question: How many different professions are there in a large city, like New York? In general, there is no objective answer as it depends on how finely one differentiates similar functions. However, under general conditions, a limit of infinite taxonomic resolution can be inferred through an analysis analogous to that of statistical physical quantities near phase transitions25: If such a limit exists, taxonomic scheme-independent answers emerge.

Here we show how addressing these two issues at once - measuring functional diversity and linking it to productivity by grounding it in scale-dependent urban social structure - can help shed light on the dynamics and structure of cities. As a result we are able to develop a predictive theoretical framework for the emergence of functional heterogeneity as a consequence of a process of division and coordination of functions in social networks, stimulated by productivity gains and made possible by increases in population, that is: system size.


We start by showing that the simplest measure of functional diversity in cities, the count of the number of distinct professions, is classification scheme dependent. We then show how a scheme-invariant measure can be obtained in the limit of infinite resolution via well-known methods of finite size scaling in statistical physics. This allows us to derive the general form of the distribution of professions in US urban areas, which we show applies to cities of all sizes and over time. We then develop a theoretical framework that derives the observed patterns of professional diversity from the maximization of economic productivity under social network constraints reflecting coordination costs at the metropolitan level.

Classification resolution and the number of distinct professions

The simplest measure of diversity, D(N), counts the number of distinct professions in a city. Fig. 1A shows D(N), for US metropolitan areas (MSAs) as a function of their total employment, Ne. Because Ne is, on average, proportional to population26, N, we use the two measures of scale interchangeably. We observe that D increases with Ne initially and then saturates for large cities and is well fit by

Eq. 1 holds over time (Fig. S1) and for different levels of resolution, r, in the occupations hierarchical classification scheme (see Supplementary Information). The parameters in Eq. 1 are, in general, functions of r. The scale d0(r) is the effective size of the classification scheme at resolution r, N0(r) is the characteristic size of the city at which saturation starts. The exponent (elasticity), γ, is empirically independent of r: It expresses the proportionality between the population growth rate and that for new occupations in the city, in the absence of saturation (see Supplementary Information).

Figure 1
figure 1

The number of distinct occupations in US Metropolitan Statistical Areas vs. total employment.

(A) The relationship between the number of professions present for each city (orange dots) and city size is well described by , with d0 = 686, γ = 0.84, N0 = 1.48 × 105 (blue line). (B) D(Ne) at different labels of resolution of the occupational classification scheme, ri, with i = 6 the finest and i = 3 the coarsest. (C) d0 is proportional to across levels of classification scheme resolution and time, suggesting the there is a r-independent limit to the form of the occupational diversity of cities and that D is open-ended. In this limit, and larger cities are always more diverse as a whole, but more specialized per capita.

Coarsening the hierarchical classification leads to similar saturation at each of the taxonomy's size d0(r6) > d0(r5) > d0(r4), etc (Fig. 1B). This behavior is the hallmark of a finite resolution artifact, a phenomenon well understood through finite-size scaling analysis in statistical physical systems25. The explicit dependence of d0 on r suggests that existing classification schemes are too coarse to capture the true professional diversity of large US cities27, beyond an employment size of order . Nevertheless, we can use the variation of the statistics of occupations with r to derive classification scheme independent results. We reconcile all curves for D(N) at different r and extract their limit as r → ∞. We define a dimensionless function such that

where D0 is a constant. Comparison with Eq. 1 tells us that in the limit , h → 1, and in the limit , . A universal scaling regime exists if and only if the quantity becomes a constant, independent of r, as r → ∞ (Fig. 1B). Fig. 1C shows d0vs. across r and over time. The relationship is well described by a straight-line with slope D0 = 0.05 across all years.

These results suggest the existence of a resolution independent, scale-invariant limit for D(N) and show that the occupational diversity of cities is in fact open-ended: the number of distinct occupations in US cities increases by γ ~ 84% with each doubling of its labor force, meaning that larger cities are at once more diverse in absolute terms and more specialized per capita. These insights can be proven as simple theorem, see SI Text.

Universal Distribution of Urban Professions

Beyond considering the presence or absence of professions, which gives only a crude measure of urban diversity, we can characterize their probability. To derive this distribution we identify D(N) in Eq. 1 with the maximum rank at each N, which has probability . Inverting this relation and generalizing it to all ranks, i, leads to the occupational frequency, f(i):

This is also scheme independent in the large resolution limit and can be used to derive the probability density, p(i), as

which is also independent of r. The occupational probability has a residual dependence on N through D(N) because the rarest professions in each city cannot have less than one person. Figs. 2A, B shows that the distribution of occupations for different cities is universal: When adjusted for scale, Ne, the frequency curves for every city collapse onto a single line, Fig. 2B. This shows that there is an expected nested sequence of occupations predicted by city size, as expected by the hierarchy principle of central place theory28 and its extensions in regional economics23,29. This sequence of professions arising with growing city size is also analogous to the hierarchy of products vs. level of economic development observed at the national level8,9 and shows how large cities play a fundamental role in embodying the economic diversity of nations.

Figure 2
figure 2

The distribution of occupations in US metropolitan areas is universal.

(A) Frequency distribution for several cities with different population sizes only differ in their amplitude, which is set by city size and the extent to which they probe rare occupations. The horizontal grey line shows the minimum number of professions (thirty) reported. (B) The rank-probability distributions for different cities collapse on each other when adjusted for city size (total employment). The yellow line shows the fit of the universal form to , where we introduces a scale at small ranks. The black line is the f of f(i)/Ne in the absence of saturation.

Indices of diversity

The diversity associated with the distribution of types is commonly measured by functionals of the probability distribution, p(i). Examples are the Herfindahl–Hirschman Index (H H) and the Shannon entropy (S)21,22,24. Here we show that the residual dependence of p(i) on N, through D(N) in Eq. 3 is the only source of city size dependence of these measures of occupational diversity.

The Herfindahl–Hirschman index measures how concentrated a distribution is. For this reason it is often applied to economic sectors to measure their concentration in terms of firm's market share and assess the existence of monopolies. Given the asymptotic form of the distribution (Eq. 3) the H H(N) can be calculated analytically as

where the exponent δ = 1 − γ, Consequently the H H index decreases with N towards a small constant, set by the exponent δ, as cities grow. This expresses an increase in diversity with city size. Note that the asymptotic value for N → ∞, with 30, is H H → 0.028, which is typical of highly diverse (and competitive) markets.

Similarly the Shannon entropy, S, measures the diversity of the occupational distribution as

which increases with N towards the Pareto distribution limit at infinite N. Thus, the increase in entropy signals the increase in diversity of the occupational distribution in larger cities. Note that in both cases the increases in diversity are driven, at leading order, by a term of order N−δ (Fig. S3). As we shall see below, this term measures professional diversity per capita and is also implicated in the growth of labor productivity and social connectivity with city size.

Diversity, Productivity and Urban Social Networks

We now address the problem of how professional diversity, economic productivity and urban social networks are interconnected.

Given the empirical results so far, we should expect economic productivity to be inversely proportional to professional diversity. To see this more explicitly, consider that urban indicators of economic productivity (wages, GDP, etc) scale, on average, superlinearly with city size N, W(N, t) = W0(t)N(t)β, with W0(t) and independent of N26, see Fig. 3A. The average wage per capita is, then, w(N) = W0Nδ, where 30. This result has been derived from a general theoretical framework that defines cities as co-located social networks, subject to infrastructural efficiency constraints30, with w = Gk(N), where G is a constant in N, involving a balance between mobility costs and infrastructural properties and k(N) = k0Nδ is the average social connectivity (network degree) per person. (This increase in social connectivity with city size has been observed directly in urban telecommunication networks31.) Similarly, diversity per capita, d(N) = D(N)/N = D0Nγ−1 = D0N−δ. Hence, we conclude that w 1/d. This relationship is an expression of the abundant evidence in economics for knowledge specialization – a decrease in d(N) – as the source of increases in (labor) productivity19,32,33,34. However, this view of productivity per se cannot be the target of optimization: no city has become rich by reducing its occupational diversity to a single activity! What then is the optimal level of diversity and productivity attainable at the city level?

Figure 3
figure 3

The scaling of economic productivity with city size and the generation of professional diversity.

(A) Superlinear scaling relation between total wages in US metropolitan (W(N) = W0N1+δ, with δ = 0.18 ± 0.03, in agreement with theoretical expectations of 30). Inset shows the product A = wd versus city size, demonstrating that A it is on average independent of city size (R2 = 0.91). (B) The process of generation of new occupations as a function of city size: As the self-similar frequency distribution of occupations is pushed up by city size, latent occupations cross the lowest probability threshold and appear explicitly in D. The schematic shows how such a new explicit function (red circle) allow other occupations (yellow and green circles) to specialize further (acquire brighter colors), while remaining available to each other through network ties.

Before we answer this question we note that the connection between the microscopic features of socioeconomic urban organization and the aggregate productivity of cities has remained a fundamental but difficult question in urban economics. A number of detailed socioeconomic mechanisms have been proposed to explain increases in productivity with city size16, all of which depend on the social coordination opportunities that larger cities enable36. A different, if not orthogonal, set of arguments seeks to explain the greater productivity of larger cities from the empirical observation of a disproportionate presence of economic sectors with greater value-added or of greater human capital35. Despite some important theoretical attempts33,34, what is still missing from these two lines of argument is an empirically based predictive formalization of urban diversity that ties together learning, via the specialization of knowledge (accumulation of human capital), to the coordination costs inherent to the division of labor in cities.

We now propose such a formalization by showing how knowledge specialization is set by the extent and costs of human social networks in cities. We implement these ideas in terms of the structural change of urban social networks with city size by assuming a process whereby individuals start as generalists in small social environments and are able to specialize in larger cities by externalizing some of their functions to others, while maintaining access to them through direct social links. Thus, we assume that only close coordination (at least in the initial stages of this transfer of functions) will be able to preserve these specialized functions suitably integrated and, as such, that there must be a conservation of the number of functions within the immediate network contacts (first neighbors) of each individual (Fig. 3B). This means that as an individual specializes by performing fewer functions, she maintains access to such functions via socioeconomic contacts, so that the rate of individual specialization with city size must equal in magnitude the rate of diversification of functions across ones social ties, see below. Note that this process provides a (reversible) mechanism for the creation of human capital, through the gradual division of labor in larger social networks and is fundamentally different from arguments of comparative advantage where differential endowments of distinct individuals or organizations are taken as given.

We formalize these ideas in terms of the condition that, on average, as , where A is a constant in N but that may vary over time, e.g. due to changes in communication and transportation technologies. This condition is equivalent to the conservation of the total rate of change of k plus d with N: Δln k + Δln d = 0, where Δ denotes a population size variation. This conservation also enables each individual's capacity to innovate through processes of functional recombination resulting from re-wirings of ties in social networks. The main difference of this approach to that of Ref. 30, is to start from the more fundamental assumption that economic productivity is proportional to (labor) specialization that is w ~ 1/d and not just to social connectivity, k and is subject to network constraints. To preserve physical dimensions we write w = g(kd)/d, where g is a function that transforms the index of specialization 1/d into units of economic productivity (money/person/unit time). Its dependence on kd is necessary for consistency with observations.

With these definitions, we can formulate the problem of determining the optimal professional diversity of a city of size N in terms of the d that maximizes (labor) economic productivity, w, subject to the constraint that activities lost to an individual remain available through her neighbors in their social network. Note that because d = A/k, by assumption, that this is equivalent to finding the value of A and the function g(A) that maximize productivity. This can be written in terms of standard constrained optimization of a target function , as

where λ is a Lagrange multiplier. To proceed, we take k(N) as given and show how social connectivity and professional diversity must be related in order to maximize labor productivity. The solution follows by taking the variations of relative d and λ to zero:

where C is constant of integration; λ1 ≡ λ/k and is a function of A. Both C and λ1 are to be set by boundary conditions on g(A), the baseline for labor productivity across cities at a given time. From measurement of g(A), we then obtain A and the relationship between productivity, diversity and connectivity, since A = D0k0.

These results derive the aggregate patterns of urban occupational diversity for all city sizes from microscopic considerations of the division and coordination of labor in cities and show the sense in which the patterns of diversity observed empirically above are optimal through their relation to individual productivity under network constraints.


We showed that US cities express a general statistical pattern of professions, where greater individual specialization coexists with increases in overall diversity as cities become larger. We demonstrated that these patterns of diversity are optimal, in the sense of providing the basis for open-ended increases in economic productivity per capita with city size while preserving the overall functionality available to each individual. This dynamics of differentiation and coordination at the individual level promotes increases in the quality of these functions available in larger cities and thus in urban systems by encouraging learning and the accumulation of human capital, enabled by the scaling properties of urban social networks. It also provides a general microscopic theory, based on fundamental principles of economic organization32,33, that derives observed increasing returns to scale in economic productivity at the metropolitan level consistent with data for thousands of cities worldwide30.

These ideas for the origin of functional diversity unify important concepts about the role of cities as generators of heterogeneity and innovation. The view of cities as “social reactors” from complex systems theory30 provides the underpinnings for this dynamics by deriving the social and infrastructural conditions (coordination costs) under which the process of specialization can happen and remain openended. They also provide a context for ideas of innovation as a recombinatory process37, a framework often invoked to explain technological development: The growth in social connectivity with city size and the possibilities for the division of labor it affords, promote the appearance of explicit new functions, such as new professions. These functions can then be recombined in organizations and products, which in turn may stimulate the development of new functions and so on. Thus, the information about more elaborate products and services becomes increasingly encoded in social networks, rather than being contained in the knowledge of single individuals. Given inevitable individual cognitive constraints38 this seems to be a necessary condition for socioeconomic complexity to develop as an open-ended process.

The framework developed here also unifies two distinct points of view that are often thought to be at odds: Growing diversity, in this picture, is the emergence of greater specialization. This demonstrates how diversity and specialization can be two facets of the same process for realizing increasing productivity gains through knowledge specialization in social organizations. In this picture, it is the transfer of already existing incipient functions from generalist individuals to growing social networks of interdependent specialists, see Fig. 3B, that supports their close integration and coordination. It also shows how system size (‘the extent of the market’32 as the metropolitan area) is key in enabling these dynamics as it determines, indirectly30, both productivity gains and coordination costs33.

Where these two points of view - diversity and specialization - differ is in the unit of analysis at which they take place. Here, we showed that patterns of differentiation are predominately local in social networks and can be observed at the individual level. (The original division of labor in human societies – by sex and by age – takes place in closely-knit cohabiting groups39). However, this does not exclude that processes of specialization and interdependence may also take place at higher levels of organization, such as firms, cities or even nations, whenever coordination costs are sufficiently low. In fact, a similar analysis at the establishment level (places of employment) reveals that there is less specialization in larger cities (a smaller increase in diversity) in terms of business types relative to professions40. This result implies that businesses classified as the same type in cities of different sizes contain, on the average, a more diverse set of occupations and realize greater productivity (e.g. in terms of value-added41) in larger cities. At the national level, scale (population) plays less of a role in determining economic diversity or productivity8 (c.f. China or India vs. the Netherlands or Singapore), though these two characteristics remain firmly correlated and intimately tied to levels of urbanization. In fact, urban hierarchy theories from central place theory28 to more recent economic geography29, assume the existence of a general hierarchy of ‘services’ common to all cities that grows deeper with city size. In this picture, as we have demonstrated quantitatively here, the economic diversity, D, of a national economy is set by that of its largest city.

This evidence suggests that the development of fundamental theory linking diversity, productivity and scale in complex systems must be guided by the integrated analysis of microscopic productivity gains obtained via specialization and learning with associated coordination costs mediated by (social) networks of interdependence33. They also show how resilience deriving from greater diversity may be an unintended consequence of a general process of knowledge specialization in complex networks. Similar analysis based on different socioeconomic units (such as firms, cities or nations) may shed light on the necessary conditions for these entities to realize analogous patterns of growth and development and help establish a more general quantitative framework for this sort of analysis in other complex systems. In this respect, it is likely an important clue to theory that similar quantitative patterns of functional diversity characterize the technological complexity of at least some simpler (pre-urban) human societies42 and may be a property of other networked systems that can experience increases in their productivity with scale. The reversibility of these processes, e.g. the absence of hysteresis in the externalization and reabsorption of functions by individuals interconnected by networks may also underlie the resilience of many complex systems5,20,43 under functional change or population loss.


Urban Units of Analysis

We adopt the standard definition of functional cities as metropolitan statistical areas (MSAs). MSAs are collections of political units (counties) aggregated by the US Census Bureau based on a set of criteria that includes population size, density and commuting flows. There were 403 MSAs in 2010, providing an ample basis for studying occupational diversity across city size (from 50K to 20M inhabitants) as well as other urban characteristics. MSAs are integrated labor markets and the best current official definition of functional cities in terms of a mixing population30.

Data Sources

Data on professional occupations in US MSAs used is the Occupational Employments Statistics (OES) obtained from the Bureau of Labor Statistics (BLS)44 and is freely available online ( Because The OES survey is a semi-annual mail survey of non-farm establishments it does not include self-employment. All occupations reported by employers in the US economy are hierarchically classified on the basis of their similarities at different levels of aggregation based on the BLS's Standard Occupational Classification Scheme, which contains a total of 840 distinct professions at its finest (r = 6-digit) level of resolution for 2010. It is important to notice that while this classification scheme is the finest occupation classification available at the MSA level, it does not actually represents a job that someone does, but instead a group of job titles preformed on average by 7 specific jobs. Such even finer classification is only available at the national level ( However, the inclusion of such definition do not alter the conclusions reached here as it only multiplies the diversity of occupation by a factor of 7 on average (See Supplementary Fig. S.4 and Table S.3).

Fit methodology

For each year, we estimated the parameters in Eq. 1 at r = 6, 5, 4 and 3-digits level occupations (Fig. 1A–B) through ordinary least squares, using the Gauss-Newton method, which relies on linear approximations to the nonlinear mean function45. We then used γ, d0, N0 estimated for 2010 to fit Eq. 2. For visual purposes, a constant i0 = 3 is introduced to Eq. 2 to account for the initial curvature observed for most common occupations (Fig. 2B). Hence the fit in Fig. 2B corresponds to . Note that i0 is not determined by the process of analytic continuation, valid only at high ranks, used to obtain the form of the frequency distribution and constitutes from that point of view a functional freedom that can be set by the boundary condition of observing f(i; N) for sufficiently small cities, where D(N) is of order i0.