Urban economic fitness and complexity from patent data

Over the years, the growing availability of extensive datasets about registered patents allowed researchers to get a deeper insight into the drivers of technological innovation. In this work, we investigate how patents’ technological contents characterise metropolitan areas’ development and how innovation is related to GDP per capita. Exploiting worldwide data from 1980 to 2014, and through network-based techniques that only use information about patents, we identify coherent distinguished groups of metropolitan areas, either clustered in the same geographical area or similar in terms of their economic features. Moreover, we extend the notion of coherent diversification to patent production and show how it is linked to the economic growth of metropolitan areas. Our findings draw a picture in which technological innovation can play a key role in the economic development of urban areas. We contend that the tools introduced in this paper can be used to further explore the interplay between urban growth and technological innovation.


Introduction
Modern cities are at the centre of a passionate debate about their future.With over 55% of the global population now living in urban areas, cities represent the core of the modern world.They are key for the production and diffusion of innovation 1,2 in many different sectors ranging from economy 3 to science 4 and culture 5 .The ongoing pandemic has been imposing the hardest possible stress test on urban infrastructures and poses a real challenge in rethinking the role of cities, urban planning and policy decisions.While urbanisation keeps thriving 6 , the challenge of understanding the development of cities to make them more sustainable and resilient becomes more and more crucial 7,8 .Therefore, it is of paramount importance to tackle urban areas' challenges by going beyond pure optimisation schemes and keeping a dynamic perspective.New tools are thus needed to understand and map the present and forecast how a change in the current conditions will affect and modify future scenarios.
Despite belonging to different geographical areas and socio-economic contexts, cities possess general features for economic development and urbanisation rates.For example, in 9 , authors show that many urban socio-economic indicators have a power-law correlation with the population size.In 10 , the authors observe how individual cities recapitulate a common pathway where a transition to innovative economies takes place with a population of around 1.2 million.However, cities are ever-evolving systems where several changes and different growth paths are possible 11 .Technological innovation has been highlighted as the main driver for evolution and change in cities, and it is has been shown that complex economic activity flourish in large urban areas 12 .In parallel, many studies recently focused on how innovation proceeds [13][14][15] .In this paper, we focus on technological innovation, and we investigate how the technological DNA of cities can affect their development and potential.
The adoption of patent data to monitor technological innovation is well established [16][17][18] .For the past few decades, patent data have become a workhorse for the literature on technical change due mainly to the growing availability of data about patent documents 19 .This ever-increasing data availability (e.g., PATSTAT, REGPAT and Google Patents 20 ) has facilitated and prompted researchers worldwide to investigate various questions regarding the patenting activity.For example, the nature of inventions, their network structure and their role in explaining the technological change 19,21,22 .
One of the characteristics of patent documents is the presence of codes associated with the claims contained in the patent applications.These codes mark the boundaries of the commercial exclusion rights demanded by inventors.Claims are classified based on the technological areas they impact according to existing classifications (e.g., the IPC classification 23 ) to allow the evaluation by patent offices.Mapping claims to classification codes allows localising patents and patent applications within the technology space.Many studies recently relied on network-based techniques to unfold the complex interplay among patents, technological codes and geographical reference areas.Network science techniques allowed to analyse economic activities of countries 24 , regions [25][26][27][28][29] , cities 2,[30][31][32] or firms 33,34 .
In the present work, we focus on cities to quantify the complexity of their technologies, correlating it with socio-economic indicators such as the GDP per capita.More precisely, we summarise our research questions as follows: Which cities have the most advanced technological production?We use the framework of Fitness and Complexity (FC) 35 to quantify the complexity of metropolitan areas and their technological endowment.Introduced initially and extensively adopted for countries' production/exports 35,36 , the approach can easily be extended to any object pair, in this case, urban areas and technological codes.Are cities able to diversify their production of patents, or do they tend to specialise in particular sectors?In economics, FC has also been applied to sub-national scales, such as regions 37,38 and firms, both at a country 39 or global 40 level.The study of bipartite economic systems at different scales revealed that to apply the FC framework, the economic agents need to have the capability to diversify to create global competition in the system.Otherwise, they will try to specialise and create a nested subsystem of entities specialising in the same products.In such a case, the analysis has to be restricted to subsystems for the FC method to capture the interplay among the economic agents.In this sense, the scale of the system is fundamental and regulates the interplay between competition and specialisation.We aim to understand whether metropolitan areas can compete globally or if they tend to specialise.Are there clusters of cities with similar technological baskets?Starting from a bipartite system of metropolitan areas -technology codes, we investigate the relations and similarities among metropolitan areas and uncover meaningful patterns in the evolution of their technological production.In bipartite systems, it is often important to understand the similarities between pairs of nodes of the same layer, to obtain a validated projection on a single layer 41 .We adopt this procedure to understand which metropolitan areas are more similar in the type of patents they produce and which patents are more likely to be produced together.
The paper is organised as follows: in Section 2, we describe the data used in this work and we go through our data cleaning procedure.In Section 3, we introduce the methodologies used in our work, describing the details of the networks and measures we employed.In Section 4, we discuss the results showing how the network techniques can highlight non-trivial clusters of technologies and metropolitan areas, and how both the Fitness and the coherent diversification can drive a higher increase in the GDPpc of metropolitan areas.Finally, Section 5 sums up our contributions and hints at future work needed to address questions arising from this study.

Technology Codes
Here, we shall adopt the PATSTAT database (www.epo.org/searching-for-patents/business/patstat) that provides information about patents and technology codes.The database contains approximately 100 million patents registered in about 100 Patent Offices.Each patent is associated with a code that uniquely identifies the patent and a certain number of associated technology codes.The WIPO (World International Patent Office) uses the IPC (International Patent Classification) standard 23 to assign technology codes to each patent.IPC codes make a hierarchical classification based on six levels called digits, used to go into more and more detail about the technology used.The first digit represents the macro category: for example, the code Cxxxxx corresponds to the macro category "Chemistry; Metallurgy" and Hxxxxx to the macro category "Electricity"; considering the subsequent digits, we have, for instance, with C01xxx, the class "Inorganic Chemistry" and with C07xxx the class "Organic Chemistry".After assigning a technology code to each patent, we use a database about cities (see next section) to match the unique patent identifier and its technology code to the corresponding city.To geolocalise the patents, we adopt the De Rassenfosse et al. database 42 that contains entries on 18 million patents from 1980 to 2014.Conveniently, in this database, the geographical information of patents is assigned to precise geographical coordinates.Thus, each patent has a unique identifier, a series of technology codes, and geographical coordinates identifying the corresponding city.

GDP of cities
To obtain information on the GDP of cities and their evolution, we used the work of Kummu et al. 43 .The authors constructed a worldwide GDP grid with a resolution of about five arc minutes for the 25 years 1990-2015.To compute the GDP per capita of each city or metropolitan area (MA) for each year in the data, we first download the boundaries from the Global Human Settlement Layer 44 .Considering the GDP grid in one year, we compute the GPD per capita of a MA as the average of all the grid points within its boundaries.In Fig. 3 in the Supplementary Information, we show the example of the grid of the Rome metropolitan area.

Data Cleaning Procedure
To clean the data, the first step is to associate the technology codes of a patent with a specific city.Once this preliminary operation is completed, it is possible to build the bipartite networks that will link cities to technology codes.We represent the bipartite networks through bi-adjacency rectangular matrices V y whose elements V y c,t are integers indicating how many times a technology code t appeared in different patents in a given city c in year y.In total, our network features 42912 cities connected to 650 technology codes (4-digit).To reduce the difference between the two layers of the networks and reduce the noise in the system which is often due to the presence of very small cities, we aggregate the cities in the respective metropolitan areas (MAs).We select all cities within a metropolitan area (MA), and the technology codes associated with the metropolitan area will be the union of all the technology codes of the cities within it.The MAs present in the Global Human Settlement Layer 44 are 8641 and cover the entire world.However, most of these do not contain cities that have patents.The metropolitan areas producing patents are 2169 and are distributed as shown in Figures 1 and 2 in the Supplementary Information.
We obtain a matrix V y for each year y from 1980 to 2014, connecting 2169 metropolitan areas a and 650 technology codes t.To avoid the fluctuations due to using only one year at a time as an interval, we decided to consider a window of 5 years each time, summing the matrices in one window.In this paper, therefore, the matrix V y will refer to the time window from y to y + 5.The final database consists of 30 5-year window matrices V y ranging from window 1980 − 1984 to 2010 − 2014.Finally, we binarise the matrices V applying a standard procedure in economic complexity to determine relevant producers/exporters of products (see Section 3).

Revealed Comparative Advantage
To understand which metropolitan areas are relevant innovators of a specific technological sector, we apply the revealed comparative advantage (RCA) 45 binarisation strategy.RCA is a frequently used tool in the economic complexity literature 24,36,46 .Considering a bipartite network of countries and products, RCA allows us to determine how competitive a country is in exporting a given product while also considering how many countries export that product.In our case, RCA reveals when the share of patents of some technology, t, introduced by a certain MA, a, is higher than the average share of the rest of the market, meaning that the metropolitan area focuses on the technology t more than the number of technologies produced would suggest.
Considering the matrix V y for the year y, we define the RCA for the MA a and the technology t as: , where the sums in the lhs run over all the technologies t and all the MAs a.A value RCA a,t ≥ 1 means that MA a is significantly competitive in the technology field t.We use this threshold on the RCA values to obtain 30 M y matrices, one for each 5-year

3/15
window: Notice that, in the following, we consider only having an average of at least one RCA > 1 per year, reducing their number to 1211.These M y matrices represent our final temporal bipartite network that links 1211 MAs to 650 technology codes.

Bipartite Networks
A bipartite network is a network whose nodes represent two different kinds of entities, and only connections between nodes from different entities are allowed.Many systems in ecological and socio-economical environments, such as those studied in the present work, are easily described as bipartite since they involve interactions between two kinds of entities 39,47 .For instance, the Internet can be modeled as a users-websites bipartite network, whose analysis can reveal sets and ranks of pages which will be more likely to be of interest for the user 48 .We can use the M y matrices as biadjacency matrices of MA -technology bipartite networks, connecting each MA with the technologies in which it is competitive.In figure 1 we show a pictorial representation of this bipartite network and its biadjacency matrix M y for the year y = 2000.Projecting the bipartite network on one of its layers, we can find non-trivial similarity patterns between MAs or technologies.However, the problem of finding the proper projection of a bipartite network into a monopartite one representing the similarities of nodes on one of its layers is well-known in the literature 41,[48][49][50] .In general, the goal is to find the representation of a monopartite network that best represents the bipartite one without taking too much information away from the latter.We decided to use the Bipartite Configuration Model (BiCM) 51,52 to select the most significant nodes and links.

Bipartite Configuration Model (BiCM)
One of the simplest ways to obtain a one-party projection from bipartite data is to count the number of links in common between two different entities belonging to the same layer.For example, using M as the biadjacency matrix of a bipartite network between metropolitan areas a and technologies t, counting the number of links in common between two different entities belonging to the same layer means computing: where A a i a j is the adjacency monopartite projection matrix element of A between elements a i and a j .However, we note that a projection made in this way leads to a densely connected structure with a trivial topology.
To select the relevant nodes and links in our projected networks to avoid obtaining a too dense projection,we use as a null model the Bipartite Configuration Model (BiCM) 49,51,52 which we compute by using the NEMtropy Python package 1 .The BiCM belongs to the family of the Exponential Random Graphs, adapted to the case of bipartite networks.These models arise from the maximisation of the Shannon entropy of an ensemble of networks, in our case undirected binary bipartite networks M: considering a set of constraints C(M).P(M) is the probability of a specific bipartite network M.
The probability distribution maximising the entropy is the exponential distribution: where is the Hamiltonian imposing the Lagrangian multipliers.Two sets of constraints are imposed in the BiCM, one for each layer.Specifically, the node degrees are fixed, namely ubiquity u(M) for each technology code and diversification d(M) for MAs, in our case.The mean values of the node degrees must be tuned to match these quantities.Then we obtain the Hamiltonian H: Imposing the previous constraints together with the normalisation condition ∑ M∈Ω P(M) = 1, we can write Eq. 1 as: Since constraints have been imposed on the mean values of the node degrees, the previous equation can be decomposed into the product of the probability distributions of a single link: where p at = x a y t 1+x a y t is the probability of the link between the MA a and the technological code t, x a = e −α a and y t = e −β t .To estimate the unknown parameters we have to maximise the log-likelihood L ( x, y) = ln P(M| x, y), i.e. solving the system: a and u t (M) = u * t representing the observed quantities.
After we obtain the link probabilities of the model, we use them to compute how unexpected is the number of common neighbours of two nodes of the same layer.Given that, by construction, the links of the model are independent random variables, the probability of sharing a technology for two MAs is P(V t aa = 1) = p at p a t , and the total number of technologies they share will be V aa = ∑ t m at m a t .Thus, we can compute a p-value for the number of common neighbours observed for two nodes of the same layer, which reads: where V * aa is the number of common neighbours between nodes a and a in the observed network.Note that the random variable V aa is a Poisson-Binomial, i.e. a sum of independent Bernoulli random variables of different parameters, which is hard to evaluate when the number of different Bernoulli is large, we actually approximate this by substituting a Poisson variable with the same mean, as it has been done in previous works.
After applying this procedure to each pair of nodes, we obtain as output a p-value matrix of the same size as the adjacency matrix M of the starting bipartite network.As a final step, we have to decide which of these p-values are significant and which are not.To assess the link significance, we use the False Discovery Rate test 53 : let us assume that we have N hypotheses, each characterised by its p-value.The FDR first sorts these N p-values as p-value 1 ,...,p-value N , and then identifies the integer I such that: where α is the arbitrarily defined single-test significance level.We use α = 0.01 for the projection onto the technology layer, and α = 0.1 for the MA one.Note that in this case, α will be the statistical significance of the whole validated network, while for the single links their significance will be much lower.Finally, all hypotheses with p-value lower or equal than p-value I will be rejected, i.e. the link will be validated in the projected network.In our case, for instance in the case of the projection on the technologies' layer, the number of hypotheses is the number of possible links in the projection N t 2 and Eq. 3 becomes: .
Ordering the coefficients N t 2 p-value ( V tt ) and retaining only the links between pairs of nodes t,t such that p-value(V * tt ) ≤ p-value I yields our projection.
Let us remark that the projection obtained via the procedure just described only keeps links that are highly significant with respect to the degree of the nodes, unveiling hidden strong similarities.

Modularity and Community detection
We are interested in finding relevant communities of MAs or technologies to visualise better which nodes in the two layers are highly interconnected.To this end, we adopted the Louvain method introduced by Blondel et al. 54 , which relies on finding a partition that maximises the modularity.We also vary the Resolution 55 to find communities at different scales.

Fitness and Complexity algorithms
The Fitness and Complexity (FC) framework 35 , introduced in 2012, provides a way to quantify the competitiveness (Fitness) of the economy of a country.Here, we adopt it to quantify the Fitness of metropolitan areas considering only patent data.The idea is to define an iterative process linking and combining the Fitness of a MA, F a , with the Complexity of a specific technology, Q t .The iterations to find these quantities are defined as: where for each step n the quantities are normalised as: and initial conditions Q a = 1 ∀a.In 56 the convergence of the algorithm is studied in detail.In our case, we compute F y a and C y t for each 5-years window y starting from the biadjacency matrices M y at .We stop the iteration when the Fitness ranking of MAs does not change anymore.The rationale behind the whole process is as follows.A technology made in an already developed MA carries little information about the complexity of the technology itself because developed metropolitan areas produce a large part of the technologies.In contrast, a technology exported by an underdeveloped MA must require a low level of sophistication.Thus, it is possible to measure a MA's technological competitiveness given its technologies' complexity.A different approach should be taken instead to assess product quality.Fitness F a is proportional to the sum of technologies weighted by their complexity Q t .Intuitively, the complexity of a technology is inversely proportional to the number of MAs that have implemented it.If a MA has high Fitness, this should reduce the burden of limiting the complexity of a technology, and MAs with low Fitness should contribute strongly to Q t .In recent studies, the authors of 37,57 have shown that it is helpful to calculate the Fitness of sub-national actors using the complexity that comes from the national systems.This measure is called exogenous Fitness and overcomes the issue of the limited capabilities of sub-national entities, such as cities/MAs in our case.Thus, as for Fitness calculations, we enter the complexity obtained by considering global international patent data instead of calculating the complexity of a technology only on the MA subsample.We proceed in the same way by aggregating all the MAs of a country, i.e., summing all the rows of the MAs and running the FC algorithm.In other words, we compute F C and Q C relative to each country c and technology t through the formulas 4, and then calculate the Fitness of the MAs through: For each time window, we calculate the Exogenous Fitness of all metropolitan areas and the complexity of each technology.

Coherent diversification
The coherence of production and innovation diversification has been shown to be a significant driver of productivity 58,59 .Thus, to better understand the nature of MAs' performance from their technology portfolio, we analyze their coherent diversification 34 .The underlying question is whether the accumulation of knowledge and capabilities associated with a coherent set of technologies leads MAs to experience more significant benefits in terms of GDPpc.Consistent diversification is defined as the Coherence of the technology field t with respect to the technology basket of the MA a: where B can be any matrix quantifying the similarities between pairs of technologies and M is the usual adjacency matrix of a bipartite network between the layers of MAs and technologies.For each technological field, t, and each MA, a, one counts how many technologies t adopted by a are connected with t, using B tt as a weight.If the technological portfolio of a is such that t is surrounded by a large number of strongly connected technologies owned by a, then t will be very coherent to a, and γ at will be high.On the contrary, if t belongs to a portion of the network of technologies far from the patenting activity of a, γ at will be low.In our case, we use as B matrix the projection represented in Fig. 2. Notice that γ has the same dimensions as M, and the elements quantify how coherent a technology t is to the technology basket of MA a.
Finally, we can calculate the coherent technological diversification 34 of MA, a, as: where d a = ∑ t M at is the diversification of MA, a.The Coherence of technological diversification, Γ a , of MA a computes the average coherence γ of the technologies in which a is patenting.

Networks projection
To find a general network representation of our data for each year, we apply the Bipartite Configuration Model (BiCM) projection method (discussed in detail in the Methods section) to each M y matrix, one for each 5-years window (for both layers of technology codes and MAs) using the following steps: • For each 5-year sliding window, we calculate the BiCM projection (with the same parameters every year), which gives us the most relevant nodes and links; • We merge all the projections for every year as follows.For instance, suppose code A00A is connected with A00B in the projection relative to 1980, but code A00C does not appear in this network; suppose also that in the network of 1981, A00A is connected with A00C, but A00B does not appear.In the merged network, we will have both a link between A00A and A00B and a link between A00A and A00C; • We use weights: e.g., if A00A is connected with A00B in 1980, 1981 and 1982, the relative link weight will be 3.We decided to do this to emphasise a relevant link between relevant nodes that lasts over time.
The resulting technology network was obtained by setting the BiCM parameter for the statistical validation of the projected networks to α = 0.01 for every year.In contrast, the projected networks of MAs were obtained by setting the threshold α = 0.1, as explained in the Methods section.The two networks have a density of 0.032 and 0.012 for technologies and MAs, respectively.The mean density of the starting bipartite ones in years is 0.124.After these steps, we use the Louvain algorithm to identify communities, as discussed in the Methods section.The resulting networks are shown in Fig 2 and Fig. 3.The technology network of Fig. 2 does not show a strong modular structure due to the ability of MAs to produce patents in different areas.Instead, different specific communities, with contiguous clusters containing products of similar macro-type.
For instance, we can find the technology communities of (clockwise, starting from the left/light green) communication & information, weapons, printers, domestic technologies, cars, bicycles, buildings, textile, plastic, metallurgy, agri-food & mining, fuel, organic chemistry, train, nuclear energy and clock.Node sizes are proportional to their complexity.The cluster with the highest number of complex nodes is the communication & information one, pointing out that not all MAs have the necessary capabilities to patent in this area.
Increasing the resolution parameter in the modularity optimisation 55 , we can identify three technological macro areas.These three areas correspond respectively to the light blue, pink and olive green nodes in Fig. 2 (b): The three regions contain different kinds of technologies: • Car technologies: this region, coloured in olive green, contains technologies closely related to cars; • Highly sophisticated technologies: this macro area, depicted in light blue, contains clusters such as electricity and communications, nuclear, and household items, all technology sectors that we can classify as highly sophisticated technology sectors; • Manufacturing technologies: in this area, represented in pink, we can find clusters related to the textile, agri-food, plastic and paper industries, thus containing manufacturing technology sectors.
Finally, in Fig. 2, we colour the technologies to show the average RCA values of New York (c) and Shanghai (d) in the database years.Red implies a higher RCA value, and we can note how Shanghai has focused more on manufacturing technologies while New York is strong in electricity and communications technologies.
As for the projected network of MAs (Fig. 3), α = 0.1 was used as the significance threshold parameter for the projection of each window; the depicted communities were found via the Louvain algorithm, and the partition features a modularity of 0.68.This statistically validated projection shows how the MAs can be grouped according to specific criteria.We find well-defined communities of Chinese, emerging, Euro + USA, Japanese + Korean MAs.We also found a high fitness cluster containing MAs such as London, New York or San Jose, and the Western MAs cars manufacturing including Turin, Detroit and Stuttgart, among others.We can see how the Japanese & Korean cluster shares connections with MAs from western countries and the Chinese and emerging countries.We present a complete table of MAs with their class in the Supplementary Information.

The Fitness-GDP relation in metropolitan areas
In Fig. 4, we report the results regarding the relationship between the technology basket of MAs and their GDPpc.We apply the Fitness and Complexity algorithm described in the Methods section: we first calculate the complexity of technologies at the country level and then compute the exogenous Fitness of the MAs.In Fig. 4 we report three different representations of the GDPpc-Fitness plane.In the first (a), we trace the trajectories of some MAs from 1990 to 2010.MAs with high fitness are generally more likely to have a more significant increase in GDPpc.For Shanghai, for instance, the trajectory is the values of modularity and density are 0.68 and 0.012, respectively.Node size is proportional to the fitness of the MA.We notice that the high-tech cluster is the one containing the MAs with the highest fitnesses.
nearly vertical, arguably thanks to the high starting fitness, ending at a similar value of GDPpc as Santiago.Santiago is also an interesting case as its trajectory moves in an almost horizontal line increasing the fitness but cannot improve the GDPpc quite as much as Shanghai, arguably due to the low initial fitness.Other MAs, such as the Indian New Delhi and Kolkata, also tend to grow consistently in fitness and GDPpc.The same phenomenology is mirrored in Fig. 4 (b) where we show the average vector field of the trajectories from 1995 to 2005.Here, we also observe that the higher the fitness, the higher the increase in GDPpc.Finally, in Fig. 4c we show the overall trend of all MAs whose trajectories are coloured according to the community of belonging.For each community, we highlight the average trajectories.The three communities of Chinese MAs are particularly interesting since they show similar trends of fitness increase.The other clusters show different trends that can be easily interpreted in terms of GDPpc and Fitness.The High-Tech cluster has the highest average GDPpc, while the Western and Western cars ones have the same average GDPpc with the difference that the latter has higher fitness.The cluster Korea & Japan has a low average GDPpc compared to the previous three clusters, though with a comparable fitness.The cluster labelled as Emerging is slowly increasing in terms of average GDPpc and fitness.Finally, we note that the fitness trends of all clusters are decreasing, except for the Emerging and Chinese clusters.This behaviour is justified by considering that the fitness is a globally computed quantity, using data about all MAs.For this reason, the fitness cannot increase for all MAs simultaneously, and if it increases for some MAs, it must automatically decrease for others.

The innovation fitness rankings of metropolitan areas
The metropolitan areas with the highest fitness per year are presented in Fig. 5.It is remarkable, even in this case, the rise of Chinese MAs from 1990 to 2010: at first, only the biggest areas such as Beijing and Shanghai enter in the top 30 of the fitness rankings.Nagoya (Japan) sits atop of the rankings from 1990 to 2001, then it is overtaken by the wave of Chinese cities who start to monopolise the top 30 shortly after 2000; in 2000 the rankings are still mixed, including many Chinese metropolitan areas but also still many from the US and Japan.Ten years later, there are only seven metropolitan areas in the top 30 which are not Chinese: six of these are Korean and only one is European, Frankfurt.In 2020 Suzhou tops the rankings, followed by other Chinese metropolises such as Nantong, and the first non-Chinese MAs are the Korean Daegu and Busan, which were also at the top in the 2000 rankings.

Coherent technology production
In Fig. 6 we show the results of the coherent diversification in technological production.From Fig. 6 (a), displaying the Coherence -Fitness plane, we observe that coherence can capture the signal of significant positive change in the GDPpc of MAs.Fig. 6 (b) confirms this picture: while the change in GDPpc is not sensitive to fitness changes, a growing trend of coherence is accompanied by a parallel growth in the GDPpc's change.This result is appealing, especially if we consider that, in the ranking of Γ, 79 MAs, out of the top 100, are Chinese.To ensure that our result is not simply due to the relatively high number 9/15 We trace the trajectory of some MAs from 1990 to 2010 in the Fitness-GDPpc plane.MAs with high fitness show a more significant increase in GDPpc.(b).We show an average vector field of the trajectories from 1995 to 2005.In this plot, we can better visualise how high Fitness leads to increases in GDPpc (most evident in the lower right).In contrast, MAs with low Fitness will tend to increase this first.(c).Trends of all MAs, with trajectories coloured according to the community of belonging.For each community, we highlighted the average trajectory.
of Chinese MAs in our dataset, we performed a robustness test described in more detail in the Supplementary Information.In this test, we rebuild the technology network as explained in the "Networks projection" Section without the Chinese MAs, to then compute the Coherence using all MAs.
In the Supplementary Information, we also ran a simple check to show that the high Coherence is not related to low diversification.The coherent diversification strategy of China was already highlighted in a previous work by Gao et al. 60 , who noticed similar coherent patterns for the expansion of the production in Chinese regions.To highlight that coherence can better discriminate changes in GDPpc, we divide fitness and coherence into ten bins and calculate the mean fractional GDPpc variation of all the points in each of the ten bins.We show how fitness and coherence display different behaviours.In particular, the fitness curve is roughly constant, highlighting that the fitness cannot discriminate different fractional changes of the GPDpc.Coherence, instead, displays a growing trend with the fractional change of GDPpc, i.e. the higher the fitness, the higher % ∆ GDPpc.

Discussion
In this work, we studied technological innovation in metropolitan areas by analysing data on the production of patents.In particular, we focused on the signals of specialisation and diversification by applying the Fitness and Complexity framework 11/15 and novel methods for bipartite networks to the technological production of metropolitan areas.The Fitness and Complexity algorithm application for MAs is particularly interesting since the interplay between specialisation and diversification can change at different scales 40 .We found that MAs tend to specialise in technology sectors, particularly for some technological categories, such as cars or electronics.Moreover, we observed similarities among metropolitan areas within a country or across similar countries.Chinese MAs give the best example of similar MAs in a single country.They are organised in three coherent clusters specialised in similar technological baskets.One of the clusters is specialising in the technology sectors of textile industries, another one specialises in agri-food and the third cluster is devoted to highly sophisticated technology sectors.We observe a similar behaviour of relatedness, though at a smaller scale, in Japanese and South-Korean MAs.We also observe similarities among emerging MAs and among highly technological metropolitan areas.Interestingly, the network of similarities among MAs shows a clear geographical boundary between highly developed Asian and Western (European/American) MAs.We used the Fitness and Complexity framework to understand the economic evolution of MAs and their clusters.In line with previous results, we have shown that Fitness can drive an increase in GDP per capita: MAs with a complex technological basket tend to have higher GDPpc in the following years than MAs developing more basic technologies.Korea and Japan followed this path, especially in the past.In recent years the standout case is China: the complexity of innovation in Chinese MAs is very high, and their GDPpc displays rapid growth.We found that Chinese metropolitan areas are not only able to diversify their innovation patterns by aiming for a more complex technological basket but also do this in a coherent and coordinated way.Measuring, in fact, the coherence of the innovation baskets of MAs, we show that a vast majority of MAs with the highest coherence values are Chinese, and we report that this outcome is not due to a restriction to a specific set of technologies.On the contrary, Chinese MAs diversify consistently and coherently.Moreover, a coordinated effort is also evident, with Chinese MAs areas sharing common sets of technologies.We found that coherent diversification is necessary and arguably as important as fitness to increase the wealth of a metropolitan area, as the highest increase in GDPpc is found in metropolitan areas with both high fitness and coherence.We found that from 1990 to 2010, the top 30 items of the patents production's Fitness rankings drastically changed.In 1990, many MAs from many rich countries were sitting at the top of the table, with Japan and the US vastly represented, Nagoya and Los Angeles in the top two positions, and only Beijing and Shanghai as Chinese metropolitan areas.By contrast, in 2010, only seven metropolitan areas in the fitness top 30 rankings were not Chinese: six Korean ones and a European, Frankfurt, with the Chinese Suzhou topping the table.
The theoretical framework presented here can be applied in several scenarios for investigating questions arising from our analysis: Optimal diversification strategies and technology forecasting for MAs at different scales and capabilities.Our theoretical framework can be applied to study the best diversification strategy for MAs, assessing the best technologies to develop in a city, as in 33,61 .However, some metropolitan areas can diversify as much as they want because their size and capabilities are close to those of a whole country; others cannot diversify their technological products as much because they do not have the resources to do so.Specialisation and diversification are both feasible ways for MAs to compete depending on their resources, acting more as a large firm 40 or as a whole country 35 .
The strategy of Chinese MAs.We found that the Chinese MAs have the most coherent technology diversification and specialisation strategies.These results align with other work such as 62 , but the cause of the observed structured diversification remains unanswered: is this behaviour coordinated nationally?A more detailed analysis of the Chinese case could highlight whether China is implementing a long-term, all-purpose strategy for developing technologies, defining a priori the production basket of single MAs.If this is true, can the strategy be copied by other countries, and under which conditions?For instance, some emerging MAs, such as Indian ones, were on a trajectory similar to Chinese MAs, as shown in Fig. 3(a) for New Delhi or Kolkata.
The restricted business of car technologies.The strong signal from MAs dedicated to producing cars is unique and suggests that these metropolitan areas could have trouble diversifying their production.It is not clear yet whether this is a signal of high competitiveness of these kinds of technologies, and therefore MAs should specialise to better profit from this production, or it is hard to implement other technologies for car-focused MAs.However, with the advent of electric cars and considering the significant technological changes about to occur in the forthcoming years (see, for instance, the European ban on fossil-fuel car production by the EU2 ), the future economy of MAs currently producing cars will have to be reshaped.Future studies focusing on optimal diversification strategies and forecasting future technology production could be used to shape technology paths that can help these MAs adapt to such changes.
1 MA data cleaning and statistical data visualization.
In the main text we described how to obtain the final matrices composed of 2169 MAs and 650 technology codes.However, to make the data cleaner, we decided to cut out from the analysis those MAs that are not very active in technology production.We decided to quantify their activity by setting a threshold for MAs as follows: We denote by RCA y the RCA matrix obtained by calculating the RCA values from the V y matrix defined in the main text.The matrix element RCA y at denotes the value of RCA related to the technology t of MA a developed in year y.By Y we denote the set of years in the database.We then denote by RCA a the average over the years of the overall RCA of MA a, i.e., the average over the years of the RCA obtained by summing the RCA values of the entire technology portfolio of a.Put simply, we filter MAs that on average are relevant innovators in less than one technology per year.With red colour we indicate the MAs in the database that are cut off, in blue those that remain.We use as a cut-off criterion the fact that if the average total RCA across all years of an MA is greater than 1 then the MA remains.We want to emphasise that we are interested in capturing relatedness among technologies, and to do this we remove from the analysis the MAs that on average make few technologies.
The remaining metropolitan areas are 1211 and are distributed as shown in Fig. 1 and 2. We show in these Figures the geographical distribution of MAs available to us around the world 2 and the distribution of the number of MAs for each country in the database 1.In both figures, with red colour we indicate the MAs in the database that are cut off (i.e.all a that have RCA a < 1), in blue those that remain (i.e.all a that have RCA a ≥ 1).We are interested in capturing relatedness among technologies, and to do this we remove from the analysis the MAs that on average make very few technologies.Most of these are from emerging countries, and it will be interesting to study in future work how we can characterise their technological growth.However, in the present work these MAs are not able to help us capture relatedness between technology codes.

GDP per capita grid example
In this section we show an example of our GDPpc calculation using the grid constructed in the work of of Kummu et al. [1].In Fig. 3 we show the intersection of the grid with the MA of Rome as an example.To calculate GDPpc we average all the points (the red dots referring to the figure) within the MA and, because we are working in 5-year windows, we also average over time in each window.

Robustness Coherence test
In this section we present the coherence robustness test presented in the main text.This consists of showing the diversification distributions of MAs and calculating the technology network without Chinese MAs.nologies which are closely related; on the other hand, from the trend (green distribution in Figure 4) we see that the 79 Chinese MAs that show a high consistency value are distributed across the diversification spectrum.
In the "technology network without Chinese MAs test", we recalculate the network of technologies in the same way that the one in the main text was calculated, except that we do so by removing all Chinese MAs.In Figure 5 we show a representation of the network of technologies obtained without  Again, coherence coefficients are computed after removing Chinese MAs from the system.To highlight that Coherence is better able to discriminate change in GDPpc, we divide both Fitness and Coherence into ten bins and calculate the mean GDPpc variation of all the points in each of the ten bins.We show how the Fitness curve is roughly constant while the Coherence one has an increasing trend.This means that, on average, looking at the Fitness, we will find approximately the same value of % ∆ GDPpc while looking at the Coherence, the higher this one is, the higher the variation of GDPpc.We show also the lines of best fit of both curves to highlight the difference in the two trends.
average over the decade 1995-2005 and the color scale is the percentage change of GDPpc over this years.We represent Chinese MAs with star markers.Considering the ranking of Γ for MAs, it appears that of the top 100, 24 are Chinese ones.To evidence that Coherence is better able to discriminate change in GDPpc, we divide both Fitness and Coherence into ten bins and calculate the mean GDPpc variation of all the points in each of the ten bins in Fig. 6b.We show how the Fitness curve is roughly constant while the Coherence one has an increasing trend.This means that, on average, looking at the Fitness, we will find approximately the same value of % ∆ GDPpc while looking at the Coherence, the higher this one is, the higher the variation of GDPpc.We show also the lines of best fit of both curves to highlight the difference in the two trends.

Coherence of clusters
In Figure 7, we show the average coherence for each of the clusters we found, obtained by averaging all coherence values of the MAs in the respective clusters.Confirming our finding, the Chinese clusters still show the highest levels of coherence.High fitness clusters seem to have higher values of coherence: the high fitness clusters such as high tech or the Japanese/Korean one show on average a higher coherence than the other non-Chinese clusters.

Figure 1 .
Figure 1.Bipartite metropolitan areas -technology codes network.(a): Pictorial representation of the bipartite metropolitan areas-technology codes network.Each MA is connected to one or more technology sectors.(b): Bipartite network adjacency matrix for the year 2000.A dark dot means that a given technology code is present in a patent made by a given MA.

Figure 2 .
Figure 2. Projected network of technologies.Each node in these figures is a technology code.The size of the nodes is proportional to the complexity of the technology.(a): clusters found by the community detection algorithm with a modularity value of 0.56 and a density of 0.032.We could identify the specific significant technologies for most clusters and represent them with corresponding icons in coloured circles.The Electricity and Information cluster (light green on the left) contains the most complex technologies.(b): clusters found by the community detection algorithm in the technology network with higher resolution, i.e., considering larger communities.In this case, we could identify three regions.Olive green: this region contains technologies closely related to cars; light blue: this macro area contains clusters of technology sectors that we can classify as highly sophisticated technology sectors; pink: clusters related to manufacturing technology sectors.(c) and (d): Projection on the technology network of average RCA values in the database years of New York ((c)) and Shanghai ((d)).The colour scale shows the RCA value of the metropolitan area in a particular technology: more red nodes (technologies) indicate a high RCA value of the MA for those specific technologies.We note how Shanghai has focused more on manufacturing technologies (pink region of (b)), while New York is strong in electricity and communications technologies (light blue region of (b)).

Figure 3 .
Figure 3. Projected network of metropolitan areas.Clusters of MAs obtained through the community detection algorithm:the values of modularity and density are 0.68 and 0.012, respectively.Node size is proportional to the fitness of the MA.We notice that the high-tech cluster is the one containing the MAs with the highest fitnesses.

Figure 4 .
Figure 4.The Fitness-GDPpc plane in the case of metropolitan areas and their technological production.(a).We trace the trajectory of some MAs from 1990 to 2010 in the Fitness-GDPpc plane.MAs with high fitness show a more significant increase in GDPpc.(b).We show an average vector field of the trajectories from 1995 to 2005.In this plot, we can better visualise how high Fitness leads to increases in GDPpc (most evident in the lower right).In contrast, MAs with low Fitness will tend to increase this first.(c).Trends of all MAs, with trajectories coloured according to the community of belonging.For each community, we highlighted the average trajectory.

Figure 5 .
Figure 5.The Fitness rankings of metropolitan areas.The 30 MAs with the highest fitness are shown, along with the evolution from 1990 to 2000 (a) and from 2000 to 2010 (b).In 1990 many of the metropolitan areas in the top 30 of the fitness rankings were from the US, Europe, Canada and Japan, with only Shanghai and Beijing from China.In 2000, Chinese and Korean MAs appear in the top 30, and in 2010 they dominate the top of the fitness rankings with Frankfurt as the only European, 6 Korean MAs and all others being Chinese.

Figure 6 .
Figure 6.Fitness VS Coherence to evaluate GDPpc growth.(a) Fitness -Coherence plane.We represent the averages of the measures over the decade 1995-2005, and the colour scale is the fractional change of GDPpc over the years.We observe how the coherence allows discriminating MAs with a more significant positive change in GDPpc.Stars indicate the Chinese MAs.(b) Average fractional change of the GDPpc versus Fitness and Coherence.To highlight that coherence can better discriminate changes in GDPpc, we divide fitness and coherence into ten bins and calculate the mean fractional GDPpc variation of all the points in each of the ten bins.We show how fitness and coherence display different behaviours.In particular, the fitness curve is roughly constant, highlighting that the fitness cannot discriminate different fractional changes of the GPDpc.Coherence, instead, displays a growing trend with the fractional change of GDPpc, i.e. the higher the fitness, the higher % ∆ GDPpc.

Figure 1 :
Figure1: Distribution of the number of MAs for each country in the database.With red colour we indicate the MAs in the database that are cut off, in blue those that remain.We use as a cut-off criterion the fact that if the average total RCA across all years of an MA is greater than 1 then the MA remains.We want to emphasise that we are interested in capturing relatedness among technologies, and to do this we remove from the analysis the MAs that on average make few technologies.

Figure 2 :
Figure 2: Geographical distribution of MAs available to us around the world.With red color we indicate the MAs in the database that are cut off, in blue those that remain.We use as a cut-off criterion the fact that if the average total RCA across all years of an MA is greater than 1 then the MA remains.

Figure 3 :
Figure 3: Example of GDPpc grid in the MA of Rome.The light blue area is the MA of Rome and the red dots are the part of the grid that intersects the MA.

Figure 4 :
Figure 4: Diversification distribution.We plot respectively the distribution of all the MAs, Chinese ones and of the 79 higher Coherence ones, to show that the high Coherence is not cause to the low diversification.

Figure 5 :
Figure 5: Technology code network obtained without Chinese MAs.The size of the nodes is proportional to the complexity of the technology.

Figure 6 :
Figure 6: Fitness VS Coherence to evaluate GDPpc growth.(a) Coherence Γ VS Fitness F plane.Coherence coefficients (i.e. products similarities) are computed after removing Chinese MAs from the system.Values are the respectively average over the decade 1995-2005 and the color scale is the percentage change of GDPpc over these years.With star markers we are indicating the Chinese MAs.Considering the ranking of Γ for MAs, it appears that of the top 100, 24 are Chinese ones.(b) Average GDPpc versus Fitness and Coherence.Again, coherence coefficients are computed after removing Chinese MAs from the system.To highlight that Coherence is better able to discriminate change in GDPpc, we divide both Fitness and Coherence into ten bins and calculate the mean GDPpc variation of all the points in each of the ten bins.We show how the Fitness curve is roughly constant while the Coherence one has an increasing trend.This means that, on average, looking at the Fitness, we will find approximately the same value of % ∆ GDPpc while looking at the Coherence, the higher this one is, the higher the variation of GDPpc.We show also the lines of best fit of both curves to highlight the difference in the two trends.

Figure 7 :
Figure 7: The mean coherence of each cluster of metropolitan areas.The clusters containing Chinese cities have the highest average coherence, and high fitness clusters tend to have higher values of coherence as well.