Predicting success in the worldwide start-up network

By drawing on large-scale online data we are able to construct and analyze the time-varying worldwide network of professional relationships among start-ups. The nodes of this network represent companies, while the links model the flow of employees and the associated transfer of know-how across companies. We use network centrality measures to assess, at an early stage, the likelihood of the long-term positive economic performance of a start-up. We find that the start-up network has predictive power and that by using network centrality we can provide valuable recommendations, sometimes doubling the current state of the art performance of venture capital funds. Our network-based approach supports the theory that the position of a start-up within its ecosystem is relevant for its future success, while at the same time it offers an effective complement to the labour-intensive screening processes of venture capital firms. Our results can also enable policy-makers and entrepreneurs to conduct a more objective assessment of the long-term potentials of innovation ecosystems, and to target their interventions accordingly.

co-location of firms and their proximity to universities 12 . Other studies have analyzed social networks (e.g., inventor collaboration networks, interlocking directorates) to unveil the microscopic level of interactions among individuals; yet their scope has been limited mostly to specific industries or small geographic areas, and to a fairly small observation period 11,13,14 . Owing to lack of data, what still remains to be studied is the global network underpinning knowledge exchange in the worldwide innovation ecosystem. Equally, the competitive advantage of differential information-rich network positions and their role in opening up, expediting, or obstructing pathways to firms' long-term success have been left largely unexplored.

the World-Wide network of Start-Ups
Here we study the complex time-varying network 15,16 of interactions among all start-ups in the worldwide innovation ecosystem over a period of 26 years (1990-2015). To this end, we collected all data on firms and people (i.e., founders, employees, advisors, investors, and board members) available from the www.crunchbase.com website. Drawing on the data, we first constructed a bipartite graph in which people are connected to start-ups according to their professional role. We then obtained the projected one-mode time-varying graph in which start-ups are the nodes and two companies are connected when they share at least one individual that plays or has played a professional role in both companies (see Supplementary Information (SI) for details). At the micro scale, employees working in a company can perceive the intrinsic value of new appealing opportunities and switch companies accordingly. This mobility creates an information and intel flow between companies, such that those receiving employees increase their fitness by capitalizing on the know-how the employees bring with themselves. Such microscopic dynamics is thus captured and modelled here by the creation of new edges at the level of the network of start-ups. As a consequence, companies which are perceived at the micro scale as appealing opportunities by mobile employees will likely boost their connectivity and therefore will acquire a more central position in the overall time-varying network. Note that ideas revolving about the hypothesis that the position of a start-up within its ecosystem is relevant for its future success have been previously discussed by some authors 17,18 and more recently formalised by Sorkin 19 . For simplicity, here we assume edges between companies to be undirected (which reflects more knowledge sharing than transfer) as while the movement of an employee from company A to company B certainly boosts the know-how of B and under our approach should thus increase its centrality, it does not necessarily decrease the know-how and centrality of A. Similarly, we assume memory to be present and thus keep all edges in the network, i.e. edges are not deleted over time as know-how is not necessarily destroyed (see SI Section 5.4 for details).
The resulting time-varying World Wide Start-up (WWS) network comprises 41,830 companies distributed across 117 countries around the globe, and 135,099 links among them (see SI Figs. S1 and S2). Figure 1A highlights the countries in which start-ups have joined, over time, the largest connected component of the network 15,16 . Figure 1B indicates that the number of nodes and links in the WWS network has grown exponentially over the last 26 years. In the same period, various communities of start-ups around the globe have joined together to form the largest connected component including about 80% of the nodes of the network (Fig. 1C). Currently, an average of 4,74 "degrees of separation" between any two companies characterizes the WWS network.
At the micro scale, Fig. 1E shows a snapshot of the network of interactions between Airbnb and other companies based on shared individuals. As an illustration, in 2013 Airbnb hired Mr Thomas Arend (highlighted in the red square), who had previously acted as a senior product manager in Google, as an international product leader in Twitter, and as a product manager in Mozilla. As previously pointed out, the professional network thus reveals the potential flow of knowledge between Airbnb and the three other companies in which Mr Arend had played a role. Moreover, as new links were forged over time, the topological distances from Airbnb to all other firms in the WWS network were reduced, which in turn enabled Airbnb to gain new knowledge and tap business opportunities beyond its immediate local neighborhood.
The mechanistic interpretation of employees' mobility inducing link creation discussed above and illustrated in Fig. 1E suggests that the potential exposure to knowledge of a start-up in the WWS network, and its subsequent likelihood to excel in the future, should be well captured by its network centrality over time. To test this hypothesis we have considered different measures of node centrality 20 . For parsimony here we focus on the results obtained using closeness centrality as it assesses the centrality of a node in the network from its average distance from all the other nodes, although similar results has also been found using some other centrality measures, such as betweenness or degree (see SI). In each month of the observation period, we ranked companies according to their values of closeness centrality (i.e., top nodes are firms with the highest closeness). Figure 1D is an example of the large variety of observed trajectories as companies moved towards higher or lower ranks, i.e., they obtained a larger or smaller proximity to all other companies in the network. Notice that Apple has always been in the Top 10 firms over the entire period, while Microsoft exhibited an initial decline followed by a constant rise towards the central region of the network. The trajectories of formerly younger start-ups, such as Facebook, Airbnb, and Uber, are instead characterized by an abrupt and swift move to the highest positions of the ranking soon after their foundation, possibly as a result of the boost in activity that has characterized the venture capital industry in recent years.

early-Stage prediction of High performance
To investigate the interplay between the position of a given firm in the WWS network and its long-term economic performance, from www.crunchbase.com we collected additional data on funding rounds, acquisitions, and initial public offerings (IPOs). For each month t, we obtained the list of Our recommendation method is based on the hypothesis that start-ups with higher values of closeness centrality at an early stage are more likely to show signs of positive long-term economic performance. Accordingly, we counted the total number m t ( ) of firms inside the open-deal list that, within a time window ∆ = t 7 years starting at month t, succeeded in securing at least one of the following positive outcomes: (i) they took over one or more firms; (ii) they were acquired by one or more firms; or (iii) they underwent an IPO. To assess the accuracy of our recommendation method in early identifying successful companies, we checked how many of the Top  In Fig. 2B, we characterize the overall performance of the recommendation method over the entire period of observation. Results indicate that about 30% of the firms appearing in the Top 20 in any month from 2000 to 2009 have indeed achieved a positive economic outcome within 7 years since the time of our recommendation. The black error bars indicate the expected success rates and standard deviations in the case of random ordering of companies (p-values in this case are all below 10 −5 ). Interestingly, the random null model provides an expected success rate which is indeed comparable to the actual performance that private investors focusing on early-stage start-ups as those considered in our prediction (e.g. accelerators and incubators such as 500 Startups, Y Combinator, Techstars and Wayra, whose target companies comply with our definition of open-deal list) reach through costly and labour-intensive screening processes (see SI Section 4.2 for details), while the performance of our recommendation method is considerably superior.
We further checked the robustness of our methodology by replicating the analysis based on the Top 50 and Top 100 (reported in Fig. 2B), for two additional time windows ∆ = t 6 and ∆ = t 8 years (see SI Fig. S6) and an alternative method of aggregation of the success rate across the entire observation period (see SI Fig. S7). We also controlled for different confounding factors such as start-up size, geographical location or structural role of venture capital funds in the start-up network, finding that our conclusions hold (see SI Section 5).
Finally, notice that the method presented here only provides a simple heuristic recommendation, i.e. it does not quantify the probability of each start-up in the open-deal list to show economic success in the future. In SI Section 6 we further studied this possibility by using a suite of logistic regression methods to predict success of each and every start-up in the open-deal list. We indeed found that a snapshot of the closeness centrality ranking of a given start-up could predict its future economic outcome (F1 score = 0.6), in qualitative agreement with findings in Fig. 2. implications As lack of data and subjective biases inevitably impede a proper and rigorous evaluation of risky and newly established innovative activities, our study has indicated that the network of professional relationships among start-ups can unlock the long-term potential of risky ventures whose economic net present value would otherwise be difficult to measure. Our recommendation method can help stakeholders devise and fine-tune a number of effective strategies, simply based on the underlying network. Employees, business consultants, board members, bankers and lenders can identify the opportunities with the highest long-term economic potential. Individual and institutional investors can discern financial deals and build appropriate portfolios that most suit their investment preferences. Entrepreneurs can hone their networking prowess and strategies for sustaining professional inter-firm partnering and securing a winning streak over the long run. Finally, governmental bodies and policy-makers can