Article | Open | Published:

# Mapping road network communities for guiding disease surveillance and control strategies

## Abstract

Human mobility is increasing in its volume, speed and reach, leading to the movement and introduction of pathogens through infected travelers. An understanding of how areas are connected, the strength of these connections and how this translates into disease spread is valuable for planning surveillance and designing control and elimination strategies. While analyses have been undertaken to identify and map connectivity in global air, shipping and migration networks, such analyses have yet to be undertaken on the road networks that carry the vast majority of travellers in low and middle income settings. Here we present methods for identifying road connectivity communities, as well as mapping bridge areas between communities and key linkage routes. We apply these to Africa, and show how many highly-connected communities straddle national borders and when integrating malaria prevalence and population data as an example, the communities change, highlighting regions most strongly connected to areas of high burden. The approaches and results presented provide a flexible tool for supporting the design of disease surveillance and control strategies through mapping areas of high connectivity that form coherent units of intervention and key link routes between communities for targeting surveillance.

## Introduction

The world is continuing to become more connected. The speed and reach of global transport infrastructure is increasing, as well as the numbers of travelers using them. An unintended consequence of this is the increasing transport of pathogens, with more outbreaks becoming global pandemics than ever before1. Recent pandemics, such as SARS2, H5N13 and H1N14 spread rapidly between and within countries through the movement of infected travelers by air, land and sea. Moreover, rising global connectivity is facilitating the increasingly rapid spread of drug resistance5,6,7. The growth of transport networks is a key factor in driving the speed and extent of disease spread. While air and shipping networks provide long distance connections, enabling rapid pathogen, host and vector movements8,9,10, the vast majority of shorter distance movements take place over land. This is particularly true in low income settings, where poorer populations are disproportionately affected by infectious diseases and air travel often remains the preserve of wealthier people. Evidence of the importance of regional connectivity through road networks on infectious disease spread is growing11,12,13,14. The 2015 West Africa Ebola outbreak illustrated how the emergence of a disease in a highly inter-connected region with poor surveillance capacity facilitated rapid spread, compared to previous outbreaks in poorly connected areas15,16. The density, pattern and amount of road in a region is indicative of how populations are physically connected. Dense road networks develop to serve the needs of highly populated regions covering many settlements, facilitating and promoting extensive and regular travel between them17,18. In contrast, areas with relatively fewer roads are indicative of lower population densities, lower rates of travel and poorer connectivity. Being able to quantify and map these differing types of regions provides potentially valuable information for tackling infectious diseases.

The study of road network structure has a long tradition. The general approach, which matured as an integration of transportation geography with graph theory19,20, is to model a road system as a network (or graph) and then to study its topological structure. More recently, network science coupled with the availability of large transport datasets, has boosted the study of road networks21. There is now a considerable body of knowledge on road network structure22,23,24, evolution and environmental and societal impact25. However, in contrast to other societal networks, the regular and planar nature of road networks precludes the formation of clear communities, i.e. roads that cluster together shaping areas that are more connected within their boundaries than with external roads.

Highly connected regional communities can promote rapid disease spread within them, but can be afforded protection from recolonization by surrounding regions of reduced connectivity, making them potentially useful intervention or surveillance units6,26,27. For isolated areas, a focused control or elimination program is likely to stand a better chance of success than those highly connected to high-transmission or outbreak regions. For example, reaching a required childhood vaccination coverage target in one district is substantially more likely to result in disease control and elimination success if that district is not strongly connected to neighbouring districts where the target has not been met. The identification of ‘bridge’ routes between highly connected regions could also be of value in targeting limited resources for surveillance28. Moreover, progressive elimination of malaria from a region needs to ensure that parasites are not reintroduced into areas that have been successfully cleared, necessitating a planned strategy for phasing that should be informed by connectivity and mobility patterns26. Here we develop methods for identifying and mapping road connectivity communities in a flexible, hierarchical way. Moreover, we map ‘bridge’ areas of low connectivity between communities and apply these new methods to the African continent. Finally, we show how these can be weighted by data on disease prevalence to better understand pathogen connectivity, using P. falciparum malaria as an example.

## Data

Data on the African road network (ARN) were obtained from GPS navigation and cartography as described in a previous study24. The dataset maps primary and secondary roads across the continent, and while it does have commercial restrictions, it is a more complete and consistent dataset than alternative open road datasets (e.g. OpenStreetMap29, gRoads30). Visual inspection and comparison between the ARN and other spatial road inventories validated the improved accuracy and consistency of ARN, however a quantitative validation analysis was not possible due to the lack of consistent ground-truth data at continental scales. Figure 1a shows the African road network data used in this analysis. The road network dataset is a commercial restricted product and requests for it can be directly addressed to GARMIN31.

### Plasmodium falciparum malaria prevalence and population maps

To demonstrate how geographically referenced data on disease occurrence or prevalence can be integrated into the approaches outlined, gridded data on Plasmodium falciparum malaria prevalence were obtained from the Malaria Atlas Project (http://www.map.ox.ac.uk/). These represent modelled estimates of the prevalence of P. falciparum parasites in 2015 per 5 × 5 km grid square across Africa32. Additionally, gridded data on estimated population totals per 1 × 1 km grid square across Africa in 2015 were obtained from the WorldPop program (http://www.worldpop.org/). The population data were aggregated to the same 5 × 5 km gridding as the malaria data, and then multiplied together to obtain estimates of total numbers of P. falciparum infections per 5 × 5 km grid square.

## Results

### Detecting communities in the African road network

Nodes in the dual network represent lines in the primal network. The conversion from primal to dual is done by using a modified version of the algorithm known as continuity negotiation37. In brief, we assume that a pair of adjacent edges belongs to the same street if the angle θ between these edges is smaller than θc = 30°. We also assume that the angle between two adjacent edges (i, j) and (j, p) is given by the dot product cos (θ) = ri, j r j,p/ri, jrj,p, where ri, j = r j ri. Under these assumptions, the angle between two edges belonging to a perfect straight line is zero, while it assumes a value of 90° for perpendicular edges.

Our algorithm starts searching for the edge that generates the longest road in the primal space, as can be seen in Fig. 2a. Then, a node is created in the dual space and assigned to this road. Next, we search for the edge that generates the second longest road, and a new node is created in the dual space and assigned to this road. If there is at least one interception between the new road and the previous one, we connect the respective nodes in the dual space. The algorithm continues until all the edges in the primal space are assigned to a node in the dual space, as shown in Fig. 2b. Note that the conversion from primal to the dual road network has been used extensively to estimate human perception and movement along road networks (Space syntax, see36), which also supports our use of road geometry to detect communities.

Despite the regular structure of the network in the primal space, the topology of these networks in the dual space is very rich. For instance the degree distribution in dual space follows the power-law P(k) k−γ. This property has been previously identified in urban networks33 and it is strongly related to the long tailed distribution of road lengths in these networks (see Fig. 1c). Since most of the roads are short, most of the nodes in dual space will have a small number of connections. On the other hand, there are a few long roads (Fig. 2a) that originate at hubs in the dual space (Fig. 2b). Our approach for detecting communities in road networks consists then in performing classical community detection in the dual representation (Fig. 2c) and then bringing the result back to the primal representation, as shown in Fig. 2d. The algorithm used to detect the communities is the modularity-based algorithm by Clauset and Newman35.

The hierarchical mapping of communities on the African road network, with outputs for 10, 20, 30 and 40 sets of communities, is shown in Fig. 3. The maps highlight how connectivity rarely aligns with national borders, with the areas most strongly connected through dense road networks typically straddling two or more countries. The hierarchical nature of the approach is illustrated through the breakdown of the 10 large regions in Fig. 3a into further sub-regions in b, c and d, emphasizing the main structural divides within each region in mapped in 3a. Some large regions appear consistently in each map, for example, a single community spans the entire north African coast, extending south into the Sahara. South Africa appears as wholly contained within a single community, while the horn of Africa containing Somalia and much of Ethiopia and Kenya in consistently mapped as one community. The four maps shown are example outputs, but any number of communities can be identified. The clustering that maximises modularity produces 104 communities, and these are mapped in Fig. 4.

Even with division into 104 communities, the north Africa region remains as a single community, strongly separated from sub-Saharan Africa by large bridge regions. South Africa also remains as almost wholly within its own community, with Somalia and Namibia showing similar patterns. The countries with the largest numbers of communities tend to be those with the least dense infrastructure equating to poor connectivity, such as DRC and Angola, though West Africa also shows many distinct clusters, especially within Nigeria. Apart from the Sahara, the largest bridge regions of poor connectivity are located across the central belt of sub-Saharan Africa, where population densities are low and transport infrastructure is both sparse and often poor. The communities mapped in Figs 3 and 4 align in many cases with recorded population and pathogen movements. For example, the broad southern and eastern community divides match well those seen in HIV-1 subtype analyses12 and community detection analyses based on migration data27. At more regional scales, there also exist similarities with prior analyses based on human and pathogen movement patterns. For example, the western, coastal and northern communities within Kenya in Fig. 4b, identified previously through mobile phone and census derived movement data39,40. Further, Guinea, Liberia and Sierra Leone typically remain mostly within a single community in Fig. 3, with some divides evident in Fig. 4c. This shows some strong similarities with the spread of Ebola virus through genome analysis15, particularly the multiple links between rural Guinea and Sierra Leone, though Fig. 4c highlights a divide between the regions containing Conakry and Freetown when Africa is broken into the 104 communities. Figure 3 highlights the connections between Kinshasa in western DRC and Angola, with the recent yellow fever outbreak spreading within the communities mapped. Figure 4d shows the’best’ communities map for an area of southern Africa, and the strong cross-border links between Swaziland, southern Mozambique and western South Africa are mapped within a single community, as well as wider links highlighted in Fig. 3, matching the travel patterns found from Swaziland malaria surveillance data41.

### Integrating P. falciparum malaria prevalence and population data with road networks for weighted community detection

The previous section outlined methods for community detection on unweighted road networks. To integrate disease occurrence, prevalence or incidence data for the identification of areas of likely elevated movement of infections or for guiding the identification of operational control units, an adaptation to weighted networks is required. We demonstrate this through the integration of the data on estimated numbers of P. falciparum infections per 5 × 5 km grid square into the community detection pipeline. The final pipeline for community detection calculated a trade-off between form and function of roads in order to obtain a network partition.

The form is related to the topology of the road network and is taken into account during the primal-dual conversion. The topological component guarantees that only neighbor and well connected locations could belong to the same community. The functional part, on the other hand, is calculated by the combination of estimated P. falciparum malaria prevalence multiplied by population to obtain estimated numbers of infections, as outlined above.

The two factors were combined to form a weight to each edge of our primal network. The weight wi, j of edge (i, j) is defined as

$${w}_{i,j}=\frac{1}{/{{\bf{r}}}_{i,j}/}\begin{array}{c}{{\bf{r}}}_{{{\bf{r}}}_{j}}\\ {{\bf{r}}}_{i}\end{array}m({\bf{r}})p({\bf{r}})d({\bf{r}})$$
(1)

where m(r) is the P. falciparum malaria prevalence and p(r) is the population count, both at coordinate r. These values are obtained directly from the data. When the primal representation is converted into its dual version, the weights of primal edges, given by Eq. 1, are converted into weights of dual nodes, which are defined as

$${\lambda }_{\bar{i}}=\,{\rm{\max }}({w}_{i,j}),\quad (i,j)\in {{\rm{\Omega }}}_{\bar{i}},$$
(2)

where $$\bar{i}$$ represents the i th dual node and $${{\rm{\Omega }}}_{\bar{i}}$$ represents the set of all the primal edges that were combined together to form the dual node $$\bar{i}$$ (see Fig. 2a,b). Finally, weights for the dual edges are created from the weights of dual nodes, by simply assuming

$${\lambda }_{\bar{i},j}=\,{\rm{\max }}({\lambda }_{\bar{i}},{\lambda }_{\bar{j}}).$$
(3)

The dual network weighted by values of λij was used as input for a weighted community detection algorithm. Ultimately, when the communities detected in the dual space are translated back to primal space, we have that neighbor locations with similar values of estimated P. falciparum infections belong to the same communities. For the example of P. falciparum malaria used here, the max function was used, representing maximum numbers of infections on each road segment in 2015. This was chosen to identify connectivity to the highest burden areas. Areas with large numbers of infections are often ‘sources’, with infected populations moving back and forward from them spreading parasites elsewhere6,42. Therefore, mapping which regions are most strongly connected to them is of value. Alternative metrics can be used however, depending on the aims of the analyses.

The integration of P. falciparum malaria prevalence and population (Fig. 5a) through weighting road links by the maximum values across them produces a different pattern of communities (Fig. 5b) to those based solely on network structure (Fig. 3). The mapping of 20 communities is shown here, as it identifies key regions of known malaria connectivity, as outlined below. The mapping shows areas of key interest in malaria elimination efforts connected across national borders, such as much of Namibia linked to southern Angola43, but the Zambezi region of Namibia more strongly linked to the community encompassing neighbouring Zambia, Zimbabwe and Botswana44. In Namibia, malaria movement communities identified through the integration of mobile phone-based movement data and case-based risk mapping26 show correspondence in mapping a northeast community. Moreover, Swaziland is shown as being central to a community covering, southern Mozambique and the malaria endemic regions of South Africa, matching closely the origin locations of the majority of internationally imported cases to Swaziland and South Africa41,45,46. The movements of people and malaria between the highlands and southern and western regions of Uganda, and into Rwanda47, also aligns with the community patterns shown in Fig. 5b. Finally, though quantifying different factors, the analyses show a similar east-west split to that found in analyses of malaria drug resistance mutations6,48 and malaria movement community mapping27.

## Discussion

The emergence of new disease epidemics is becoming a regular occurrence, and drug and insecticide resistance are continuing to spread around the world. As global, regional and local efforts to eliminate a range of infectious diseases continue and are initiated, an improved understanding of how regions are connected through human transport can therefore be valuable. Previous studies have shown how clusters of connectivity exist within the global air transport network49,50 and shipping traffic network50, but these represent primarily the sources of occasional long-distance disease or vector introductions1,8, rather than the mode of transport that the majority of the population uses regularly. The approaches presented here focused on road networks provide a tool for supporting the design of disease and resistance surveillance and control strategies through mapping (i) areas of high connectivity where pathogen circulation is likely to be high, forming coherent units of intervention; (ii) areas of low connectivity between communities that form likely natural borders of lower pathogen exchange; (iii) key link routes between communities for targetting surveillance efforts.

The outputs of the analyses presented here highlight how highly connected areas consistently span national borders. With infectious disease control, surveillance, funding and strategies principally implemented country by country, this emphasises a mismatch in scales and the need for cross-border collaboration. Such collaborations are being increasingly seen, for example with countries focused on malaria elimination (e.g.51,52), but the outputs here show that the most efficient disease elimination strategies may need to reconsider units of intervention, moving beyond being constrained by national borders. Results from the analysis of pathogen movements elsewhere confirm these international connections (e.g.6,12,41,48, building up additional evidence on how pathogen circulation can be substantially more prevalent in some regions than others.

The approaches developed here provide a complement to other approaches for defining and mapping regional disease connectivity and mobility9. Previously, census-based migration data has been used to map blocks of countries of high and low connectivity27, but these analyses are restricted to national-scales and cover only longer-term human mobility. Efforts are being made to extend these to subnational scales53,54, but they remain limited to large administrative unit scales and the same long timescales. Mobile phone call detail records (CDRs) have also been used to estimate and map pathogen connectivity26,40, but the nature of the data mean that they do not include cross-border movements, so remain limited to national-level studies. An increasing number of studies are uncovering patterns in human and pathogen movements and connectivity through travel history questionnaires (e.g.41,47,55,56), resulting in valuable information, but typically limited to small areas and short time periods.

There exist a number of limitations to the methods and outputs presented here that future work will aim to address. Firstly, the hierarchies of road types are not currently taken into account in the network analyses, meaning that a major highway and small local roads contribute equally to community detection and epidemic spreading. The lack of reliable data on road typologies, and inconsistencies in classifications between countries, makes this challenging to incorporate however. Moreover, the relative importance of a major road versus secondary, tertiary and tracks is exceptionally difficult to quantify within a country, let alone between countries and across Africa. Finally, data on seasonal variations in road access does not exist consistently across the continent. Our focus has therefore been on connectivity, in terms of how well regions are connected based on existing road networks, irrespective of the ease of travel. A broader point that deserves future research is that while intuition suggests a correspondence in most places, connectivity may not always translate into human or pathogen movement.

Future directions for the work presented here include quantitative comparison and integration with other connectivity data, the integration of different pathogen weightings, and the extension to other regions of the World. Qualitative comparisons outlined above show some good correspondence with analyses of alternative sources of connectivity and disease data. A future step will be to compare these different connections and communities quantitatively to examine the weight of evidence for delineating areas of strong and weak connectivity. This could potentially follow similar studies looking at community structure on weighted networks, such as in the US based on commuting data57, or UK and Belgium from mobile network data58,59. Here, P. falciparum malaria was used to provide an example of the potential for weighting analyses by pathogen occurrence, prevalence, incidence or transmission suitability. Moreover, future work will examine the integration of alternative pathogen weightings. The maximum difference method was used here to pick out regions well connected to areas high P. falciparum burden, but the potential exists to use different weighting methods depending on requirements, strategic needs, and the nature of the pathogen being studied.

Despite the rapid growth of air travel, shipping and rail in many parts of the world, roads continue to be the dominant route on which humans move on sub-national, national and regional scales. They form a powerful force in shaping the development of areas, facilitating trade and economic growth, but also bringing with them the exchange of pathogens. Results here show that their connectivity is not equal however, with strong clusters of high connectivity separated by bridge regions of low network density. These structures can have a significant impact on how pathogens spread, and by mapping them, a valuable evidence base to guide disease surveillance as well as control and elimination planning can be built.

## Methods

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Change history

• ### 25 July 2018

A correction to this article has been published and is linked from the HTML and PDF versions of this paper. The error has not been fixed in the paper.

## References

1. 1.

Tatem, A. J., Rogers, D. J. & Hay, S. Global transport networks and infectious disease spread. Adv. parasitology 62, 293–343 (2006).

2. 2.

Peiris, J., Guan, Y. & Yuen, K. Severe acute respiratory syndrome. Nat. medicine 10, S88 (2004).

3. 3.

Webster, R. G. & Govorkova, E. A. H5n1 influenza—continuing evolution and spread. New Engl. J. Medicine 355, 2174–2177 (2006).

4. 4.

Trifonov, V., Khiabanian, H. & Rabadan, R. Geographic dependence, surveillance, and origins of the 2009 influenza a (h1n1) virus. New Engl. journal medicine 361, 115–119 (2009).

5. 5.

Marais, B. J. The global tuberculosis situation and the inexorable rise of drug-resistant disease. Adv. Drug Deliv. Rev. 102, 3–9 (2016).

6. 6.

Lynch, C. & Roper, C. The transit phase of migration: circulation of malaria and its multidrug-resistant forms in africa. PLoS medicine 8, e1001040 (2011).

7. 7.

Hupalo, D. N. et al. Population genomics studies identify signatures of global dispersal and drug resistance in plasmodium vivax. Nat. genetics 48, 953 (2016).

8. 8.

Tatem, A. et al. Air travel and vector-borne disease movement. Parasitol. 139, 1816–1830 (2012).

9. 9.

Tatem, A. J. Mapping population and pathogen movements. Int. health 6, 5–11 (2014).

10. 10.

Lemey, P. et al. Unifying viral genetics and human transportation data to predict the global transmission dynamics of human influenza h3n2. PLoS pathogens 10, e1003932 (2014).

11. 11.

Moustafa, A. et al. The blood dna virome in 8,000 humans. PLOS Pathog. 13, 1–20 (2017).

12. 12.

Tatem, A. J., Hemelaar, J., Gray, R. R. & Salemi, M. Spatial accessibility and the spread of hiv-1 subtypes and recombinants. Aids 26, 2351–2360 (2012).

13. 13.

Faria, N. R. et al. The early spread and epidemic ignition of hiv-1 in human populations. Sci. 346, 56–61 (2014).

14. 14.

Kraemer, M. U. et al. Spread of yellow fever virus outbreak in angola and the democratic republic of the congo 2015–16: a modelling study. The Lancet Infect. Dis. 17, 330–338 (2017).

15. 15.

Dudas, G. et al. Virus genomes reveal factors that spread and sustained the ebola epidemic. Nat. 544, 309–315 (2017).

16. 16.

Wesolowski, A. et al. Commentary: containing the ebola outbreak-the potential and challenge of mobile network data. PLoS currents 6 (2014).

17. 17.

Scott, A. J. World development report 2009: reshaping economic geography (2009).

18. 18.

Linard, C., Gilbert, M., Snow, R. W., Noor, A. M. & Tatem, A. J. Population distribution, settlement patterns and accessibility across africa in 2010. PLOS ONE 7, 1–8 (2012).

19. 19.

Garrison, D., W.L. Marble. The structure of transportation networks. Techical report (1962).

20. 20.

Haggett, P. & Chorley, R. J. Network analysis in geography, vol. 67 (Edward Arnold London, 1969).

21. 21.

Barthélemy, M. Spatial networks. Phys. Reports 499, 1–101 (2011).

22. 22.

Strano, E., Nicosia, V., Latora, V., Porta, S. & Barthe´lemy, M. Elementary processes governing the evolution of road networks. Sci. Rep. 2 (2012).

23. 23.

Strano, E. et al. Urban street networks, a comparative analysis of ten european cities. Environ. Plan. B: Plan. Des. 40, 1071–1086 (2013).

24. 24.

Strano, E. et al. The scaling structure of the global road network. Royal Soc. Open Sci. 4 (2017).

25. 25.

Porta, S. et al. Street centrality and densities of retail and services in Bologna, Italy. Environ. Plann. B 36, 450–465 (2009).

26. 26.

Tatem, A. J. et al. Integrating rapid risk mapping and mobile phone call record data for strategic malaria elimination planning. Malar. journal 13, 52 (2014).

27. 27.

Tatem, A. J. & Smith, D. L. International population movements and regional plasmodium falciparum malaria elimination strategies. Proc. Natl. Acad. Sci. 107, 12222–12227 (2010).

28. 28.

Wangdi, K., Gatton, M. L., Kelly, G. C. & Clements, A. C. Cross-border malaria: A major obstacle for malaria elimination. Adv. parasitology 89, 79–107 (2015).

29. 29.

Open Street Map. URL https://www.openstreetmap.org.

30. 30.

Center for International Earth Science Information Network -CIESIN- Columbia University, Information Technology Outreach Services -ITOS- University of Georgia. Global Roads Open Access Data Set, Version 1 (gROADSv1). Palisades, NY: NASA Socioeconomic Data and Applications Center (SEDAC)URL https://doi.org/10.7927/ H4VD6WCT (2013).

31. 31.
32. 32.

Bhatt, S. et al. The effect of malaria control on plasmodium falciparum in africa between 2000 and 2015. Nat. 526, 207–211 (2015).

33. 33.

Porta, S., Crucitti, P. & Latora, V. The network analysis of urban streets: a primal approach. Environ. Plan. B: Plan. Des. 33, 705 (2006).

34. 34.

Masucci, A. P., Smith, D., Crooks, A. & Batty, M. Random planar graphs and the london street network. The Eur. Phys. J. B-Condensed Matter Complex Syst. 71, 259–271 (2009).

35. 35.

Clauset, A., Newman, M. E. & Moore, C. Finding community structure in very large networks. Phys. review E 70, 066111 (2004).

36. 36.

Rosvall, M., Trusina, A., Minnhagen, P. & Sneppen, K. Networks and cities: An information perspective. Phys. Rev. Lett. 94, 028701 (2005).

37. 37.

Porta, S., Crucitti, P. & Latora, V. The network analysis of urban streets: A dual approach. Phys. A Stat. Mech. its Appl. 369, 853–866 (2006).

38. 38.

Newman, M. E. J. Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103, 8577–8582 (2006).

39. 39.

Wesolowski, A. et al. The use of census migration data to approximate human movement patterns across temporal scales. PloS one 8, e52971 (2013).

40. 40.

Wesolowski, A. et al. Quantifying the impact of human mobility on malaria. Sci. 338, 267–270 (2012).

41. 41.

Tejedor-Garavito, N. et al. Travel patterns and demographic characteristics of malaria cases in swaziland, 2010–2014. Malar. J. 16, 359 (2017).

42. 42.

Pindolia, D. K. et al. Human movement data for malaria control and elimination strategic planning. Malar. journal 11, 205 (2012).

43. 43.

Smith, J. L. et al. Malaria risk in young male travellers but local transmission persists: a case–control study in low transmission namibia. Malar. journal 16, 70 (2017).

44. 44.

Simon, C. et al. Malaria control in botswana, 2008–2012: the path towards elimination. Malar. journal 12, 458 (2013).

45. 45.

Raman, J. et al. Reviewing south africa’s malaria elimination strategy (2012–2018): progress, challenges and priorities. Malar. journal 15, 438 (2016).

46. 46.

Koita, K. et al. Targeting imported malaria through social networks: a potential strategy for malaria elimination in swaziland. Malar. journal 12, 219 (2013).

47. 47.

Lynch, C. A. et al. Association between recent internal travel and malaria in ugandan highland and highland fringe areas. Trop. medicine & international health 20, 773–780 (2015).

48. 48.

Pearce, R. J. et al. Multiple origins and regional dispersal of resistant dhps in african plasmodium falciparum malaria. PLoS medicine 6, e1000055 (2009).

49. 49.

Guimera, R., Mossa, S., Turtschi, A. & Amaral, L. N. The worldwide air transportation network: Anomalous centrality, community structure, and cities’ global roles. Proc. Natl. Acad. Sci. 102, 7794–7799 (2005).

50. 50.

Kaluza, P., Ko¨lzsch, A., Gastner, M. T. & Blasius, B. The complex network of global cargo ship movements. J. Royal Soc. Interface 7, 1093–1103 (2010).

51. 51.

APMEN, Asian Pacific Malaria Elimination Network. http://apmen.org.

52. 52.

53. 53.

Sorichetta, A. et al. Mapping internal connectivity through human migration in malaria endemic countries. Sci. data 3, 160066 (2016).

54. 54.

Ruktanonchai, N. W. et al. Census-derived migration data as a tool for informing malaria elimination policy. Malar. journal 15, 273 (2016).

55. 55.

Marshall, J. M. et al. Key traveller groups of relevance to spatial malaria transmission: a survey of movement patterns in four sub-saharan african countries. Malar. journal 15, 200 (2016).

56. 56.

Bradley, J. et al. Infection importation: a key challenge to malaria elimination on bioko island, equatorial guinea. Malar. journal 14, 46 (2015).

57. 57.

Nelson, G. D. & Rae, A. An economic geography of the united states: from commutes to megaregions. PloS one 11, e0166083 (2016).

58. 58.

Ratti, C. et al. Redrawing the map of great britain from a network of human interactions. PloS one 5, e14248 (2010).

59. 59.

Expert, P., Evans, T. S., Blondel, V. D. & Lambiotte, R. Uncovering space-independent communities in spatial networks. Proc. Natl. Acad. Sci. 108, 7663–7668 (2011).

## Acknowledgements

E.S. has been supported by funding from the Swiss National Science Foundation. AJT is supported by funding from the Bill & Melinda Gates Foundation (OPP1106427, 1032350, OPP1134076, OPP1094793), the Clinton Health Access Initiative, the UK Department for International Development (DFID) and the Wellcome Trust (106866/Z/15/Z, 204613/Z/16/Z).

## Author information

### Affiliations

1. #### Department of Civil and Environmental Engineering, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA

• Emanuele Strano
2. #### German Aerospace Center (DLR), German Remote Sensing Data Center (DFD), Oberpfaffenhofen, D-82234, Wessling, Germany

• Emanuele Strano
3. #### IBM Research Brazil, Sao Paulo, Brazil

• Matheus P. Viana
4. #### WorldPop, Department of Geography and Environment, University of Southampton, Highfield, Southampton, UK

• Alessandro Sorichetta
•  & Andrew J. Tatem
5. #### Flowminder Foundation, Stockholm, Sweden

• Alessandro Sorichetta
•  & Andrew J. Tatem

### Contributions

E.S., M.P.V. and A.J.T. conceived and designed the analyses. E.S. and M.P.V. designed the road network community mapping methods and undertook the analyses. All authors contributed to writing and reviewing the manuscript.

### Competing Interests

The authors declare no competing interests.

### Corresponding authors

Correspondence to Emanuele Strano or Andrew J. Tatem.