Main

Antimicrobials are used in agriculture as disease treatments, prophylactically to prevent infections in healthy animals and to increase productivity1. However, the routine use of antimicrobials as surrogates for good hygiene practices on farms2,3 is driving a rise in antimicrobial resistance (AMR), with increasingly serious consequences for animal health1,4 and potentially for human health5,6.

Globally, 73% of antimicrobials are used in animals7, with China being the largest consumer of antimicrobials in absolute terms (41,967 t in 2017) and the second largest consumer in relative terms with 200 mg kg−1 (ref. 8) (Supplementary Fig. 1a,b). In comparison, Denmark and the Netherlands use respectively 39 and 56 mg kg−1 (ref. 9), while maintaining a productive livestock sector. Multiple factors may contribute to antimicrobial overuse in China. Meat production has grown by 560% since 1979 (FAOSTAT, http://www.fao.org/faostat/en/#data/QL), which could have made farmers reliant on antimicrobials to prevent infections. Veterinary antimicrobials are reportedly accessible without prescriptions10 and are sold at low prices in comparison to other countries11. As in many other low- and middle-income countries (LMICs)12,13, farmers predominantly obtain antimicrobials from local drug stores where vendors also provide medical advice without veterinary training10,14. Additionally, enforcing the existing regulations10 on the compounds authorized in animals, or the recently announced ban on growth promoters15, remains a formidable challenge in a country where 360 million people are active in agriculture (World Bank, https://data.worldbank.org/indicator/SL.AGR.EMPL.ZS). In the last 5 yr, China has reported multiple first emergence of resistance genes to last-resort antimicrobials such as colistin and tigecycline16,17 and a recent global analysis suggested that China may have become one of the largest hotspots of resistance among LMICs4, ranking eighth in relative terms, and first in absolute terms, for the animal-associated burden of AMR amongst LMICs (Supplementary Fig. 1d,e).

In high-income countries, epidemiological evidence collected by surveillance systems guides AMR responses and provides a baseline for evaluating policy targets. The US Food and Drug Administration collects meat samples from retail premises and slaughterhouses to monitor AMR levels (https://www.fda.gov/animal-veterinary/antimicrobial-resistance/national-antimicrobial-resistance-monitoring-system); the European Food Safety Authority (https://www.efsa.europa.eu/en/topics/topic/antimicrobial-resistance) serves a comparable role by amalgamating the surveillance efforts of its member states. The majority of LMICs—including China—either lack systematic surveillance systems or do not publicly report data from animal AMR surveillance4. Despite these challenges, China could act as a leader for guiding the international response to AMR—because its domestic policies may have far-reaching benefits for neighbouring countries and for its numerous trading partners18.

Point-prevalence surveys (PPSs) published independently by veterinarians constitute an alternative source for documenting AMR trends (N.G. Criscuolo et al., submitted), and inferences can be made to map AMR using a large collection of PPSs4. However, adapting this approach to the Chinese context requires building a critical mass of PPSs, including surveys in Chinese to train geospatial models. Accurate maps of disease prevalence have been generated19,20,21,22, but few used the associated uncertainty maps to inform field sampling campaigns23,24. In particular, as prediction uncertainty grows with distance from existing surveys, an uncertainty map can help identify the location where conducting new surveys could be most valuable to improve the confidence level of a prevalence map. Repeating this process iteratively can guide long-term surveillance efforts.

In this article we use event-based surveillance data to map trends in AMR in animals and associated uncertainty levels. We identify regions where future surveillance efforts could be intensified to reduce uncertainty on the geographic distribution of AMR in China. In a context of competing disease control priorities, our approach helps optimally target the limited resources dedicated to event-based surveillance of AMR.

Results

Data

We identified 446 PPSs reporting AMR in food animals in China between 2000 and 2019 (Supplementary Text S1). This corresponds to one survey per 470,177 t of food animals annually (28th rank among LMICs; Supplementary Fig. 1c). We collected data on four common indicator bacteria: Escherichia coli (184 PPSs), non-typhoidal Salmonella spp. (131 PPSs), Staphylococcus aureus (131 PPSs) and Campylobacter spp. (33 PPSs). The 446 PPSs included 6,295 resistance rates. We defined a composite metric of AMR to summarize trends in resistance across multiple drugs and bacteria. For each survey, we calculated the proportion of antimicrobial compounds with resistance higher than 50% (P50) (Supplementary Fig. 2).

Temporal trends

In pigs, between 2000 and 2019, P50 increased significantly in E. coli (+59%), Salmonella (+148%) and S. aureus (+85%) (Fig. 1a–c). In contrast, in chicken, P50 was stable in E. coli, Salmonella and S. aureus, with mean P50s of 60%, 42% and 37%, respectively (Fig. 1d–f). In cattle, P50 increased significantly in E. coli (+167%; Fig. 1g), and was stable in Salmonella and S. aureus, with mean P50s of 23% and 31%, respectively (Fig. 1h,i).

Fig. 1: Antimicrobial resistance between 2000 and 2019.
figure 1

ai, P50 values for pigs (ac), chicken (df) and cattle (gi), showing resistance to E. coli (a,d,g), Salmonella (b,e,h) and S. aureus (c,f,i). Mean refers to the mean P50 value of all surveys. C1 is the coefficient associated with the temporal trend in a logistic regression model weighted by log10-transformed sample size in each survey. Shaded areas indicate 95% confidence intervals. ***P < 0.001; **P < 0.01; *P < 0.05. Nsurveys is the number of surveys and Nrates is the total number of resistance rates reported in the surveys. Surveys conducted at multiple locations in the same publication are considered multiple surveys.

Prevalence of resistance across antimicrobial classes

For each drug–bacteria–animal combination, we estimated the prevalence of resistance (R%), and calculated the centre of mass of the probability density distribution of the prevalence of resistance across PPSs (Methods; Fig. 2). The prevalence of resistance of tetracyclines, sulfonamides and penicillins was high across all tested bacterial species between 2010 and 2019 (R% > 25%). In comparison, the prevalence of resistance has remained at low levels in polymyxins and cephalosporins (R% < 10% for at least one bacterial species tested in one animal species). For all antimicrobial classes, the prevalence of resistance in E. coli in chicken and pigs increased after 2010, except tetracyclines which already had a high prevalence of resistance (R% > 90%) in pigs before 2010. In Salmonella, an increase in the prevalence of resistance after 2010 was observed in penicillins in chicken, as well as in sulfonamides, penicillins and tetracyclines in pigs.

Fig. 2: Prevalence of resistance per antimicrobial class.
figure 2

ac, Resistance to E. coli in chicken, pigs and cattle (a), to Salmonella in chicken and pigs (b) and to S. aureus in cattle (c). In each panel, the x axis represents resistance rates and the y axis represents the probability density. The area under the curve between two resistance rates represents the probability that resistance rates fall within the interval. N, number of resistance rates used to calculate the density distribution. Dashed lines represent the centre of mass of each distribution.

The prevalence of resistance in E. coli was higher than in Salmonella for all antimicrobial classes (Fig. 2a,b). Across drug classes, the prevalence of resistance in E. coli was 18% higher than the prevalence of resistance in Salmonella in chicken, and 16% higher than the prevalence of resistance in Salmonella in pigs. The prevalence of resistance for individual antimicrobial classes differed between chicken and pigs. For E. coli, cephalosporins and quinolones had respectively 20% and 27% higher prevalence of resistance in chicken compared with pigs, whereas the prevalence of resistance in other antimicrobial classes differed by <6% between chicken and pigs (Fig. 2a). For Salmonella, quinolones had a 25% higher prevalence of resistance in chicken compared to pigs, while for other antimicrobial classes, the difference in the prevalence of resistance between chicken and pigs was <12%. (Fig. 2b). This comparison was largely influenced by the relative abundance of serotypes of Salmonella in different animal hosts (Supplementary Fig. 3). However, an in-depth investigation on its influence on resistance trends was challenged by the fact that 70% of the surveys on Salmonella (93 out of 131 surveys) did not report the prevalence of resistance broken down by serotypes.

Geographic distribution of resistance

We used a geospatial model (Supplementary Text S2) to map P50 at 10 km resolution, and combined information from PPSs with environmental and anthropogenic covariates (Supplementary Table 1). Hotspots of AMR—regions with P50 > 40%—were found in: (1) eastern China in the areas of Heilongjiang, western Jilin, western Liaoning, southern Hebei, Shandong, eastern Jiangsu, southern Anhui, Fujian and Taiwan; (2) central China in the areas of northern Shaanxi, central Hunan and southeastern Sichuan; and (3) the northwestern Xinjiang Uyghur Autonomous Region (Fig. 3). Low levels of AMR (P50 < 30%) were found in Tibet Autonomous Region, northwestern Sichuan and southern Guangxi (Fig. 3). We measured the association between P50 and covariates, using the decrease in area under the receiver operator curve (AUC) by sequential permutation of each covariate (Supplementary Text S2). The most important covariates associated with P50 values were the travel times to cities25 (−16% AUC), the minimum monthly temperature26 (−15% AUC) and cattle population density27 (−13% AUC; Supplementary Fig. 4).

Fig. 3: Geographic distribution of antimicrobial resistance.
figure 3

Colour shading represents the P50 level.

Optimal location for future event-based surveillance efforts

We identified the locations of 50 hypothetical surveys to be conducted in China such that these would minimize uncertainty on the current map of AMR. The uncertainty was quantified using a map of ‘necessity for additional surveillance’ (NS)—the product of the kriging variance (a metric of interpolation uncertainty) and the population density (Methods). The 50 locations for the hypothetical surveys were identified such as to minimize the mean of NSi across all pixels i in the NS map.

We compared four approaches to distribute hypothetical future surveys (Methods): first, a ‘greedy’ approach that tested all possible locations for new surveys but was associated with high computational cost. Second, an ‘overlap approach’ based on mutual zones of exclusions for consecutive surveys to be conducted. This approach was a computational approximation to the greedy approach. Third, an ‘administrative’ approach where surveys were distributed equally across administrative divisions. These three approaches were compared with a ‘null model’ consisting of randomly distributing 50 surveys across China. The greedy ‘optimal’ approach achieved the greatest reduction in NS (Fig. 4b, dark red). The greedy approach reduced NS by 56% more than the null model (Fig. 4b, blue). However, the greedy approach was associated with a considerable computational burden (Fig. 4c, 4.5 × 105 central processing unit minutes). The overlap approach reduced the mean NS by 44% more than the null model (Fig. 4b, blue), thus achieving near-optimal reduction of NS, but with a considerably lower computational burden than the greedy approach. The overlap approach also outcompeted the administrative approach (Fig. 4b, green): it reduced the mean NS by 104% more than if surveys had been distributed equally between administrative divisions.

Fig. 4: Predicted locations for future surveys.
figure 4

a, Predicted optimal locations for future surveys using the ‘overlap approach’. The background colour represents the ‘necessity for additional surveillance’ (NS): the product of the kriging variance and animal population density (standardized from 0 to 1). b, Reduction in the mean NS with 50 hypothetical additional surveys. The 50 additional survey locations were identified using the greedy approach (dark red), the overlap approach (red), the administrative approach (green) and the random approach (blue). c, Total central processing unit time for computing the four approaches (log10 scaled).

The overlap approach predicted locations for a large number of new surveys in the southwest (21/50 surveys) and northeast (11/50 surveys) of China. The surveys were predominantly distributed in Yunnan Province (ten surveys), Tibet Autonomous Region (nine surveys), Xinjiang Uyghur Autonomous Region (seven surveys) and Heilongjiang Province (five surveys) (Fig. 4a). These locations were determined using animal population densities as the metric of exposure (Methods). Additionally, we calculated the locations by province for individual animal species (Supplementary Fig. 5). The locations were mainly distributed in Yunnan and Heilongjiang (exposure by chicken or pigs) and in Tibet Autonomous Region (exposure by cattle). If human population was considered to determine exposure (Supplementary Fig. 6), then the locations predicted by the overlap approach to conduct new surveys were mainly distributed in Heilongjiang Province (eight surveys), Xinjiang Uyghur Autonomous Region (eight surveys), Yunnan province (seven surveys) and Inner Mongolia Autonomous Region (five surveys).

Discussion

We identified geographical gaps in event-based surveillance of food animal AMR in China, using a map of AMR derived from 446 PPSs, and an associated map of uncertainty, identifying where surveillance scale-up would be the most valuable to reduce uncertainties in the current trends of AMR.

Trends of AMR across animals and bacteria

Between 2000 and 2019, in pigs, P50 doubled in E. coli, Salmonella and S. aureus. This increase in AMR occurred in a period of considerable intensification of pig production in China, and the number of pigs slaughtered in China increased by 45%28. Traditional backyard systems were gradually replaced by large-scale intensive farms to support the growing domestic demand for pork29. However, as in other countries currently transitioning from extensive to intensive farming, improvements in biosecurity may have lagged behind improvements in productivity30. Future improvements in biosecurity may reduce farmers’ dependency on antimicrobials for disease prevention, and have potentially indirect benefits for managing AMR in the long-term. Future biosecurity improvements can reduce the risk of disease introduction through strict hygiene requirements for personnel who enter the farms, appropriate carcass management and by reducing the spread of diseases inside the premises through establishing pig compartments, and regular cleaning and disinfection31.

Between 2000 and 2019, in chicken, P50 remained stable in E. coli, Salmonella and S. aureus, albeit at high levels. In 2000, P50 in E. coli, Salmonella and S. aureus in chicken were already at 58%, 48% and 56%, respectively, double the levels of resistance in pigs (35%, 15% and 30%). This suggests that the intensification process (and the routine use of antimicrobials for production) occurred earlier and faster in the poultry sector than for pigs32. Excessive use of quinolones (for example, norfloxacin and ofloxacin) and cephalosporins (for example, ceftriaxone) in chicken10 may have caused much higher resistance rates of these two antimicrobial classes in chicken compared with pigs (Fig. 2). Our analysis suggests that the antimicrobials that maintained low prevalence of resistance in chicken are expensive and are seldom available on the Chinese market (Supplementary Text S3), impeding overuse and also preventing further AMR increases.

The prevalence of resistance for E. coli was higher than for Salmonella in pigs and chicken (Fig. 2), possibly influenced by commensal E. coli being associated with lower resistance levels than pathogenic E. coli33. However, due to the non-systematic nature of the PPS sampling schemes (event-based surveillance), disentangling how resistance rates differ between bacteria exhibiting commensal or pathogenic behaviour remains challenging. We attempted to mitigate this potential bias by focusing our analysis exclusively on bacteria isolated from healthy animals.

Resistance levels (P50) in cattle were lower than in chicken and pigs (Fig. 1). However, P50 in E. coli grew by 81% between 2000 and 2019, while globally the P50 in cattle was stable over the same period4. This may be associated with the increasing demand for cattle product in China—cow milk production increased by 261% from 2000 to 201928. Despite this rapid expansion, the current per capita consumption of dairy products in China is still only one-fifth of the dairy consumption in the United States and the EuropeanUnion28—leaving room for further expansion. Thus, a window of opportunity may exist at the current stage to slow the rise of AMR in cattle while resistance rates are still low (22% in E. coli)—and immediate action could help secure a sustainable dairy intensification.

Improved maps of AMR in China

Currently, AMR levels in animals are the highest in the east (43%), moderately high in the northwest (40%) and lowest in the southwest (34%; Fig. 3). These geographical trends are in agreement (Pearson correlation coefficient, 0.48) with previous attempts to map AMR in China4. However, the present map is considerably more robust because it is exclusively based on surveys conducted in China (446 surveys, including 318 publications in Chinese). In comparison, previous maps were produced with just 101 surveys from China supplemented by surveys from other LMICs4. The revised maps of AMR help identify hotspots of AMR (Fig. 3) where intervention could be targeted immediately as part of domestic policies34. Travel time to cities was the factor with the highest influence on resistance levels25. The clustering of intensive farms in major consumption centres during industrialization35 and the ease of access to drug stores in peri-urban areas10,36 may drive AMR level upwards37. High AMR levels were also associated with high minimal monthly temperature26—high temperatures may lead to increased stress and conflicts among animals, with risk of animal injuries requiring antimicrobial treatment38.

Key locations for conducting event-based surveillance

Amongst LMICs, China ranks 28th for the number of surveys in event-based surveillance per kilogram of food animals (population corrected units of food animals; PCU), and 36th for the number of surveys per PCU relative to average resistance level (P50) per country (Supplementary Fig. 1). We identified locations where additional surveys on AMR in animals could be conducted in the future to minimize uncertainty associated with the geographical trends in AMR—representing a gain in information given the resources spent on event-based surveillance. Current survey patterns resulting from event-based surveillance are clustered around veterinary institutes, mainly in the east (Supplementary Fig. 7), where sampling to investigate AMR in their vicinity is easier (Supplementary Text S4)—and may have contributed to geographical information gaps on AMR trends in the southwest and northeast. Cross-provincial efforts between institutes are needed to coordinate future event-based surveillance efforts into these regions, which may be far from existing institutes, but where the gain in information by additional surveys would be the highest.

Our approach for assigning future surveys works by minimizing NS based on a map of AMR (China for this example). However, exhaustively testing all possible locations for future surveys (‘greedy approach’) incurs considerable computational cost. We developed an ‘overlap approach’, which is a rapidly implementable approximation of a greedy approach. The overlap approach achieved 93% of the reduction of the uncertainty in AMR trends achieved by the greedy approach, albeit using just 15% of total computation time required by the greedy approach (Methods). This not only makes the approach faster but also applicable with limited computational resources, and was developed in the context of event-based surveillance, which was abundant in China with 446 PPSs and served as a proof of concept. However, the approach could also be used with systematic surveillance data or in other countries with event-based surveillance. In addition, the overlap approach is flexible with respect to exposure. In this analysis we used animal densities as a metric of exposure, but this variable could easily be substituted by other criteria that are relevant for epidemiological or environmental assessments.

Although steps are taken (Supplementary Text S1) to ensure comparability between surveys, there remain potential sources of bias in variations in the accuracy of susceptibility testing. These include potential differences in laboratory equipment, and compliance to analysis protocols across regions in China. The World Health Organization assesses the quality of antimicrobial susceptibility testing across countries39, but to the best of our knowledge, such within-country assessment is not currently available to account for laboratory practices that could lead to variations in the accuracy of susceptibility testing. These ‘hidden’ variations between surveys may influence the accuracy of the spatial distribution of P50. Inherent to event-based surveillance, a subjective summary metric P50 was used in the absence of publicly available systematic surveillance data. P50 could be affected by the different antimicrobials subject to susceptibility testing in each survey. The potential bias was reduced by using the drug–bacteria combinations recommended by the World Health Organization Advisory Group on Integrated Surveillance of Antimicrobial Resistance40 to calculate P50. Insufficient and irregular geographic coverage of data points may affect the accuracy of the estimations of model parameters. The risk of local overfitting is attenuated by using spatial cross-validation in the models. Finally, future mapping efforts could integrate surveys on AMR in aquaculture, because aquatic animals are important food animals in China, with at least 20 antimicrobials involved in their production41. Complementary to phenotypic resistance, AMR surveillance could be expanded to include genomics data, through metagenomic analysis of wastewater42 from farms, although issues about harmonization remain an active field of analysis (J. Pires et al., submitted).

The health challenges that China currently faces are multifaceted and burdensome, both in humans (for example, COVID-1943) and in food animals (for example, African swine fever44). With limited resources to allocate between competing priorities for disease surveillance, our approach identifies locations where conducting new surveys of AMR in animals could have the highest benefits, particularly in southwestern and northeastern China. Timely policy intervention could curb AMR in China, as illustrated by the significant reduction in colistin resistance after the colistin withdrawal policy45. Our analysis helps to optimally deploy the limited resources dedicated to event-based surveillance of AMR, thereby improving chances for successful intervention for curbing AMR development and providing data to inform policy.

Methods

Data

We reviewed PPS reporting rates of AMR in healthy animals and animal food products in China between 2000 and 2019 (Supplementary Text S1). We focused on three common food animal species: chicken, pigs and cattle. Dairy cattle and meat cattle were pooled in this study, in consistency with the categorization adopted in the maps of livestock created by the Food and Agriculture Organization27. The review focused on four common foodborne bacteria: E. coli, non-typhoidal Salmonella, S. aureus and Campylobacter. We recorded resistance rates reported in PPSs, defined as the percentage of isolates tested resistant to an antimicrobial compound. In addition, we extracted the anatomical therapeutic chemical classification codes of the drugs tested, the year of publication, the guidelines used for susceptibility testing, the latitude and longitude of sampling sites, the number of samples collected and the host animals. We recorded sample types for each survey, including live animals, slaughtered animals, animal products and faecal samples. Each sample was taken from one animal or animal product. These sample types were pooled in the current analysis. In total, 10,747 rates of AMR were extracted from 446 surveys (Supplementary Fig. 8), including 318 surveys from China’s National Knowledge Infrastructure (CNKI), the leading Chinese-language academic search engine. All data extracted in the review are available at https://resistancebank.org.

Two steps were taken to ensure comparability of the resistance rates extracted from the surveys. First, the panel of drug–bacteria combinations extracted from each survey was that recommended for susceptibility testing by the World Health Organization Advisory Group on Integrated Surveillance of Antimicrobial Resistance40. This resulted in the extraction of 6,295 resistance rates for 76 drug–bacteria combinations. Second, resistance rates were harmonized using a methodology4 accounting for potential variations in the clinical breakpoints used for antimicrobial susceptibility testing (Supplementary Text S1). There are two major families of methods used for susceptibility testing in this dataset: diffusion methods (for example, disc diffusion) and dilution methods (for example, broth dilution). Previous works have shown good agreement between the two approaches in measuring resistance in foodborne bacteria4,46. For each family of methods, variations of breakpoints may result from differences between laboratory guidelines systems (European Committee on Antimicrobial Susceptibility Testing vs Clinical and Laboratory Standards Institute), or from variations over time of clinical breakpoints within a laboratory guidelines system (Clinical and Laboratory Standards Institute or European Committee on Antimicrobial Susceptibility Testing). Here we accounted for both situations using distributions of minimum inhibitory concentrations and inhibition zones obtained from eucast.org (Supplementary Text S1).

Trends in AMR

We defined a composite metric of AMR to summarize trends in resistance across multiple drugs and bacterial species. For each survey, we calculated the proportion of antimicrobial compounds with resistance higher than 50% (P50). For each animal–bacteria combination, we assessed the significance of the temporal trends of P50 between 2000 to 2019 using a logistic regression model, weighted by the log10-transformed number of samples in each survey.

For each bacteria–drug (antimicrobial class) combination, we estimated prevalence of resistance by calculating a curve of the distribution of resistance rates across all surveys (Fig. 2). The analysis was conducted for surveys published between 2000 and 2009, and between 2010 and 2019, respectively. The distribution was estimated at 100 equally spaced intervals from resistance rates of 0% to 100%, using kernel density estimation. We used the centre of mass of the density distribution to estimate prevalence of resistance. The calculation was conducted for six animal–bacteria combinations. This included E. coli in chicken, pigs and cattle, Salmonella in chicken and pigs, and S. aureus in cattle. The remaining animal–bacteria combinations were excluded due to limited sample size, only represented in 32 out of 446 PPSs. The analysis was restricted to antimicrobial classes represented by at least 10 resistance rates. In addition, we estimated the association between resistance rates and the ease of obtaining antimicrobials from the market, using data from online stores (Supplementary Text S3).

Geospatial modelling

We interpolated P50 values from the survey locations to create a map of P50 at a resolution of 10 × 10 km across China. The approach followed a two-step procedure47. In step 1, three ‘child models’ were trained using four-fold spatial cross-validation to quantify the relation between P50 and environmental and anthropogenic covariates (Supplementary Text S2 and Supplementary Table 1). In step 2, the predictions of the child models were stacked using universal kriging (Supplementary Text S2). This approach combined the ability of the child models to capture interactions and non-linear relationships between P50 and environmental and anthropogenic covariates, as well as the ability to account for spatial autocorrelation in the distribution of P50.

The outputs of the two-step procedure were a map of P50 (Fig. 3) and a map of uncertainty on the P50 predictions (Supplementary Fig. 9 and Supplementary Text S2). The overall accuracy of the geospatial model was evaluated using the area under the AUC. The contribution of each covariate was evaluated by permuting sequentially all covariates, and calculating the reduction in AUC compared with a full model including all covariates (Supplementary Fig. 4). The administrative boundaries used in all maps were obtained from the Global Administrative Areas database (http://www.gadm.org).

Identifying (optimal) locations for future surveys on AMR

We identified the locations of 50 hypothetical new surveys—the rounded average number of surveys conducted per year (54 surveys per year) between 2014 and 2019 in China. The location of each new survey was determined recursively such that it minimized the overall uncertainty levels on the geographical trends in AMR across the country. This process took into account the locations of existing surveys and the location of each additional hypothetical survey. The objective of this approach was to maximize gain in information about AMR given the resource invested in conducting surveys.

The map of uncertainty consisted of the variance in the child model predictions Var(PBRT,PLASSO–GLM,PFFNN) (step 1) across 10 Monte Carlo simulations, where PBRT, PLASSO–GLM, and PFFNN were the predictions of P50 using boosted regression trees, logistic regression with LASSO regularization, and feed-forward neural network, and the kriging variance VarK (step 2):Vartotal = Var(PBRT,PLASSO–GLM,PFFNN) + VarK

In this study, the location of hypothetical surveys was solely based on VarK, instead of the sum of both terms. This approach was preferred because including both terms would have required to hypothesize P50 values associated with surveys to be conducted in the future, adding an additional source of uncertainty that cannot be quantified. In any case, the uncertainty attributable to VarK was 4.1 times the Var(PBRT,PLASSO–GLM,PFFNN) (Supplementary Text S2).

The allocation of new surveys was based on a map of ‘necessity for additional surveillance’ (NS), defined as:NS = VarK × Wwhere VarK reflects the uncertainty of the spatial interpolation, and W is the log10-transformed population density of humans48, animals27 in total, and in chicken, pigs and cattle, separately, which reflected exposure (Supplementary Fig. 10). Animal population density was calculated here as the sum of population-corrected units of pigs, chicken and cattle, using methods described by Van Boeckel et al.7. We adjusted the values of W such that its density distribution equals that of VarK. Concretely, for each pixel i, we calculated the quantile of Wi on the map of W, and replaced the value by the corresponding value of VarK at the same quantile. VarK and W were both standardized to range [0,1], thus giving each term equal weight in the need for surveillance.

Four approaches were used to distribute 50 surveys across China based on the map of NS. The reduction in uncertainty on AMR level associated with each of the four spatial configurations of the hypothetical surveys was evaluated by calculating the reduction in the mean values of NS across 7,857 possible pixels on the map of China.

First, we used a ‘greedy’ approach where all possible locations for additional surveys were tested. Concretely, the first hypothetical survey was placed at each of the 7,857 possible pixel locations, and a revised map of NS(+1 survey) was calculated for each of the placements. The survey was eventually placed in the pixel that led to the largest reduction in NS(+1 survey). The map of NS was then revised to account for the reduction in uncertainty in the neighbourhood of the new survey. The process was repeated recursively for the next hypothetical surveys (2nd–50th). This approach, by definition, yields the optimal set of locations to reduce uncertainty, but it also bears a considerable computational burden, because every possible location is tested (Npixels = 7,857) by the geospatial model for each hypothetical survey.

The second approach developed was a computational approximation to the greedy approach, hereafter referred to as the ‘overlap approach’. This approach exploits a key feature of the kriging procedure: the decrease of the kriging variance (VarK) with increasing proximity to existing survey locations. Each additional survey reduces the variance of the geospatial model at its own location, but also in its surrounding area (Supplementary Fig. 11). The ‘overlap approach’ selects an optimal set of locations that reflect a compromise between high local NS and distance to other surveys. It iteratively selects new locations based on the highest local NS penalized by the degree of overlap between the hypothetical new surveys and existing surveys (Supplementary Fig. 12). The first survey was placed at the location Xp,Yp with the highest local NS (Supplementary Fig. 12, part 1). Then the value of NS at each pixel location Xi,Yi was recalculated as \({\mathrm{NS}}_{{(+1\,{\mathrm{survey}})}X_i,\,Y_i}={\mathrm{NS}}_{X_i,\,Y_i}\times(1-{\mathrm{overlap}}\,{\mathrm{area}}/{\mathrm{neighborhood}}\,{\mathrm{area}})\) (Supplementary Fig. 12, Part 2), where the neighbourhood area was the circular area of decreased kriging variance around a new survey, and its radius was the distance until which NS decreased due to this new survey; the ‘overlap area’ is the shared area of the neighbourhoods of location Xp,Yp and of location Xi,Yi. The radius of the neighbourhood was determined using a sensitivity analysis, optimized by approximate Bayesian computation (sequential Monte Carlo)49 (Supplementary Text S5). The optimal neighbourhood radius was chosen such that it minimizes reduction in NS across all pixels. The procedure (Supplementary Fig. 12, parts 1 and 2) was repeated recursively for the hypothetical surveys (2nd–50th).

The third approach tested consisted of distributing surveys equally between provinces to reflect a common approach to disease surveillance based on equal allocation of resources between administrative entities. Twenty-two provinces with the highest human population were assigned two surveys, and the remaining six provinces were assigned one survey per province. The exact location of each survey was randomly selected inside a province. Finally, all approaches were compared with the fourth approach (the random approach) as a ‘null model’, in which the 50 hypothetical surveys were located randomly across the country without any geographic weighting criteria. The reduction in NS associated with the third and fourth approaches, which was compared to the greedy approach and overlap approach, was the average over 50 simulations.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.