Main

Water has been central to the development of China’s ancient civilization and to its contemporary development and industrialization, and will shape its path to an ecological civilization—an aspirational, more sustainable mode of human interaction with the environment1. Chinese understanding of water and its management stems from the Neolithic period, supporting agriculture, irrigation and intensive cultivation. This has also involved extensive engineering and hydrology, as well as drought and flood management (for example, Yu the Great had a legendary ability to control floods around 4,100 yr bp) and defined Chinese culture (the Liangzhu culture developed in the context of extreme hydrological dynamics in the Yangtze delta around 5,100 yr bp; ref. 2). Today, and in the years to come, China’s burgeoning economy, cities and population desperately need access to scarce freshwater resources to support industry and ensure food and energy security3,4,5, to manage the impacts of increasingly frequent and severe droughts and floods, and to improve the quality of water resources and water-dependent ecosystems given the impacts of industrialization and urbanization. Ready access to comprehensive, high-quality data on multiple aspects of surface, ground and coastal water resources (including quantity, quality, ecology, demand and infrastructure6,7; Table 1) is fundamental to understanding, managing and monitoring China’s water resources and meeting the nation’s commitment to the UN Sustainable Development Goals (SDGs), particularly SDG 6 (clean water and sanitation; refs. 8,9,10). However, although water data are extensively collected in China, governance and sharing lag well behind the international community.

Table 1 Summary and descriptions of water data categories under the SDGs and the Chinese government agencies responsible

In this Perspective we use the term ‘data governance’ to refer to the principles, practices, roles, standards and metrics necessary to ensure the security, availability, usability and integrity of data. By adopting the terminology of Cantor et al.6, we use the term water data to describe a broad range of data and information that support decision-making and research on water-related topics, including data to characterize and monitor water-dependent ecosystems. Given that water data are produced by many different entities, including government agencies, academia, industry and non-governmental organizations, water data can be grouped into authoritative (that is, certified and provided by an authoritative source such as government) and non-authoritative data7. We focus here on government provision of authoritative data from the first five water categories in Table 1 (that is, W1–W5), which are directly related to water resources.

Here we assess the state of China’s water data, discuss the barriers to water data sharing and provide a practical framework for modernizing its water data infrastructure. The assessment is based on our collective experience and a review of the literature and available data sources. In addition, to support our assessment, we conducted an anonymous online survey published on the Tencent Questionnaire Platform and promoted via WeChat from 20 February 2022 to 6 March 2022 (see Word File 1 at https://doi.org/10.6084/m9.figshare.21532908). The questionnaire was posted on the Hydro90 WeChat channel to target the community of water scientists actively using water data in analyses and modelling for water resource management in China. The survey was also promoted via our personal WeChat connections. The survey was read by 845 people, and completed by 305 people (36%). We asked participants about the types of water data they lacked, where their data were sourced, difficulties in sourcing data, their willingness to share water data (used in published papers), reasons for the lack of data availability and recommendations for improving data access and sharing. Most respondents (88%) were early-career researchers (for example, research students, young academics and young professionals; Fig. 1a).

Fig. 1: China’s water data survey results.
figure 1

a, Respondent profile. b, Most common sources of data. c, Top four water data aspects lacking. d, Top four difficulties encountered when sourcing data. e, Top four reasons for data being unavailable. f, Willingness to share data from respondents’ published papers. g, Opinions on the call for better water data sharing. h, Top four ways to effectively boost data sharing. Data are available in Data File 3 at https://doi.org/10.6084/m9.figshare.21532908.

State of water data collection, governance and sharing

Water data are extensively collected in China under the responsibilities of multiple government agencies (Table 1). Most hydrometric stations for collecting water resource data are managed by the Ministry of Water Resources (MWR), and these have increased in number from 353 in the 1950s to ~120,000 (~80,000 of which are automated) in 202011. The MWR also supervises nearly 100,000 dams and reservoirs12, following growing investment in water conservancy projects from $0.01 billion (in 2021 US$) in 1950 to a peak of ~$127 billion in 2020 (see Data File 1 at https://doi.org/10.6084/m9.figshare.21532908). Beyond that, many stations have been established and are overseen by other agencies. For example, the Ministry of Ecology and Environment (MEE) supervises 3,641 surface water quality monitoring stations (2,024 automated)13 and maintains approximately 10,826 municipal wastewater treatment plants (capacity of over 500 m3 of sewage discharge per day). The Ministry of Natural Resources also built and manages 10,171 groundwater monitoring stations (10,168 automated), which generate over 90 million data points per year14,15, and manages 1,503 stations for collecting coastal water quality data16 (http://ep.nmemc.org.cn:8888/Water/).

However, these data are not widely shared and obtaining water data is extremely difficult (Fig. 1d). For example, most time series of streamflow data have been collected by local Hydrology Bureau branches (managed by the MWR) and published in hard-copy annual yearbooks, with no digital versions available online17. Scientists who want to work on China’s water resource issues must first source the relevant hard-copy yearbook from a local library, university library or bureau (Fig. 1b) and then digitize the streamflow data themselves. Surface water quality data collected by MWR is not published even in hard copy and hence are difficult to find and verify. Most water demand and use data are published in the Water Resources Bulletin and formatted as a report each year for each city separately, which is also time-consuming to source, digitize, curate and use. While researchers often share digitized water data among their networks, these actions lack coordination and result in unnecessary duplication of effort.

Some water data are made available digitally via national data platforms (for example, see the list of streamflow data in Data File 2 at https://doi.org/10.6084/m9.figshare.21532908), but the quality is often poor with temporal discontinuities and limited sites. Data can also be secondary, descriptive, summarized (for example, see water quality reports in Data File 2 at https://doi.org/10.6084/m9.figshare.21532908), modelled (for example, estimated and reanalysis data products; see streamflow data in Data File 2 at https://doi.org/10.6084/m9.figshare.21532908) or presented in a user-unfriendly way (for example, data in the format of Microsoft Word files, see weekly water quality data in Data File 2 at https://doi.org/10.6084/m9.figshare.21532908). Essential metadata information (for example, locations or attribute descriptions) describing hydrometric sites available on common data platforms is frequently missing or unclear (see Data File 2 at https://doi.org/10.6084/m9.figshare.21532908 and Fig. 1c) and there are no publicly accessible standards for this information. Most water datasets on these platforms are only available on an ad hoc basis and access to each item is subject to approval.

Water data for China available from online international repositories, particularly streamflow and water quality data, are few and outdated and China lags most other nations of comparable land area in making this data available (Table 2). For example, the Global Runoff Data Centre (GRDC), which supports large-scale hydrological studies, provides hydrological data for 10,702 stations (as of 29 April 2022) worldwide. However, only 39 of these stations are in China and these records have not been updated since 2004. Similarly, the Global Streamflow Indices and Metadata (GSIM) project was initiated to collate publicly available data and promote the widespread use of streamflow data, with 30,959 stations in total18. However, Chinese streamflow data are only available for rivers in Yunnan Province and the Tibet Autonomous Region maintained by the China Hydrology Data Project19. This includes just 163 stations with records spanning 1947 to 1987 that have not been updated since the establishment of GSIM18. Aggregating five large water-quality datasets, the Global River Water Quality Archive (GRQA) comprehensively reports 42 indicators for 93,057 sites with records from 1898 to 2020. There are 244 of these stations located in China with only 3,595 observations in total, and daily records are only available from 1980 to 200920. However, water infrastructure and utilities data (for example, wastewater treatment plants, dams and reservoirs) for China is much better represented in global databases (Table 2).

Table 2 Data available in global water datasets from the seven largest nations by area

Some live, direct, real-time water quality, streamflow and water level data for China’s surface water resources are temporarily viewable online at the website of the China National Environmental Monitoring Centre, which provides 2,024 automated surface water quality stations and the website of National Hydraulics and Hydroinformatics, which provides data for 1,145 hydrological stations. This has led some researchers to resort to web scraping and other automated means of assembling and cleaning real-time water data. However, historical data are not accessible to the public21.

Barriers and impacts

There have been many barriers to sharing China’s water data. Of our survey respondents, 77% believed that the greatest barrier to the accessibility of water data is the widespread practice of onselling hydrological data and the high costs involved for users. For example, purchasing a time series of daily streamflow data for a single year at just one station costs around US$1,023. Although a cost-recovery model may have been reasonable in the early stages of data collection due to the labour-intensive nature of manual water resource monitoring, ongoing high data costs are difficult to justify with the widespread adoption of automated monitoring equipment and hydroinformatic techniques. Concerns around national security and hydro-geopolitical relations as a barrier to water data sharing were expressed by 69% of respondents. Water plays an important role in national security and regional geopolitics for China, which shares 42 major transboundary rivers with neighbouring countries22. For example, countries surrounding the Tibetan Plateau, which share multiple major transboundary rivers including the Mekong, Brahmaputra and Indus, have already experienced conflicts around water resource management, the impacts of global warming, atmospheric circulation changes and increased water demand23. With water a key resource and foundation of national security and development, water data in China have historically been considered to be highly sensitive. Sixty-five percent of respondents also identified the lack of explicit policies as a barrier for promoting water data sharing (Fig. 1e) including data privacy laws, security standards and data governance protocols for data providers; restricted data use policies and data publishing strategies for end users; standards for metadata and data interoperability; policies for promoting supporting data infrastructure and technology; and policies for connecting data providers and users (for example, incentive policies, engagement, user networks)6,24. Although China has existing legal structures administering scientific data25, stronger policies (acts, laws or executive orders) that are targeted towards water data and specify clear roles for different agencies are required.

Inadequate data sharing and overlapping responsibilities has resulted in competition between multiple agencies and stakeholders from local to national levels in establishing, managing and monitoring gauging stations. For instance, the MWR established the Zhimenda station at the outlet of the source of the Yangtze River (33.013° N, 97.238° E) which has collected water data (for example, streamflow, water quality, and sediment) since 1956. This station is very close to another station also called Zhimenda (33.022° N, 97.248° E) established by the MEE to collect water quality data since 199926. Water quality sampling programmes have also been implemented for hundreds of lakes and rivers of high concern by different stakeholders. For example, multiple water sampling programmes for Lake Taihu are conducted by water management agencies, institutes and universities, leading to a considerable waste of resources and time, inconsistent data quality, varied results of hydrological analyses and divergent policy recommendations27.

The lack of an openly available dense network of digital, high-quality, up-to-date, continuous, standardized, long-term daily or hourly data capturing key water indicators has limited the credibility of China’s water information (Fig. 1e). More importantly, it has hampered the ability of Chinese and international scientists to contribute to the sustainable management of water resources in China and globally, impeded the development of China’s next generation of water scientists and stymied the growth of water resource science, management and policy28. Although some studies have used original national-scale streamflow data in China (see Data File 1 at https://doi.org/10.6084/m9.figshare.21532908), the datasets underlying these studies are not publicly available due to licensing restrictions, which limits the reproducibility of the work.

An emerging willingness to act but more is needed

Efforts have been made towards improving the quality and availability of water data in China. More than 20 years ago, the MWR launched a notice on the disclosure of public-welfare water data. In 2007, the Deputy Minister of the MWR announced that China would strive to set up a national water database to boost data sharing29. Just 2 years ago, the MWR announced measures for assembling and managing water resource information30. China’s government also recently pledged to share year-round Mekong River data to downstream Asian countries for better monitoring and forecasting of floods and droughts through the Lancang-Mekong Cooperation Mechanism31, and published a detailed report proposing the construction of digital twin basins with big water data to enhance smart water conservancy in early 202232. Despite these initiatives, the systematic collection, governance and sharing of high-quality water data in China has not gained momentum.

As the next generation of water scientists working to understand and manage China’s water resources, we watch with envy the ever-expanding open global provision of data for supporting water resource monitoring, modelling and prediction. For example, real-time streamflow records for approximately 8,500 gauging stations and water quality for about 2.7 million sites (covering both surface and groundwater resources in inland and coastal areas) in the United States are available on the National Water Information System and Water Quality Portal33 (https://waterdata.usgs.gov/nwis). This data resource is supporting the US National Water Model for real-time flood forecasting and will support the Surface Water and Ocean Topography mission, which will further enhance estimates of Earth’s surface water from 202234. Based on the GRDC database, the European Union has built continental-scale hydrological models to assess the impacts of climate change on water scarcity35. This information has enabled France to develop the Observatories National des Étiages network for in-depth modelling of headwater hydrological processes. Although some other countries do not have a network of hydrologic stations as dense as China’s, they have pushed forward with the publication of non-sensitive data online36,37. These efforts are pertinent examples of the important role of national water agencies in generating major new advances in water data as a catalyst for promoting evidence-based water resource management policy.

Given the need to accelerate progress towards SDG 6 and to manage the looming impacts of climate change, the availability of water data is becoming even more important and urgent to underpin hydrological and interdisciplinary water research. Water data are a crucial asset in developing early warning systems for flooding; providing information to support decision-making in drought planning and management, agricultural and irrigation management, and integrated river and basin management; and developing a modelling and strategic foresighting capacity for managing water resources under future uncertainty10. These toolsets are urgently needed as more and more research points to the accelerating impacts of compounding extremes of weather and natural hazards38, as well as surging demand for clean water—especially during crises such as the COVID-19 pandemic39. With the prevalence of artificial intelligence, data-driven modelling is increasingly complementing process-based modelling, making it a fast-moving field connecting water scientists from all countries in countering the adverse impacts of climate change. These issues are likely to be exacerbated over the next few decades due to intensified international conflicts over water.

A path forward

We call upon the Chinese government to reform and provide open access to its water data, and to coordinate its water-related administrative sectors and transform water data governance and sharing via integrating the now fragmented components into a connected, specialized, national water data infrastructure to advance water resource management (Fig. 1g,h). Below we propose the priority, e-technology, archetypes, cooperation and engagement (PEACE) framework to illustrate the major components of such a modern data infrastructure (Fig. 2).

Fig. 2: Anticipating water data sharing challenges with PEACE.
figure 2

The PEACE framework offers a connected, specialized infrastructure for advancing water resources management. MHURD, Ministry of Housing and Urban-Rural Development; MNR, Ministry of Natural Resources; NBS, National Bureau of Statistics. River (lake) chief system is a policy to address water governance via assigning each part of a river/lake to a certain official and associating the status of a river/lake (i.e., water quality) with the overall assessment of officials.

Priority

The digitization, governance, internationalization and open provision of the full historical record of non-sensitive gauging station water data within China needs to be prioritized according to national rules and scientific data strategies25,40. As recommended by the Jiangsu Committee of the Chinese People’s Political Consultative Conference41, stronger policies and laws are also urgently required to prevent onselling of basic and non-sensitive water data.

E-technology

Next-generation automatic water resource monitoring, modelling, projection, standards, services and wireless sensing are required, with a focus on data transparency and openness. Digital technology will play a key role in the usability of water data via integrating multi-source and heterogeneous data into interoperable data. For example, the great success of the Google Earth Engine platform has transformed big data into information that meets diverse user needs. Developed by a group at Stanford University, the CEDAR Workbench illustrates a potential solution for creating national standardized metadata42.

Archetypes

We urge the establishment of archetypes for specific standards, rules and regulations to identify water information (for example, sensitive data with regards to national security) for restricted data sharing following the Water Law of the People’s Republic of China 2017 Edition40 and the Measures for the Administration of Scientific Data in 201825. Water data are inherently sensitive, especially in transboundary river systems and areas subject to geopolitical and resource conflicts. It is therefore important for the related agencies to classify the regions and associated data clearly to avoid misuse24 and allay national security concerns.

Cooperation

We appeal for cross-departmental cooperation to reduce duplication of effort and expand transboundary cooperation to seek basin-level integrated water resource management. The exchange of water information between different departments can improve the value discovery and credibility of water data across interdisciplinary producers7. Hydro-diplomacy is also essential to bring different stakeholders and countries in these conflict-prone river basins together to mediate hydro-geopolitical conflict via dialogue and cooperation43.

Engagement

We encourage research institutes, scientists, industry, local communities and non-governmental organizations to engage, and governments (for example, different administrative divisions from regional to local level) to improve corporate data governance following the principles of transparency, responsibility, user focus, sustainability and technology (TRUST)44 and findable, accessible, interoperable and reusable data (FAIR)45. In particular, empowering the participation of women would promote gender equality and diversity in water resource management46. Ongoing stakeholder engagement helps to ensure that investment will ultimately be worthwhile for different data users. Lessons from California’s efforts to promote a data-driven water platform indicated that stakeholder engagement helped to expand data system usability and awareness, even though it requires time, resources, and commitment6.

The above transformation will set China on a path to more effective and efficient water data governance characterized by increased transparency and accountability, and promote trust and credibility in institutions and government. Efficiencies gained via reduced duplication of effort will save money and resources. Sharing high-quality data will not only increase the value of data to stakeholders, but also create a return on investment of 10–20 times for data users, as well as a 20–50 times greater spill-over value for the wider economy47. One US dollar invested in data services alone has been found to provide returns of up to US$8.30 (ref. 48). In addition, the community can connect and engage in citizen science to improve awareness of water-related issues such as scarcity and pollution49. Open science and data sharing will promote collaboration between Chinese and international scientists, and boost understanding of and improve water resource management in China and globally. It will also enable China to make progress in water data governance and analysis, and facilitate its young scientists to develop the skills required to manage China’s water resources sustainably28 on the path towards an ecological civilization and the SDGs. An optimized water data infrastructure is essential to underpin China’s contribution to international scientific programmes and to building a regional and global community with a shared aspiration to enhance our ability to manage rapidly changing water resources.