From data to decisions: understanding information flows within regulatory water quality monitoring programs

Most countries maintain regulatory requirements for testing of drinking water supplies to guide treatment procedures and ensure safe water delivery to consumers. It is unclear, however, if water quality data are always used effectively, particularly in low-resource settings. Efforts to improve the use of water quality data will benefit from a comprehensive understanding of existing systems for managing and sharing information. This study evaluates the methods used to organize, analyze, and transmit drinking water quality data among 26 water supplier or surveillance institutions and two regulatory agencies in six countries of sub-Saharan Africa. Following extensive qualitative and quantitative data collection, we developed data flow diagrams to map formal and informal water quality networks. We found high levels of similarities between the information systems established by different institutions operating under different regulatory structures. We determined that the key barriers to information flows were the limited aggregation and analysis of data and the poor enforcement of data sharing requirements. Our results suggest that broad reforms are necessary to improve the use of these water quality data to manage water safety. These measures could include strengthening enforcement of testing and reporting, building staff capacity for managing and using data, and integrating collection of water quality data with other information systems.


INTRODUCTION
Diarrheal diseases resulting from unsafe drinking water are responsible for an estimated 230,000 deaths every year in sub-Saharan Africa 1 . At least 1.8 billion people are estimated to consume water from supplies that likely contain microbiological contamination [2][3][4] . To mitigate waterborne health risks, most countries maintain regulatory requirements for monitoring drinking water quality. These regulations generally specify operational monitoring by water suppliers to ensure the safety of their systems and surveillance monitoring, typically by public health agencies, of regulated and unregulated water supplies that fall within their jurisdictions 5 . A recent survey of 72 water suppliers and public health agencies in 10 sub-Saharan African countries found that 85% had performed some level of microbiological water testing within the last year 6 . .
It is unclear, however, if the water quality data generated by these regulatory monitoring programs is always used effectively. Previous research on drinking water quality monitoring in lowresource settings has mostly focused on the structural elements of administering and operating testing programs, including: (i) evaluating the extent of testing practices 6,7 , (ii) developing, evaluating, or inventorying tools available for low-cost fecal indicator organism testing [8][9][10][11][12] , (iii) guidance on sampling frequencies, locations, or logistics [13][14][15][16] , and (iv) mobile phonebased collection of data [17][18][19][20] . In addition, a systems-level analysis of the factors leading to success and failure of monitoring programs showed that institutional commitment-including motivation and leadership, knowledge, and staff retentionunderlies high-performing monitoring programs 21 . As highlighted in Peletz et al. 21 , support to monitoring programs is often supplyfocused (i.e., providing consumables for tests or constructing laboratories), rather than demand-driven (i.e., matching program designs to the information needed to protect public health). As a result, the promotion of water monitoring programs that are "data-rich but information-poor" has been a recognized problem for decades in the water resources sector, despite a need for information to make better water management decisions. Multiple studies have evaluated monitoring programs that failed to link the data collected back to the original monitoring program objectives of managing and improving water quality [22][23][24][25][26] . However, notably, prior work analyzing this issue has focused on monitoring environmental waters (i.e. watersheds), and not on drinking water or built infrastructure systems.
Researchers in the water resources monitoring sector have also identified a frequent lack of alignment between the goals and activities of information producers and users 23,27 , in which those who would use information to make water management decisions are often not involved in the design and evaluation of monitoring networks. This can result in dissatisfaction with the monitoring programs and non-use of data by those who need to make informed decisions. Our previous research found that many institutions in sub-Saharan Africa do respond to test results that indicate contamination, undertaking remedial or preventative actions to mitigate water quality risks 7 . In most of these cases, however, the information producers and users were the same entity: for example, a community health worker at a local health office would test a drinking water sample, and, subsequently, communicate the results and recommend improvements to water source owners 20 . However, there was little information available on whether the water quality test results reliably reached other information users, including senior institutional managers, the regulators that required such testing, or other stakeholders who could act to improve water systems. Because testing water quality is expensive and time-consuming 16,28,29 , it is important to maximize the cost-effectiveness of testing programs. Data should be collected and transferred in a timely and useful format to those responsible for managing water safety, allocating resources, and enforcing water quality standards [5][6][7] .
This study describes and assesses the formal and informal systems used by institutions with regulatory requirements for testing drinking water quality in sub-Saharan Africa to organize, analyze, and transmit information. We conducted our research with 26 institutions from six countries: Ethiopia, Guinea, Kenya, Senegal, Uganda, and Zambia. Eleven of these institutions were piped water suppliers and 15 were surveillance agencies, and all were participants in The Aquaya Institute's Monitoring for Safe Water (MfSW) research program 30 . From 2012-2016, we collected qualitative and quantitative data from these 26 institutions and provided the following support: (1) initial "start-up funds" to cover equipment and training for improved microbial water quality testing; (2) monthly payments for completed tests; and (3) bonus payments to institutions that met agreed-upon testing targets. Each collaborating institution was responsible for compiling and submitting data in a digital format of their choice to receive monthly and bonus payments. Due to this incentive-based research design, MfSW likely influenced information flows; as a result, what we present here is a "best case scenario" of water quality data sharing. In 2019, we revisited the Kenyan institutions to further examine successes and barriers to information flows and data use in the absence of incentives; therefore, our findings include specific case study examples from Kenya.
To analyze the transmission of water quality data, we first defined a data sharing framework and then systematically mapped information flows within the 26 institutions. Subsequently, we used these maps to evaluate trends, connections, and barriers to the flow of information within and between institutions. In Kenya, we also compared current data sharing practices with government policies on water quality monitoring to screen for potential deficiencies in both institutional practices and in reporting requirements. Finally, we developed recommendations to improve flows of water quality information and better support water safety management at institutional, local government, and national levels.

Mapping information flows
We developed data flow diagrams (DFDs) to illustrate the transmission of water quality data within institutions and to external stakeholders. We specified four elements in the DFDs 31 : (1) external entities (institutions or departments outside of the system boundaries); (2) processes (transformations of, or changes to, data); (3) data stores (physical storage of data); and (4) data flows (movement of data) (Fig. 1).
Our DFDs for the 26 institutions showed that suppliers and surveillance agencies used similar structures for collecting and sharing water quality information (as generalized in Fig. 1). First, institutions selected locations for water sampling (1.0, D1). Then they collected the samples (2.0a,b), recorded information about the water source on the sample containers or in a logbook (D2, D3), and tested the samples in the field and/or laboratory (3.0a,b). Subsequently, they recorded (D3, D4), compiled (4.0a), and transferred or transported (4.0b) test results to a location where they could be digitized (D5). Finally, they summarized data (4.0c) in reports (D6) that were passed to external entities (e.g., senior managers, regulators, ministries or other stakeholders) (6.0). In parallel, they applied the water test results (D3, D4) to guide actions (5.0) that addressed contamination: for example, communicating with water source owners/consumers, or performing corrective actions to the water source/distribution system. DFDs for individual institutions are available online at www.aquaya.org/ dfds. Despite using similar structures for collecting and sharing water quality information, suppliers and surveillance agencies are generally responsible to different regulatory institutions (i.e., the Ministry of Water and Ministry of Health, respectively) and monitor different drinking water source types; water suppliers are Fig. 1 Generalized DFD showing external entities, processes, data stores, and data flows representing the majority of monitoring programs included in this study. The DFD representations comprise four elements: 1) external entities (shadowed boxes), which are systems, individuals, or institutions outside of the modeled system's boundaries; 2) processes (rounded rectangles, assigned a unique number), which represent transformations of, or changes, to data; 3) data stores (open-ended rectangles, assigned a unique number, such as D1, D2, etc.), which represent physical storage of data (e.g., paper-based such as filing cabin et or notebook, or digital such as computer file or database); and 4) data flows (arrows), which depict movement of data. responsible for monitoring their respective piped distribution networks, whereas surveillance agencies are responsible for monitoring all supplies of drinking water from any source type at the point of consumption within their geographical jurisdiction.

Processes
Information flows involved a number of processes: deciding on sample locations, collecting and processing samples, responding to contamination, and reporting the data. Most institutions had at least two different personnel groups (e.g., local management and local lab staff, or local lab staff and central management) responsible for these processes. At one extreme, a municipal water supplier in Ethiopia allocated all of these steps to a single individual. In contrast, four personnel groups conducted these processes for a provincial supplier in Zambia: local management of each town's water supply system (which were under the jurisdiction of the provincial supplier) decided on the sampling locations; local laboratory staff collected and processed samples; local management transferred and transcribed the data; provincial laboratory staff summarized the data; local management took actions; and provincial management transmitted water quality information to the national regulator.
In Kenya, county public health offices typically had three staff involved in water quality monitoring and reporting: a county Public Health Officer (PHO) responsible for water quality sampling and analysis, a Monitoring and Evaluation (M&E) Officer responsible for digitizing water quality data, and the head county PHO responsible for reviewing and submitting data externally. Kenyan water suppliers typically had a laboratory technician and assistants responsible for water quality sampling, analysis, and data recording; water quality data was then reviewed by upper management (i.e., managing directors). "Monthly water quality reports are sent to the Technical Manager who will also sometimes request to see raw data if there are high levels of contamination. The Managing Director only sees the report if there is a serious contamination issue" (Kenyan Water Supplier).
Data stores Data stores represent records that contain information regarding the water quality testing program, including sampling plans or guidelines (D1), information about the sample's location and date of sampling (D2), and written records of the contextual information and test results of a sample (D3-D6), transformed or summarized into different formats. All suppliers in Kenya (4/4) and most throughout MfSW countries (10/11) had written sampling plans with a set schedule (e.g., dates, and/or sampling locations on those dates) ( Table 1). Water suppliers typically established their sampling plans to meet regulatory requirements for sampling frequencies for distribution networks, often based on the World Health Organization (WHO) recommendations 32 . Water suppliers generally repeated sampling locations, but also altered sampling patterns when piped water was intermittent, resources were limited, or permissions were required (as in the case of household taps). In contrast, only one of the three surveillance agencies in Kenya and fewer than half of those that participated in MfSW (7/15) had written sampling plans, and even these rarely followed a set schedule or repeated sampling locations. The practice of varying sample collection between different water supply types follows the WHO recommendations for non-piped supplies, which state that every source should be tested every 3-5 years 32 (Table 1). In practice, many surveillance agencies selected their sample schedule based on the availability of transportation, staff, and equipment, or on indications of suspected contamination; these constraints to water quality testing are discussed in more detail elsewhere 21 .
Most institutions used a combination of paper and digital records to manage data collection and recordkeeping. In the field, data were recorded on the water sample container, on blank paper, or in photocopied recording templates (D2, Table 1). Mobile phone applications for recording data were only used in one of the testing programs, and this was excluded from Table 1 because the phone application was introduced through MfSW 20 . All institutions eventually transcribed microbial water quality data from paper to computer programs (often driven by the MfSW program requirement for electronic sharing of results with research staff), but this process occurred at different points in the sampling program as determined by computer or internet availability. Hardcopies of monthly and quarterly reports were often maintained in physical folders. Though most institutions responded to test results indicating that water supplies were out of compliance, they generally did not document these response follow-up or mitigation actions; therefore, we excluded this activity from the generalized DFD presented in Fig. 1. Among suppliers, management (central or local) was often responsible for reporting to external entities, and among surveillance agencies, health staff or local management were generally responsible for external reporting ( Table 1).
The seven Kenyan institutions that participated in follow-up interviews in 2019 all recommended improvements in data compilation. Suppliers highlighted that managerial and M&E staff spent substantial time digitizing results and would benefit from having additional computers and a database (e.g., Excel) to improve internal record-keeping and data sharing. "Hard copies of data are transferred from the lab to the Technical Manager's office, which is inefficient. This also means that the Technical Manager spends a lot of time inputting data into Excel when this could be done directly by the laboratory staff" (Kenyan Water Supplier). Public health offices faced similar challenges and expressed a desire for transferring data digitally instead of via paper records that were hand-carried from sub-county public health offices to county offices. "One opportunity is to digitize results at the subcounty and county level into an electronic reporting system so reporting can be more efficient" (Kenyan County Public Health Office). In addition, two institutions expressed desire for an internal computer or internet-based data analysis system and subsequent training that would allow them to examine temporal trends in water supply and quality.
Data flows All institutions reported water quality information to at least one national administrative unit: a health ministry, an environment/ water ministry, an independent regulator, or national boards/ management bodies (Table 2). Most suppliers reported water quality results to upper management and to a national administrative unit, while surveillance agencies sent data to a wide variety of both local government units and other stakeholders, including health staff, epidemics committees, village committees, non-governmental organizations, and donors (Tables 2 and S1). In some cases, water quality data were a component of a report that included information about other topics (e.g., health or disease data for surveillance agencies, operational performance data for water suppliers). We observed a wide variety of final reporting formats, which were either required by external entities (e.g., health reporting systems) or developed by institutions themselves.
The DFDs depicted in Fig. 2 highlight the processes used by six institutions to report to stakeholders (complete DFDs for all institutions are available online at www.aquaya.org/dfds). Surveillance agencies (top row) had more reporting routes than suppliers (bottom row) ( Fig. 2 and Table 2). In addition, surveillance agencies and water suppliers in the same country had differing reporting practices (two Kenyan public health offices are represented in Fig. 2a, b; two Zambia suppliers are represented in Fig. 2e, f). Although institutions regularly shared data with external entities, they rarely received feedback (such as acknowledgement of results, questions about results, or a formal response such as a written summary or rewards/penalties). In Kenya, regulators and managing directors provided feedback to suppliers, while only upper management (i.e., county and subcounty public health officers or directors) provided feedback to surveillance agencies, despite their transmission of data to many other Local Government Units and stakeholders. Routine feedback from upper management consisted of approval of compliance reports before they are sent to external agencies. If compliance reports indicated contamination, upper management generally provided instructions to laboratory personnel and technical managers (suppliers) or public health officers (surveillance agencies) for mitigation (described in Table 3).
Kenyan institutions reported that current reporting systems did not facilitate data sharing: "Data from the sub-county public health offices is not digitized, which is inefficient. A sub-county Public Health Officer must hand deliver water quality test results to the county public health office" (Kenyan County Public Health Office). Suppliers in Kenya noted a similar challenge: water quality data were typically recorded manually in a logbook and then digitized by laboratory or management staff. Limited access to computers and the internet also prevented efficient data sharing. "We do not have a dedicated computer for our office so we share with other departments. It would be more efficient to have a computer at the laboratory so that data can be digitized immediately" (Kenyan Water Supplier). A national electronic reporting system exists to capture health data from county public health offices (the District Health Information System, DHIS), but the database only allows entry of the number of water quality tests conducted, not the actual test results: "The District Health Information System [DHIS] does not have a water quality component so we do not know the quality of water at the local and country level" (Kenyan County Public Health Office).
To improve data sharing, all three interviewed Kenyan surveillance agencies suggested a regional database or integrated national database to capture water quality data, similar to or integrated within the Ministry of Health's (MoH) DHIS. A PHO also noted that an online reporting system would standardize water quality data reporting across counties, though internet access was a challenge. "There should be a national database or reporting tool that captures water quality data from the ground" (Kenyan County Public Health Office). It is important to note that the current DHIS  This refers to the field collection methods used in D2 (container, scape or blank paper, or template pages).
E. Kumpel et al.    system does not capture data from water suppliers, who instead submit data through a different system to the Water Serves Regulatory Board (WASREB) under the Ministry of Water. Kenyan institutions also recommended holding WASH stakeholder meetings, separate from regular public health meetings, to discuss water quality results and concerns with all county stakeholders, including communities reliant on point source types. A sub-county PHO emphasized meetings solely dedicated to water quality: "Regular stakeholder meetings would provide an opportunity to prioritize water quality and discuss any issues that arise" (Kenyan sub-County Public Health Office). When contamination was detected, all institutions reported acting on the results by verifying contamination, mitigating risks, and/or engaging with consumers. All suppliers reported verifying contamination and/or mitigating risks, while surveillance agencies engaged with consumers (14/15), and, to some extent, verified contamination (4/15) and/or mitigated risks (5/15) ( Table 3). As noted above, however, institutions did not document their response actions.
Case study: policy and practice in Kenya We compared the policies and regulations for water quality testing and reporting in Kenya with the actual practices of water suppliers and surveillance agencies. Licensed water suppliers are regulated by WASREB under the Ministry of Water, with water suppliers mandated to report water quality data quarterly and annually to WASREB under section 50 of the 2002 Water Act 33 . WASREB has established monitoring requirements that include water quality parameters as well as testing frequency and sample numbers based on populations served and volumes of piped water supplied 34 . Suppliers are required to submit a sampling plan to WASREB for each water treatment facility. According to WASREB documents 34 , all water supplies must comply with drinking water quality standards established by the Kenya Bureau of Standards, although none of the water suppliers that we interviewed reported penalties for reporting results that did not meet these standards. Notably, the Kenya Bureau of Standards for drinking water 35 list many more water quality parameters than are commonly included in supplier testing programs. For Kenyan surveillance agencies, the national MoH oversees the county public health government but has limited legal authority, due to the devolved transfer of responsibilities from national to county governments under the 2010 Constitution of Kenya. County governments are responsible for water and sanitation provision and the allocation of funds for these services. The MoH does not provide water quality parameter or sampling guidelines and instead refers to the WHO's Drinking Water Quality Guidelines 32 .
We examined information systems within WASREB and the MoH as they existed in 2019 (Fig. 3). WASREB had an electronic Water Regulation Information System (WARIS) with reporting requirements that included: i) the number of tests planned and conducted, and ii) number of these samples whose results met the required standard for physicochemical (i.e., turbidity, pH, and residual chlorine) and microbial parameters. Suppliers also reported additional utility performance information, such as coverage, continuity, and financial performance. These metrics were processed into annual Impact Reports that rank utility performance on nine key indicators, one of which is water quality 36 (Fig. 3a). The water quality indicator included the following metrics for chlorine residual and bacteriological testing: (i) the percentage of tests conducted (i.e., number of tests conducted divided by the number of tests planned), and (ii) the percentage of samples meeting water quality standards. Water suppliers that did not meet these standards therefore received a low score for this key indicator. Despite knowledge of WASREB's reporting frameworks, the four Kenyan water suppliers that participated in this study were not complying with WASREB's schedule for reporting microbial water quality results. Other than lowered performance ratings, none had been penalized for noncompliance; however, in theory, low indicator ratings could result in the dismissal of the water supplier's managing director.
To improve information flows, the suppliers suggested that WASREB should conduct more audits: "If WASREB or KEBS [Kenya Bureau of Standards] audited us, we would feel more pressure to sample and submit data" (Kenyan Water Supplier). One supplier suggested adding an emergency reporting component to WARIS: "If there is a cholera outbreak, it is important to report this immediately" (Kenyan Water Supplier). WASREB personnel suggested including a feature in WARIS that allows water suppliers to attach raw data or other supporting documents, as well as redesigning WARIS to allow rural water suppliers not directly regulated by WASREB to submit less detailed water quality data.
Public health offices (surveillance agencies) in Kenya are required to enter monthly health data and the number of water samples tested (but not the results of those tests) into the MoH's DHIS. Subsequently, monthly and annual reports are then generated and re-uploaded back into DHIS for access by a variety of other MoH departments and outside stakeholders (Fig. 2b). However, in practice, none of the surveillance agencies that we revisited in 2019 reported water quality data to the MoH.
During the MfSW program, most participating institutions conducted regular tests, although not always at the frequency required by national guidelines or standards 21 . Four years afterwards, most (5/7) institutions were no longer conducting routine microbial water quality testing (Table 4). One county PHO noted, "We rarely share data because we do not do enough testing" (Kenyan sub-County Public Health Office). "We cannot take the appropriate actions without more data [to manage water safety]" (Kenyan Water Supplier). Institutions attributed the lack of testing to insufficient funding to replace broken laboratory equipment, purchase reagents, and cover transportation (i.e., no vehicle for sampling). "Because we have limited resources, we have not conducted routine water quality testing in over a year. Our office does not have enough water quality results to produce summaries or inform decisions" (Kenyan County Public Health Office). Without regular testing, water quality information is not available to inform decisions.
Two of the four water suppliers in Kenya that had participated in MfSW were still conducting regular water quality testing, although at a lower frequency. The other two suppliers tested only when equipment and reagents were available. The three surveillance agencies only tested water in response to customer complaints or disease outbreaks, though they did not specifically document complaints or responses. "We have not conducted routine water quality sampling or analysis since MfSW. We only conduct water quality testing when there is a customer complaint" (Kenyan sub-County Public Health Office). "Water quality testing is reactionary, so we can only confirm contamination rather than fully understand water quality in our community" (Kenyan County Public Health Office). It was more common for all institutions to test water for basic physico-chemical parameters (pH, turbidity, and residual chlorine) rather than microbial parameters (Table 4). "Since running out of consumables after MfSW, we no longer test for bacteriological parameters, only pH and residual chlorine" (Kenyan Water Supplier). Two suppliers reported testing for additional physico-chemical parameters, including alkalinity, conductivity, dissolved oxygen, and various nutrients and heavy metals. All sampling data, inclusive of these parameters were included in Table 4. Summary of sampling programs in Kenya. "-" indicates that no monitoring agency used the listed frequency, method, testing location, or staff.
a Testing usually done in the field; however, when consumables are unavailable, suppliers will default to lab (MF for bacteriological tests), if possible, or send to a government (external) lab. For operational parameters, if field and lab methods are unavailable, the agency may forego the analysis. b Samples were sent to an external lab (typically run by County or National government) for analysis and the method could not be determined. These tallies correspond to those listed "External Lab" in Testing Location.
E. Kumpel et al. reports to upper management, but only those required for compliance are included in external reporting.

DISCUSSION
Flows of water quality information within institutions followed similar processes in all six countries. Since all institutions had systems for transmitting water quality data, interventions to improve the use of these data should address deficiencies in existing procedures, rather than layering on entirely new reporting systems. Our maps of information flows illustrate the complexity of personnel, activities, and data transmission methods (oral, hand-written, and digital) used to convey water quality information. Several challenges limited institutions' and stakeholders' (i.e., information producers and users) abilities to use information to manage water safety. First, when institutions were testing, they often acted on the results of single samples, and the synthesis of water quality data was uncommon. When institutions were testing frequently, data synthesis could improve understanding of water quality trends and geographic variability and facilitate the use of data for long-term or large-scale planning. Potential routes to support better synthesis include increased digitization of data (e.g., using simple spreadsheet software) and guidance for summarizing data and generating descriptive statistics and graphs. We identified opportunities to improve data literacy throughout institutions by building staff skills in summarizing, analyzing, and interpreting data through training laboratory technicians, operational managers, and upper management, and the expectation of exercising these skills. Some institutions with more capacity could invest in a dedicated data management position. Second, many surveillance agencies reported data to external agencies, however, they rarely received feedback and often did not know if external entities used the data to inform their activities or programming. Previous work in water resources analyses have suggested that monitoring should be a cycle where information gathered is used to inform the monitoring program design 24,37 . We did not find any evidence that this occurred within the institutions that participated in this study. Similarly, this was echoed on the local scale, where, while the monitoring agency may have informed households of results, the households did not have an outlet to share information back to monitoring agencies (although some water suppliers may have customer complained mechanisms, these did not come up in conversations specifically about water quality monitoring programs). Two-way communication in all levels would be desirable. Third, weak regulatory environments resulted in poor enforcement of national guidelines or standards for water quality monitoring. Relatedly, many institutions were not testing at the frequency required by national guidelines or standards after the MfSW program ended, and therefore had limited data to use for decision-making.
Many institutions emphasized their desire to compile and digitize data through mobile platforms. Since the time this study was initiated under the MfSW program (primarily 2013-2015), many mobile phone-and cloud-based data management applications have become available 38 . These technologies present potential advances for improving water quality data flows; however, we observed that water testing typically occurs at centralized locations where it is usually possible to maintain a computer for on-site data entry and management. Furthermore, water quality reports and presentations are commonly produced using document or presentation software, which are easier to employ via computers. Before introducing new digital tools, we suggest that it would be useful to understand the opportunities and challenges for expanding the availability of existing computers and simple spreadsheet programs. We previously made this recommendation to MfSW collaborating organizations when discussing evidence from our study of mobile phone-based applications for water quality data management 20 . Their continued demand for mobile data platforms indicates that our recommendations were not sufficient. Further efforts are needed to improve the availability and use of basic computers in these institutions.
We note several limitations to this study. First, the information flows that we documented generally represent best-case scenarios, as they were supported by the MfSW program. It is important to note that institutions no longer received financial support from Aquaya for water quality testing or monetary incentives for submitting water quality data to Aquaya, as they did during the MfSW program. During the MfSW program, water suppliers and surveillance agencies had access to funds to which could be used to purchase equipment and reagents, train staff, and pay for transportation to sampling locations, which thereby increased the amount of water quality data available for internal and external reporting. For the seven Kenyan institutions that we revisited in 2019, water quality testing activities had decreased since the ending of the MfSW program in 2016 and had less water quality data to report (these changes to the DFDs are available online at www.aquaya.org/dfds). Second, information flows are complex and involve many actors; while we attempted to carefully map these processes, we did not verify all of our DFDs with participating institutions. Therefore, it is possible that we misinterpreted or excluded some details. Third, this research primarily focused on the use and transmission of microbial water quality data from drinking water supplies, and we did not analyze the use of data collected on water resources, raw water supplies, and treatment process operations.
Despite these limitations, however, we note that our results coincide with findings from water safety management studies in other regions. For example, a comparative analysis of water safety management in Brazil, Ecuador, and Malawi identified the following common constraints: the ministries that dealt with drinking water services did not coordinate and share data; water quality regulations were not enforced; and water service providers lacked administrative and technical capacities 39 . These independent conclusions suggest universal challenges for the water sector that call for broad reforms, rather than targeted interventions such as MfSW 21 . To support these reforms, we draw several recommendations from this study for improving the management of drinking water supplies. The first is to strengthen the enforcement of water quality testing and reporting regulations (e.g., build accountability): as indicated by our analysis of water quality data reporting to national-level agencies in Kenya, stronger demand for information could promote the sustainability of data collection systems. Relatedly, reporting on water quality must also be prioritized, such that utilities face consequences, such as a lower utility performance score, for not reporting. The second is to build staff capacity, both within institutions with monitoring responsibilities and within national ministries and regulators, for managing, digitizing, and understanding data. Increased use of water quality data, for example to analyze changes over time and geographies, could also increase demand for accurate and timely data. Finally, we identified opportunities for improving the efficiencies of monitoring programs by layering the collection and analysis of multiple types of data. Water quality data is only one requirement for managing and improving water and sanitation services. Other relevant data include water supply performance (reliability, sustainability), affordability, and access to sanitation (safety, equity, affordability, and waste management). This joint reporting of different types of water and sanitation information already occurs to some degree in Kenya: water suppliers are required to provide various types of information to WASREB, and surveillance agencies (public health offices) submit a range of health-related information to the MoH DHIS database, though surveillance and supplier data is not currently collated.
Water quality management should be increasingly linked to other functioning information systems.

Study sites and data collection
Collaborating institutions were selected through the MfSW program, which was established to document and address constraints to water safety management in sub-Saharan Africa. The selection process, description of institutions participating, and structure of the MfSW program has been described in previous publications 6,21,30,40,41 . In brief, we established research partnerships with 26 institutions from six countries (Ethiopia, Guinea, Kenya, Senegal, Uganda, and Zambia) with regulatory responsibilities for performing operational or surveillance monitoring of drinking water supplies. Eleven institutions were water suppliers responsible for operational monitoring of their sources and distribution systems, including two national suppliers, two regional suppliers, one private water operator association, and six municipal suppliers. Fifteen institutions were surveillance agencies responsible for monitoring all drinking water supplies in their jurisdiction, including one national health ministry, three regional surveillance laboratories, and 11 district health or water offices ( Fig. S1 and Table S2).
From November 2012 to July 2015, we collected qualitative and quantitative data on microbial water quality monitoring activities from the 26 institutions through the following activities: (a) needs assessments; (b) midterm assessments; and (c) ongoing communication during testing. Our assessments consisted of semi-structured interviews and observations of water sample collection, testing, and reporting procedures. We conducted interviews with water quality laboratory staff as well as management involved in water testing and reporting. During the data collection period, institutions tested microbial water quality and provided the results electronically to Aquaya on a monthly basis. We followed up via email and telephone to discuss testing details and challenges. In addition, the institutions received the following resources to support their monitoring programs: (1) "start-up funds" to expand their existing water quality testing efforts (as determined to be required through a needs assessment); (2) monthly payments for each test completed above their baseline and up to a previously agreed-upon target number of tests (as determined by applicable guidelines and their baseline rate); and (3) bonus payments to institutions that met agreed-upon testing targets. Institutions were required to submit test results in a digital format to receive the monthly and bonus payments. Due to these interventions, it is likely that the documented information flows represent a 'best case scenario' for these institutions.
Between April and June 2019, four years after the conclusion of the MfSW interventions, we conducted follow-up research in Kenya from 13 institutions: seven institutions that participated in MfSW (three sub-county and county public health offices and four water piped water suppliers) and six county and national ministries and agencies. Data collection consisted of semi-structured interviews as well as a quantitative assessment of water quality data sampling, testing, and reporting actions. At each institution, we interviewed staff involved in the processes of water quality testing and reporting (approximately five staff at each). At regulatory agencies, this included directors for water quality, chief public health officers, and managers. At surveillance agencies, we interviewed county PHOs, PHOs responsible for water quality, sanitation, or monitoring and evaluation, water, sanitation, and hygiene coordinators, and school health or health promotion coordinators. At water suppliers, we interviewed managing directors, technical and operations managers, heads of laboratories, and laboratory assistants.
Data flow diagrams DFDs are tools for systems analysis to map the inputs, processes, and outputs of a system and can be used to understand how data and information flow through a system or institution 31,42 . Here, we applied the conventions described by Kendall and Kendall 31 . Using NVivo software (QSR International), we entered, coded, and accessed qualitative data (interviews, observations, written surveys, and communication notes), and used this analysis to construct the DFDs for each of the 26 MfSW institutions and two of the regulatory agencies in Kenya. We mapped the information flows within these institutions or agencies for the following: (1) water from a source (e.g., well, spring) or distribution system; (2) the water source manager, household, or consumer; and (3) any institution or group outside of the testing process who received data (e.g., government ministries or other agencies, managing directors, or other management personnel) (Fig. 1). We analyzed the DFDs by identifying patterns observed across institutions, such as the external entities institutions reported to (Table 2) or the format of data in the data stores ( Table 1).
The Western Institutional Review Board (WIRB) (Olympia, WA USA) determined this study was exempt from a full ethical review under 45 CRF 46.101(b)(2) of the Common Rule.