Machine-learned epidemiology: real-time detection of foodborne illness at scale

Machine learning has become an increasingly powerful tool for solving complex problems, and its application in public health has been underutilized. The objective of this study is to test the efficacy of a machine-learned model of foodborne illness detection in a real-world setting. To this end, we built FINDER, a machine-learned model for real-time detection of foodborne illness using anonymous and aggregated web search and location data. We computed the fraction of people who visited a particular restaurant and later searched for terms indicative of food poisoning to identify potentially unsafe restaurants. We used this information to focus restaurant inspections in two cities and demonstrated that FINDER improves the accuracy of health inspections; restaurants identified by FINDER are 3.1 times as likely to be deemed unsafe during the inspection as restaurants identified by existing methods. Additionally, FINDER enables us to ascertain previously intractable epidemiological information, for example, in 38% of cases the restaurant potentially causing food poisoning was not the last one visited, which may explain the lower precision of complaint-based inspections. We found that FINDER is able to reliably identify restaurants that have an active lapse in food safety, allowing for implementation of corrective actions that would prevent the potential spread of foodborne illness.

In Las Vegas, restaurants are routinely inspected once a year and can also be inspected as a result of a complaint, but this is rare (Las Vegas received 15 complaints during the 4-month experimental period). Inspectors are given a list of restaurants to inspect at the beginning of the year and must complete all inspections by the end of the year, in whatever order they choose. A routine inspection is a risk-based process addressing the food establishment's control over the following five areas of risk for foodborne illness: personal hygiene, approved food source, proper cooking temperatures, proper holding times and temperatures, and sources of contamination. Violations are weighted based on their likelihood to directly cause a foodborne illness, and are divided into critical violations (at 5 demerits each, e.g., food handlers not washing hands between handling raw food and ready to eat food), major violations (at 3 demerits each, such as hand sink not stocked with soap), and good food management practices (no demerit value, e.g., leak in the hand sink). Demerits are converted to letter grades, where 0-10 is an A, 11-20 is a B, 21-39 is a C, and 40+ is an F (immediate closure). A repeated violation of a critical or major item causes the letter grade to drop to the next lower rank. Any grade less than an A is required to undergo a re-inspection to confirm all critical and major violations have been corrected.
Whenever a food establishment was identified by FINDER, the assigned inspector was instructed to conduct a standard routine inspection on two restaurants: the FINDER-flagged restaurant and a matched restaurant from the routine inspection list. Matched restaurants were selected at random based on their location and permit type to match the FINDER-flagged restaurants. The inspectors were not aware of which restaurant was flagged by FINDER, so that each facility received the same risk-based inspection. The venue owner/manager was also not aware of the experiment and was told by the inspector that a routine inspection was being conducted. Google staff was not privy to inspector assignment. After the end of the experiment, the SNHD staff collected information about the number and the type of violations found at the FINDERidentified restaurants and at the matched restaurants. Only this set of restaurants had detailed information about the count and severity of violations that could be compared across the two cities, and thus only these restaurants were used in the analysis of violation counts. Matched restaurants were included in the ROUTINE subset of the BASELINE group. CDPH performs initial health inspections prior to the opening of a food establishment, and assigns a risk level, either 1, 2 or 3, to the establishment that will determine the frequency of future routine inspections. Establishments at risk level 1 (the highest risk) are inspected twice a year, those at level 2 are inspected once a year, and those at level 3 are inspected every other year. During an inspection, inspectors look for serious or critical violations. A serious violation indicates a "potential health hazard" that must be corrected within a timeline established by the inspector, and if it has not been remedied on re-inspection, the establishment is closed. A critical violation poses "an immediate health hazard" and must be fixed while the inspector is present or else the restaurant is closed. For a complete list of violations, see Supplementary Table S1. For the violation count analyses, critical violations were grouped with critical violations in Las Vegas, and serious violations were grouped with major violations in Las Vegas.
In addition to these routine inspections, restaurants can also be inspected when CDPH receives a complaint. Chicago has an advanced complaint system that includes complaints generated from phone calls, Foodborne Chicago (a social media mining system), as well as a predictive analytics system. Complaints produced by any of these mechanisms receive higher inspection priority than routine inspections.
All FINDER-identified restaurants were inspected according to the standard protocol. The inspectors were not aware of the experiment, and therefore were not aware which restaurants were flagged by FINDER, so that each establishment received the same risk-based inspection.
The venue owner/manager was also not aware of the experiment and was told by the inspector a routine inspection was being conducted. Google staff was not privy to inspector assignment.

Sensitivity Analyses
We conducted a series of sensitivity analyses to assess the robustness of our method. First, we analyzed the results for each of the cities separately, and observed similar results; FINDER restaurants were more likely to be unsafe than BASELINE restaurants (Supplementary Table   S3).
In Chicago, if an issue is found during an inspection, a re-inspection is conducted shortly thereafter to determine whether the establishment has complied with the requests of the inspectors. Since this type of inspection is slightly different from a typical routine or complaintbased inspection, we conducted a sensitivity analysis where these re-inspections were excluded.
The results were qualitatively similar, except the increased sensitivity of FINDER was no longer statistically significant when compared to complaints (Supplementary Table S4).
In our main experiment, we considered restaurants as unsafe if they either passed an inspection with conditions or failed outright. We also assessed FINDER's precision in identifying unsafe restaurants under a more restrictive definition of unsafe, namely, only considering restaurants with the most serious violations (grade C or worse in Las Vegas or a Closure in Chicago) as unsafe. In this analysis, FINDER again identified a higher fraction of unsafe restaurants than routine and complaint-based inspections (Supplementary Tables S5-S7).
Additionally, we compared the adjusted mean number of critical and major violations within each city and found similar results (Supplementary Tables S8 and S9). Finally, we used a multinomial logistic regression with three possible values of the dependent variable: safe (grade A in Las Vegas or Pass in Chicago), unsafe (grade C or worse in Las Vegas or Fail in Chicago), and a new semi-safe category (grade B in Las Vegas or Pass with Conditions in Chicago). In this analysis, we again found that both unsafe and semi-safe outcomes were more likely to occur in FINDER-flagged restaurants than in BASELINE restaurants (Supplementary Table S10).
Additionally, we conducted a sensitivity analysis where we compared the likelihood of being deemed unsafe among FINDER restaurants versus all restaurants, including both FINDER and BASELINE restaurants. Here we find that again, FINDER-identified restaurants are more likely to be closed than FINDER+BASELINE restaurants. Finally, we compared FINDER+BASELINE vs BASELINE to examine the additive effect of including FINDER in routine inspections. Here, the direction of the results are the same, where adding FINDER into the baseline inspections increased the rate of unsafe restaurants that were identified, however, given the relatively small number of FINDER restaurants, the impact is also relatively small (Supplementary Table S11). Tables  Table S1. List of all critical and serious violations that could arise during an inspection in Chicago Department of Public Health.

Violation Number
Health Standard to Be Met Violation Type 1 · All food shall be from sources by health authorities and safe for human consumption. · Shellfish shall be obtained from an approved source and kept in their original package until sold. · Molluscan shell stock shall be obtained in containers bearing legible source identification tags or labels.
Critical 2 · All food establishments that prepare, sell, or store hot food shall have adequate hot food storage facilities. · All food establishments that display, prepare, or store potentially hazardous food shall have adequate refrigerated food storage facilities.
Critical 3 · All hot food shall be stored at a temperature of 140°F or higher. · All cold food shall be stored at a temperature of 40°F or less.

Critical 4
All food shall be protected from contamination and the elements, and so shall all food equipment, containers, utensils, food contact surfaces and devices, and vehicles Critical 5 No person affected with or carrying any disease in a communicable form or afflicted with boils, infected wounds, sores, acute respiratory infection, or intestinal disorder shall work in any area of a food establishment in any capacity where there is a likelihood of that person contaminating food or food contact surfaces.
Critical 6 All employees who handle food shall wash their hands as often as necessary to maintain a high degree of personal cleanliness and should conform to hygienic practices prescribed by the Board of Health.
Critical 7 Hand washing of all tableware and drinking utensils shall be accomplished by the use of warm water at a temperature of 110°F to 120°F containing an adequate amount of detergent effective to remove grease and solids.
Critical 8 Equipment and utensils should get proper exposure to the sanitizing solution during the rinse cycle. Bactericidal treatment shall consist of exposure of all dish and utensil surfaces to a rinse of clean water at a temperature of not less than 180°F.
Critical 9 All food establishments shall be provided with an adequate supply of hot and cold water under pressure properly connected to the city water supply.
Critical 10 In food establishments, there shall be adequate sewage and waste water disposal facilities that comply with all requirements of the plumbing section of the Municipal Code of Chicago.

Critical 11
Adequate and convenient toilet facilities shall be provided. They should be properly designed, maintained, and accessible to employees at all times.

Critical 12
Adequate and convenient hand washing facilities shall be provided for all employees. Critical 13 All necessary control measure shall be used to effectively minimize or eliminate the presence of rodents, roaches, and other vermin/insect infestations.

Critical 14
A separate and distinct offense shall be deemed to have been committed for each Serious violation that is not corrected upon re-inspection by the health authority.

15
Food once served to a consumer shall not be re-served, with the exception of packaged food remaining in its original, unopened package.

Serious 16
All food should be properly protected from contamination during storage, preparation, display, service, and transportation.

Serious 17
Thawing frozen food for further processing shall be accomplished by storage in a refrigerator at 40°F or less, or by other approved method.

Serious 18
All necessary control measures shall be used to effectively minimize or eliminate the presence of rodents, roaches, and other vermin and insects on the premises of all food establishments, in food-transporting vehicles, and in vending machines.

Serious 19
The area outside of the establishment used for the storage of garbage shall be clean at all times and shall not constitute a nuisance.

Serious 20
All garbage and rubbish containing food wastes shall, prior to disposal, be stored in metal containers with tight fitting lids and shall be kept covered except when opened for the disposal or removal of garbage.

Serious 21
A certified food service manager must be present in all establishments at which potentially hazardous food is prepared or served.

Serious 22
All dishwashing machines shall maintain proper water pressure and must be provided with suitable thermometers, chemical test kits, and gauge cocks.

Serious 23
Dishes and other utensils shall be rinsed or scraped to remove gross food particles and other soil before washing.

Serious 24
All dishwashing machines must be of a type that complies with all requirements of the plumbing section of the Municipal Code of Chicago and Rules and Regulation of the Board of Health

Serious 25
Only such poisonous and toxic materials as are required to maintain sanitary conditions may be used in food establishments and they shall not be used in any hazardous manner.

Serious 26
When toilet and lavatory facilities are provided for the patrons of food establishments, such facilities shall be adequate in number, convenient, accessible, properly designed, and installed according to the municipal code.

Serious 27
In all food establishments, toilet facilities shall be kept clean and in good repair and shall include an adequate supply of hot and cold or tempered water, soap, and approved sanitary towels or other approved hand-drying devices.