Trading contact tracing efficiency for finding patient zero

As the COVID-19 pandemic has demonstrated, identifying the origin of a pandemic remains a challenging task. The search for patient zero may benefit from the widely-used and well-established toolkit of contact tracing methods, although this possibility has not been explored to date. We fill this gap by investigating the prospect of performing the source detection task as part of the contact tracing process, i.e., the possibility of tuning the parameters of the process in order to pinpoint the origin of the infection. To this end, we perform simulations on temporal networks using a recent diffusion model that recreates the dynamics of the COVID-19 pandemic. We find that increasing the budget for contact tracing beyond a certain threshold can significantly improve the identification of infected individuals but has diminishing returns in terms of source detection. Moreover, disease variants of higher infectivity make it easier to find the source but harder to identify infected individuals. Finally, we unravel a seemingly-intrinsic trade-off between the use of contact tracing to either identify infected nodes or detect the source of infection. This trade-off suggests that focusing on the identification of patient zero may come at the expense of identifying infected individuals.


S1 Details of the Experimental Procedure
The detailed pseudocode of the experimental procedure is presented as Algorithm 1. In the loop in lines from 1 to 4, we generate the diffusion process with source v † that infects at least 10% of the network nodes. In line 5 we select the initially infected nodes that are discovered by the party running the tracing process, and that will be the starting points of tracing. In lines from 6 to 9 we initialize the variables used during the tracing process, which takes place in the loop in lines from 10 to 23, as long as we do not run out of the budget b * . In a single execution of the loop, we will trace contacts of β tr nodes, and newly detected infections will be collected using variable D * initialized in line 11. Tracing contacts of a given node takes place in a single execution of the loop in lines from 12 to 22. In line 14 we select the node v * , the contacts of which will be traced. We add v * to the set of traced nodes in line 15, and we compute the last day of the tracing window t 0 in line 16 (we will trace the contacts of v * on day t 0 and δ preceding days). In line 17 we identify the contacts of v * within the tracing window that v * remembers. Out of these, we randomly select 10, and we test them for infection in the loop in lines from 18 to 21. More precisely, in line 19 we add w (a contact of v * ) to the set of tested nodes H, and if it is infected (which is checked in line 20), we record this fact in line 21. After tracing the contacts of β tr nodes, the newly detected infections are added to the set of known infections D in line 23. Only now can we trace the contacts of nodes in D * . The results of the tracing process are returned in line 24.
▷ Node the contacts of which will be traced 15: t0 ← min(τ (v * ) + ωtr, T ) ▷ The last day of the tracing window 17: for w ∈ randomly select 10 from C \ H β tr ,ω tr do 19:

S2 Experiments with Real-Life Networks
In this section, we present the results of our experiments with real-life temporal networks. In particular, we consider the following datasets: • Hospital [6]-a network of contacts between patients and health care workers in a geriatric unit of a university hospital collected using wearable proximity sensors, consisting of 73 nodes and 1, 381 edges; • Dormitory [2]-a network of contacts of students living in a dormitory collected via Bluetooth, consisting of 74 nodes and 2, 516 edges; • Office [1]-a network of face-to-face contacts of staff of an office building collected using RFID badges, consisting of 219 nodes and 16, 725 edges; • Primary school [5]-a network of contacts of students and teachers in a primary school collected via radio frequency identification devices consisting of 238 nodes and 5, 541 edges; • Conference [4]-a network of face-to-face contacts between attendees of a medical conference collected using RFID badges, consisting of 403 nodes and 65, 355 edges; • Copenhagen [3]-a network of contacts of university students collected via Bluetooth as part of the Copenhagen Networks Study, consisting of 672 nodes and 21, 318 edges.
For each of these networks we perform the same experimental procedure as for the random networks in the main article.
The results of our simulations are presented in Figure S1. As can be seen, the results are largely consistent with those presented for large random networks in Figure 3 in the main article. In particular, adjusting the tracing breadth parameter β tr seems to be more impactful than the tracing window offset parameter ω tr , with greater values of β tr resulting in identifying more infections, but being further away from detecting the source.  Tracing breadth Tracing window offset 0.5 0.6 0.7 0.8 0.9 1.0 1.1 Figure S1: The effectiveness of tracing for varying β tr and ω tr in real-life networks. In each plot, the x-axis corresponds to the tracing breadth parameter β tr (with greater values indicating more focus on the breadth). The y-axis corresponds to the tracing window offset parameter ω tr (with greater values indicating the window shifted to the future). The plots in the first and third row present the number of infected detected by the tracing process, colours closer to red indicate more effective detection. The plots in the second and fourth row present the number of edges between the earliest detected infection and the actual source. The colour closer to blue indicates more effective detection. Each pair of plots shows results for different real-life network, with tracing budget b = 10. The results are presented as an average of over 100 simulations, with a new infection process generated for every simulation.

Barabási-Albert Erdős-Rényi Watts-Strogatz
Number of nodes Distance to the real source Barabasi-Albert Erdos-Renyi Watts-Strogatz Figure S3: The comparison of the effectiveness of source detection while contact tracing to other source detection algorithms. In the plot, each group of bars corresponds to different method of source detection, while the y-axis corresponds to the distance from the detected source to the real source. The results are presented for networks with 10, 000 nodes generated using different models, either Barabási-Albert, Erdős-Rényi, or Watts-Strogatz, with tracing budget b = 100, and tracing breadth β tr = 10. The results are presented as an average of over 1, 000 simulations, with a new network generated for every simulation. The error bars correspond to the 95% confidence intervals. Infectious period [days] Distance to the real source

Barabasi-Albert
Erdos-Renyi Watts-Strogatz Figure S4: The effectiveness of tracing when changing infectious period. In each plot, the x-axis corresponds to the infectious period γ −1 of the infection expressed in days, while the y-axis corresponds to either the percentage of infected nodes that got detected or the distance to the real source. The results are presented for networks with 10, 000 nodes generated using different models, either Barabási-Albert, Erdős-Rényi, or Watts-Strogatz, with tracing budget b = 100, tracing breadth β tr = 10. The results are presented as an average of over 1, 000 simulations, with a new network generated for every simulation. The colored areas correspond to the 95% confidence intervals.