Nested pool testing strategy for the diagnosis of infectious diseases

The progress of the SARS-CoV-2 pandemic requires the design of large-scale, cost-effective testing programs. Pooling samples provides a solution if the tests are sensitive enough. In this regard, the use of the gold standard, RT-qPCR, raises some concerns. Recently, droplet digital PCR (ddPCR) was shown to be 10–100 times more sensitive than RT-qPCR, making it more suitable for pooling. Furthermore, ddPCR quantifies the RNA content directly, a feature that, as we show, can be used to identify nonviable samples in pools. Cost-effective strategies require the definition of efficient deconvolution and re-testing procedures. In this paper we analyze the practical implementation of an efficient hierarchical pooling strategy for which we have recently derived the optimal, determining the best ways to proceed when there are impediments for the use of the absolute optimum or when multiple pools are tested simultaneously and there are restrictions on the throughput time. We also show how the ddPCR RNA quantification and the nested nature of the strategy can be combined to perform self-consistency tests for a better identification of infected individuals and nonviable samples. The studies are useful to those considering pool testing for the identification of infected individuals.


Background on pool testing
Group or pool testing has been extensively studied for two main applications: (i) to identify "defective" units (in the present case, infected individuals) when the prevalence, p, defined as the probability of being defective, is relatively low [1]; (ii) to estimate p [2]. In the present manuscript we mostly focus on the identification problem. This is the key aspect for the expansion of the testing capacity; accepting that a negative result in a single reaction indicates that all the individuals whose samples are included in the tested pool are not infected.
The first proposal of a pooling strategy for this aim was made by Dorfman [3] in 1943. In this strategy, pools are formed containing a number, m, of individual samples. One test is run on each pool to detect the presence of a defective or infected sample in it. Assuming that the detection is sensitive enough, the samples in the pools that test negative in this first stage are identified as non-infected. All the samples that belong to pools that test positive are tested individually at a second stage. The identification of the samples that belong to pools that test negative at first is done with only one test. The identification of those that belong to pools that test positive at first requires m + 1 tests instead. A large reduction in the number of tests will then be achieved depending on the relative fraction of pools that test positive at the first stage. This scheme is an example of a class of models that are called adaptive or hierarchical or sequential. In this type of models the actions to be taken at any given stage depend on the test results of the previous stages. The two-stage strategy of Dorfman's was subsequently improved by others [4][5][6][7][8] including more stages to further reduce the number of tests that are necessary to identify the infected samples. In particular, the method of [6], which was recently optimized in [9], achieves an important reduction at the expense of having a random, in principle large, number of stages. Alternatively, non-adaptive methods can be applied, in which the way in which the samples are pooled and tested is established before having the result of any test. This definition of the pools a priori allows an easy parallelization of the strategy [10] and, depending on the method, can also help to reduce the impact of test errors by including all individual samples in more than one pool. One of the most common such strategies is array testing in which each sample is included in at least two pools and if both pools test negative then the individual samples are labeled as not infected, otherwise they are tested individually [11][12][13]. There are also mixed strategies, called semi-adaptive, that try to exploit the advantages of the different types of approaches simultaneously [14]. According to [10], the mixed strategy of [14] is the best two stage strategy in terms of reducing the number of tests per individual.
Various pool testing strategies have been analyzed specifically for the case of SARS-CoV-2. In [15] Dorfman's algorithm is applied including the use of replicates to check for false negatives or positives. In [16], Dorfman's strategy is compared numerically with adaptive and non-adaptive methods that use, at some stage, binary splitting. The work in [17] evaluates numerically the performance of two-dimensional array pooling in which the individual samples are organized in an r × c array. In [18] a D-dimensional array pooling strategy is evaluated assuming that there are very few infected samples among the tested ones. This strategy is being used for large scale testing in Rwanda [19]. In [20], both Dorfman's and array algorithms are evaluated on practical implementations. In [21] a non-adaptive group testing approach is presented which uses a combinatorial pooling strategy that is based on compressed sensing and is designed to maximize the identification of the positive samples. So far, sample pooling strategies for SARS-CoV-2 nucleic acid detection have mostly been explored for the current gold standard test, RT-qPCR [22], finding that, for large enough viral loads, the identification of a single infected individual is possible in pools of up to 100 samples (1 infected + 99 negative samples). Despite the initial excitement, it is accepted that in RT-qPCR the target can go unidentified if present in small amounts, and that samples frequently contain inhibitors of the PCR reaction [23]. Moreover, diagnostic sensitivity in symptomatic patients was reported to be in the range of 60%-90% [24]. Thus, for samples with low viral loads, the pooling associated dilution will further reduce the detection of infected individuals to unacceptable levels. The risk of increasing the number of false negative results has raised concerns and delayed the approval of group testing protocols, limiting pool sizes to single digit numbers. This has reduced the impact of pooling strategies and, consequently, the expansion of the world testing capacity.

Nested pooling strategy
The hierarchical pooling strategy analyzed in the paper is characterized by a number of stages, k + 1, and a sequence of pool sizes, m = (m 1 , . . . , m k ), where m 1 > · · · > m k > 1 and m j is a multiple of m j+1 for all j. At the last (k + 1) stage the samples are tested individually, i.e., m k+1 = 1. It was first introduced for k = 2 [7] and later generalized to any number of stages [8]. Given the infection probability, p, the optimal strategy is characterized by values, k and m, that minimize the cost (the expected number of tests per individual). In [25] we derived analytic expressions for the parameters that define the optimal strategy for any given p. We present in what follows a brief description of the strategy and the main results obtained in [25] that are used in the present paper. Readers interested in our approach can try it at [26]. Given that different facilities might have different constraints, the site offers the possibility of choosing one or various constraints to start with and gives not only the optimal strategy given the constraints but several suboptimal ones as well. In this way users can choose the strategy that best meets their needs, even if it is not the absolute optimal in terms of resource savings.

Presentation of the strategy
Let us assume that N individuals have to be tested and that p is the probability that anyone of them is infected. At the first stage we subdivide the N samples into disjoint groups of size, m 1 , and combine the m 1 samples of each group into a single pool that is tested. Once the first-stage tests are performed, we subdivide the samples of each of the pools that detected the presence of at least one infected sample (i.e., that tested positive) into m 1 /m 2 pools of m 2 samples each. We perform the test on these m 2 sample pools and proceed as before: those that test positive are subdivided in m 2 /m 3 pools of m 3 samples each. The procedure is iterated until the (k + 1)-th stage is reached at which all the individual samples of the pools that tested positive at the k−th stage are tested.  (27,9,3) applied to an initial pool (rectangle) with samples (circles) coming from the 27 individuals in the set, W i 1 with i 1 = 0, three of which are infected (red circles in the figure). Each pool of each stage is subdivided into 3 pools at the following stage. The figure illustrates the labels that we assign to the different pools at the various stages and how this can be done a priori. In any practical implementation of the strategy, only the pools with infected samples pass to the following stage. In the example of the figure, the test performed on W 0 is positive (because there are samples of infected individuals in W 0 ). The subpools, W 00 , W 01 and W 02 , of W 0 are then tested at the second stage, the first two of which turn out to be positive because they contain samples from infected individuals. All the sub-pools of W 00 , W 01 and W 02 , are depicted in the third-stage row of the figure, but only those that are tested at this stage contain circles that are colored in red or green if infected or not, respectively. The grey circles correspond to samples identified as not infected at previous stages. Thus, only W 000 , W 001 , W 002 , W 010 , W 011 and W 012 are tested at the third stage, three of which turn out to be positive so that 9 samples are tested individually at the last (4 th ) stage. The total number of tests that are performed in this case is 19, which is smaller than 27, and the cost is 0.7. As illustrated in Fig. 1 of the paper, this situation occurs in less than 5% of the cases for which the strategy with k = 3 and m = (27,9,3) is optimal.
In order to understand the optimization of the procedure it is best to introduce a labeling of the pools at all stages from the very beginning, as if none of them ever tested negative. To this end, let us call, W i1 , the set of individuals whose samples are contained in the i 1 −th pool of the first stage. We subdivide this set into m 1 /m 2 subsets with m 2 individuals each. Let us call W i1i2 the sets of individuals whose samples, at the second stage, are in the i 2 −th subpool of the first stage i 1 −th pool. We repeat this labeling at all stages, so that at the k−th one we have subsets W i1...i k . For the sake of simplicity, we will also refer to the pools of samples with these same names, W i1...i k . To each of these subsets we assign a variable, φ, such that φ = 0 if there is at least one infected individual in the subset and φ = 1 otherwise. We introduce the notation . We illustrate this labeling in Fig. 1 where we depict one first stage pool, W 0 , and its subdivision for an example with 4 stages (k = 3) that starts with m 1 = 27 and such that each (positive) pool is subdivided into three new pools at each successive stage. The name "nested" reflects the fact that samples that come from different positive pools at a given stage are not mixed up for testing at subsequent stages.

Number of tests
Using the labeling illustrated in Fig. 1 and the definition of φ we can write the number of tests, T (j) k (m, p), that are needed to perform at each stage, 2 ≤ j ≤ k + 1, of the (k + 1)-stage strategy with pool sizes, m = (m 1 , ..., m k ), to identify the infected individuals in each first-stage set, W i1 . This number is equal to m j−1 /m j times the number of pools that test positive at the j − 1 stage: where we have introduced m k+1 = 1. The only non-zero terms in Eq. (1) correspond to the m j−1 -sample pools with φ i1i2...ij−1 = 0 (i.e., with infected samples). The probability Taking into account that all pools are tested at the first stage, the expected value of the total number of tests that is needed to identify the infected individuals in any first-stage pool with the m = (m 1 , ..., m k ) strategy is then given by: In the case k = 1 this is just the formula computed by Dorfman [3]. The computation for all k seems to appear first in Kotz and Johnson [7], see also the monograph by Johnson, Kotz and Wu [8]. More recently the formula appears in [11][12][13]27]. This last paper generalizes the formula to the case of individual dependent prevalences and variable pool-sizes. All these papers study the role of sensitivity and accuracy. Certain approximations hold when the number of samples that correspond to infected individuals is low enough so that there is at most one infected sample in each pool. In such a case, at any given stage, only one pool "makes it" to the following stage. Thus, the total number of tests that have to be performed for each of these pools is: and the number of tests per individual can be written as: withp = 1/m 1 . The latter is equal to the expected number of tests per individual for realizations with at most one infected sample per initial pool if we setp equal to the infection prevalence, p. It also coincides with the linearization of Eq. (21) replacing,p, by the infection probability, p. We proved [25] that the cost of the optimal nested strategy is smaller than the one given by Eq. (22). This is reflected in the example of Fig. 3 (a). Eqs. (4) and (22) withp = 1/m 1 also hold when there is more than one infected sample in the initial pool that remain in the same pool up to the k-the stage. The variance of the number of tests can also be computed analytically [25]. We only quote here the result for the special case of the strategies in which the ratio of 4/30 consecutive pool sizes is constant and equal to m k . As described later, a strategy of this type (with m k = 3) is the optimal for most values of the infection probability, p. The variance for the strategies with m = (µ k , µ k−1 , . . . , µ), µ > 1, can be written as [25]: where q = 1 − p. This variance gives a standard deviation for the number of tests per individual: Given a nested strategy, the probability of having to perform a certain number of tests can also be computed analytically as a function of the infection probability, p, using combinatorics. The problem becomes increasingly complicated as the size of the initial pool gets larger and the number of possible outcomes also increases. We hereby describe how to proceed in the case of the strategy illustrated in Fig. 1, which has k = 3 and m = (27,9,3). Given an initial pool, the total number tests, T , that will have to be performed under such strategy can take on ten possible values: {1, 7 + 3 × 1, 7 + 3 × 2, . . . , 7 + 3 × 9}. The procedure consists in counting in how many ways one can end up performing a number of tests, T , equal to one of these ten values, given a number of infected samples that can go from 0 through 27, pondering the various cases by their probability of occurrence. We limit the description to computing the probabilities of occurrence of the first few possibilities, but the procedure can be extended to all. Let us then call minimal subpools the sample pools associated to the nine sets, W i1i2i3 , 0 ≤ i 1 , i 2 , i 3 ≤ 2 (third row in the scheme of Fig. 1) and maximal subpools those associated to the three sets W i1i2 , 0 ≤ i 1 i 2 ≤ 2 (second row in Fig. 1).
The first case, T = 1, occurs when there are no infected samples in the initial pool. This occurs with probability: The second case, T = 10, occurs when all the infected samples of the pool belong to the same minimal subpool. A minimal subpool can have 1, 2 or 3 infected samples. To compute the probability we first fix the number of infected samples and then count in how many ways we can distribute them so that they all belong to the same minimal subpool. Finally we add over the situations with 1, 2 or 3 infected samples with their corresponding probability of occurrence. We obtain: In this equation, the term, 3 We illustrate these computations in Fig. 3 and 4 (shown in green) where they are compared with the results of stochastic numerical simulations of the nested strategy performed over various initial pools as explained later.

Cost and optimization
The aim of pooling is to reduce the number of tests as much as possible. Let the cost be the expected number of tests per individual [8]: which, given Eq. (3), is equal to: Some properties are derived from the linearization in p of Eq. (21) which is given by: This expression is similar to the expected number of tests per individual for realizations with at most one infected sample per initial pool if we replace p by the infection prevalence, p. We proved [25] that:

7/30
This implies that we can always expect our proposed strategy to produce, on average, a larger reduction in the number of tests per individual than the one predicted by Eq. (22). The optimization problem consists in finding the value, k, and the sequence of pool sizes, m, that minimize (21) for a given p. We proved in [25] that, for p ≥ 1 − 3 −1/3 it is best not to pool and that, for p ∈ (2 −51 , 1 − 3 −1/3 ), the optimal nested strategy is of the form (3 k , . . . , 3) for most values of p interspersed with small p intervals over which it is (3 k−1 4, 3 k−1 , . . . , 3). In [25] we gave a precise description of the optimal values of k as functions of p and of the p-intervals over which one or the other scheme is best. We show the cost of the optimal strategy as a function of p in Fig. 2 (a). We conjecture it holds true for all p ∈ (0, 1). . We observe that for the largest (smallest) probabilities displayed in the figure, the first (third) one is best. In between there is a small interval over which the best is the one with m 1 = 4 · 3.
The costs of the strategies with m = (3 3 , 3 2 , 3) (black dashed curve), m = (4 · 3, 3) (red solid curve) and m = (3 2 , 3) (black dashed-dotted curve) are shown in Fig. 2 (b). These strategies are optimal for 0.0146 p 0.0380; 0.0380 p 0.0431 and 0.0431 p 0.1098, respectively [25]. Not only the interval of optimality for the strategy with m 1 = 4 · 3 is very small compared to those of the other two ( Fig. 2 (a)), but also the difference between the three costs is very small (< 0.4%) over this interval. A similar behavior is observed for all k.
The paper of Black et al [27] also looked for the optimal within the family of nested strategies considered here. To this end, the cost of each possible strategy was computed and then minimized by inspection. The example of a 5-stage strategy with initial pool size, m 1 = 18, was worked out in detail. In [27] the optimal strategy is called ORC. Our optimal strategy corresponds to the ORC formula of [27] in the restricted case of no errors, equally sized groups and known homogeneous prevalence, p. Our theoretical contribution is an analytic expression for the optimal (unrestricted) strategy for any value of p. Since we have no formula for situations with further constraints, such as maximal group size and/or maximal number of stages, we have implemented a fast program (available at [26]) to find the 10 best strategies satisfying those restrictions. As mentioned elsewhere, two of the lemmas demonstrated in [25] are key to accelerate the search of the optimal under constraints. It is a matter of future research to understand if our program can be used to accelerate the computation of the costs in the algorithm of [27].

Cost and variability among individual realizations of the strategy
The cost that the optimization seeks to minimize is an expected value. Any realization of the strategy, however, will result in a number of tests that will differ from the expected value. In order to study the variability among individual realizations of the nested strategy we performed stochastic numerical simulations as explained in the main 8/30 paper. Some of those results are in the paper. In this note we include two additional figures that serve to illustrate two aspects of the nested strategy. Fig. 3 shows that, in the optimal case, the probability that the samples of two infected individuals that are initially in the same pool remain in the same pool throughout the application of the strategy is not negligible.
We show in Fig. 3 histograms of the number of infected samples per pool for a stochastic simulation in which the strategy with k = 3, m = (3 3 , 3 2 , 3), is applied to 1000 first-stage pools within which there are 544 infected samples (same simulation as in Fig. 3 (a) of the paper). We can observe that ∼ 6% of the 1545 pools that are "tested" at the third-stage have two infected samples. The pools holding more than one infected sample at the k-th stage help are expected to reduce the average number of tests per individual needed to identify the infected ones. They therefore contribute to the inequality between the actual expected value of the number of tests per individual and the one where no more than one infected sample is present in each pool at all stages. We show in Fig. 4 (a) the results of stochastic numerical simulations in which we apply the strategy with m = (3 k , . . . , 3) and k = 3, which is optimal for p ∈∼ (0.01347, 0.03987), to a situation with infection prevalence, p = 0.00152, (41 infected samples over 27000 analyzed). As before, the figure shows the histogram (in log scale and as fraction of occurrences) of the number of tests performed per initial pool obtained with 1000 numerical realizations of the strategy (i.e., with 1000 initial pools of 27 samples each). We infer from the figure that, within these 1000 realizations, there are no instances in which the initial pool had more than one infected sample. The pools with exactly one infected sample are those requiring 10 tests to be resolved and the rest corresponds to pools with no infected samples and only one test. In this case the resulting average number of tests per individual, 0.0507, is equal to the one given by Eq. (22) withp = p = 0.00152 and the resulting standard deviation is 0.066. We contrast this distribution with the one that is obtained with the strategy, m = (3 k , . . . , 3), that is optimal for p ≈ 0.0015. We show in Fig. 4 (b) the histogram obtained with 1000 realizations of this strategy for which it is k = 6 (729,000 samples analyzed, 1133 infected). In this case, even if there are pools that need many more than 10 tests to be resolved, starting with a larger number of samples (m 1 = 749) results in a smaller average number of tests per individual, 0.0264, and standard deviation, 0.0212, than when using the suboptimal strategy with k = 3. If we restrict the number of realizations of the optimal strategy to 37, so that a similar number of individual samples (27,713) is analyzed as in the 1000 realizations of the strategy with k = 3 (27,000), a smaller number of tests per individual (average number = 0.0256 with standard deviation= 0.0218) is obtained compared to that of the suboptimal case. However, in the analyzed example, the suboptimal case still produces a very noticeable reduction in the number of tests. Even when the resulting standard deviation, 0.066, is slightly larger than the expected value, 0.0507, it is clear that practically no pool will require more than 10 tests to be resolved (which is < 40% of the number of samples in the initial pool). In general, if we apply the m = (µ k , . . . , µ) strategy to a situation with p p µ,k+1 , not only the expected number of tests per individual, (1 + kµ)/µ k , will be given by Eq. (22), but more likely the number of tests per pool required will never exceed (1 + kµ), which is smaller than the number of individuals tested per initial pool, m 1 = µ k for k ≥ 2 and µ ≥ 3.

Algorithm to search for the optimal strategy when there are constraints
Although we have a prescription of the optimal nested strategy for every infection probability, p > 2 −51 , sometimes there are constraints that prevent the optimal from being used. In such a case, an algorithmic approach can be used to search for the optimal strategy. The idea is to build a "tree" of sequences as described in what follows.
To simplify the description we here characterize the (candidate) nested strategies by the sequence of numbers m = (m 1 , . . . , m k , m k+1 ) with m k+1 = 1, i.e., with an additional 1 at the end that represents the stage of individual sample testing. The tree is then constructed as follows: 1. The root of the tree is the sequence with a single element, (m 1 ). 3. If the sequence (m 1 , . . . , m ) is a node of the tree and m = 1 there are no branches emerging from (m 1 , . . . , m ). This node then corresponds to a fesible nested strategy with k = − 1.

10/30
As a consequence of this construction the terminal nodes of the tree are all the feasible sequences that start with the number, m 1 , of the root. The cost, D k (m, p), of each of these sequences can be computed using Eq. (21). The sequence with the minimal value of D k (m, p) is the optimal for the given m 1 and p.
The number of nodes can be reduced by taking into consideration the results of two lemmas that were proved in [25]. Namely, that at an optimal solution with m = (m 1 , ..., m k ): i) the ratio m j /m j+1 is not smaller than the ratio m j+1 /m j+2 for all 1 ≤ j ≤ k − 1; ii) m j /m j+1 can only be equal to two for j = k. These two lemmas prune the tree and accelerate the search of the optimal strategy for a given m 1 .
The algorithm based on this description is available at [26].

Practical implementations to reduce misclassification errors
The formulas derived so far, hold under the assumption that there are no testing errors. Facilities that identify virally infected individuals using RT-qPCR or related methods perform, simultaneously, several additional tests to reduce misclassifications. This is done even if samples are tested individually (i.e., without pooling), for which the use of multi-well plates, as described elsewhere in the main paper and in this Supplementary file, is very convenient. We describe in what follows the errors that can be encountered, regardless of whether the samples are tested in pools or not, and the controls that are done to reduce these errors. First of all, it is worth noticing that the probability of having false positives is very low. If they occur, on the other hand, it is very unlikely that they are not detected with the control tests. Namely, false positives may only occur if the sample is contaminated with nucleic acids of the virus or capable of giving a signal equal to that of the virus. If this contamination is introduced between the moment when the sample is extracted from the patient and the first opening of the tube in the lab (prior to the preparation of any pool), there is no way of knowing that the sample has been contaminated and will be identified as positive (unless, by chance, it is misclassified as negative). This, however, is not related to a solvable misclassification and would occur regardless of whether the test is run on pools or on individual samples. Laboratories take all the precautions not to have this contamination, so that these false positives are not observed in practice. A contaminant entering at any of the steps involved in the preparation and testing of the pools (or individual samples) would immediately be detected in today's automated systems in which the PCR test is run simultaneously on several tubes that have been prepared using the same master mix of reagents. To detect contaminations of this sort, it is mandatory that the testing facility, in all runs (regardless of whether the test is performed on pools or on individual samples) one or more negative controls be prepared and tested simultaneously with the tubes containing the patients' samples. The negative controls are tubes in which only a sample from uninfected people is added. Thus, if a negative control tube gives a positive result, the reagents are discarded and the entire run is processed again. It is highly unlikely that this type of contamination will occur at all in today's automated systems. The probability that it occurs, on the other hand, is independent of the infection probability and of whether the samples are tested in pools or individually. The number of tests that, on average, would have to be performed if some of them had to be repeated due to this contamination would be larger than the expected number by a similar factor if the pooling strategy were applied or if the samples were tested individually. Therefore, the optimal pooling strategy derived under the assumption that the tests carried no errors would give similar savings with respect to testing individual samples in the presence or not of this contamination.
The other aspect to be considered is the controls that are run to reduce the occurrence of false negatives (both for pool or individual testing). As described in what follows, there are many possible causes for the occurrence of false negatives. One of 11/30 them is the presence of an inhibitor of the PCR reaction in the testing tube. In order to detect the occurrence of this inhibition the internal control that is added consists in amplifying not only the RNA of the virus but also a human RNA that comes from the cells of the individual that are extracted to detect the infection. This is done simultaneously, in the same tube, using a master mix with the necessary components to identify both types of RNA. The result associated with the presence of the virus is considered negative only if it is seen that there is no signal for viral RNA but there is signal for human RNA in the same tube. As mentioned in the main body of the paper, when testing individual samples, if both signals are negative the result of the infection is not reported as negative or positive, as it may be due to the presence of inhibitors or because the sample was not well preserved or the RNA extraction failed. It is then recommended that a new sample be drawn from the patient. A similar approach should be followed if this problem is encountered when testing pools. If all the tests performed (including the positive control described in the following paragraph) happen to be negative for human RNA, it must be concluded that the reagents are deteriorated or the PCR machine is not working correctly.
Another possibility for a false negative is that the faulty component of the master mix is one of the virus-specific primers or the dye that is used to detect the products of the amplification that come from the viral RNA. In this case, the test would give a human RNA signal, but would fail to detect the viral RNA. In order to prevent this from happening, a "positive control" containing a specimen with viral and human RNA is tested simultaneously in the same run. A problem with the primers and/or the dye that gives the viral RNA signal will be noticed if this "positive control" tests negative for the viral RNA and positive for the human one.
The occurrence of false negatives due to the causes described so far can, in principle, be detected through the various controls that are mandatory to run in today's testing facilities. The probability of having faulty reagents or dyes is independent of the infection probability or of whether the tests are run on individual samples or on pools. The probability of having an inhibitor of the PCR in the testing tube, on the other hand, scales with the final volume that goes in the testing tube and is also independent of the number of samples or infection probability. Thus, the need of having to run some tests more than once due to the occurrence of these detectable errors would happen at the same rate for individual or pooled testing. Our optimal strategy, obtained under the assumption that there were no detection errors, would then give similar savings with respect to testing individual samples in this case as in the case with no detection errors. A similar situation holds for the problems that may arise during the RNA extraction procedure given that, for pool testing, this procedure is done on the pools.
In the case of pool testing there is an additional possibility for false negatives which we discuss in the manuscript: the internal (human RNA) control per se cannot detect the presence of a single or a few mishandled samples in a pool given that the human RNA coming from the "good" samples would give a detectable signal. The RNA quantification that ddPCR naturally provides at the end of the test can be used to solve this problem, as we describe in the main body of the paper. Thus, we have a detection method that would allow us to correct for this error.
Finally, another possible cause for false negatives is having samples with very low viral load. In that regard, performing tests on pools of samples exacerbates this problem, as the viral RNA content of each infected individual gets diluted when the sample is combined in a pool with the samples of others, particularly of people who are not infected. As shown in the main body of the paper, in 3-sample pools the presence of a single sample coming from an infected individual with the lowest viral load reported in [28] may be detected, according to our calculations (see Eq. (6) in the paper and Sec. 5 below), in more than 97% of the cases and, in more than 70%, if 9-sample pools 12/30 are used instead. The calculation shows that average viral loads can be detected even with 27-sample pools in over 99% of the cases. The studies of [29] show that the distribution of virions content among symptomatic or asymptomatic infected individuals is log-normal. Assuming that their estimate of virion concentration is equal to that of RNA copies and using the mean (5.9 10 6 virions/ml) and median (2.5 10 5 virions/ml) they obtained for one of their primers, we calculate that ∼ 0.5% of the population has viral loads that, upon the D = 8 dilution of their experiments, would result in having, on average, one RNA copy or less in a 20µl testing volume (c R;i /D = 50/ml). On the other hand, ∼ 93.5% and ∼ 84% of the infected population would have, respectively, c R;i /D > 700/ml andc R;i /D > 2500/ml, allowing the detection of a single infected sample in pools of 3 or 9 samples with probability higher than 99%. This does not imply that only ∼ 92.6% and ∼ 84% of the infected population would be identified as infected when using 3 or 9-sample pools, respectively. Namely, when using the unrestricted optimal strategy for a given p, there is a non-negligible probability that two or more infected samples be contained in the same pool, especially at the first stage. This is illustrated by the stochastic simulations of Fig. 3. Having more than one sample corresponding to an infected individual in a pool would certainly facilitate the detection of the viral RNA in it. The numbers derived, on the other hand, depend on the assumption that the virion content estimated in [29] corresponds to RNA copies as measured by ddPCR. Comparing the calibration between Ct values of the RT-PCR test and virions/ml reported in [29] and the relationship between Ct values and RNA copies/test obtained with ddPCR in [28] we could estimate that the number of RNA copies/test of [28] is beween ∼ 2.5 and 5 times that of virions derived in [29]. Using the 2.5 rescaling factor to go from virion [29] to RNA [28] content, we then obtain that 0.17% of the infected population described in [29] would have, on average, one RNA copy or less in a 20µl test and that 97%, 91.5% and 81.8% of the same population would have RNA number copies that would allow the detection of a single infected sample in pools of 3, 9 and 27 samples, respectively, with probability higher than 99%. These numbers, however, need a direct validation using ddPCR. It is worth noticing that the FDA in the US recommends that, after pooling, test performance should include ≥ 85% positive agreement when compared with the same test performed on individual samples (https: //www.fda.gov/medical-devices/coronavirus-covid-19-and-medical-devices/ pooled-sample-testing-and-screening-testing-covid-19#pooled). Considering that the smallest viral load that is detected with ddPCR can be up to two orders of magnitude smaller than the lowest end detected with the current gold standard, RT-qPCR, it is likely that this 85% positive agreement between pool testing strategies using ddPCR and single sample testing will occur, especially if we restrict the maximum pool size to having ∼ 27 samples. Analyzing in detail how all the aspects we have just described interact to determine the probability of detection in pools and in individual samples will be the subject of a future study.

Cost and test errors
Based on the previous descriptions, we analyze in this subsection the strategy under the assumption that there are no false positives, but that the presence of false negatives cannot be discarded.
Given ≥ 1, let us denote by S e ( ) the probability a pool of samples is declared positive given that at least one sample in the pool is truly positive. In the absence of errors S e ( ) = 1 for all , and formula (3) computes the mean number of tests per individual associated to a nested strategy with k + 1 stages and sequence of pool sizes m = (m 1 , . . . , m k ). When S e ( ) < 1 for some ≥ 1, the mean number of tests per 13/30 individual associated to the same strategy becomes as computed in [8], Section 6.3. Notice that the presence of these types of errors typically decreases the number of tests, this is due to the fact that a larger number of pools test negative, thereby stopping the search for infected samples in these pools. In terms of the mean number of tests per individual, the expression in (24) is less than or equal the associated expression in (3), for the same value of q = 1 − p, and the same strategy with parameters k, m.
The probability that an infected sample is identified as positive by the algorithm, on the other hand, can be computed as that is, the probability that at each stage j, with 1 ≤ j ≤ k + 1, the pool the sample belongs to is correctly identified as positive (see [8]). Instead, Kim et al. [12] consider the mean number of false negative classifications per individual. For a nested strategy with parameters (m, k), this becomes Knowing the optimal strategy for a given p with and without constraints in the errorless case is also relevant when test errors are included. To illustrate this point, let us consider one of the examples discussed in [12]. In particular, in that paper the authors follow the recommendations of [30] which are based on results derived under the assumption that there is at most one infected sample per pool at any given stage, an assumption that we do not make. In particular, for the 3-step algorithm (k = 2 in our notation) without error that is considered in [12], this implies taking m 2 = √ m 1 . The subsequent analysis is then based on this choice. Let us then consider the case of identification of individuals with acute HIV infection discussed in Section 7 of [12]. In this case the authors consider the prevalence p = 0.0002, a 99% test specificity and a 90% test sensitivity and work under the assumption that the sensitivity and the specificity are independent of the total number of samples and infected samples in the pool. According to our calculations with no restrictions on the number of stages, the optimal for p = 0.0002 has 8 stages, the sequence of pool sizes is m = (3 7 , 3 6 , ..., 3, 1), and the mean number of tests per sample is ≈ 0.0045, ignoring test errors. If, on the other hand, we limit the exploration to strategies having at most 3 stages, the optimal sequence of pool sizes given by our optimization program for this prevalence is (307, 17, 1). In this case, if we also include the given specificity and sensitivity in the computation (see formula (3) in [12]), the mean number of tests per sample is approximately 0.01, which is better than 0.016, the mean number of tests per specimen associated to the strategy with pool sizes (90, 10, 1) used in [12]. This implies that the number we can expect that would have to be performed using the strategy of [12] would be approximately ∼ 60% larger than the number we can expect if our optimal strategy with constraints were used instead.
The optimal strategy just described depends on the infection probability, p, a parameter that might be unknown a priori. As shown in the paper, applying a strategy that is optimal for a given p to a situation with smaller infection probability still produces a noticeable reduction in the number of tests per individual that have to be performed to identify the infected samples. The fact that PCR tests can be run in parallel on several (individual or pools of) samples at the same time using multi-well plates opens the possibility of applying a sub-optimal strategy and using the first round of tests to obtain an estimate of p. In this way, one option is to start with the strategy that is optimal for the largest p for which the pooling of samples is advisable (p ∼ 0.3). The infected samples identified in this way could give an estimate of p with which to choose the strategy to be applied to the following batch of pools. Now, this largest p optimal strategy starts with pools of m 1 = 3 samples. Thus, if p is moderately small it is likely that, when running the strategy in parallel on a 96-well plate, no pool will test positive (∼ 1.4 pools would test positive on average for p = 0.005). In such a case the first (parallel) run would not provide an estimate of p. Another possibility is to perform a first (parallel) round solely to estimate p. In either case, these runs would allow the identification of some of the non-infected samples (those that happen to be in pools with no infected samples). We describe in what follows how we can proceed if we choose one or the other option. In this work we focus on the use of the pool testing strategy to identify infected individuals. In the case of a pandemic, like the SARS-CoV-2 one, estimating the infection prevalence within the population of a certain region is also relevant to make decisions to prevent the spread of the epidemic. The procedures described in what follows can be used for this purpose as well. Since, for this application, it is a matter of estimating the probability, and not of producing correct identifications for all individuals, the possibility of false negatives or positives has no significant effect. The probability estimate, p, would be obtained at a much cheaper cost with essentially the same effectiveness as if samples were tested individually.

Running a set of parallel tests to infer p.
Let us assume that we are working with a 96-well plate. Situations with plates of other sizes can be handled analogously.

2.
We run the test on the 96 pools simultaneously and determine the number, W h , of pools with m 1 = h (h = 3, 9, 27) that test positive in this first run.
3. We compute the estimator of the probability,P h , that a pool with m 1 = h samples within a set of N w = 32 pools turns out to be positive as: 4. We derive an estimator,p, of the (individual) sample infection probability, p, as: wherep The term h−1 2h is added in order to reduce the bias [31]. If any of the values,p h , is too small, it is best to follow the procedure described in the next section.

Dynamic estimation of p as the strategy is applied
In this case we apply the strategy as stated in the manuscript, with some modifications depending on the outcome of the first set of parallel tests. If p is completely unknown but one wants to apply the strategy anyway, it is best to start with m 1 = 3. The test is then run simultaneously on 96 pools with 3 samples each. There are four possible types of outcomes and, thus, of subsequent steps, as described in what follows. I) If the number of 3-sample pools that test positive in the first run, N + p , is such that 0 < 3N + p < 96: 1. We obtain an estimate,p, of the individual sample infection probability asp =p h using Eqs. (27) and (29) with N w = 96, h = 3 and W h = N + p . We derive the optimal pool size for the estimated probability, m 1 (p), as: 2. We accommodate, in the 96-well plate, the 3N + p samples of the N + p pools that tested positive in the first stage and 96 − 3N + p pools with m 1 ≤ m 1 (p) samples each. If m 1 (p) is smaller than the maximum number allowed for a reliable detection, then we set m 1 = m 1 (p), if not we choose the largest number of the form m 1 = 3 k that allows the reliable detection of one infected sample in an m 1 -pool.
3. After the second run of 96 parallel tests is done we know the number, N i , of infected samples that were contained in the first set of 96 3-sample pools. We can then obtain a better estimate of the infection probability as: with h = 3. Using this estimate we recompute the best pool size, m 1 , using Eq. (30).
4. For the third run of tests, we use as many wells as necessary to accommodate the (m 1 /3)-sample pools that come from the m 1 -sample pools that tested positive in the second run and, if there is room, pools with the newly determined value of m 1 . This procedure is then continued as long as there are samples.
II) If the number of 3-sample pools that test positive in the first run, N + p , is such that 96 ≤ 3N + p < 3 · 63 = 189: 1. We obtain an estimate,p, of the individual sample infection probability as in the previous case, namely using Eqs. (27) and (29) to computep h with h = 3 and then setp =p h . We then use Eq. (30) to estimate the best pool size, m 1 (p).
2. We test individually 96 of the 3N + p samples contained in the N + p pools that tested positive in the first run. We keep on doing this with the other individual samples of the initial N + p pools as long as there are no unused wells in the plate. Once there is room, we pool the new samples in groups of m 1 samples each with m 1 given by the estimate, m 1 (p), derived in the previous step if this value does not exceed the maximum allowed for a reliable detection. Otherwise, we choose, as before, the largest m 1 = 3 k that satisfies the restriction for a reliable detection.
3. Once all the samples contained in the initial 3-sample pools are identified as infected or not, we know the number, N i , of infected samples in the initial batch with which we can obtain a better estimate of the infection probability using Eq. (31) with h = 3 and then compute the best pool size, m 1 , with Eq. (30).
III) If the number of 3-sample pools that test positive in the first run, N + p , is such that 3N + p ≥ 3 · 63 = 189: 1. We continue working with individual samples (no pooling) both with those contained in the N + p pools that tested positive in the first run and with the untested samples so far. For the latter, each run will give the number of infected individuals, N i , from which the infection probability,p, can be estimated using Eq. (31) with h = 1. As long as the obtained value satisfieŝ p ≥ 1 − 1/3 1/3 ≈ 0.307 we continue without pooling. Otherwise, the optimal m 1 can be determined with Eq. (30) and a strategy that starts with this value of m 1 (or smaller as discussed before) is applied.
IV)If none of the 3-sample pools tests positive: 1. We choose m 1 as the the maximum number of the form m 1 = 3 k that allows the reliable detection of one infected sample within an m 1 pool.
If an estimate,p, of the infection probability is available a priori, we then start with pools of size m 1 as defined in Eq. (30). If only the order of magnitude ofp is known, we then start with m 1 = 3k 3−1 withk 3 as in Eq. (30). After the first run of tests is performed on the 96 pools containing m 1 samples each, we compute the estimator,P h , of the probability that a pool tests positive using Eq. (27) with N w = 96 and h = m 1 . The individual sample probability is then estimated using Eq. (29) with h = m 1 . This newly determined estimate,p, is then used to decide the size of the new pools if there is room to place them in the plate together with those that come from the 96 pools tested in the first run, as explained in II) before. Once the number, N i , of infected samples among the 96m 1 tested in the first run is known, a new estimate of the infection probability can be obtained using Eq. (31) with h = m 1 . The infection probability can be re-estimated as the strategy is applied either using Eqs. (27) and (29) or Eq. (31), depending on whether we use the information drawn, respectively, from the first or the last stage of the strategy. Averaging the estimates obtained in various ways could also be a good option.

.1 The general case
In ddPCR the volume of each test, i.e., the testing volume, V t , that goes into one well is divided into M sub-volumes, V gt , of ∼ 1nl each: M varies depending on the equipment. We will use M = 20, 000, V gt = 1nl and V t = 20µl. The testing volume, V t , in turn, comes from a dilution of a volume, V s , that is taken from the sample. Namely, V t contains the nucleic acid molecules contained in a volume, V s , taken from the original sample and other solutions with the material that is needed for the test to proceed. Given that this procedure does not add any new molecules of the nucleic acids that the test identifies, for the purpose of their quantification we merely think of it as introducing a dilution of factor, D, so that: V s , on the other hand, is part of the volume of biological material that is taken from the individual to detect the presence of the nucleic acids of interest. We call, V o , the volume of this biological material of which V s is a sub-volume with no dilutions involved. We clarify what we mean by this in what follows. The aim of the quantification is to determine the concentration, c R , of the nucleic acids in V o . To this end, we think of V s as the sum of M = 20, 000 sub-volumes, V gs , each of which, when diluted by the factor D, corresponds to one of the sub-volumes of the ddPCR test: We assume that each of the sub-volumes, V gs , is a sufficiently small fraction of V o so that the numbers of nucleic acid molecules contained in each of them, N Rg , are independent identically distributed (i.i.d.) random variables. The mean of these variables is such that The first of the two equalities in Eq. (36) reflects what we meant before when we said that there was no dilution involved in the extraction of V s from V o . Given that the dilution that transforms V s into V t does not change the nucleic acid content, these i.i.d. random variables also correspond to the numbers of nucleic acid molecules in each of the tested sub-volumes, V gt . The variables, N Rg , are described by a Binomial distribution: of parameters As we show later, the Poisson approximation with mean given by Eq. (36) serves very well for our purpose too. ddPCR is an end-point method which determines (ideally) how many of the M V gt ∼ 1nl sub-volumes contained at least one nucleic acid molecule. Under the assumption that the M numbers, N Rg , are i.i.d. according to Eqs. (37)- (38), the fraction of sub-volumes with at least one nucleic acid molecule is an estimator of the probability, P +g , that one such volume contains at least one of these molecules: Using the second equation in (38) we can rewrite Eq. (39) as: or, equivalently, as: Given Eq. (34) and the fact that V s < V o it is V gs /V o < 1/M = 5 10 −5 , the l.h.s. of Eq. (41) can be approximated by −V gs /V o . With this approximation and using the two equations in (38), Eq. (41) can be rewritten as: This equation is also obtained if the variables, N Rg , are assumed to be Poisson distributed with mean given by Eq. (36) in which case 1 − P +g = exp(−c R V gt /D).
Let us assume that the test determined that M + of the M sub-volumes contained molecules of the nucleic acid of interest. As explained in Sec. 4.3, if M P +g > 5 and M (1 − P +g ) > 5, the fraction:P is an estimator of P +g such that: with 95% confidence. If not, other intervals for P +g can be estimated as explained in Sec. 4.3. The "large sample" limit (M P +g > 5 and M (1 − P +g ) > 5) guarantees that the borders of the interval are bounded away from 0 and 1. Another important property of this limit is thatP +g is normally distributed around P +g . We can then replace P +g in Eq. (42) by its estimator,P +g , [32] and obtain an estimate,ĉ R , of the nucleic acid concentration in the original sample, c R , as: This is the usual formula with which the content of nucleic acids is quantified in ddPCR experiments (see e.g., [32]). In view of Eq. (44), the range of possible values, c R , for each value ofP +g is then given by: This implies that the large sample limit can be used if the number of sub-volumes that test positive satisfy 11 < M + < 19989. In terms of the estimated concentration, Eq. (47) can be rewritten as: or, equivalently, that where we have used that the testing sample, V t = 20µl, is such that V t = 20, 000V gt . As discussed in the paper, these ranges allow the quantification of viral RNA in most of the samples taken from people infected with SARS-CoV-2, with the exception of those with the very highest viral loads [23,[33][34][35]. As illustrated in Fig. 5, the uncertainty in the determination of the concentration, on the other hand, remains at manageable levels throughout the range for which the large sample limit holds. Even though at very low values the uncertainty is ∼ 60%, which implies that only its order of magnitude can be determined when c R /D ∼ O(10 −1 /ml), at M + /M ∼ 0.005 the uncertainty is already of the order of 20%. For larger, M + /M , the uncertainty goes down and never exceeds 10%.

The case of pooled samples.
Let us apply now the analysis just described to the case of pools of samples. To this end, let us denote by original sample the sample that is taken from an individual before it is diluted and/or mixed with any other sample to proceed with the test. Let us identify each of the original samples that go in the pool of interest with the subscript, i, 1 ≤ i ≤ m j . Each of these original samples has a volume, V o;i and a certain concentration,c R;i , of the RNA that is to be detected by the test (c R;i = 0 is an option). Let us assume that, for the test, a volume, V (j) s;i , is drawn from V o;i for each 1 ≤ i ≤ m j . We assume that the volume extraction is done in such a way that guarantees a small variability of V s;i , 1 ≤ i ≤ m j , are at some point combined into one volume, V s , and diluted with other volumes that contain the materials for the test to proceed. We call this final volume the testing volume, V t ∼ 20µl. These various volumes are then related by: where D is the dilution factor. Given that V R;i of mean:

20/30
The total number of RNA molecules in the m j -sample pool that is tested is then: R is Poisson distributed with mean: Thus, the concentration that can be estimated in the case of an m j -sample pool is: Let us now consider an m j -sample pool with m j > 1 that tests positive at the j-th stage so that, at stage j + 1, it is sub-divided into m j /m j+1 sub-pools with m j+1 samples each. Given Eq. (51), the size of the volumes that are extracted from the original samples at this stage, V (j+1) s , are related to the previously defined volumes by: The number of RNA molecules contained in the volume, V The number of RNA molecules in each of the m j+1 -sample sub-pools is then: and, as in Eqs.(54)-(55), their means satisfy: where c (j+1) R; is the RNA concentration that is estimated for the -th sub-pool at the end of the (j + 1-stage) ddPCR test. The sum of the RNA molecules over all the sub-pools that come from the same m j -sample pool is: and, given Eqs. (56), (57), (59) and (60), its mean satisfies: Eqs. (55) and (61) imply that: which, as explained in the paper, can be used to check the self-consistency of the results.

21/30
4.3 Formulas to estimate probability from fraction of occurrences.
Here we compile some well established results on the estimation of probabilities from fraction of occurrences, including a discussion on the conditions that guarantee the validity of Eq. (44). Let us assume that an observation can fall into one of two categories, + and −, with probabilities, P and Q = 1 − P , respectively. Let us assume that, given M independent observations, a fractionP , of them falls in the + category. Then, if M is large enough,P is a point estimator of P such that: with a 100(1 − α) confidence level (i.e., there is at least a 100(1 − α)% chance that P is contained in the interval given by Eq. (63)) where ζ α/2 is the critical value of the normal distribution for the confidence level [36,37]. In particular, for α = 0.05 (95% confidence) it is ζ α/2 = 1.96 and for α = 0.01 (99% confidence) it is ζ α/2 = 2.58. The large sample approximation holds if M P ≥ 5 and M Q ≥ 5 [37] and is most accurate if 0.3 ≤ P ≤ 0.7 [36]. Furthermore, under this approximation, the fractionP is normally distributed with mean, P , and standard deviation, σ P = P Q/M . The confidence interval is sometimes enlarged to include a correction for continuity by adding (subtracting) 1/(2M ) to the upper (lower) limit of the interval in Eq. (63) [37]. Outside this limit, more complicated expressions can be used for the lower, P L , and upper, P U , limits of the confidence interval that are always valid for a binomially distributed variable: with x =P M and where the F a,b;α is the 100(1 − α)th percentile of the F distribution with two degrees of freedom [37]. If M P ≥ 5 and M Q ≥ 5 but P < 0.3 or P > 0.7, the lower and upper limits can be computed as: with ζ α/2 as defined before andQ = 1 −P .
5 Minimum viral load detectability and maximum pool size using ddPCR The description introduced in the previous subsection allows us to determine the maximum pool size for which the presence of a single infected sample can be detected with a certain probability. Assuming that ddPCR is able to detect even a single RNA 22/30 particle, then the presence of only one infected sample in an m j -sample pool will be detected if there is at least one viral RNA molecule in the volume of size, V s /m j , with which the sample of the infected individual contributes to the total sample volume, V s . Let us assume that the only infected sample of the m j -pool is the i-th one. Using the notation and assumptions of the previous subsection, the probability that there is at least one viral RNA molecule in the testing volume is: Setting a lower bound to this probability we can determine the minimum detectable viral concentration for a given m j or, conversely, the maximum pool size that allows the detection of at least a certain viral RNA concentration. Inserting Eq. (33) in Eq. (68) we obtain the expression that we use in the paper.
6 Computation of likelihoods for the detection of flawed samples in pools.
The viability of the tested sample and the quality of the amplification procedure are checked by amplifying the genetic material associated to a piece of human RNA that should be present in any good sample. As described in the paper, we are interested in the possibility of identifying the presence of a single flawed sample in an m-sample pool on which a ddPCR test is run. The underlying assumption is that the conditions that guarantee that this distinction is possible would also allow the identification of cases with more than one flawed sample. To this end, we assume that the number of RNA molecules of the i-th individual in a volume of size, V s , drawn (with no dilution) from the original sample of the same individual is described by a Poisson distribution with mean λ i . The distribution is Poisson as well for any sub-volume of V s but with mean rescaled by the ratio of volume sizes. Thus, when pooling together m samples whose mean numbers of human RNA molecules in V s are, respectively, λ 1 , λ 2 , . . . , λ m , the number of RNA molecules in V s of the pool is a Poisson distributed random variable of mean, i λ i /m. To proceed with the calculation we need to consider the distribution of the human RNA content within the human population. Now, the human RNA content (as well as the viral one) contained in the samples that arrive at the testing facility depends on multiple factors including the type of specimen (swabs, saliva, others), the sample handling procedures, the RNA extraction methods, reagents supplier and PCR calibration, besides the intrinsic variability among individuals. There is no empirical data that can support a particular choice for the distribution of human RNA in the samples that are tested. We then decided to work under the most parsimonious hypothesis which is to assume that this content follows a Normal distribution, i.e., that the individual means, λ i , of the good samples are actually instances of a Normal distributed random variable of mean λ and standard deviation, σ.
The results, however, are not very sensitive to this hypothesis. In fact, if instead we consider an exponential distribution with the same mean, since the important variable is the sum of the λ i s, the central limit theorem (at least for m 20) ensures that the approximation is adequate. We also assume that flawed samples have λ i = 0. Under these assumptions, an m-sample pool is characterized by a sequence, λ 1 , λ 2 , . . . , λ m of i.i. (Normal) distributed random variables, if all the samples are good. If there is one flawed sample in the pool, then, the corresponding λ i will be zero and the other m − 1 will be i.i.d. random variables. We notice that, in both cases, the number of RNA molecules is the same in V s and in the test volume, V t = DV s , because the dilution does not change the relevant RNA content.
huge). Nevertheless, in these two particular limits, the accuracy can be calculated exactly using the Binomial distributions and it is equal to 1/2 as shown in Fig. 6(b) of the main body of the paper.
7 Running the strategy in parallel. Numerical examples.
We show in Tables 1 and 2 the results of stochastic numerical simulations performed as before but this time assuming that at most 96 tests can be run simultaneously and that at most 32 individual samples can be mixed in any given pool. Each row of the tables corresponds to the parallel run of 96 tests on pools of the same size. The upper bound on the number of simultaneous tests comes from one of the typical sizes of commercially available multi-well plates where to run the tests. The maximum number of individual samples that can be mixed to detect the presence of a single one infected with SARS-CoV-2 using RT-qPCR has been estimated to be 32 in [38]. Pool overflow refers to the pools of the size listed in the row that come from those that tested positive at the previous stage which could not be tested because of lack of space in the multi-well plate. Table 1 corresponds to a situation in which the infected population is p = 0.02 for which the optimal number of stages for a scheme of the form (3 k , . . . , 3) is k = k 3 = 3. We then compare in this table the results of running the tests in parallel for the (3 3 , 3 2 , 3) scheme and for a scheme that starts with m 1 = 32. We use the same population in both cases, with some added samples in the latter case. While the former scheme is optimal for p = 0.02 and satisfies that m 1 < 32, the latter puts the largest number of allowed individual samples in each of the first pools. Even if the former starts with fewer samples than the second one, it allows to resolve more cases after the first round of the strategy (first 4 time units in the table) because fewer pools (and samples) are left for the second round of stages. Furthermore, the number of tests per individual turns out to be smaller for the (3 3 , 3 2 , 3) scheme. Changing the values of m 2 and m 3 in the scheme that starts with m 1 = 32 does not improve its performance. Namely, the case with m 2 = 16 and m 3 = 4 does not resolve all cases in two parallel rounds, but needs a third one. This is consistent with the proofs of [25] that we referred to before. As per the throughput time, if only the tests included in the table were run as listed in the table, the strategy that starts with m 1 = 32 samples would give a slightly better performance (384samples/unit-time) than the one that starts with m 1 = 27 (370samples/unit-time). If all 96 wells of the plate were used in each run, instead, the situation would be reversed, given that running the strategy with m 1 = 27 leaves many more empty wells in the second round (time units 5-7) where to accommodate additional pools than the one that starts with m 1 = 32. The analysis of how to optimize the use of the plates in detail is the matter of future work. We then perform the same comparison for p = 0.008 for which the optimal scheme of the form (3 k , . . . , 3) has k = k 3 = 4. In this case we would have to start with m 1 = 2 4 = 81 samples which is not allowed. Then, we compare the same two schemes as before and a scheme that starts with m 1 = 32 but has k = 4 instead of k = 3. We show the results in Table 2. There we see that there is no pool overflow in this case, something that we could have expected for the (3 k , . . . , 3) scheme based on the discussion of Fig. 4(b) in the main body of the paper. The comparison shows that the (3 k , . . . , 3) scheme is the one that requires the smallest number of tests per individual even if the number of stages is not the optimal for the infection probability of the population. That strategy is also the one that leaves the largest number of free wells where to accommodate new pools or samples to be tested among the two that are run in 4 time units. Results of stochastic numerical simulations in which the nested pool strategy is performed simultaneously on up to 96 pools each of which can have at most 32 samples. At the first stage there are exactly 96 pools, but at the second stage the number can be larger. This is what we call "pool overflow" in the table. These unresolved pools can be tested whenever there is room. In the table we put in the same row all the pools of equal size that are available for testing at that point. The optimal strategy for p = 0.02 (the one with m 1 = 27) produces less overflow than the optimal one among those that start with the largest allowable number of samples in a pool (m 1 = 32). In this way, the strategy with m 1 = 27 finishes the processing of the 2592 initially tested samples in 7 time units while the one with m 1 = 32 requires 8 to process 3072.  Table 1 but for a situation with a fraction of infected samples for which the optimal strategy requires that 81 samples be mixed together in the initial pools. In this case, the number of infected samples is low enough so that the average number of initial pools that test positive is very low and there is no "pool overflow".