Main

Every year, nearly three million people develop active tuberculosis (TB), but are not notified to health authorities1. Some of these individuals may spontaneously resolve their disease, die or be treated in the private sector, but many remain infectious, fuelling ongoing transmission in the community. Reaching this 'missing three million' remains one of the top priorities for global TB control2. A widely cited reason for the ongoing gap between incidence and cases notified is the lack of highly sensitive and deployable diagnostic tests for TB3. Sputum smear microscopy, the global cornerstone of TB diagnosis4, can miss half of all people with infectious TB5, whereas more sensitive tests cannot routinely be implemented at the point of treatment6,7. Nevertheless, the link between improved diagnostic sensitivity and better TB detection remains uncertain. Studies8,9,10,11 in different settings have found little or no change in the number of pulmonary TB diagnoses or deaths when comparing sputum smear microscopy and Xpert MTB/RIF, a more sensitive molecular test12. This result may reflect high levels of empirical treatment among people who test negative13,14,15. Against this backdrop, a key question remains: if novel diagnostic tests are developed and implemented at scale, what impact can we expect on TB epidemiology within populations?

The impact of TB diagnostics on transmission reflects not only the accuracy of the test, but also the way in which patients with infectious TB interact with members of the community and with health systems over time16,17. These infection pathways have at least three crucial dimensions: the transmission rate (number of transmission events per unit time), the frequency at which people contact health systems (often slower in subpopulations with poor access to care), and the probability of starting effective TB treatment after such contact18. Each of these dimensions varies through the duration of infectiousness (from onset to effective treatment, spontaneous recovery or death)19.

Mathematical models can be a useful tool in helping to demonstrate how these dimensions relate to the impact of diagnostic tests on TB transmission20,21,22. Figure 1a depicts the simplest, and most commonly used23,24,25, conceptualization of TB diagnosis in mathematical models so far. In this framework, on becoming infectious, people with TB experience a series of uniform processes. Specifically, they transmit TB at a constant rate, contact the health system at a constant rate and undergo a constant probability of successful diagnosis (leading to appropriate treatment) with each health-system contact. In this framework, the speed at which someone with TB gets treated — and the number of people they infect before that treatment — are strongly related to the sensitivity of the diagnostic algorithm. If, for example, people with TB contact the health system on average every 6 months with a 50% chance of being diagnosed at each visit, the mean duration of infectiousness will be 1 year (approximately the prevalence/incidence ratio estimated by the World Health Organization1). If a more sensitive test (for example, replacing sputum smear microscopy with Xpert MTB/RIF26,27) can increase that probability of diagnosis from 0.5 to 0.75, the mean duration of disease, and thus the transmission per active case, could be cut by one-third. As a result, the projected epidemiological impact of a more sensitive diagnostic test in this framework is tremendous. This conceptualization of the diagnostic process (constant transmission, constant health-system contact and constant probability of successful diagnosis) over time has permeated nearly all projections of expected epidemiological impact from novel diagnostic tests for pulmonary TB — and it is almost certainly wrong. Figure 1b shows an alternative conceptualization of the TB diagnostic process. In this framework, the transmission rate, frequency of health-system contact and probability of successful diagnosis can all change over time19. As an illustration, if patients remain infectious for an average of 10 months before seeking care and then begin to contact the health system once a month28,29, a 50% chance of successful diagnosis per visit would still result in a mean duration of infectiousness of 1 year — but increasing the probability of diagnosis from 0.5 to 0.75 would only reduce that duration to 11.3 months. Worse still, if most transmissions occur in the first 10 months, then even a perfect diagnostic test at the health facility could not avert those events. Thus, the dynamic trajectories of transmission, health-care seeking and diagnostic index of suspicion over the course of TB disease are inextricably linked to the epidemiological impact of novel diagnostic tests19,30,31,32,33 — and overly simple depictions of those trajectories may systematically overestimate that impact. Adding complexity to these simple frameworks requires additional data to inform a more nuanced understanding of the impact of diagnostic tests. Without such data, and models with sufficient flexibility to incorporate them, it is likely that projections of the impact of novel diagnostic tests on TB transmission will continue to be biased, often dramatically so.

Figure 1: Conceptual diagrams of different tuberculosis (TB) diagnostic models.
figure 1

a, The 'standard' model. So far, most models of TB diagnosis have assumed that, on becoming infectious, individuals with active TB transmit their disease at a constant rate, seek care at a constant rate and maintain a constant probability of diagnosis and treatment with each care-seeking attempt. In reality, the rate at which individuals with active TB transmit disease and seek care, as well as the probability of successful diagnosis and treatment, change over time with the disease course. This process can be more accurately represented by assuming three different stages in the TB diagnostic pathway, as represented at the bottom of b. This framework accommodates different types of variation that can be crucial in the potential impact of a test. For example, patients might transmit their disease at an increasing rate over time as bacillary burden increases, seek care more frequently as symptoms progress, and be more likely to receive ancillary diagnostic tests (or empiric treatment) as symptoms persist and other diagnoses become less likely.

So far, test accuracy (sensitivity and specificity) — and to a lesser extent, feasibility of implementation in peripheral settings — has dominated thinking about the 'value' of new TB diagnostic tests. However, the impact of any novel TB diagnostic test will depend on how the health-care system incorporates it34, as well as on the dynamics of patient interactions with that health-care system (Fig. 1). Epidemiologically, therefore, a novel diagnostic assay should be evaluated not by its sensitivity and specificity, but rather the extent to which it provides diagnostic information beyond earlier tests and practices35 — its incremental value. This concept is similar to the classic concept of the expected value of diagnostic information (EVDI) promoted by Phelps and Mushlin36, who also highlighted the need to combine the EVDI with estimates of cost or resource requirements. Subsequent work has expanded on this concept37,38. In this paper, we use principles of infectious-disease modelling and diagnostic epidemiology to argue for a change in conceptual approach, from one that has focused primarily on a test's sensitivity to one that centres on its incremental value.

Methods

Quantifying the incremental value of diagnostic tests for TB. In the context of TB, there are a number of benefits that new diagnostics could provide. These include, but are not limited to, averting TB transmission, averting TB morbidity and mortality9, saving money39,40, freeing up health-care capacity for other activities, enabling better treatment of other conditions by ruling out TB41 and improving patients' economic situations42 or quality of life43. We focus here on the use of novel diagnostic tests as tools to avert TB transmission; however, the intention of some tests may be to add value in one or more of these other areas — and each test's utility should be evaluated according to its intended purpose.

To appropriately estimate the incremental value of a new diagnostic test for TB in terms of transmissions averted, one must consider its relationship to the diagnostic pathways outlined in Figure 1. Table 1 lists four defining features of TB disease and diagnosis (latency44, gradual symptom onset45,46, reliance on sputum47 and concentration of transmission among 'superspreaders'48). These features highlight a number of potential diagnostic gaps, or elements along the TB diagnostic pathway, which, if filled by a novel diagnostic test, could generate substantial incremental value.

Table 1 Four potential diagnostic gaps in tuberculosis (TB).

Specifically, in any given setting, TB transmission may occur primarily from people who are not sufficiently ill to seek care49,50; those who are seeking care, but have symptoms (for example, a mild cough) not specific to TB51; or those with severe or prolonged symptoms, but who test negative for TB and are therefore not treated (Fig. 1b). Alternatively, most transmission may occur from hard-to-reach populations in which the rate for seeking care is low52. Each of these gaps suggests a potential diagnostic solution that would have high incremental value. This may be a test to predict progression to active TB (and thus allow targeted preventive therapy), one optimized for diagnosing combinations of symptoms (such as cough and fever), one that is simply more sensitive, or one that is more deployable to peripheral and informal settings (Table 2)53. We incorporate these possibilities more formally into a mathematical model of TB transmission.

Table 2 Profiles of three illustrative diagnostic tests for tuberculosis (TB).

Model description. Figure 2 presents a simple, illustrative model of TB diagnosis and transmission that expands the constant care-seeking approach shown in Figure 1a. In this model, the population is divided into different compartments that reflect the natural history of TB and incorporate both the stages of the diagnostic pathway shown in Figure 1b and the corresponding diagnostic gaps listed in Table 1. Movement of people between these compartments can be represented by a system of ordinary differential equations, with rates of transition between compartments (for example, γ0, the rate of initiating care seeking) that reflect the inverse of the mean duration of time spent in each phase (for example, the mean duration between onset of infectiousness and beginning to seek care). As most of these durations are currently unknown (and differ from one setting to the next), we assume — for the purposes of illustration — a population that is at equilibrium, with values of TB incidence, prevalence and mortality that reflect a setting of moderate TB burden (see Supplementary Information). We then use this simplified model to estimate, in this hypothetical setting, the incremental value of diagnostic tests with different profiles under different assumptions about the relative importance of each diagnostic gap. This simplified model divides the population of individuals with active TB into three categories (Figs 1b and 2): those who are infectious, but who are not actively seeking care (I0), those who have early symptoms that trigger less frequent care seeking and who have a lower probability of correct diagnosis/empiric therapy (I1), and those who have characteristic and prolonged symptoms that trigger frequent care seeking and a likely diagnosis with each attempt (I2). We also assume a general population and a sub-population (I', set at 10% for the purposes of illustration) with 'poor' access to care whose rate of care seeking is a specified fraction (k, set initially at 0.5) of the rate in the general population.

Figure 2: Model structure relating diagnostic pathways to transmission load.
figure 2

A representation of a simple mathematical model that incorporates the three stages of diagnosis shown in Figure 1b. Relative rates of transmission, β, can vary from one stage to the next, with γ representing the inverse of the mean duration of each stage at the population level. Upward arrows denote removal of cases through diagnosis and curative treatment, d, as well as spontaneous resolution (not shown, for simplicity). We also assume a fixed proportion of the population (10% in the base case) have 'poor' access to care, defining an 'access disparity parameter' k to reflect the relative rates of diagnosis in this population. At baseline, we assume that k = 0.5. TB, tuberculosis.

Importantly, this model captures the three dynamic processes of transmission, health-care seeking and empiric treatment shown in Figure 1b. First, the rate of transmission (the probability of a 'contact' resulting in TB transmission, multiplied by the number of potential contacts per unit time) can vary over time. For example, β0 (the number of transmissions per person-month spent in the asymptomatic infectious state I0) may be higher than β1 and β2 (transmission rate from the symptomatic states I1 and I2), because the contact rate with susceptible individuals may be highest early in the disease course (suggested by the high prevalence of TB infection in contact investigations54). Alternatively, the inverse might be true because the bacillary burden grows over time55. We capture this in the concept of the 'transmission load', which we define as the proportion of transmission events at the population level that occur in each of these three stages. Second, the rate of seeking care can increase over time as symptoms progress. Third, the probability of diagnosis with each care-seeking attempt can also increase over time, as symptoms become more suggestive of underlying TB disease56. These two processes can be combined into a single 'rate of successful diagnosis and treatment' (d) that increases over time from d0 to d1 to d2.

We explore three hypothetical settings for how transmission varies during the course of TB disease: late diagnostic gap, in which the transmission rate β is four-fold higher at each subsequent stage of TB disease (for example, constant contact rate with susceptible individuals with increasing bacillary burden); early diagnostic gap, in which β falls by a factor of four at each stage (for example, pool of susceptible individuals shrinks over time as household members and other close contacts are exposed); and high access disparity, in which those with least access to care are assumed to have a rate of diagnosis and treatment that is 10% (rather than 50%) that of the general population. Each setting is calibrated to have the same level of TB incidence (see Supplementary Information).

In the context of each of these settings, we explore the potential incremental value of four illustrative diagnostic tests: a 'progression biomarker' that predicts progression from latent to active TB (to facilitate preventive therapy)57; a 'triage' test that facilitates syndromic diagnosis of people presenting with cough58; a more sensitive 'replacement test' to supplant current sputum-based confirmatory tests for TB59; and a 'point-of-care test' that can replace sputum smear in peripheral settings60, thereby (unlike the other three tests) being accessible to those with poor access to care. These tests, along with their mathematical representation in our simplified modelling framework, are summarized in Table 2.

We focus on comparisons between these types of diagnostic tests when they are added to the standard of care. To illustrate the transmission contributions of different groups, we assume that progression biomarker, triage and replacement tests are deployed in the general population, whereas the point-of-care test is deployed in the poor-access population. We discuss below how different diagnostic gaps might cause each of these illustrative tests to be preferred over the others, thereby emphasizing the importance of quantifying (or at least estimating) the diagnostic gap in any given setting.

Incorporating resource constraints. Ultimately, discussions of a new diagnostic test's incremental value must also consider any constrained resources — whether economic or otherwise — that would be required to implement the test. One method for evaluating the incremental value of a diagnostic test in a given setting is to first identify any constrained resources required for test implementation. The additional resources required to change from the existing standard of care to an algorithm that augments that standard of care with the new diagnostic test can then be estimated (the incremental resource requirement)61. Finally, this is combined with estimates of the incremental number of transmissions averted under this augmented algorithm, relative to the standard of care (incremental impact). Thus, tests that aim to avert TB transmission can be compared using an inverse incremental cost-effectiveness ratio62: (incremental transmissions averted)/(incremental resource requirement), or

where 1 denotes the presence of the new test and 0 denotes its absence.

In settings in which TB diagnostic tests are being compared with other interventions (for example, TB treatment or HIV diagnosis), transmissions averted can be converted into measures of health utility (such as disability-adjusted life years, or DALYs, averted)63 to estimate resources in terms of economic costs and to report this incremental value as an incremental cost–effectiveness ratio. However, when only comparing diagnostic tests with the same primary aim (to avert transmission), the formulation of incremental value in Equation 1 may be more useful; this formulation places the emphasis on impact rather than cost and does not require additional model assumptions to convert transmissions into DALYs or constrained resources (for example, human capacity) into economic costs. Therefore, we use this more direct formulation in our model results.

Results

Incremental value of TB diagnostic tests. Figure 3 shows how the transmission load at equilibrium (the proportion of population-level transmission contributed by each stage) differs in each transmission scenario. For example, in the late diagnostic gap scenario, 35% of all transmission originates from individuals with mild symptoms in the general population, whereas this percentage falls to 5% in the early diagnostic gap scenario. Importantly, averting transmission in the earlier stages (for example, preventing a case from developing, even before to care seeking) also averts that transmission in later stages — seen in Figure 3 by the combined value of the stacked bars. Thus, for example, preventing all transmission in the latter two care-seeking stages in the general population would avert 51% (35% + 16%) of all transmission in the late diagnostic gap scenario, compared with only 5% in the early diagnostic gap scenario — and a diagnostic test targeting these stages (for example, the 'cough triage' test) might be expected to have greater impact in settings that more closely resemble the late diagnostic gap scenario.

Figure 3: Tuberculosis (TB) transmission load under three alternative scenarios.
figure 3

The size of each bar denotes the transmission load, defined as the percentage of all tuberculosis transmission that occurs within a given diagnostic stage. Transmission from the general population is shown in darker colours, with that originating from the 'poor-access' population shown in lighter colours. Interrupting transmission at a given stage also averts transmission in subsequent stages (for example, diagnosing a case in stage I1 also averts the transmission that this case could have caused in stage I2); this effect can be calculated as the sum of the transmission load in the relevant stage plus all subsequent stages within that population.

A notable feature of the late diagnostic gap scenario is that, despite transmission being substantially more intense64 in the prolonged-symptom stage I2 (16 times greater per unit time than in the pre-care-seeking stage I0), the contribution of this stage to transmission remains relatively modest. This is largely due to the relatively short time that individuals spend in this late symptomatic stage. We assume here that, under the standard of care (typically using sputum smear microscopy), individuals are diagnosed on average after 1 month in this late symptomatic stage, compared with 6 months spent in the asymptomatic stage. However, the high access disparity scenario shows the potential importance of the late symptomatic stage when the rate of diagnosis is diminished. Here, transmission in the late symptomatic stage is sufficiently strong for 55% of the transmission load to occur from a high-risk (and symptomatic) subgroup that accounts for no more than 10% of the total population — a level of disproportionate transmission that is only modestly higher than has been suggested in some settings65.

Incremental value of new diagnostic tests under constrained resources. Figure 4 shows results for the incremental value (Equation 1), comparing diagnostic tests that target different stages and under different transmission scenarios. For the denominator of Equation 1, Figure 4 assumes a simple, illustrative example for which the constrained resource is the number of individuals who can be tested with a novel test, irrespective of the test type or its unit cost (see Supplementary Table 2 for further details). This might, for example, reflect a setting in which donor funding could be obtained to implement a new test, but the equipment or human resources available to conduct those tests were extremely limited. For the numerator of Equation 1, Figure 4 assumes the maximum number of transmissions averted if the diagnostic test in question (such as the cough triage test) could avert all of the transmission occurring in the stage of disease targeted (for example, I1, mild symptoms, but seeking care). In practice, owing to factors such as imperfect sensitivity and incomplete population-level implementation, an actual test would only avert a portion of that maximum transmission load; the actual incremental value of each test would therefore be proportionally lower. Thus, in dividing the maximum incremental impact by the fixed incremental resources available, Figure 4 compares the maximum incremental value for each idealized test type, leaving it to subsequent work to estimate what proportion of that maximum could actually be achieved by a given test in practice.

Figure 4: Maximum incremental impact per unit of constrained resources for four illustrative diagnostic tests in three alternative scenarios.
figure 4

On the y-axis is the maximum incremental impact (number of tuberculosis (TB) transmissions averted) for each of four illustrative TB diagnostic tests, divided by the incremental resources required to implement each test (Equation 1). All measures are benchmarked to an incremental impact of 1 for the smear replacement test in the general population. Here, we assume that the constrained resources are simply proportional to the number of people needed to test to diagnose one additional case of active TB. The maximum incremental impact is the number of transmissions that would be averted if diagnosis averts all transmission associated with a given patient stage in Figure 2. Accordingly, the results presented here should be interpreted as an upper bound that are illustrative of the role of diagnostic gaps in each stage. In the cases illustrated here, the 'progression biomarker' (which identifies individuals at risk for progression to active TB) is clearly favoured in the early diagnostic gap scenario, whereas the point-of-care test (which replaces the smear test in the poor-access population, and is deployed only in the poor-access population) is strongly favoured in the high access disparity scenario.

Figure 4 illustrates that — where the primary diagnostic gap is early in the disease course — the maximum incremental value for tests that target earlier stages is higher than that of the smear-replacement test. By contrast, when the primary diagnostic gap is late in the disease course, the maximum incremental value of the later-stage diagnostics is far greater (as represented by their markedly higher incremental value). Notably, where transmission is concentrated among a population with particularly poor access to care, the maximum incremental value for a test that can be implemented in this population can be considerably higher than for any other test (as in the access disparity scenario).

Figure 5 shows an alternative scenario for the denominator of Equation 1 in which the limiting resource is financial (for example, a fixed amount of money available), assuming that the cost per test is higher when applied earlier in the diagnostic pathway. (For example, it is more costly to screen a patient for TB in a prevalence survey66 than it is in a clinic67.) The 'unit cost' of a test is also assumed to be higher per person when applied in the poor-access population (see Supplementary Table 2), as these individuals are assumed to be harder to reach than the general population. In the early diagnostic gap scenario, for example, considering this unit cost dramatically lowers the maximum incremental value of the biomarker test that could be achieved per unit of the constrained resource, relative to the cough triage and smear replacement tests. As a result, under this alternative resource constraint, the cough triage test, rather than the biomarker, would be preferred.

Figure 5: Maximum incremental value per unit of constrained resources after incorporation of a cost function.
figure 5

The same illustrative tests are evaluated in the same alternative scenarios as in Figure 4, but in this case we apply a cost function that accounts for the fact that diagnosis earlier in the disease process, or among low-access populations, is generally more resource-intensive on a per-test basis (see Supplementary Table 2 for full assumptions). After considering this cost function, the 'progression biomarker' is no longer clearly favoured in the early diagnostic gap scenario, and the degree to which the point-of-care test is favoured over the smear replacement test in the high access disparity scenario is reduced by the same factor (8 in this case) by which the cost per person screened in the low-access population exceeds that in the general population. As in Figure 4, all measures are benchmarked to an incremental impact of 1 for the smear replacement test in the general population.

Discussion

In evaluating novel diagnostic tests for TB, it is crucial that we move beyond simple considerations of elements such as sensitivity, specificity and turnaround time — and instead begin to consider the incremental value of diagnostic tests that fit certain profiles. We use a simple mathematical model to demonstrate key trade-offs in an illustrative setting. This work demonstrates how diagnostic tests for TB can be quantitatively assessed in terms of their incremental value (incremental impact divided by incremental resource requirement), and moreover how this incremental value can vary from one setting to the next.

The prevailing diagnostic gap in a given setting has a profound effect on the potential incremental impact of each diagnostic test. When most transmission occurs before patients begin to seek care, diagnostic tests that require patients to access the health system are unlikely to have substantial epidemiological impact; thus, in the early diagnostic gap scenario (Fig. 3) the only diagnostic test capable of averting the bulk of the transmission load is the prevention biomarker. Similarly, when a substantial disparity exists between the high-risk and general population, diagnostics that cannot be implemented in the high-risk group are limited in their potential.

Consideration of incremental impact must also include consideration of incremental resource requirements, however. For example, the resources required to avert a transmission are generally much greater when diagnostic tests are performed early in the disease course66, or in hard-to-reach populations. As a result, those diagnostic tests with the largest maximum incremental impact may also be those that require the most resources. In estimating the incremental resource requirement of a given test, it is important to consider the resources for TB control that are constrained in a given setting. In many cases, these constrained resources will be purely financial, but in others, there may be limitations on the availability of trained staff or laboratory capacity to perform certain tests68. The per-test incremental outlay of the most constrained resources is therefore also likely to vary from one setting to the next. Ultimately, the incremental value of a TB diagnostic test depends not on sensitivity and specificity, but also on multiple factors that will vary from one system to the next (Box 1). For any setting, all six of the elements in Box 1 should be evaluated to help to identify the type of novel test that is likely to have the greatest incremental value (avert the most TB transmission events, given the constrained resources). As assessments of these factors are performed across a variety of settings, consensus may emerge as which tests should be prioritized for development.

Unfortunately, we currently lack the empirical data in most settings to make such an informed assessment. Specifically, it is likely that different transmission loads and diagnostic gaps — early, late or among high-risk subpopulations — predominate in different settings, and that resource constraints vary widely from one setting to the next. How can this data gap be closed?

First, we require better evidence regarding how novel diagnostic tests function when implemented under field conditions. Such data would allow us to estimate the proportion of any diagnostic gap that a new TB test could close, as well as the number of tests required to make one additional diagnosis. Unfortunately, most diagnostic tests are evaluated primarily in well-funded trials and demonstration studies, without good evidence of how they perform in the real world. For example, Xpert MTB/RIF was recommended on the basis of high-quality data about its accuracy and cost-effectiveness under controlled conditions and in a large field trial26; however, emerging evidence has suggested that, in many settings, the characteristics of Xpert may be different when implemented in the field — including its sensitivity69, calibration70, positive predictive value (owing to low pre-test probability)71, and accuracy for rifampin resistance72. To make accurate assessments of the incremental value of diagnostics, we should collect such data early after launch, and update expectations and recommendations as those data become available.

Second, we need better data on the performance of existing tests, including clinical judgement. These data would enable us to evaluate the incremental number of transmissions that a novel test might be able to avert, relative to the existing standard of care. A series of recent high-quality studies suggests that, when patients present with symptoms that are highly suggestive of TB in upper-middle income settings (for example, South Africa and Brazil), the probability of empirical diagnosis is reasonably high8,9,10,11 — but that a large number of people may be presenting to care with a cough without TB ever being considered8. Such studies are crucial to understand the likely diagnostic gaps for TB, but unfortunately, very few such analyses have been performed in settings with fewer resources (for example, most of sub-Saharan Africa73,74 and Southeast Asia) where empirical diagnosis rates (and the capacity to implement novel diagnostic tests) may be much lower. Characterizations of relative TB transmission from high-risk populations (akin to the 'low-access' population in Figure 3) compared with the general population are also sparse75, and could potentially be informed by better use of surveillance data76.

Third, and perhaps most challengingly, we need to prioritize characterizations of the transmission load and diagnostic gaps in a variety of settings. If we can describe the prevailing transmission loads in any given setting, we can then quantify the maximum incremental impact (transmissions averted) of any diagnostic test in that setting. Ultimately, for any setting, one should be able to delineate what proportion of the transmission load in each of the phases of TB (pre-care seeking, mildly symptomatic and prolonged symptomatic in the general population and in high-risk groups) is being averted using existing tests, and therefore what proportion might still be amenable to implementation of a novel diagnostic. Molecular characterization of TB (for example, through whole-genome sequencing77) in entire populations is becoming available and can be linked to conventional epidemiological investigations (for example, through contact investigations78) using increasingly discriminatory tools for analysis and data collection79. Thus, it may become possible to triangulate an infectious individual's onset of symptoms, initiation of care-seeking activities and specific transmission events. Studies that merge data on transmission, contact patterns, symptom histories, care-seeking patterns and interactions with the health-care system on a population level should be prioritized in this regard. In the meantime, simple investigation of surveillance data can help to identify geographic hotspots of transmission, and operational analyses of diagnostic test implementation can demonstrate where diagnoses are probably being missed. Although estimating the duration of an infectious episode poses significant challenges, household cohorts using currently available tools could cast some light on the 'transmission load' that occurs early in the clinical course80,81.

Finally, we need better investigations of constrained resources in specific settings to enumerate the resources that are genuinely constrained, and to quantify those resources per test performed (as the equivalent of a unit cost). Although conventional economic evaluations of interventions against diseases such as TB implicitly consider money to be the most constrained resource, other studies in low-income settings have shown that human resources, laboratory capacity, regulatory infrastructure or ability to implement new interventions may be the key limiting factors68. This may be especially true in the modern era of direct assistance for health — which may supply money, but not resources in the form of trained personnel82. An understanding of the most constrained resources in any given setting must then be merged with data on the number of tests required to identify an incremental case, as well as the per-test resource outlay, for any given novel diagnostic test. Only if we truly understand the resources that are most constrained in a given setting, as well as the resource outlay for each type of diagnostic test, can we identify the diagnostic tests that will optimize epidemiological impact under existing resource constraints.

Ultimately, the only way to end TB is to diagnose and treat people with TB before transmission occurs — novel diagnostics are an essential component of any strategy with this aim. If we are to succeed in that endeavour, we must think of, and quantify, those tests not just in terms of sensitivity, specificity and turnaround time, but rather in terms of their incremental value across a variety of epidemiological settings. We present a framework for estimating this incremental value that also highlights the need for additional data in order to inform more appropriate prioritization of novel TB diagnostic tests, across settings that may differ in their existing diagnostic gaps and resource constraints. As we continue to develop diagnostic tests with the goal of curbing TB transmission, we must think beyond accuracy and consider the broader context of patient behaviour, health systems and TB natural history.