Preface
The introduction of new diagnostic tools can help to reduce the large burden of disease in the developing world. New tests that can accurately discriminate between patients who do and do not need treatment will reduce mortality, morbidity and the waste of scarce resources. Although high-performance tests are desirable, those that are more accurate usually require greater levels of infrastructure and are therefore accessible to fewer people. Here we outline an approach for estimating the health benefits of new diagnostic tools, and examining the trade-offs between accuracy and infrastructure requirements.
Introduction
An essential component of evaluating and improving global health is access to appropriate diagnostic tools. As described in the other articles in this supplement, the current diagnostic tests for many diseases do not meet the needs of the developing world. Some tests require technological capabilities and infrastructure that are beyond the resources of developing countries, while others are too costly to be used.
Developing a rational strategy for investment in diagnostic technologies requires a means to determine the need for, and the health impact of, potential new tools. This paper outlines an approach for modelling the health benefits of new diagnostic tools. The framework was developed by the RAND Corporation in conjunction with the Bill & Melinda Gates Foundation and the partnership they formed in 2004 — known as the Global Health Diagnostics Forum — with domain experts in relevant disease areas, representatives from the diagnostics industry and technology development arena, and experts in the modelling of disease impact and the application of diagnostic technologies. The results of disease-specific interventions and the roles of new diagnostic technologies are reported in the other articles in this supplement, and are also available in a series of RAND reports (http://www.rand.org/health/feature/research/0612_global.html).
In order to determine the health impact of a new diagnostic test, our approach divides the problem into two tasks: first, we establish the effect that a specific diagnostic tool might have on the reduction of the disease burden; and second, we identify the performance characteristics and user requirements that a diagnostic tool must have to realize that reduction. The first task requires disease-specific modelling of the status quo and the changes that could occur were a new diagnostic to become available in certain settings. The product of this effort is a tool that — given the sensitivity and specificity of a potential new diagnostic, and an estimate of the proportion of people who will have access to it — can predict the health impact of a test using a number of different health outcomes. The second task involves defining the characteristics of diagnostics, such as the type of infrastructure needed to be operational, sensitivity and specificity, and estimating the proportion of people that will have access to different types of test. We refer to these characteristics as "user requirements". This task requires us to define representative health-care settings in the developing world, identify their capabilities and estimate a patient's access to different levels of care.
The methods and approaches described in this article can be applied to disease-specific problems to provide guidance for technology developers on the infrastructure and user requirements of new diagnostics, with the aim of achieving a health impact.
Methods
Modelling framework
The guiding principle behind our model is that, in order to estimate the effect of any intervention, we must begin with a good description of the status quo. The effect of an intervention is modelled by changing key parameters of the status quo and comparing the outcomes with those in the world in which it takes place.
Modelling the status quo
The status quo is modelled by determining the types of diagnostic available in a country, the proportion of individuals who have access to them and the relevant epidemiological parameters. These data are used to divide the population into mutually exclusive subgroups according to their trajectories through the health-care system, and to assign a health outcome to each.
We can describe the status quo in terms of a sequence of events, as detailed below and shown in the probability tree displayed in Fig. 1.
Figure 1: Probability tree.

The population of interest is positioned at the base of the tree. The population is then split into three different access levels, depending on whether and where its members enter the health-care system. Within each access level, individuals might be tested and experience either a positive or a negative result. They are then further divided according to their disease status, and, as a result, are assigned one of the possible four test outcomes: true positive (TP), false positive (FP), true negative (TN) and false negative (FN).
High resolution image and legend (28K)First, an event (for example, a sufficiently severe bout of illness) occurs that prompts an individual to seek care. The probability that an individual will take this course of action depends, in general, on epidemiological parameters, such as the prevalence of a condition (that is, the proportion of the population affected by it) as well as its severity distribution.
Second, individuals who seek care will enter the health-care system at different points (for example, a clinic or urban hospital), whereas others might fail to obtain care. Those who enter the system will be offered different types of tests. For our purposes, not receiving a test is equivalent to receiving a test that is 100% specific and 0% sensitive. The probability that an individual is given a specific type of test is conditional on health-care-seeking behaviour, and is determined by the type of facility that the individual accesses.
Third, for any given test, patients will experience different test outcomes (that is, true positive, false positive, false negative or true negative), with probabilities that depend on the test characteristics and the prevalence of the condition.
Fourth, depending on the test outcome, patients will follow different treatment trajectories, which will ultimately be associated with one or more health outcomes. In the simplest case, all patients who test positive will be treated; however, many alternate scenarios are possible. For example, when test results are not immediately available, some individuals might fail to return. Moreover, those who do return might not have access to available treatment, the treatment might not be 100% efficacious or its administration might be conditional on the result of a further round of testing. In all of these cases, a group of patients is split into subgroups, each of which is assigned to a particular health outcome.
Modelling outcomes
Outcomes are often described in terms of mortality and morbidity. In the former case, the status quo simply describes how many individuals die of a specific disease, which is computed using the fatality rates for untreated and treated individuals. In the latter case, the status quo describes outcomes in terms of disability-adjusted life years (DALYs). Another outcome of interest is a measure of the potential negative effects resulting from treatment. In fact, all treatments are typically associated with some degree of negative externalities, both for the individual, such as allergic reaction, stigma or loss of productivity, and for society at large, such as development of resistant strains of pathogens, capital and labour costs of treatment or opportunity cost (that is, the health loss due to the use of resources that could have been otherwise invested in the most cost-effective interventions).
Negative externalities are often extremely difficult to quantify. It is not sufficient to say that they are proportional to the number of treatments administered, because they are not comparable to any of the health outcomes of interest (such as mortality). It is difficult to assign each test a unique outcome that takes into account both the benefits and the negative externalities of treatment. Therefore, it is also difficult to compare and rank different tests.
Consider, for example, test A, which leads to the use of 500,000 treatments and saves 100,000 children, and test B, which leads to the use of only 300,000 treatments but saves only 80,000 lives. It is not obvious a priori which of the two tests is preferable: test A saves 20,000 more lives than test B, but does so at the price of an additional 200,000 treatments. If the negative externalities associated with treatment are sufficiently large, test B might be preferable to test A, even if it saves the lives of fewer children in the short term.
In order to solve this problem, we have introduced the concept of harm of treatment, which is a measure of all the potential negative externalities associated with treatment, is expressed in the same units as the primary health outcome and is referred to with the symbol C. In the context of the example above, each time we treat a child, a fraction (C) of a life is lost due to the negative effects of treatment. Assuming that C = 0.001, treating 1,000 children will lead to the loss of 1,000
0.001 = 1 life. We refer to this as an "indirect" life lost to treatment, because it summarizes the indirect effect of the treatment on society. Indirect lives are a public-health concept and cannot be matched to particular individuals. For example, 1,000 indirect lives could be lost because 100,000 individuals lost a number of life years due to the negative externalities of treatment.
In the example above, we assumed the value of C to be 0.001 for simplicity; however, this is not an unreasonable number. For instance, assuming that the only source of harm is the opportunity cost, if the cost of treatment is 50 US cents (a typical value for antibiotics), then for every 1,000 treatments administered, US$500 is spent. If there is at least one intervention that can save the life of one child at a cost of US$500, then for every 1,000 treatments administered, we miss the opportunity to save one child, and the calculated harm of treatment is C = 0.001.
The introduction of the harm of treatment concept allows us to assign a unique measure of benefit to a test. Therefore, if a test saves L lives and utilizes T treatments, its value V is calculated as V = L - C
T, where the term C
T represents the number of indirect lives lost due to the harm of treatment. We refer to V as the number of "adjusted" lives saved, because it adjusts the number of individual lives saved by taking into account the potential negative externalities of treatment. A similar technique can be used to define the numbers of adjusted life years saved and adjusted DALYs saved.

Photo by Sharon Farmer courtesy of the Bill & Melinda Gates Foundation
One significant shortcoming of this methodology is that, although some information about negative externalities is usually available, a direct computation of the harm of treatment is extremely difficult in most cases. Therefore, we have devised an indirect way to provide limits for this quantity, using a method inspired by the revealed-preference approach of neo-classical economics1.
The basis of our method is that whenever the clinical community recommends the use of a test to determine who should receive treatment, it implicitly makes a statement about the potential harm of treatment. The fact that a test is recommended is an acknowledgment that the harm of treatment is >0, otherwise mass treatment would be preferable. However, it is also an acknowledgment that the harm of treatment is not sufficiently large that treatment would never be recommended. More formally, we can say that if a test is currently used or recommended, the number of adjusted lives saved must be larger than the number saved by tests that are 100% sensitive and 0% specific (that is, mass treatment), or 0% sensitive and 100% specific (that is, no treatment). As the number of adjusted lives saved is a function of the harm of treatment, these statements can be transformed into mathematical inequalities for the unknown quantity C, and used to provide its lower and upper limits. Further details of this method have been reported elsewhere by Girosi and colleagues2.
Although this method of computing the potential harm of treatment is appealing in many ways, one disadvantage is that it can provide only a summary view of the collective decision of the medical community about whether a test should be used. By itself, it does not give any insight into the factors that influenced the decision. However, the findings of this method can often be corroborated with opportunity-cost calculations or by consulting an expert panel, giving the results more credibility and certainty. Additionally, in all cases, sensitivity analysis can be used to study how the estimate of the harm of treatment affects the results.
Modelling the introduction of a diagnostic
The description of the status quo can be used as the basis on which to model the introduction of a new diagnostic test, which is defined by performance characteristics and other features, such as the type of sample, cost and infrastructure needed. In order to compute the health impact of a new test, we need to determine which subset of the population will have access to it, and how many of these individuals will actually receive it. These two steps identify the target population (that is, the population that benefits from the new test). The health impact is computed as the improvement in health outcomes obtained by testing the target population with the new test instead of the status quo test.
Modelling access to a new test
The size and composition of the population that will have access to a new test primarily depends on the type of infrastructure needed to administer it. Therefore, we focus on infrastructure as a key determinant of access to a new test. Some tests might require electricity or refrigeration, as well as trained staff to administer them, while others might not need any type of infrastructure and can be performed at home by anyone able to follow pictorial instructions. We therefore define three levels of infrastructure: no infrastructure, minimal infrastructure and moderate-to-advanced infrastructure. Note that because facilities with advanced infrastructure are relatively scarce in the developing world, we combine the moderate and advanced infrastructure categories in our analyses. The features of each level are detailed in the next section and summarized in Table 1.
We use the infrastructure levels to derive an access measure (that is, a single number representing the proportion of people who will have access to a given test). By considering a diagnostic and its user requirements, we can determine the infrastructure level needed to support it, which can then be converted into the proportion of people who are likely to gain access to the test (that is, the access measure). For a given infrastructure level, the corresponding access measure will vary by country and region. For example, a test that requires refrigeration might be accessible to a small proportion of the population in Africa and to a much larger proportion of the population in Asia. We explain our method for estimating access to care and its results later in this paper.
Knowing the access measure for a new test is not sufficient to identify those individuals who will benefit from it, as it will not be randomly available within a country. Rather, we assume that when a new test is introduced, it will initially be available to the providers with the most sophisticated infrastructure, followed by those with progressively worse infrastructure.
Modelling the adoption of a new test
The last step necessary to compute the health impact of a new test is to determine, from among the individuals who will have access to it, who is actually going to use it. Depending on its performance characteristics, a new test might represent a great improvement for a village clinic, but be far from optimal in an urban hospital. Therefore, we assume that the new test will be adopted only if it represents an improvement over the status quo test that it might replace. In some models, we allow for transitional phases in which both tests are used in conjunction with one another. Defining what improvement means in this context is a non-trivial task, and we address this using the harm of treatment concept previously introduced. We consider a new test to represent an improvement if it saves more adjusted lives than would be saved in the status quo.
Infrastructure levels and user requirements
To derive the infrastructure levels employed in defining user requirements for new diagnostic tests and developing access measures, we began by identifying common health-care settings, and the general capabilities associated with them (for example, clean water and electricity) and their staff (for example, skilled nurses and trained laboratory members). We focused primarily on the traditional avenues of health-care delivery, and did not propose any novel forms of delivery. We then identified a more detailed list of capabilities or user requirements, to assist developers in determining the right technology for a diagnostic test in a particular setting.
Because country-level data describing the availability, accessibility and characteristics of the health-care settings of developing countries are limited, we developed a detailed questionnaire to gather information on health-care settings worldwide. Our questionnaire was partly based on the Service Provision Assessment (SPA) surveys performed by ORC Macro as part of the Measure Demographic and Health Surveys (DHS) project (http://www.measuredhs.com); these are among the most complete reports available on health-care settings and their capabilities, and have so far been published for five countries (Kenya, Ghana, Rwanda, Egypt and Bangladesh). The questionnaire also drew on the draft World Health Organization (WHO) Service Availability Mapping (SAM) reports (http://www.who.int/health_mapping/about/services_SAM/en/index.html). The SAM programme works with health ministries to identify and map all of the health-care resources in a country, and projects have been completed in Rwanda, Uganda and Zambia.
The questionnaire was pilot-tested by two team members who visited health-care facilities in Uganda and Malawi. It was then used in interviews with
20 members of the Global Health Diagnostics Forum, who collectively had field experience in >35 developing countries. For each country in which a forum member had experience, we asked questions about the type of health-care settings, their basic functions and infrastructure, the level of staff training and access, and the user requirements. The part of the questionnaire addressing the user requirements was designed by modifying a document developed by the Foundation for Innovative New Diagnostics as part of their efforts to develop a molecular-based diagnostic for tuberculosis (TB).
Based on the data from the SPA and SAM reports, and interviews with forum members, we categorized the health-care settings based on a minimal set of characteristics that were identified by the experts as important for informing technology developers. The characteristics identified as most important in determining the health-care-setting capacity for diagnostics were as follows: availability of reliable power and clean water; level of training of the person performing the test (for example, nurse, laboratory technician, community health worker or family member); and physical infrastructure (that is, whether a test had to be performed in a stationary facility or could be mobile). In defining and categorizing health-care settings, we considered numerous additional variables, such as the available types of laboratory equipment (for example, polymerase chain reaction instruments, microplate readers, computers and incubators) and infrastructure-type equipment (for example, refrigerators, freezers and air conditioners). However, in developing countries, most settings have limited capabilities and the data sources describing them are also limited.
Because of the variation across regions among similarly named health-care facilities (for example, hospitals and health clinics), we defined the settings according to the infrastructure levels (that is, advanced, moderate, minimal or no infrastructure; Table 1). Therefore, one type of facility (for example, health clinics) could be associated with more than one infrastructure level (in this case, health clinics in Africa and Asia were classified in different infrastructure categories). The countries we modelled fell into three regions: Africa, Asia and Latin America. The complete set of facility capabilities and user requirements have been reported elsewhere by Olmsted and colleagues3.
The infrastructure categories highlighted the potential need for different types of test depending on the setting. For example, a tissue-culture or nucleic-acid-based test would need to be performed in a health-care setting with advanced infrastructure, and could not be used at facilities in the other categories. However, a test developed for a setting with minimal infrastructure (for example, a disposable dipstick test) could also be used in settings with more advanced infrastructure.
"Hospital" was the most consistent term used across the different data sources we analysed, and generally referred to facilities with in-patient beds. In some areas, advanced health clinics could be considered hospitals. We focused on district-level hospitals in our analysis, as they care for a much larger number of patients than central or national hospitals. Although many central hospitals do provide testing services for patients from remote areas, the forum members reported that tests that require delivery of a sample to a central hospital are not reliable (partly because of loss to follow up) in many of the countries of interest, and are not acceptable for any of the acute diseases of interest (such as malaria, diarrhoeal dis-eases and acute lower respiratory infections). District hospitals generally have fewer capabilities than central hospitals. In addition, we focused on publicly funded hospitals in our descriptions, with the understanding that privately funded hospitals (and other health facilities) tend to have better infrastructure.
We use the term "health clinic" to refer to health posts, health centres, and any other facility with a physical presence and trained medical staff that provides health-care delivery but is not a hospital. As noted in Table 1, health clinics have a broad range of capabilities across the three geographical regions.
Estimating access to care
Data describing the percentage of a population that can access a health-care setting are difficult to obtain for many developing countries. To address this gap, we developed a multinomial logistic-regression model to estimate access to care across our three regions of interest. We obtained data on health-care utilization from the Measure DHS surveys conducted from the year 2000 to 2005 for 17 African countries (Benin, Burkina Faso, Cameroon, Egypt, Ethiopia, Gabon, Ghana, Kenya, Malawi, Mali, Morocco, Mozambique, Namibia, Nigeria, Rwanda, Uganda and Zambia), six Asian countries (Armenia, Bangladesh, Indonesia, Nepal, the Philippines and Vietnam) and six Latin American countries (Bolivia, Colombia, Dominican Republic, Haiti, Nicaragua and Peru). The DHS surveys were designed to provide a representative sample of the population of each country and to collect data across a spectrum of health issues, including human immunodeficiency virus (HIV) infection, sexually transmitted infections (STIs), childhood illnesses, nutrition, family planning and maternal health. The surveys we analysed included on average >5,000 households in each country.

Photo by Sharon Farmer courtesy of the Bill & Melinda Gates Foundation
For our access-to-care model, we drew on four survey questions about the following aspects of health-care utilization: the person who delivered prenatal care for the last pregnancy; the source of care for the last STI; the source of care for the last fever/cough (within the past 2 weeks) in a child aged <5 years; and the source of care for the last case of diarrhoea (within the past 2 weeks) in a child aged <5 years. For each of the conditions listed, the respondents were asked whether or not they received care. Those who gave a positive response were then asked where or from whom they received care. Respondents to the prenatal care question were also asked to provide the level of training of the person who delivered the care (for example, physician, nurse, traditional birth attendant or family member). We coded each of these choices to a type of health-care setting according to the training level of the person who delivered the care, and the feedback we received from the forum and surveys on the staff in each setting. For the other three conditions, the respondents were given a detailed list of settings for the care they received, which varied across the different countries surveyed, but typically included public and private hospitals, health clinics, health centres, health posts, dispensaries, community health workers, friends, traditional healers, midwives and family members. We collapsed the health-care settings across all four conditions into the following five categories, which were consistent with those defined by the infrastructure levels: hospital, health clinic (including health posts and health centres), community health worker, other (for example, friend, traditional healer or pharmacy) and no care.
The responses to the four survey questions were combined to estimate a household level of access to care (that is, the highest level of care among the four conditions for any given household). For example, a mother might report the following: she visited a nurse midwife (clinic) for prenatal care, a friend helped her with an STI, she took her child to a hospital for fever and she did not treat her child for a recent diarrhoeal illness. In this case, the household would be assigned "hospital" as its highest access level (the dependent variable of the model). The information was then further dichotomized according to rural and urban locations, giving the following 10 possible access to care levels for the dependent variable: rural clinic, rural hospital, rural community health worker, rural other, rural no care, urban clinic, urban hospital, urban community health worker, urban other and urban no care.
This focus on health-care utilization is often called realized access, as opposed to potential access, which includes more of the characteristics of the health-delivery system (such as the availability and organization of health services in a community)4. Although potential access might be useful for estimating the expected benefit of a new diagnostic in an idealized world, realized access is likely to provide a more realistic estimate of the immediate benefit. Therefore, our estimates for potential benefit are likely to be conservative.
Using the 29 countries for which we obtained utilization data, the regression model was fitted to obtain the predicted values for access for all 114 countries in the regions of interest. As noted above, the dependent variable in the model was the highest level of household access to care. The independent variables, which were selected from the World Bank Group World Development Index (http://devdata.worldbank.org/data-query) and the WHO TB statistics (provided by J. Cunningham of the WHO and M. Perkins of the Foundation for Innovative New Diagnostics), were as follows: percent urban population, rural population density, gross domestic product (GDP) per capita, health expenditure per capita, number of physicians per 1,000 people, percentage of adult pulmonary TB suspects that have a smear and percentage of adult pulmonary TB suspects that have an X-ray.
To account for any regional differences in access, we also included a three-level variable in the model. Therefore, for each region, we were able to obtain different parameter estimates from the model. Using these, we calculated the predicted values for the percentage of access to care level for all countries that had non-missing values for all of the independent variables.
Table 2 provides the population weighted averages of the access to care by infrastructure and region. To illustrate the use of these data, we consider the case of a new test with a given sensitivity and specificity, which, in order to be administered, requires electricity, clean water and well-trained technicians. From Table 1, we can infer that this test falls in the moderate/advanced infrastructure level. From Table 2, we can infer that if this test were introduced in Africa, only 28% of the population would have access to it. We can then use this value to determine which subgroup of the population in the status quo would have access to the new test if it were introduced. For this subgroup, we can then determine whether the new test is better than the status quo, by determining which saves more adjusted lives. If the new test is better, its health impact can be measured according to the number of total lives saved and/or other relevant outcome measures.
Discussion
We have outlined a method for estimating the potential health impact of new diagnostic tests in developing countries. This process included developing a novel modelling framework, determining and describing health-care settings, and calculating access to care in these countries. We have categorized health-care settings across the developing world into a small number of infrastructure levels, to provide results that minimize the different types of test technology developers might require.
Our results indicate that a large portion of the population in each of the regions modelled has access to some form of health-care setting. However, in some cases, the capabilities of these settings are limited in terms of both infrastructure and level of staff training. The articles in this supplement focus on improving the diagnostic tests available in each of the health-care settings and provide recommendations on improvements to the status quo tests. Although it is outside the scope of this paper, another method for improving health outcomes that could be approached in parallel to improving diagnostic tests would be enhancing the infrastructure and staffing available at these health-care settings. This approach would, in turn, allow the facilities to adopt better tests that might be available today or in the future. For instance, improving infrastructure and staffing could allow nucleic-acid-based tests for STIs to be adopted in more health-care settings.
Our definitions of infrastructure levels are, by necessity, simplified. In order to cover all developing countries, we have made assumptions and grouped countries into three regions. Owing to the limits of the published data concerning health-care settings in these countries, and the wide variety of such settings, our descriptions are basic, although they cover the important characteristics needed to determine the kinds of tests that should be developed. In addition, central hospitals do not weigh heavily in our modelling. Although these facilities generally have the most advanced capabilities in a developing country, the ability of patients to access them is severely limited. Moreover, although central testing for conditions such as TB or HIV can be done with a delay time for the results, many of the other diseases we model are acute so a delay in diagnosis is not acceptable.
Using the modelling approach described above and adding a few more layers of complexity, it is possible to generate a rich set of scenarios that describe the diagnostic landscape of a country. The limits of this approach are largely dictated by limits on the type of data available. However, one additional limitation is that the approach is static: it does not explicitly take into account the transmission patterns that are relevant for diseases such as TB and gonorrhoea. For example, we can model the number of gonorrhoea cases averted by a specific test in the status quo, but the impact on the prevalence of this and other related diseases (such as HIV) remains unclear. Transmission effects can only be brought into this type of model a posteriori, by the judicious use of multipliers that convert a static outcome (such as the number of cases averted in 1 year) into a flow of downstream outcomes (such as the number of additional cases averted in the following years).
Another limitation of our approach is that the potential harm of treatment (or negative externalities caused by treatment) is only known within relatively large limits, and we are usually unable to tell what types of harm are taken in account. While it is true that the general conclusions of this type of analysis are often robust with respect to this parameter5, 6, the uncertainty over the harm of treatment contributes to the considerable uncertainty over the number of adjusted lives saved by a new test. We note that this uncertainty will be shared by any study aiming to evaluate the benefit of a new diagnostic in the developing world. In fact, an important lesson learned in the development of our modelling approach is that analyses cannot be performed unless an estimate of the harm of treatment is available. This suggests that additional research in this area is needed to further our understanding of the benefits of new diagnostic tools.


