Visceral leishmaniasis (VL) is a severe infectious disease caused by a protozoan parasite: Leishmania donovani in East Africa and the Indian subcontinent and Leishmania infantum in Latin America and the Mediterranean basin. Not all leishmanial infections lead to overt clinical disease, but in those infected persons who do develop the disease, multiplication of the parasite in the reticulo-endothelial system causes prolonged fever, anaemia, hepatosplenomegaly and weight loss. VL is fatal if it is not adequately treated. The drugs currently used to treat VL can have severe side effects and the clinical presentation of VL is not sufficiently specific to guide treatment. Highly accurate (both sensitive and specific), cheap and simple rapid diagnostic tests (RDTs) are therefore crucial for case-management of VL. Early case detection followed by adequate treatment is also central to control of VL because, as yet, no vaccine is available and the long-term impact of vector control is unclear.
Although the need for accurate VL diagnostics is obvious, innovation in this field has been slow. Since the 1980s, the main objective of VL diagnostics development has been to replace the direct demonstration of parasites in tissue smears, a technique that is invasive and requires considerable expertise, by a 'field test' that is more appropriate for use in a VL-endemic context. Several serological tests have been developed, but none are specific for VL disease as such, although they have proved useful in combination with a clinical case definition.
New diagnostic tools are needed for more than just the confirmation of VL disease. No alternatives to parasitological methods are yet available to establish test of cure in treated VL patients. Clinicians do not have the tools to distinguish re-infection from relapse in cases of recurrence, and control programmes do not have validated assays for the surveillance of drug resistance in parasites. Furthermore, in the context of the VL elimination initiative, it would be desirable to have better markers of leishmanial infection at the population level.
Any evaluation of a new diagnostic device should carefully identify its intended purpose. Too often developers and researchers confuse a device for the detection of leishmanial infection with a device for the confirmation of VL disease, and this is particularly the case for nucleic-acid-based assays. PCR is usually highly sensitive for detection of leishmanial infection, but this does not mean PCR will be useful for the confirmation of acute VL disease in patients in endemic areas, as many carriers of the infection in these areas will be PCR-positive without developing VL disease. This article will focus specifically on the evaluation of RDTs for confirmation of VL disease.
I. Current diagnostic tests for VL disease
The World Health Organization (WHO) established the clinical case definition of VL as persistent fever (>2 weeks) and splenomegaly in a person residing in an VL-endemic area1. The combination of both signs is found in the majority of VL cases, though splenomegaly is not always present2. Some VL control programmes therefore add other clinical signs or symptoms to this definition, such as wasting, anaemia and lymphadenopathy. Unfortunately, these clinical definitions lack specificity as such signs are common in other diseases that can be prevalent in VL-endemic areas, such as malaria, hyper-reactive malarial splenomegaly, enteric fever, disseminated tuberculosis, brucellosis and haematological malignancies. Given the high cost and toxicity of the current therapeutic options for VL, starting a course of anti-leishmanial treatment solely on the basis of clinical suspicion is not acceptable. Confirmatory diagnostic tests must therefore be used, particularly in first-line health services, where the prior probability of disease is lower than in referral centres. Below, we discuss the existing options for confirmation of diagnosis, with the emphasis on those techniques that are suitable for field use.
1. Parasite-detection methods
The identification of parasite amastigotes in tissue smears or culture has been the recommended method of VL diagnosis for many years but has variable sensitivity, depending on the type of aspirate that is used. The most sensitive technique, splenic aspiration, can only be used under highly controlled conditions (see below), and is not suitable for decentralized use in first-line health services.
1.1. Direct microscopic examination. The amastigote forms of the parasite (called 'LD bodies') can be seen intracellularly in monocytes or macrophages on microscopic examination of Giemsa-stained blood or aspirates from lymph nodes, bone marrow or spleen. Amastigotes are round or oval bodies, 2–4
m in diameter, with characteristic organelles (nucleus and kinetoplast). The identification of amastigotes requires expertise and training and the accuracy is dependent on the microscopist.
The sensitivity of direct microscopic examination varies, but it is lowest in peripheral blood smears, as parasitaemia in immunocompetent individuals with VL is low. The reported sensitivity of direct microscopic examination of lymph node aspirates ranges from 52% to 58%2, 3, and for bone marrow aspirates from 52% to 85%2, 4, 5. Enlarged lymph nodes are typically observed in VL patients in Sudan, but are rare in patients from other countries. Spleen aspiration has been shown to be the most sensitive aspirate assay (93.1%–98.7%)2, 3, 6. Parasite density in splenic or lymph node aspirate smears can be graded on a logarithmic scale (from 0 to 6+), allowing the response to treatment to be evaluated, and slow responders can be distinguished from non-responders by using sequential smears7. A safe procedure for splenic aspiration has been developed in Kenya8, but it remains an invasive and complex technique1, 5. After the procedure the patient must be observed in the recumbent position for a minimum of 8 hours in a facility where blood transfusion is available. Splenic aspiration is not possible in non-cooperative children, is difficult in those without a palpable spleen and is contra-indicated in persons with active bleeding, thrombocytopenia, severe anaemia or jaundice, those in a moribund state, non-cooperative individuals and pregnant women. There is a small risk of fatal haemorrhage7 and several authors have reported iatrogenic morbidity and mortality9, 10. One death was observed in a series of 671 splenic aspirates in Kenya8, and 3 in a series of 3,000 in India11. Two episodes of fatal bleeding occurred following 9,612 splenic aspirates (0.02%) in a specialised treatment centre in India12. In conclusion, splenic aspirate is highly sensitive and specific, but can only be carried out under strictly controlled conditions, and is not suitable for use in first-line health centres.
Fluid from tissue aspirates can be inoculated in Novy–MacNeal–Nicolle medium for culture, which increases the sensitivity, but parasite culture is costly and time-consuming, and requires expertise and expensive equipment. Its use is therefore restricted to referral hospitals or research centres.
1.2. Molecular diagnosis. Molecular approaches to diagnosis have recently been reviewed by Reithinger and Dujardin13. These techniques remain complex and expensive, and in most VL-endemic countries they are therefore restricted to a few teaching hospitals and research centres.
2. Antigen-detection methods
Recently, Sarkari et al. described a urinary leishmanial antigen, a low-molecular-weight, heat-stable carbohydrate that was detected in the urine of VL patients14. An agglutination test to detect this antigen has been evaluated in laboratory trials, using urine collected from well-defined cases and controls from endemic and non-endemic regions. This test showed 100% specificity and sensitivity between 64% and 100%15. However, the sensitivity of this test was disappointingly low in clinically suspect patients in a VL-endemic area in Nepal16. Further work is ongoing, as this technique holds promise as a test of cure, for which none of the current serological tests is appropriate.
3. Serological methods
Several antibody-detection tests have been developed for field diagnosis of VL, but, as mentioned before, none is sufficiently specific for acute VL disease to be used as a stand-alone test. In a VL-endemic region, asymptomatically infected persons can also be positive in these antibody-detection tests, but they do not require treatment. This used to be the reason why many control programmes restricted treatment to parasitologically confirmed patients. However, since the late 1990s, ample evidence has been generated that a combination of the WHO clinical case definition for VL and a positive antibody test is an adequate and safe basis for the decision to treat17. Nonetheless, the limitations of these antibody-detection tests in clinical practice should be acknowledged. Assessment of cure is necessary at the immediate end of treatment (which usually lasts 21–28 days) and also at 3 and 6 months post-treatment, at a time when antibody levels have not yet waned. This also limits the usefulness of the current antibody-detection tests in persons with a previous history of VL who present with recurrence of fever and splenomegaly, as these tests cannot discriminate between a case of VL relapse and other pathologies.
Conventional methods such as gel-diffusion immunoelectrophoresis, a complement-fixation test, indirect haemagglutination test and counter-current immunoelectrophoresis have limited diagnostic accuracy and/or feasibility for field use18, 19, 20. Indirect fluorescence antibody (IFA) tests showed acceptable estimates for sensitivity (87–100%) and specificity (77–100%)21, 22 but the need for a fluorescence microscope restricts their use to reference laboratories. So far, only two antibody-detection tests have been extensively evaluated for field use: the direct agglutination test (DAT) and the rK39 immunochromatographic test (ICT).
3.1 DAT. In 1985, El Harith et al. developed a DAT for VL with high sensitivity and specificity23, and these values have been confirmed by other laboratories21, 24, 25, 26, 27. The test is semi-quantitative and uses microtitre plates with V-shaped wells in which increasing dilutions of serum or blood eluted from filter paper are mixed with stained killed L. donovani promastigotes. As the ongoing VL epidemic in Sudan28 created a pressing demand, the DAT was rapidly taken to the field. Contradictory reports on its performance were soon published29, 30. A multi-centre study reported low reproducibility owing to problems reading the results and the heat- and shock sensitivity of the liquid antigen31. A freeze-dried version of the test was developed to circumvent the latter problem, and this version showed similar diagnostic performance to the liquid version32, 33, 34.
Since 1986, the DAT has been extensively validated in most VL-endemic areas. Thirty studies were included in a recent meta-analysis, showing sensitivity and specificity estimates of 94.8% (95% confidence intervals (CI), 92.7–96.4) and 97.1% (95% CI, 93.9–98.7), respectively17. The performance of the DAT was not dependent on the region nor on the Leishmania species. DAT antigen production was initiated in some endemic countries but the production could not always be sustained, and quality control remained an issue. The cost of the antigen is in the range of
1–2 per test. Although highly sensitive and specific, the DAT requires substantial manipulation, and can only be read after a minimum of 8 hours incubation.
3.2. rK39 ICT. A test based on a 39-amino-acid-repeat recombinant leishmanial antigen from Leishmania chagasi (rK39) has been introduced into an enzyme-linked immunosorbent assay (ELISA)35, 36 and, later, an immunochromatographic strip test37. The latter is easy to use in the field and results are available after 15 minutes. The initial study showed 100% sensitivity and 98% specificity37, but this particular format (Arista Biologicals, Allentown, PA, USA) is no longer commercially available. An evaluation in Sudan of an ICT from the same producer showed only 67% sensitivity38. An ICT produced by a different company (INBIOS, Seattle, WA, USA) proved to be a good diagnostic guide in suspected VL cases in India39 and in Bangladesh, Sarker et al. found excellent sensitivity and specificity with this ICT40. In Nepal, an early prototype showed a specificity of only 71% in controls with clinical signs of VL41; however, better specificity was obtained with later generations of the InBios ICT and with an ICT produced by DiaMed AG, Switzerland22, 42.
II. The need for evaluation of VL RDTs
An expert meeting on VL diagnostics convened by TDR in Nairobi, Kenya in January 2006 identified multiple challenges in the development of VL diagnostics.
The clinical evaluation of new tests is fraught with difficulties. The lack of a gold standard has made diagnostic accuracy studies for VL extremely complex43. A gold standard in VL diagnosis exists — VL culture from splenic aspirate. However, obtaining splenic aspirates is invasive, and culture techniques are often not available in VL-endemic areas. The clinical presentation of the leishmaniasis syndromes varies considerably in different regions, and the current RDTs behave differently in the Indian subcontinent compared with East Africa17. It is therefore essential to evaluate any RDT in the region in which it will be used and greater uniformity in such diagnostic evaluations is important. The fact that substandard and/or counterfeit products have been circulating in endemic regions only adds to the need for rigorous evaluation and quality assurance. Last but not least, the variable performance of VL diagnostics in VL–HIV co-infected patients poses new challenges to test evaluation44.
In addition to confirmatory tests for diagnosis, a marker indicating the prognosis in treated patients, a test of cure after therapy, a marker of asymptomatic infection and assays that allow easier surveillance of parasite drug resistance are also needed. The ideal performance and operational characteristics for the different VL diagnostic tests that are required are summarized in Table 1.
The purpose of the test being evaluated should guide the design of the trial as the operational and performance characteristics of a test can vary depending on the purpose of the test. It is of utmost importance in the evaluation of diagnostic devices for leishmaniasis to distinguish the detection of infection from the diagnosis of VL disease.
III. General issues in study design
Past evaluations of RDTs have concentrated too often only on sensitivity and specificity. A proper evaluation of an RDT should address its performance (sensitivity, specificity and reproducibility) as well as its operational characteristics (user-friendliness and stability) and cost (see Evaluation of diagnostic tests for infectious diseases: general principles in this supplement). Also, it should be acknowledged that the development of a diagnostic test involves several phases, from early proof-of-principle and laboratory-based studies on archived samples to, eventually, clinical evaluation on prospectively recruited patients. Before a VL test can be recommended for clinical use, its clinical benefits should have been demonstrated in a prospective study that evaluated the test on a representative sample of the target population. For VL RDTs, these are the patients on whom the RDT will be used in the future; that is, persons with signs and symptoms that make them clinically suspect for VL. Zhou et al. distinguish three phases in the evaluation of diagnostics; this is useful as the study design will depend on the phase of evaluation45 (Box 1) .
Below, we discuss the essential elements in the design of a protocol to determine the diagnostic accuracy of RDTs for VL.
1. Rationale for the study
In the introduction to the evaluation protocol it should clearly state the rationale for the evaluation and the objectives of the study, describing what is already known about the issue, and how the new diagnostic test might contribute. The specific indication for the new diagnostic test should be described. Is this a test to be used in sick patients to confirm their diagnosis, or is it a marker of infection to be used for epidemiological work at the population level? Is the test a marker of acute disease? Can it be used as a prognostic marker or test of cure? In which phase of development is the test?
2. Study site
The local VL epidemiology (causative species, endemicity and most affected age groups), the climatic conditions and the workplace conditions at the study site should be described. Will the study be carried out in a research laboratory (proof-of-principle and case-control designs) or in the clinical setting (in a first-line health centre or in a specialized VL treatment centre)? Describe the type of infrastructure, and the type of staff conducting the test.
3. Study population
The choice of the study population will depend on the phase of development of the test (see above). If the study is carried out using archived samples, provide as much detail as possible on the origins of the samples, as well as the methods that were used to reach the diagnosis. Describe how these samples were stored and for how long. If the study requires prospective recruitment of patients in a clinical setting, carefully describe the inclusion and exclusion criteria. Standardized clinical case definitions should be used for enrolment, preferably the WHO case definition (see above). The minimum age of the participants should be included. Concomitant illness might confound the study results and some should be considered as exclusion criteria. The design of the evaluation should consider recent treatment of cases; although recent treatment will have little impact on the results of serological RDTs, as antibody-based tests usually remain positive for several months after treatment, it might affect the evaluation of antigen-detection tests.
4. Co-morbidities
The performance of VL diagnostic tests is highly influenced by HIV co-infection3. HIV co-infected patients typically have lower antibody and higher parasitaemia levels. Future studies of VL diagnostics should specify the HIV status of the study population and, if possible, assess the HIV status of the study subjects to allow for a separate estimate of test performance in HIV-positive and -negative patients. Due consideration should be given to all of the ethical aspects of HIV testing.
5. Recruitment process
Persons who give informed consent should undergo an interview and a physical examination according to clinical best-practice guidelines, as well as the work-up for case ascertainment if they are clinically suspect for VL. Information should be collected about sex, age, duration of illness, previous history of VL and onset of symptoms, as suggested in the sample clinic data collection form in Appendix 1.
6. Tests under evaluation
Record all details of the RDTs that will be evaluated, including: manufacturer (company name, site of manufacture), batch number, date of manufacture, packaging type and inclusion of desiccant, lancets or capillary tubes. Note whether the product is under evaluation for regulatory purposes or is already commercially available.
7. Reference standard
Several published VL diagnostic accuracy studies suffer from reference test bias. Researchers comparing a new test to a reference standard with high specificity but low sensitivity, such as bone marrow or lymph smears, will underestimate the true specificity of the new test. This kind of sub-optimal reference standard misses many true VL cases that test positive with the new test1. Moreover, lymph-node-positive VL patients probably comprise only a sub-set of all VL patients in a given region, and this might again bias the sensitivity estimates. All of the tissue aspirate assays have another inherent problem: they cannot be applied indiscriminately to healthy controls, which complicates the ascertainment of control status in Phase II studies.
The demonstration of parasite amastigotes in smears or culture from splenic aspirates should be used as the reference standard in VL diagnostic accuracy studies, if the procedure can be carried out safely. Given flawless technical execution, it will be both specific (
100%) and sensitive (>95%). Some centres use the sequence of lymph node and/or bone marrow aspirates, followed by splenic aspiration if the other aspirates are negative. This has the advantage of limiting the number of splenic aspirations while maintaining high sensitivity and specificity.
In cases where splenic aspiration cannot be used, researchers can opt to use either a composite reference standard (CRS) or latent class analysis (LCA)46. Both involve the use of several diagnostic tests as comparators for the test under evaluation, the former being an empirical definition of disease status and the latter a mathematical approach based on the probability of disease given the observed test pattern. Notwithstanding their inadequate specificity for acute disease, serological tests for VL can be included in the panels of tests used in CRS or LCA, but cannot be considered as a reference standard for stand-alone use. In the past, response to specific VL treatment was used to confirm that a diagnosis was correct, as antimonials have a very narrow spectrum. With other drugs, for example, amphotericin B, this criterion becomes less specific. Table 2 gives an overview of the acceptable reference standards in the evaluation of VL diagnostic tests.
Table 2 | Recommended reference test for the evaluation of an RDT for detection of active VL disease
8. Organization of testing
Consideration should be given in the protocol to who will perform the tests and, in the case of an RDT, whether the results will be read by one or multiple readers. For prospective evaluations in populations for whom the test is intended, it is important that the tests be performed by clinic staff or outreach workers who will provide the diagnostic testing in that population in the future. The protocol should describe the qualifications the staff require and the training they need.
IV. Conducting the evaluation
1. Obtaining informed consent
See the discussion of informed consent in the generic guidelines Evaluation of diagnostic tests for infectious diseases: general principles in this supplement and the sample informed consent forms in Appendices 2 and 3.
2. Specimen sampling and preparation
Venous or capillary blood or serum can be used for most RDTs. The manufacturer's instructions should be carefully respected. However, if there is evidence which allows deviation from the manufacturer's instructions, such deviations can be followed. For example, the package insert from InBios specifies the use of serum for VL detection using the rK39 ICT, however, there is now sufficient evidence that for active VL, whole blood obtained through a finger prick produces similar results. This is extremely important from a programmatic point of view, as the necessity to centrifuge the blood to obtain serum is likely to pose great problems in field conditions. Most RDTs specify that the results should be read within 15–20 minutes after the application of the specimen. This might not always be possible in a busy clinic. It might be useful to include in the evaluation protocol a reading after 1 hour to determine whether the test results remain the same. This would certainly increase the usefulness of the RDT.
3. Transport and storage of specimens for RDTs
A major effect of specimen sampling has been observed for the latex urine antigen-detection test: the test performed very poorly on stored urine. The manufacturer's storage instructions should therefore be followed carefully, tests should be kept out of direct sunlight and the cold-chain requirements should be respected. Keep records of the date of manufacture, expiry date, duration of storage on site, temperature and humidity of storage, the state and type of packaging, and the time to complete use from opening.
4. Use of test kits
The general guidelines for the use of test kits outlined in Box 2 should be adopted and implemented. All tests should be performed according to the manufacturer's instructions. Any deviation from the recommended procedure should be recorded.
As the interpretation of RDT results is subjective, it is recommended that at least two individuals read the test results independently. The results of RDTs performed in the clinic can also be evaluated against RDTs performed by trained laboratory technicians to assess the feasibility of using these tests in field settings, executed by auxiliary staff. In this type of agreement study, blinding is necessary to ensure the independence of test results in the evaluation. Laboratory staff should be blinded to the RDT results at the clinic and vice versa. To avoid any potential bias in the interpretation of the results, laboratory technicians and readers of RDTs should be blinded to the clinical status of the patient, his or her reference standard results and the results of other RDTs.
5. Training and choice of technicians, test preparation and interpretation
Training and experience of technicians can affect the test performance because reading of an RDT result is not always unequivocal. Sometimes the bands are faint, but these do indicate a positive test and it is a common mistake to read these as negative or doubtful. Similarly, if a dent is produced on the strips owing to manufacturing or handling error, a coloured line can appear but this is generally located in the wrong place on the strip or is very thin. In such circumstances, it is prudent to repeat the test. A company-prepared buffer is supplied with the strips, and it is extremely important to use that buffer only. If for some reasons the buffer runs out, it is best to ask for replacement buffer.
6. Laboratory facilities and testing sites
The reference laboratory that will conduct the evaluation should establish clear Standard Operating Procedures (SOPs) for both the reference standard and the RDT being evaluated.
7. Biosafety issues
The general biosafety guidelines for clinic and laboratory staff outlined in Box 3 should be adopted and implemented
V. Quality assurance
Teams that engage in the evaluation of RDTs for VL should subscribe to existing processes for laboratory quality assurance.
VI. Recording of results and archiving of specimens
The results of the two readings of the RDT under evaluation should be recorded in separate notebooks to ensure independent interpretation of the results. Both the results of the RDT and the results of the reference standard should then be entered into a spreadsheet, together with the information on the sex and age of the subject and a limited set of variables (including treatment status and duration of symptoms). Double entry of data is recommended to minimize errors. The collected information as well as the frozen serum samples should be kept until the study has ended and the results have been published.
VIII. Analysis of results
The sensitivity, specificity and 95% CIs should be calculated for each RDT compared to the results obtained by the reference standard. In Phase III studies with prospective recruitment of patients, positive and negative predictive values of the new test should be given, but not in case-control studies as the frequency of disease in such studies is artificially determined, and does not reflect the real prevalence or allow a meaningful interpretation of predictive values.
Box 4 contains a checklist with all of the points that should be considered in the design and conduct of evaluations of RDTs for VL.
