Last year marked the tenth anniversary of forensic DNA typing [1]. There were scientific advances and changes of direction in the continuing debate over assessment of DNA evidence, the resolution of which has been claimed prematurely [2]. This year promises new developments which can be understood only in the light of recent events.

Scientific Advances

In most countries, single-locus profiles have replaced multilocus fingerprints for criminal trials, although not entirely for paternity cases. There is increasing use of the polymerase chain reaction (PCR) because of its feasibility with minute amounts of DNA, resistance to degradation, and short assay time. Defence lawyers have attacked its sensitivity to contamination, and this will undoubtedly promote fastidious isolation of evidentiary samples from a suspect’s DNA. Technology that permits the safe culture of anthrax is adequate to prevent contamination of DNA, but chain-of-custody evidence must support the claim that isolation has been practised.

These and other aspects of quality control over the presentation of evidence are still evolving, and exaggeration of hazards may be constructive in the long run. Precision has been increased by semi-automated electrophoresis with a standard in every lane, making results on different gels more consistent and, therefore, databases more useful. The power of these techniques has been strikingly demonstrated with the identification of the Romanov remains and exclusion of a pretender [3].

Alleles precisely defined by PCR typing of short tandem repeats can be meaningfully tested for Hardy-Weinberg equilibrium. In contrast, highly variable loci like minisatellites have so many alleles of different but similar sizes that each allelic bin is heterogeneous: increasing the arbitrary width of a bin makes expected homozygosity Σq2 exceed the frequency of single-band phenotypes, and vice versa, so that apparent departure from Hardy-Weinberg proportions becomes a statistical artefact. The only reasonable approach with highly variable loci (sometimes misunderstood [2]) is to use bin width defined on the ‘radius of coalescence’ that satisfies the frequency of single-band phenotypes [4].

The inference that an evidentiary sample came from the suspect (barring monozygous twins) requires rejection of three other possibilities: (1) the samples do not match (the exclusion test); (2) they match by chance (the coincidence test), and (3) they match because the evidentiary sample came from a relative (the kinship test) [5]. The exclusion test does not require and should not use bins, since it depends only on quantitative differences in fragment size. The Hardy-Weinberg assumption for the coincidence test has received strong support from the demonstration that the corresponding matching probability exceeds the value under inbreeding (and is therefore favourable to the suspect, or ‘conservative’), except for extreme levels of inbreeding or low levels of polymorphism not suitable for forensic markers [5, 6]. In practical terms, this means that inbreeding of suspect and culprit can be neglected in the coincidence test without prejudice to the suspect, but possible kinship between them has a non-negligible effect and must in fairness to the suspect be given some consideration. An expert witness should consider these different contingencies, without usurping the obligation of the court to decide among them.

Whereas the effect of a smegligibleose relationship (e.g. sib, parent, cousin) can easily be deduced, an estimate of remote kinship must be based on studies of population structure. There is now strong evidence that some expressed loci like GM have high kinship within race (FRT), presumably because of divergent selection, whereas kinship is much less for non-expressed loci [4, 7]. In contrast, for a subpopulation within a race (FSR) there is no systematic difference among valid estimates of kinship from genealogy, migration, isony-my, or biossay of different types of loci [8], presumably reflecting the predominance of mutation and migration over divergent selection. Further tests of this hypothesis of uniformity will be made as structures relevant to forensic populations continue to be bioassayed. There is already a theoretically sound and empirically supported basis for the kinship test whenever the circumstances of the crime or the eloquence of the defence make it relevant.

The National Academy Tries Again

The US National Academy of Sciences (NAS) was created during the American Civil War. It is dedicated to the furtherance and use of science, and its congressional charter calls on it to advise the government whenever requested. In 1916, the National Research Council (NCR) was established as the operating organization of the Academy for such advice, and in 1990 this mechanism was invoked to form a Committee on DNA Technology in Forensic Science. Their report [9] unleashed a storm of criticism from two groups of scientists. Statisticians insisted that likelihood ratios (the optimal method for statistical decisions) should not have been dismissed [10–12]. Population geneticists were appalled that an ad hoc ‘ceiling principle’ was proposed as an alternative to well-established theory and empirical evidence [13]. Scientists normally given to understatement described the ‘ceiling principle’ as ‘arbitrary’, ‘capricious’, ‘indefensible’, ‘interest-ridden’, ‘pseudo-statistical’, ‘comic’ and as ‘having no rational basis’ [14–16].

Bowing to the storm, the NCR formed a second committee to address new developments and to ‘rectify those statements regarding statistical and population genetics issues in the previous report that have been seriously misinterpreted’ [17]. This committee includes three mathematical geneticists, but statisticians and population geneticists who had taken part in debate on the ‘ceiling principle’ were intentionally excluded. The committee cannot be considered unbiased, since it includes two signatories of the original report, but it is inexperienced. Given its compositon it cannot reject the approaches advocated by statisticians and population geneticists, but it may well try to conserve some modified version of the ‘ceiling principle’, embracing science without abandoning superstition. Galileo’s tribunal made the same effort, as did Voltaire’s savant who killed swine with an ingenious mixture of arsenic and prayer.

Two aspects of this controversy are relevant to science: the scientific issues and the effect of an expert committee. In its 26-page summary, the first committee concluded that ‘interpreting a DNA typing analysis requires a valid scientific method for estimating the probability that a random person by chance matches the forensic sample at the sites of DNA variation examined … The committee recommends approaches for making sound estimates that are independent of the race and ethnic group of the subject’. Perhaps by ‘subject’ they meant ‘suspect’ or maybe ‘culprit’. They advocated that unbiased estimates of gene frequencies be replaced by the upper 97.5% confidence limit (miscalled 95% [18]). They also advocated ‘random samples of 100 persons from each of 15–20 populations that represent groups relatively homogeneous genetically. Take as the ceiling frequency the largest frequency in any of those populations or 5%, whichever is larger’. They offered as the only justification that ‘use of the ceiling principle yields the same frequency of a given genotype, regardless of the suspect’s ethnic background, because the reported frequency represent a maximum for any possible ethnic heritage’. As critics have pointed out, it is not clear that this objective is desirable or that it is achieved by the ‘ceiling principle’. In the summary there is no proposal as to how the population samples should be chosen or why ‘homogeneity’ (however defined) should be sought.

A later chapter (mislabelled ‘statistical basis for interpretation’), written by a subcommittee, proposed that the confidence limit belt be removed when fudging braces are worn and ‘the populations should span the range of ethnic groups that are represented in the USA — e.g., English, Germans, Italians, Russians, Navahos, Puerto Ricans, Chinese, Japanese, Vietnamese and West Africans’. Obviously, Amerindians are not ‘spanned’ by Navaho, nor Hispanics by Puerto Ricans. If independence from ethnic background were justified, the samples should apply to suspects in the rest of the world and limitation to the Unites States is illogical. If independence of ethnic background is not valid for suspects outside the US, it cannot be justified for American suspects.

Recognizing the reluctance of forensic scientists to play anthropologist, the committee proposed a ‘modified ceiling principle’: use of forensic samples of at least three major races, replacing each gene frequency by its upper confidence limit and taking the largest of these values or 0.1, whichever is greater. There is no apparent reason why the confidence limit should be removed from the ‘ceiling prinicple’ but kept for the more extreme ‘modified ceiling principle’. The assumption that a product of confidence limits is a confidence limit is not true in any case, and the result is not a probability [11].

The ferocity of scientific opposition to the ‘ceiling principle’ is understandable, and its advocacy by the committee has sown confusion in American courts. However, the impact of their report should not be exaggerated. In the US, the ‘ceiling principle’ is not used in paternity cases or other non-forensic problems in DNA identification. Mary Claire King was a member of the NRC committee, but does not use the ‘ceiling principle’ in her own work on victims of the Argentine terror [19]. That would be irresponsible if the ‘ceiling principle’ were a valid safeguard. The ‘ceiling principle’ is not extended to genetic risks, insurance, engineering, and other applications of probability. It has had little impact on decisions in court, where valid alternatives are usually presented [2]. Outside the US, the ‘ceiling principle’ is not used for any purpose. This situation is unlikely to change even if the NRC blunders again: in recent centuries opposition between science and authority has been resolved in favour of science. The issue is not which will prevail, but whether the National Academy will become a laughingstock again.

Strange Bedfellows

When the NRC committee made its report, the statistics chapter was widely attributed to Eric Lander, prompting Bruce Budowle to complain that having Lander coordinate the chapter was like having ‘the fox guarding the hen house’ [20]. Lander is a polymath with a long history of concern about the validity of DNA evidence [21], whereas Budowle is the senior practitioner for the Federal Bureau of Investigation (FBI). They recently revised their positions and concluded that the DNA fingerprinting dispute has been laid to rest by the ‘ceiling principle’ [2], prompting an anonymous commentator (apparently unaware that the ‘ceiling principle’ is unanimously rejected by Scotland Yard, the Home Office, and similar agencies outside America) to remark that ‘there is no longer any reason to mistrust these techniques. Initial problems have been solved, standard techniques established and statistical criteria set’ [3]. This is true, but the criteria do not include the ‘ceiling principle’. Until it is laid to rest, there will be controversy about whether statistical and genetic principles established 50 years ago and supported by a large body of evidence should prevail over an arbitrary rule. The conclusion that ‘in a knowledgeable court DNA profiling is no longer exposed to risk of illogical presentation, blind acceptance or arbitrary rejection’ [22] has not been controversial for several years and is no longer newsworthy.

The Lander and Budowle commentary modified the ‘ceiling principle’ again. The stipulated number of populations has been reduced to 10–15, and justification has been sought in the supposition that the US population is descended from a set of N populations, of which the selected populations are a subset. A court in Maine may reasonably ask whether English and Navahos contributed equally to their criminal population. Since the argument no longer applies to other countries, it cannot apply to different ethnic groups in the US, and the calculation is still not a probability, a ceiling, or a principle.

The Anti-DNA Lobby Fights Back

Five weeks after this commentary, Nature published replies from two critics of DNA typing [23, 24]. Lewontin [23] cited early batch-processed validation tests and the risk of PCR contamination, which can be controlled by isolation, replication, and proper respect for the chain of custody. He argued that jurors who may bet on horses cannot be made to understand odds, and so DNA evidence should be dismissed until unique idiotypes are available. Since individuals are unique only in the limit as the number of typings increases, this is as unconvincing as his earlier suggestion that a database should be constructed for every possible ancestry of the suspect [25]. The problem of population structure was referred to in a companion letter by Hartl [24] that defends the ‘ceiling principle’ which Lewontin strongly opposes [13]. Hartl expressed delight that the FBI in the person of Budowle apparently accepts the ‘ceiling principle’, but apprehension that its acceptance may not be sincere. He called on the FBI to adopt some version based on sampling diverse ethnic groups, not a ‘modified ceiling principle’ based only on major forensic groups. Noting that the FBI has been granted authority to set up a committee to ‘make short work of the population genetics issue, by clarifying, changing, or discarding the original NRC recommendations’, he fears that the ‘modified ceiling principle’ may be abandoned, as it would be if population genetics were represented. Thus Lewontin argues that DNA typing is so unreliable that no attention to population structure will fix it, whereas Hartl contends that typing may be accepted if it uses the ‘ceiling principle’. They cannot both be right.

O.J. Simpson

The murder case against the former football star, O.J. Simpson, fascinates many Americans. Interpretation of DNA evidence is a central issue. An attack on the ‘ceiling principle’ is one of several tactics used by the defence, which cites a bonanza of scientific criticism [26]. By attacking the most vulnerable target, it is not difficult to create the false impression that critics are against DNA typing, whereas, with few exceptions, they are merely in favour of scientific ways to evaluate the evidence. Several commentators have made the point that the O.J. Simpson trial will ‘exert an immense influence on the public’s acceptance, or otherwise, of these powerful tools of justice’ [3]. Whatever the outcome, the ‘ceiling principle’ will fare badly in competition with more rigorous methods.

The Conference on DNA Fingerprinting

The Third International Conference on DNA Fingerprinting was held in Hyderabad at the end of 1994. The title is somewhat misleading, since more attention was given to profiling and idiotypes than to multilocus fingerprinting. There were pleasant surprises for European participants. The venue was ideal, with frequent conversation out of doors in a mild climate. Many participants use DNA typing for evolutionary studies, gene mapping, and identification of biological material of uncertain origin. DNA evidence is non-controversial not only for these investigators, but even in forensic work outside America. Bruce Budowle upheld the ‘ceiling principle’ as a stopgap, while Alec Jeffreys and Peter Gill spoke against it in any form, but Asians were largely unfamiliar with the controversy. PCR is much more expensive that fingerprinting in India, and so advances in statistical evaluation and databasing of multilocus fingerprints may well come from Asia.

The conference served to put residual disagreement into perspective. It affects only a small part of the globe, and the scientific issues are simple. Debate about admissibility of DNA evidence ended not with a bang but a whimper, when Lewontin rested his case on laboratory negligence and the stupidity of jurors and Hartl insisted that blunder by a committee should not be rectified. The committee did as well as it could in ignorance of population genetics and statistics. Now, as Lander and Budowle remark, ‘it is time to move on’. Discussion about the best way to present evidence must continue so long as advances are made in the molecular techniques that provide the evidence, but it may reasonably be hoped that future discussion will be confined to methods that do not violate population genetics and statistics. The kinship test has recently been made even more rigorous [27, 28]. Calculations presented as evidence should be subjected to as much quality control as laboratory performance, and the enormously expensive databases now being constructed should provide information about population structure and the operating characteristics of alternative likelihood ratios and their unvalidated competitors [29, 30].