Interpreting DNA Evidence: Statistical Genetics for Forensic Scientists

. Ian W. Evett and Bruce S. Weir. Sinaur Associates Inc, Sunderland, Massachusetts. 1998. Pp. 278. Price £25.95, paperback. ISBN 0 87893 155 4.

This book sets out to provide the statistical and genetical knowledge that forensic scientists require to report and testify about DNA profiling evidence. In doing so it tells a story of much broader interest. DNA evidence can be both very strong and quantifiable, yet these apparently useful properties have exposed and challenged the way in which scientific evidence is presented and decisions are made in court. How can a court weigh the impressive odds given to explain the strength of the DNA evidence with the more conventional evidence?

Evett and Weir illustrate the issue using the case of R vs. G. Adams. In outline, the case involved a rape by a stranger. A suspect was identified by a DNA profile match with a sample obtained in connection with another incident. This suspect, Adams, had an alibi for the night of the attack. He was not picked out in an identity parade by the victim, indeed she said at a later hearing that he did not look like her attacker and that he was appreciably older. The presentation of the evidence was understandably problematic involving retrial and appeals. How could the jury compare the unquantified evidence suggesting innocence with the odds calculated from the DNA evidence suggesting guilt? At retrial prosecution and defense experts cooperated in a remarkable innovation. They guided the jury in the calculations needed to express the identification and other non-DNA evidence in numerical form, thereby allowing direct comparison with the DNA evidence. Adams was again convicted, but at the subsequent (unsuccessful) appeal the judges recommended that juries should not normally be induced into numerical reasoning. This leaves the problem of how to present the DNA evidence to the court so that it can be evaluated appropriately by common-sense.

The books leads up to these difficult and incompletely resolved problems. It starts by laying the necessary foundations in probability and population genetics. The authors take the view that forensic scientists are ‘often uncomfortable with statistics’ and so start from fundamentals such as the meaning of randomness and probability. This allows them to introduce Bayesian analysis to forensic inference. This will be a big jump for their target audience. Most of them will have been trained to assess data in the classical statistical mode, to evaluate the probability of the evidence under the assumption of some null hypothesis; in this case the null hypothesis might be that the suspect did not leave the crime stain. They advocate assessing, instead, the effect of the genetic data on the relative probability of the hypotheses of interest to the court; perhaps the hypothesis that the suspect left the stain compared to the hypothesis that some other person did.

The first two chapters of the book appear to have been very thoughtfully constructed; they provide the necessary tools and background to make the difficult conceptual jump between the classical and Bayesian mode of reasoning. They use examples from forensic science at an early stage, avoid distracting issues and write clearly. The middle of the book necessarily becomes more densely written in order to cover the range of problems that occur in practical casework and will perhaps be more useful as an expert training text.

The final chapter will again be of wider interest and is particularly stimulating. The importance of introducing the Bayesian approach is illustrated by the problem of a suspect identified by a search through a database. At first sight it seems reasonable that the evidence against a suspect is weaker if he is identified in this way; after all forensic databases can be quite large (the UK database may eventually contain over a million people). Even if match probabilities are several million to one, the chances of one innocent person matching in a large database may be non-negligible. For this reason a US National Research Council recommended that a diminished strength of evidence should be represented by multiplying the match probability by the database size. The arguments of Balding and Donnelly show this to be logically flawed: imagine the database being extended to cover almost every likely suspect in the country, surely as the database gets bigger the single matching person is more likely to be the true culprit, not less! The book outlines how the Bayesian approach resolves this paradox in a straightforward manner.

What the Bayesian approach cannot resolve is the tricky issue of ensuring that the court reasons sensibly with the information provided to it by the experts about DNA profiles. They provide examples of expert testimony to courts which are logically incorrect (which may be regarded as grounds for appeal). On the other hand they present superficially similar wordings about the same evidence which are correct. They report that a judge has confided that, if the differences between the correct and incorrect sentences is so subtle, then perhaps the fallacy doesn’t matter. Alternatively it may be that there are as many problems with courts trying to apply common-sense to reasoning about forensic data as there are with introducing numerical reasoning about the other evidence.