Main

The armentarium of diagnostic procedures in clinical medicine has become vast and sophisticated over the past decades. A systematic review of autopsy-detected diagnostic errors over the last 40 years of the 20th century disclosed a relative decrease of about 20% for major errors and of one third for class I errors per decade.1 Time series from single institutions included in the systematic review showed no decrease in diagnostic discrepancies over time2, 3, 4, 5, 6 with one exception,7 reflecting probably a combination of inadequate power and decreasing autopsy rates leading to selection bias.1 We report here the 10 year follow-up of the previously mentioned study7 to assess the impact of new diagnostic tools as spiral computed tomography (CT) angiography,8, 9 an array of biomarkers10 and the use of electronic medical records on the frequency of diagnostic errors detected by autopsy.

Materials and methods

Selection of Cases

We analyzed retrospectively the medical and necropsy records of 100 randomly selected adult patients who died at the medical clinic A and B and the medical intensive care unit at the Department of Internal Medicine at the University Hospital Zurich, Zurich, Switzerland, in the year 2002. Data from 1972, 1982 and 1992 were published in a previous report.7 Random sampling was performed with a random-number table11 from the list of patients who died in the medical clinic and underwent necropsy. In the late nineties, Zurich switched from tacit consent to informed consent for autopsy-permission. In all patients who died in the medical clinic in 2002, informed consent for necropsy was sought.

The role of the medical clinic at the University Hospital of Zurich as referral center of eastern Switzerland remained unchanged, as did the organization of the medical clinic. All medical inpatients of the University Hospital of Zurich were cared for in the emergency room and then admitted to the medical clinics A and B or to the medical intensive care unit. There were no specialised medical wards. All patients were cared for by full-time hospitalists,12 who were in charge of the emergency room, of all the wards and the medical intensive care unit. There was a close collaboration with all the medical specialties in the Department of Medicine. An in-house developed computer-based patient records system was stepwise introduced since 1995. In 2002 medical and nurse reports, laboratory results and radiology reports were available at all the working places.

Analysis of Reports

We recorded age, sex, number of admissions in the 12 months before the index hospital stay, and length of stay. Clinical diagnoses were those listed by the clinician on the necropsy request and all diagnoses that were established or assumed and led to specific treatment. Autopsy diagnoses were the diagnoses listed on the final autopsy report. All included patients underwent a complete necropsy, including histological assessment of each organ, which was performed by a junior pathologist. A staff pathologist reviewed macroscopic and microscopic findings. Each week the results of autopsies were presented and discussed with the medical clinic's clinicians. The main diagnoses were listed separately on the autopsy report, but were not classified and no cause of death was given. Clinical and necropsy diagnoses were grouped into seven classifications according to the International Classification of Disease, 9th edition (classification 1979–1983): infectious diseases (ICD-9 1–139); neoplastic diseases (ICD-9 140–239); cardiovascular diseases (ICD-9 390–459); pulmonary diseases (ICD-9 460–519); gastrointestinal diseases (ICD-9 520–579); renal diseases (ICD-9 580–629); and miscellaneous (remaining diagnoses).

We classified discrepancies between the clinical and autopsy diagnoses according to the method of Goldman et al3 the modification of Battle et al13 and as non-classifiable cases14 (Box 1). Major diagnoses were those involving the principal underlying cause of death and major contributors to it.13 Minor diagnoses were antecedent disorders, related diagnoses, contributing causes or other important disorders.14 If the decision to limit or stop the diagnostic or therapeutic process was made during the hospital stay, we assessed and classified the clinical diagnostic process up to the point at which the process was stopped. For example, a patient was diagnosed as having a diffuse metastasizing carcinoma and it was agreed and stated in the notes that no further investigation and treatment should be undertaken apart from analgesic treatment and supportive care. The patient died 10 days later. Pneumonia and deep-vein thrombosis seen at necropsy were not taken into account for the classification of discrepancies. If, however, the carcinoma had not been confirmed, the diagnosis would have been classified as a major discrepancy.

A single class of discrepancy was assigned to each case. For cases with more than one class, the most severe was chosen. Discrepancies were classified by agreement of two clinicians (SSB and FS) and one pathologist (HM). All class I and II errors were assessed a second time, but in no case was a reclassification necessary.

We calculated accuracy, sensitivity, and specificity for the three most frequent clinical categories of diagnoses—cardiovascular, neoplastic and infectious diseases. Accuracy was calculated as the sum of true-positive and true-negative diagnoses in each category divided by all cases. We calculated sensitivity as the proportion of true positives divided by the sum of true positives and false negatives. Specificity resulted from the proportion of true negative divided by the sum of true negatives and false positives. We counted class I and II discrepancies to be false-negative diagnoses in the case in which the necropsy diagnosis was not in the same diagnostic group as the clinical diagnosis is. False-positive diagnoses were cases with class I and II discrepancies, in which the clinical diagnosis was not in the same diagnostic group as the necropsy diagnosis. When the diagnoses were in the same diagnostic group but a major discrepancy was present, we took them to be false positive and false negative within the assessed diagnostic group. If, for example, aortic dissection was wrongly diagnosed as myocardial infarction, the diagnosis was counted as false negative (missed aortic dissection) and as false positive (myocardial infarction not present).

Diagnostic tests (except blood tests and microbiological investigations) were separated into the following groups:3, 4 standard non-contrast radiological procedures, contrast radiological procedures, endoscopies (gastrointestinal and respiratory tract), biopsies and surgical explorations; scintigraphy; ultrasonography; echocardiography; CT; and magnetic resonance imaging. As defined by Goldman et al3 the number of different types rather than the number of procedures were counted.

Statistical Analysis

We calculated means and proportions of baseline characteristics. We compared the changes in the proportion of errors across the study years with the exact Cochran–Armitage test for trend. Analysis was carried out with SAS (version 9.1) and SPSS (version 12.0.1). All reported P-values are two sided and P≤0.05 was taken to be significant.

Results

The main characteristics of patients were similar in each year (Table 1). The necropsy rate was 94.0% in 1972, 89.2% in 1982 and 1992 and declined to 53.6% in 2002. Cardiovascular and neoplastic diseases were the largest diagnostic groups in 2002 followed by infectious diseases (Table 2).

Table 1 General data of study patients
Table 2 Distribution of clinical diagnoses

The changes in the six discrepancy classes (Box 1) over time are shown in Figure 1. Major diagnostic errors (class I and II) declined significantly from 30 to 7% (P<0.001). In the last 10 years major diagnostic errors declined from 14 to 7% with class I errors being reduced from 7 to 2% and class II errors from 7 to 5%. The major diagnostic errors in 2002 were pneumonia (two cases), myocarditis (two cases) and one case each of pulmonary embolism, intestinal ischemia and metastatic subdural empyema due to pneumococcal septicemia. Minor diagnostic errors (class III and IV) increased from 23 to 53% (P<0.001). Class III errors decreased from 25 to 16% in the last decade. Minor occult diagnoses (class IV discrepancies) increased from 21 to 37% in the same period.

Figure 1
figure 1

Changes in discrepancy classes I–VI over the study years. Class I decreased from 16 to 2% (P<0.001) and class II decreased from 14 to 5% (P=0.026). Major diagnostic errors (class I and II) decreased from 30 to 7% (P<0.001). Class III increased from 13 to 16% (P=0.207) and class IV from 10 to 37% (P<0.001). Minor diagnostic errors increased from 23 to 53% (P<0.001). Class V decreased from 43 to 35% (P=0.212) and class VI from 4 to 5% (P=1.000).

Sensitivity for cardiovascular diseases increased from 69 to 92% (P=0.006), for infectious diseases from 25 to 90% (P=0.013) and for neoplastic diseases from 89 to 100% (P=0.053) (Table 3). Specificity for cardiovascular diseases increased from 85 to 98% (P<0.001) but was unchanged for infectious diseases (100–99%, P=0.245) and for neoplastic diseases (92–99%, P=0.125).

Table 3 Accuracy, sensitivity and specificity for (a) cardiovascular diseases; (b) neoplastic diseases; (c) infectious diseases

The number of diagnostic procedures increased from 144 to 281 (P<0.001) with a higher number of CT investigations and of biopsies and fine-needle aspirations in the last decade (Table 4).

Table 4 Number of patients with diagnostic procedures in study in years 1972, 1982, 1992 and 2002

Discussion

In this longitudinal study we observed a further significant reduction in major diagnostic errors at the beginning of the new millennium. A similar reduction from 15.0 to 6.1% was reported recently from an other single center study of 970 autopsies continuously analyzed over a period of 10 years from 1997 to 2006, with an average autopsy rate of 50%.15 Such studies reflect in the first instance the local efforts to improve the diagnostic and therapeutic process. But by analyzing the possible contributing factors to the improvement there could be some indications for a more general trend. The changes in diagnostic accuracy, sensitivity and specificity can give some indications.16

In the present study the largest changes in accuracy, sensitivity and specificity for cardiovascular diseases occurred between 1972 and 1992. In the study of Thurnheer et al,15 the significant increase in accuracy, sensitivity and specificity for cardiovascular diseases was seen between 1997 and 2002 coinciding with the introduction of d-dimers, troponin and spiral CT angiography for the diagnosis of cardiovascular diseases. The levels of accuracy, sensitivity and specificity for cardiovascular diseases in the present study are remarkably similar to Thurnheer et al,15 but they were reached in the year 1992 before the introduction of the abovementioned new diagnostic tools. This example shows how the same result can be achieved by different means depending on the local circumstances regarding patient characteristics, diagnostic tools and diagnostic know-how. The further increase in sensitivity for cardiovascular diseases observed in this study between 1992 and 2002 could well be the result of the new diagnostic tools.

In most autopsy studies the number of missed neoplastic diseases was low,3, 4, 15 with exceptions even in recent times.17 In contrast with our previous study,7 we now found a significant increase in diagnostic sensitivity and accuracy for tumors over the last 30 years. This improvement goes along with an increased use of diagnostic procedures such as CT and tissue sampling most pronounced in the last decade (Table 4).

Having analyzed the changes in sensitivity and specificity for cardiovascular and neoplastic diseases it seems likely that increased use of diagnostic tools have contributed to the reduction of diagnostic errors. But as argued above, other factors could come into play. In trying to find such factors, speculations are inevitable in a retrospective study on an extremely complex task such as the diagnostic process. Graber et al18 found that cognitive factors were responsible in 90% of the cases with diagnostic errors, whereas in cases with delayed diagnosis, system-related factors were the main cause.

Faulty information synthesis was the most frequent cause of cognitive-based diagnostic errors and premature closure the single-most frequent mechanism.18 Premature closure can occur at any stage of the diagnostic process.7 The tendency to stop considering other diagnosis is independent of clinical experience19 and is associated with overconfidence in the already available findings leading to a false-positive diagnosis. Overconfidence is therefore a main factor leading to diagnostic errors and by the same mechanism autoptic verification of diagnoses is considered unnecessary by clinicians,20 as reflected by the very low autopsy rates.21, 22, 23 This fact is rarely stated openly but often disguised as complacency.20, 24 Post-mortem case review without autopsy is often used as a substitute,25, 26 but this approach has been shown to leave 85% of main diagnostic errors undetected.27 The invaluable advantage of feedback and learning through autopsies is that uncertainty is almost eliminated and confidence in the diagnostic workup is strengthened in cases with no discrepancies, which are still the largest group. In cases with major or minor discrepancies clinicians become aware of their fallibility as a prerequisite to further improve the diagnostic process28 by correcting overconfidence.20, 29 Autopsy should therefore be an integral part of any effort to reduce diagnostic errors.30

Timely information retrieval is a fundamental part of making a correct diagnosis. Graber et al18 found that non-availability of data contributed to diagnostic errors at the system-related level. In our Department of Internal Medicine, a clinically intuitive computer-based patient record system was developed and was used by doctors and nurses since 1995. The all-around availability of patient records may have contributed to reduce diagnostic errors in the present study. It is interesting to note that the introduction of a computer-based patient record system can change knowledge organization and reasoning pattern in medical decision making.31

There are several limitations in the interpretation of the present study. Most importantly there has been a drop in autopsy rate in the last decade from around 90 to 54%. This was in part due to the change in legislation in the county of Zürich from tacid to informed consent regarding autopsy request as observed by others.22 The lower autopsy rate makes the interpretation of the reduction of diagnostic errors difficult. The distribution of main diagnostic groups showed no evidence of selection bias (Table 2). We compared 100 patients without autopsy who died in 2002 with the study population and found no difference in age, gender, length of stay, previous hospitalizations or in the number of diagnostic procedures (unpublished data). A systematic review of the relationship between autopsy-detected diagnostic errors and autopsy rates found that lower autopsy rates were associated with higher rates of major diagnostic errors.1 We have been very careful to use the same criteria to assign the discrepancy classes as in the previous study but subtle unintended shifts cannot be excluded. Contrary to the previous study this time a pathologist was part of the team to classify the cases. He was not involved in performing and analyzing the autopsies.

The apparent increase in minor diagnostic errors is due to the classification system used, where only the most severe discrepancy was counted. With the reduction of major diagnostic errors more class III and IV errors emerged as discrepancies, with a predominance of class IV discrepancies in the last decade.

In summary we observed a further improvement of diagnostic performance assessed by autopsy from unselected patients who died in the wards and in the intensive care unit of an academic Department of Internal Medicine. This reduction of diagnostic errors is likely to be the result of new diagnostic methods, of continuous feedback and learning through autopsy and improved availability of patient information.