Normal Accidents: Living with High-Risk Technologies

  • Charles Perrow
First published by Basic Books, 1984. Reprinted Princeton University Press: 1999. 386 pp. $35 9780691004129 | ISBN: 978-0-6910-0412-9

In publishing, much hinges on timing. So it was with Charles Perrow's influential book Normal Accidents. Its publication in 1984 was followed by a string of major technological disasters — including the Bhopal industrial chemical leak in India in December 1984, the explosion of the US space shuttle Challenger in January 1986, and the Chernobyl nuclear accident in Russia in April that year. Each cried out for the sort of detailed analysis that Perrow supplied. Now, more than a year after the Deepwater Horizon oil-rig blowout in the Gulf of Mexico, and in the aftermath of the nuclear disaster at Fukushima Daiichi in Japan in March, the book's message seems again prescient.

Radiation monitoring at Three Mile Island in 1979. Credit: K. MEYERS/THE NEW YORK TIMES/REDUX/EYEVINE

Normal Accidents contributed key concepts to a set of intellectual developments in the 1980s that revolutionized how we think about safety and risk. It made the case for examining technological failures as the product of complex interacting systems, and highlighted organizational and management factors as the main causes of failures. Technological disasters could no longer be ascribed to isolated equipment malfunction, operator error or random acts of God.

As one of the foremost US authorities on the sociology of large organizations, Perrow admits that he came to the topic of risk and technology almost by mistake. He was invited to provide a background paper for the President's Commission On The Accident At Three Mile Island, which enquired into the 1979 nuclear incident near Harrisburg, Pennsylvania. A very small leak of water into an instrumentation system had triggered an escalating chain of events at the Three Mile Island plant, involving both component malfunctions and operator errors. The result was a major loss of coolant to the reactor, not unlike the recent tragic events in Japan.

There was no single cause of the Three Mile Island accident, nor of the Fukushima disaster. Numerous, seemingly inconsequential difficulties that had not been predicted by the plant designers combined to defeat multiple safety systems. Perrow concluded that the failure at Three Mile Island was a consequence of the system's immense complexity. Such modern high-risk systems, he realized, were prone to failures however well they were managed. It was inevitable that they would eventually suffer what he termed a 'normal accident'.

Therefore, he suggested, we might do better to contemplate a radical redesign or, if that was not possible, to abandon such technologies entirely. Foreseeing one of the problems at Fukushima, Perrow wrote in 1984 that “nuclear plants could be made marginally less complex if the spent storage pool were removed from the premises”. Because such pools typically require constant cooling and attention, a reactor accident forcing an evacuation of the building, or a complete loss of power to the fuel cooling system, would then risk a serious fuel fire and significant release of radiation from the storage ponds.

Normal Accidents introduced two concepts: 'interactive complexity', meaning the number and degree of system interrelationships; and 'tight coupling', or the degree to which initial failures can concatenate rapidly to bring down other parts of the system. Universities, for example, are interactively complex but only loosely coupled — decisions are often influenced by unanticipated factors but effects are felt slowly. By contrast, modern production lines are often tightly coupled, with close and rapid transformations between one stage and the next, but have simple relationships between those stages. Neither tends to suffer systemic accidents.

When systems exhibit both high complexity and tight coupling, as at Three Mile Island, the risk of failure becomes high. Worse still, according to Perrow, the addition of more safety devices — the stock response to a previous failure — might further reduce the safety margins if it adds complexity. For example, when a British European Airways Trident jet crashed with the loss of all lives near London Heathrow Airport in 1972 after it stalled during take-off, the pilots were unable to diagnose the fault amid at least nine other cockpit warnings and alarms that went off as a result.

A history of failure

Six years before Normal Accidents, British sociologist Barry Turner published his analysis of 80 major UK system failures in the lesser-known but similarly influential book Man-Made Disasters (Wykeham Science Press, 1978; I contributed to the posthumous 1997 second edition). Turner, too, emphasized the ways in which system complexity can defeat attempts to anticipate risks. But his theory differed crucially from Perrow's — it gave a more acute description of the organizational, management and communication failings that occur before an accident.

Major accidents do not spring into life on the day of the visible failure; they have a social and cultural context and a history. The problems at Three Mile Island, as we now know, were foreshadowed by similar near-miss events in other US pressurized-water plants — notably at the Davis-Besse nuclear plant near Oak Harbor, Ohio, in 1977. This raises the question of why safety information and learning were not shared among operators. Disaster analysis of man-made incidents — in contrast to normal accidents — implied that if the background conditions incubated over time, there was some possibility of prior detection even when systems were complex.

Tensions are evident in the comparison of Turner's and Perrow's accounts — between the possibilities for foresight and fatalism. These resurfaced several years later in debates among US scholars over 'normal accidents' versus 'high-reliability organizations'. Similar discussions were seen in related work in Europe on safety culture and organizational accidents. The fundamental question, posed by influential political scientist Scott Sagan in his book The Limits of Safety (Princeton University Press, 1993), was: are normal accidents inevitable or can the combination of interactive complexity and tight coupling be safely managed?

The high-reliability researchers believed that it could. They studied cases such as the flight operations on aircraft carriers, where the conditions for normal accidents exist but the systems operate safely each day. They identified cultural factors, such as collective decision-making and organizational learning, as key reasons why an otherwise toxic combination of complexity and risk can be managed. By contrast, critics such as Sagan pointed out that even these systems had serious near-misses from time to time, and that normal accidents could always occur as a result.

That debate is still unresolved. Nevertheless, the analyses of Perrow and Turner were ahead of their time and their legacy remains profound. Many subsequent accident inquiries drew on their insights — most notably the space shuttle Columbia Accident Investigation Board report in 2003. Enquiries into the Fukushima disaster will benefit too, but we need a wider appreciation of how future normal accidents might gestate, and a better understanding of the actions of organizations and people, both intended and unintended, that generate major risks.

The world still faces many systemic risk challenges, including those of runaway climate change, financial-market failure and information security. Although many advances in safety technology, engineering practice and risk management have been made over the past 30 years, organizational and technical complexity remain integral to the many systems that drive such risks. Normal Accidents is a testament to the value of rigorous thinking when applied to a critical problem.