The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines and Emerged Triumphant from Two Centuries of Controversy

  • Sharon Bertsch McGrayne
Yale University Press: 2011. 336 pp. $27.50 9780300169690 | ISBN: 978-0-3001-6969-0

Bayes' theorem of probability was proposed by English mathematician and clergyman Thomas Bayes in the 1740s, and rediscovered in the 1770s by Pierre Simon Laplace, a French mathematician. It states that by updating our initial beliefs with new objective information, we get an improved belief. Or as economist John Maynard Keynes put it: “When the facts change, I change my opinion.”

Considering the widespread effectiveness of Bayesian inference in physics and astronomy, genetics, imaging and robotics, Internet communication, finance and commerce, it is surprising that it has remained controversial for so long. Many twentieth-century scientists who used the Bayesian approach in their work — including mathematician Alan Turing and physicists Enrico Fermi and Richard Feynman — declined to use the 'B' word in public.

A US hydrogen bomb lost in the sea off Spain in 1966 sparked a search using Bayes' theorem. It was eventually recovered after a fisherman tipped off authorities. Credit: BETTMANN/CORBIS

Sharon Bertsch McGrayne explains their reticence in her impressively researched history of Bayes' theorem, The Theory That Would Not Die. The statistical method runs counter to the conviction that science requires objectivity and precision, she writes. Bayes' theorem “is a measure of belief. And it says that we can learn even from missing and inadequate data, from approximations, and from ignorance.”

A crucial example of the application of the theorem was Turing's cracking of the German naval cipher Enigma during the Second World War, which played a key part in the Allied victory in 1945. After the war, Turing's wartime assistant, I. J. 'Jack' Good, wrote about Turing's Bayesian technique for finding pairs and triplets of letters in the cipher. To avoid censorship under the UK Official Secrets Act, he described it in terms of bird watching.

High-speed computing is the main reason that Bayesian methods shook off their detractors.

Suppose a birder spotted 180 different species, many of which were represented by only one bird. Logically, other species must have been missed. A frequentist statistician would count those unseen species as zero, as if they could never be found. Turing, by contrast, assigned them a tiny non-zero probability, thereby factoring in that rare letter groupings might not be present in his current collection of intercepted messages but could appear in a larger sample. The same technique was later adopted in DNA sequencing and by artificial-intelligence analysts.

Turing never mentioned Bayes; he may not even have been aware of the theorem. McGrayne speculates that Turing either rediscovered the idea himself in 1940, or heard about it through geophysicist Harold Jeffreys at the University of Cambridge, UK, who published his Bayesian Theory of Probability in 1939. Good told McGrayne in an interview before his death in 2009 that he had asked Turing whether he was essentially using Bayes' theorem. To which Turing apparently replied: “I suppose.”

The advent of the cold war put Bayes' theorem into cold storage. Publication of the method used to decode Enigma was embargoed under the Secrets Act until the 1990s. During the 1950s, the influential statistician and geneticist R. A. Fisher continued his long-standing battle against Bayes' theorem, calling it an “impenetrable jungle”. When Good discussed the theorem at Britain's Royal Statistical Society, the next speaker began: “After that nonsense ...”.

In the United States, the small group of Bayesian statisticians came under suspicion as outsiders. During the McCarthy period of anti-communist sentiment, they were even considered 'un-American'. Professors at Harvard Business School referred to their Bayesian colleagues as “socialists and so-called scientists”.

Meanwhile, the US military and government were both applying Bayes' theorem, if reluctantly. In a gripping chapter, McGrayne describes how it was used in trying to find a hydrogen bomb that fell from a B-52 jet into the sea off Spain in 1966, and a US nuclear-powered attack submarine that disappeared in the Atlantic Ocean in 1968. President Lyndon B. Johnson raged to the investigating team: “I don't want this probability stuff. I want a plan that tells me exactly when we're going to find this bomb.”

In the event, Bayes' theorem was not needed. The missing bomb was located in an ocean canyon following a tip from a fisherman who had seen a parachute splash down near his boat. As for the Bayesian search for the submarine in the Atlantic, the consensus is that it would have succeeded had faster computers been available in 1968. High-speed computing is the main reason that Bayesian methods shook off their detractors and acquired their present-day prominence, McGrayne emphasizes.

For all the book's skilful mingling of ideas and intriguing personal details, I found it sloppy on occasions. Tautologies slip in, and evidence is lacking for some claims. Nonetheless, The Theory That Would Not Die is a rollicking tale of the triumph of a powerful mathematical tool.