The Black Box Society: The Secret Algorithms That Control Money and Information

  • Frank Pasquale
Harvard University Press: 2015. 9780674368279 | ISBN: 978-0-6743-6827-9

Everyone who uses the Internet for entertainment, education, news or commerce is implicated in a web of data collection whose breadth surpasses ordinary awareness.

Credit: Lee Woodgate/Getty

Last May, a US Senate investigation reported that a single visit to a popular tabloid-news website triggered activity on more than 350 other web servers. Most of those contacts, including delivery of advertisements, are likely to be benign. But they typically deposit a software 'cookie' on the visitor's computer; these enable the identification and tracking of visitors, generating digital profiles of their interests and patterns of online behaviour.

The Internet relies on user data to produce tailored advertising revenue that can support its growth and free use. But digital profiling ultimately helps to construct what law professor Frank Pasquale calls “the black box society”. As his exposé of that name shows, this is a society in which basic functions are performed in deliberate obscurity through the collection and algorithmic manipulation of personal data.

Black-box algorithms can be used to draw plausible inferences about a subject's location, age, medical condition, political affiliation and so on. The US retail giant Target, for instance, used data about purchases of vitamin supplements and oversized bags to deduce — for marketing purposes — whether its customers were pregnant. Because applying such algorithms to specific tasks can be economically advantageous for advertisers or lenders, the algorithms typically remain undisclosed as 'trade secrets'. But Pasquale worries that they can be used to shape what we know, how we are perceived and what opportunities we will be afforded. Increasing reliance on Google or other search engines, he notes, fosters greater dependence on their operating principles of selection and prioritization — which are largely opaque, beyond the fact that they reflect (and reinforce) popularity. Information sources that search engines neglect or suppress will not be discoverable. And if a search engine's algorithm were to subtly favour the company's own corporate interests, for instance, we might never know it.

Meanwhile, our reputations are defined in important ways by automated credit scoring and other algorithmic profiling activities used to determine, for instance, credit-worthiness or suitability for employment. Data from pharmacies concerning prescription purchases have been used by health insurers to deny individuals coverage. Yet our ability to inspect, correct or challenge such profiling is significantly limited. Black-box decisions also figure prominently in the finance industry, which uses them to exploit differential access to market-related information. Finance has acquired an undeserved mystique, Pasquale believes, by adopting computationally intense procedures to model and anticipate market behaviour. By exaggerating the validity of such models and concealing their risks from investors and regulators, some Wall Street firms exacerbated the financial crisis of 2008.

The insulation of black-box practices from public inspection or evaluation is close to the root of the problem: it tends to preclude independent oversight, error correction and even free-market competition. Remarkably, the US Congress has found it easier to elicit oversight information from intelligence fortresses such as the National Security Agency than from some Silicon Valley firms.

Pasquale provides an informative survey of developments in the representative fields of search, reputation and finance to bolster his argument that a laissez-faire approach to algorithmic decision-making is taking us to places where most of us will not want to go. As the power of advertising providers such as Google AdSense grows, for example, many online publications are seeing a decline in their advertising revenue. Homeland security 'fusion centres' are integrating government data collection (which is constrained by law) with unregulated information from private data brokers, in the name of information sharing. More promisingly, Pasquale points to the US Treasury's little-known Office of Financial Research, sometimes called “the CIA of finance”, which aims to provide regulators with real-time intelligence on financial markets. The book is full of instructive anecdotes on such matters, backed by useful research.

There are occasional lapses. Pasquale's remark that “Political dissent is a routine target for surveillance by the FBI”, for instance, is not accurate. More often, the book provides appropriately sceptical insights. Did Twitter somehow block the 2011 Occupy Wall Street protests from its own list of high-profile trending topics, critics wondered? The answer is no: trending is a reflection of a relative increase in prominence, not of absolute popularity, as Twitter officials eventually deigned to explain.

The opposite of the black-box society is an “intelligible society”, and Pasquale's discussion of it features remedial proposals big and small. Why couldn't the US Library of Congress provide a public book-search function to complement the digitization project Google Books? Why not commission a public credit-scoring system based on open-source software? The underlying question is, why can't the tools of algorithmic decision-making be turned against black-box systems in an open, accountable manner?

It is tempting to say that the political process needed to enact such reforms has itself become a black box. Yet in The Black Box Society, Pasquale finds reason to believe that even some of the most secretive and unresponsive institutions can be held to account. Elucidating the problem is a first step.