Introduction

Over the past 50 years, the notion of ‘heuristics’ has considerably gained attention in fields as diverse as psychology, cognitive science, decision theory, computer science, and management scholarship. While for 1970, the Scopus database finds a meagre 20 published articles with the word ‘heuristic’ in their title, the number has increased to no less than 3783 in 2021 (Scopus, 2022).

We take this to be evidence that many researchers in the aforementioned fields find the literature that refers to heuristics stimulating and that it gives rise to questions that deserve further enquiry. While there are some review articles on the topic of heuristics (Gigerenzer and Gaissmaier, 2011; Groner et al., 1983; Hertwig and Pachur, 2015; Semaan et al., 2020), a somewhat comprehensive and non-partisan historical review seems to be missing.

While interest in heuristics is growing, the very notion of heuristics remains elusive to the point that, e.g., Shah and Oppenheimer (2008) begin their paper with the statement: ‘The word “heuristic” has lost its meaning.’ Even if one leaves aside characterizations such as ‘rule of thumb’ or ‘mental shortcut’ and considers what Kahneman (2011) calls ‘the technical definition of heuristic,’ namely ‘a simple procedure that helps find adequate, though often imperfect, answers to difficult questions,’ one is immediately left wondering how simple it has to be, what an adequate, but the imperfect answer is, and how difficult the questions need to be, in order to classify a procedure as a heuristic. Shah and Oppenheimer conclude that ‘the term heuristic is vague enough to describe anything’.

However, one feature does distinguish heuristics from certain other, typically more elaborate procedures: heuristics are problem-solving methods that do not guarantee an optimal solution. The use of heuristics is, therefore, inevitable where no method to find an optimal solution exists or is known to the problem-solver, in particular where the problem and/or the optimality criterion is ill-defined. However, the use of heuristics may be advantageous even where the problem to be solved is well-defined and methods do exist which would guarantee an optimal solution. This is because definitions of optimality typically ignore constraints on the process of solving the problem and the costs of that process. Compared to infallible but elaborate methods, heuristics may prove to be quicker or more efficient.

Nevertheless, the range of what has been called heuristics is very broad. Application of a heuristic may require intuition, guessing, exploration, or experience; some heuristics are rather elaborate, others are truly shortcuts, some are described in somewhat loose terms, and others are well-defined algorithms.

One procedure of decision-making that is commonly not regarded as a heuristic is the application of the full-blown theory of subjective expected utility (SEU) in the tradition of Ramsey (1926), von Neumann and Morgenstern (1944), and Savage (1954). This theory is arguably spelling out what an ideally rational decision would be, but was already seen by Savage (p. 16) to be applicable only in what he called a ‘small world’. Quite a few approaches that have been called heuristics have been explicitly motivated by SEU imposing demands on the decision-maker, which are utterly impractical (cf., e.g., Klein, 2001, for a discussion). As a second defining feature of the heuristics we want to consider, therefore, we take them to be procedures of decision-making that differ from the ‘gold standard’ of SEU by being practically applicable in at least a number of interesting cases. Along with SEU, we also leave aside the rules of deductive logic, such as Aristotelian syllogisms, modus ponens, modus tollens, etc. While these can also be seen as rules of decision-making, and the universal validity of some of them is not quite uncontroversial (see, e.g., Priest, 2008, for an introduction to non-classical logic), they are widely regarded as ‘infallible’. By stark contrast, it seems characteristic for heuristics that their application may fail to yield a ‘best’ or ‘correct’ result.

By taking heuristics to be practically applicable, but fallible, procedures for problem-solving, we will also neglect the literature that focuses on the adjective ‘heuristic’ instead of on the noun. When, e.g., Suppes (1983) characterizes axiomatic analyses as ‘heuristic’, he is not suggesting any rule, but he is saying that heuristic axioms ‘seem intuitively to organize and facilitate our thinking about the subject’ (p. 82), and proceeds to give examples of both heuristic and nonheuristic axioms. It may of course be said that many fundamental equations in science, such as Newton’s force = mass*acceleration, have some heuristic value in the sense indicated by Suppes, but the research we will review is not about the property of being heuristic.

Given that heuristics can be assessed against the benchmark of SEU, one may distinguish broadly between heuristics suggested pre-SEU, i.e., before the middle of the 20th century, and the later research on heuristics that had to face the challenge of an existing theory of allegedly rational decision-making. We will review the former in the section “Deliberate heuristics—the art of invention” below, and devote sections “Herbert Simon: rationality is bounded”, “Heuristics in computer science” and “Daniel Kahneman and Amos Tversky: heuristics and biases” to the latter.

To cover the paradigmatic cases of what has been termed ‘heuristics’ in the literature, we have to take ‘problem-solving’ in a broad sense that includes decision-making and judgement, but also automatic, instinctive behaviour. We, therefore, feel that an account of research on heuristics should also review the main views on how observable behaviour patterns in humans—or maybe animals in general—can be explained. This we do in the section “Automatic heuristics: learnt or innate?”.

While our brief history cannot aim for completeness, we selected the scholars to be included based on their influence and contributions to different fields of research related to heuristics. Our focus, however, will be on the more recent research that may be said to begin with Herbert Simon.

That problem-solving according to SEU will, in general, be impractical, was clearly recognized by Herbert Simon, whose notion of bounded rationality we look at in the section “Herbert Simon: rationality is bounded”. In the section “Heuristics in computer science”, we also consider heuristics in computer science, where the motivation to use heuristics is closely related to Simon’s reasoning. In the section “Daniel Kahneman and Amos Tversky: heuristics and biases”, we turn to the heuristics identified and analysed by Kahneman and Tversky; while their assessment was primarily that the use of those heuristics often does not conform to rational decision-making, the approach by Gigerenzer and his collaborators, reviewed in the section “Gerd Gigerenzer: fast-and-frugal heuristics” below, takes a much more affirmative view on the use of heuristics. Section “Critiques” explains the limitations and critiques of the corresponding ideas. The final section “Conclusion” contains the conclusion, discussion, and avenues for future research.

The evolutionary perspective

While we focus on the history of research on heuristics, it is clear that animal behaviour patterns evolved and were shaped by evolutionary forces long before the human species emerged. Thus ‘heuristics’ in the mere sense of behaviour patterns have been used long before humans engaged in any kind of conscious reflection on decision-making, let alone systematic research. However, evolution endowed humans with brains that allow them to make decisions in ways that are quite different from animal behaviour patterns. According to Gibbons (2007), the peculiar evolution of the human brain started thousands of years ago when the ancient human discovered fire and started cooking food, which reduced the amount of energy the body needed for digestion. This paved the way for a smaller intestinal tract and implied that the excess calories led to the development of larger tissues and eventually a larger brain. Through this organ, intelligence increased exponentially, resulting in advanced communication that allowed Homo sapiens to collaborate and form relationships that other primates at the time could not match. According to Dunbar (1998), it was in the time between 400,000 and 100,000 years ago that abilities to hunt more effectively took humans from the middle of the food chain right to the top.

It does not seem to be known when and how exactly the human brain developed the ability to reflect on decisions made consciously, but it is now widely recognized that in addition to the fast, automatic, and typically nonconscious type of decision-making that is similar to animal behaviour, humans also employ another, rather a different type of decision-making that can be characterized as slow, conscious, controlled, and reflective. The former type is known as ‘System 1’ or ‘the old mind’, and the latter as ‘System 2’ or ‘the new mind’ (Evans, 2010; Kahneman, 2011), and both systems have clearly evolved side by side throughout the evolution of the human brain. According to Gigerenzer (2021), humans as well as other organisms evolved to acquire what he calls ‘embodied heuristics’ that can be both innate or learnt rules of thumb, which in turn supply the agility to respond to the lack of information by fast judgement. The ‘embodied heuristics’ use the mental capacity that includes the motor and sensory abilities that start to develop from the moment of birth.

While a detailed discussion of the ‘dual-process theories’ of the mind is beyond the scope of this paper, we find it helpful to point out that one may distinguish between ‘System 1 heuristics’ and ‘System 2 heuristics’ (Kahneman 2011, p. 98). While some ‘rules of decision-making’ may be hard-wired into the human species by its genes and physiology, others are complicated enough that their application typically requires reflection and conscious mental effort. Upon reflection, however, the two systems are not as separate as they may seem. For example, participants in the Mental Calculation World Cup perform mathematical tasks instantly, whereas ordinary people would need a pen and paper or a calculator. Today, many people cannot multiply large numbers or calculate a square root using only a pen and paper but can easily do this using the calculator app on their smartphone. Thus, what can be done by spontaneous effortless calculation by some, may for others require the application of a more or less complicated theory.

Nevertheless, one can loosely characterize the heuristics that have been explained and recommended for more or less well-specified purposes over the course of history as System 2 or deliberate heuristics.

Deliberate heuristics—the art of invention

Throughout history, scholars have investigated methods to solve complex tasks. In this section, we review those attempts to formulate ‘operant and voluntary’ heuristics to solve demanding problems—in particular, to generate new insights or do research in more or less specified fields. Most of the heuristics in this section have been suggested before the emergence of the SEU theory and the associated modern definition of rationality, and none of them deals with the kind of decision problems that are assumed as ‘given’ in the SEU model. The reader will notice that some historical heuristics were suggested for problems that, today, may seem too general to be solved. However, through the development of such attempts, later scholars were inspired to develop a more concrete understanding of the notion of heuristics.

The Greek origin

The term heuristic originates from the Greek verb heurísko, which means to discover or find out. The Greek word heúrēka, allegedly exclaimed by Archimedes when discovering how to measure the volume of a random object through water, derives from the same verb and can be translated as I found it! (Pinheiro and McNeill, 2014). Heuristics can thus be said to be etymologically related to the discipline of discovery, the branch of knowledge based on investigative procedures, and are naturally associated with trial techniques, including what-if scenarios and simple trial and error.

While the term heurísko does not seem to be used in this context by Aristotle, his notion of induction (epagôgê) can be seen as a method to find, but not prove, true general statements and thus as a heuristic. At any rate, Aristotle considered inductive reasoning as leading to insights and as distinct from logically valid syllogisms (Smith, 2020).

Pappus (4th century)

While a brief, somewhat cryptic, mention of analysis and synthesis appears in Book 13 of some, but not all, editions of Euclid’s Elements, a clearer explanation of the two methods was given in the 4th century by the Greek mathematician and astronomer Pappus of Alexandria (cf. Heath, 1926; Polya, 1945; Groner et al., 1983). While synthesis is what today would be called deduction from known truths, analysis is a method that can be used to try and find proof. Two slightly different explanations are given by Pappus. They boil down to this: in order to find proof for a statement A, one can deduce another statement B from A, continue by deducing yet another statement C from B, and so on, until one comes upon a statement T that is known to be true. If all the inferences are convertible, the converse deductions evidently constitute a proof of A from T. While Pappus did not mention the condition that the inferences must be convertible, his second explanation of analysis makes it clear that one must be looking for deductions from A which are both necessary and sufficient for A. In Polya’s paraphrase of Pappus’ text: ‘We enquire from what antecedent the desired result could be derived; then we enquire again what could be the antecedent of that antecedent, and so on, until passing from antecedent to antecedent, we come eventually upon something already known or admittedly true.’ Analysis thus described is hardly a ‘shortcut’ or ‘rule of thumb’, but quite clearly it is a heuristic: it may help to find a proof of A, but it may also fail to do so…

Al-Khawarizmi (9th century)

In the 9th century, the Persian thinker Mohamad Al-Khawarizmi, who resided in Baghdad’s centre of knowledge or the House of Wisdom, used stepwise methods for problem-solving. Thus, after his name and findings, the algorithm concept was derived (Boyer, 1991). Although a heuristic orientation has sometimes been contrasted with an algorithmic one (Groner and Groner, 1991), it is worth noting that an algorithm may well serve as a heuristic—certainly in the sense of a shortcut, and also in the sense of a fallible method. After all, an algorithm may fail to produce a satisfactory result. We will return to this issue in the section “Heuristics in computer science” below.

Zairja (10th century)

Heuristic methods were created by medieval polymaths in their attempts to find solutions for the complex problems they faced—science not yet being divorced from what today would appear as theology or astrology. Perhaps the first tangible example of a heuristic based on a mechanical device was using an ancient tool called a zairja, which Arab astrologers employed before the 11th century (Ritchey, 2022). It was designed to reconfigure notions into ideas through randomization and resonance and thus to produce answers to questions mechanically (Link, 2010). The word zairja may have originated from the Persian combination zaicha-daira, which means horoscope-circle. According to Ibn Khaldoun, ‘zairja is the technique of finding out answers from questions by means of connections existing between the letters of the expressions used in the question; they imagine that these connections can form the basis for knowing the future happenings they want to know’ (Khaldun, 1967).

Ramon Llull (1305)

The Majorcan philosopher Ramon Llull (or Raimundus Lullus), who was exposed to the Arabic culture, used the zairja as a starting point for his ars inveniendi veritatem that was meant to complement the ars demonstrandi of medieval Scholastic logic and on which he worked from around 1270–1305 (Link, 2010; Llull, 1308; Ritchey, 2022) when he finished his Ars Generalis Ultima (or Ars Magna). Llull transformed the astrological and combinatorial components of the zairja into a religious system that took the fundamental ideas of the three Abrahamic faiths of Islam, Christianity, and Judaism and analysed them through symbolic and numeric reasoning. Llull tried to broaden his theory across all fields of knowledge and combine all sciences into a single science that would address all human problems. His thoughts impacted great thinkers, such as Leibniz, and even the modern theory of computation (Fidora and Sierra, 2011). Llull’s approach may be considered a clear example of heuristic methods applied to complicated and even theological questions (Hertwig and Pachur, 2015).

Joachim Jungius (1622)

Arguably, the German mathematician and philosopher Joachim Jungius was the first to use the terminology heuretica in a call to establish a research society in 1622. Jungius distinguished between three degrees or levels of learning and cognition: empirical, epistemic, and heuristic. Those who have reached the empirical level believe that what they have learned is true because it corresponds to experience. Those who have reached the epistemic level know how to derive their knowledge from principles with rigorous evidence. But those who have reached the highest level, the heuristic level, have a method of solving unsolved problems, finding new theorems, and introducing new methods into science (Ritter et al., 2017).

René Descartes (1637)

In 1637, the French philosopher René Descartes published his Discourse on Method (one of the first major works not written in Latin). Descartes argued that humans could utilize mathematical reasoning as a vehicle for progress in knowledge. He proposed four simple steps to follow in problem-solving. First, to accept as true only what is indubitable. Next, divide the problem into as many smaller subproblems as possible and helpful. After that, to conduct one’s thoughts in an orderly fashion, beginning with the simplest and gradually ascending to the most complex. And finally, to make enumerations so complete that one is assured of having omitted nothing (Descartes, 1998). In reference to his other methods, Descartes (1908) started working on the proper heuristic rules to transform every problem, when possible, into algebraic equations, thus creating a mathesis universalis or universal science. In his unfinished ‘Rules for the Direction of the Mind’ or Regulae ad directionem ingenii, Descartes suggested 21 heuristic rules (of planned 36) for scientific research like simplifying the problem, rewriting the problem in geometrical shape, and identifying the knowns and the unknowns. Although Leibniz criticized the rules of Descartes for being too general (Leibniz, 1880), this treatise outlined the basis for later work on complex problems in several disciplines.

Gottfried Wilhelm Leibniz (1666)

Influenced by the ideas of Llull, Jungius, and Descartes, the Prussian–German polymath Gottfried Wilhelm Leibniz suggested an original approach to problem-solving in his Dissertatio de Arte Combinatoria, published in Leipzig in 1666. His aim was to create a new universal language into which all problems could be translated and a standard solving procedure that could be applied regardless of the type of the problem. Leibniz also defined an ars inveniendi as a method for finding new truths, distinguishing it from an ars iudicandi, a method to evaluate the validity of alleged truths. Later, in 1673, he invented the calculating machine that could execute all four arithmetic operations and thus find ‘new’ arithmetic truths (Pombo, 2002).

Bernard Bolzano (1837)

In 1837, the Czech mathematician and philosopher Bernard Bolzano published his four-volume Wissenschaftslehre (Theory of Science). The fourth part of his theory he called ‘Erfindungskunst’ or the art of invention, mentions in the introductory section 322 that ‘heuristic’ is just the Greek translation. Bolzano explains that the rules he is going to state are not at all entirely new, but instead have always been used ‘by the talented’—although mostly not consciously. He then explains 13 general and 33 special rules one should follow when trying to find new truths. Among the general rules are, e.g., that one should first decide on the question one wants to answer, and the kind of answer one is looking for (section 325), or that one should choose suitable symbols to represent one’s ideas (section 334). Unlike the general rules, the special ones are meant to be helpful for special mental tasks only. E.g., in order to solve the task of finding the reason for any given truth, Bolzano advises first to analyse or dissect the truth into its parts and then use those to form truths which are simpler than the given one (section 378). Another example is Bolzano’s special rule 28, explained in section 386, which is meant to help identify the intention behind a given action. To do so, Bolzano advises exploring the agent’s beliefs about the effects of his action at the time he decided to act, and explains that this will require investigating the agent’s knowledge, his degree of attention and deliberation, any erroneous beliefs the agent may have had, and ‘many other circumstances’. Bolzano continues to point out that any effect the agent may have expected to result from his action will not be an intended one if he considered it neither as an obligation nor as advantageous. While Bolzano’s rules can hardly be considered as ‘shortcuts’, he mentions again and again that they may fail to solve the task at hand adequately (cf. Hertwig and Pachur, 2015; Siitonen, 2014).

Frank Ramsey (1926)

In Ramsey’s pathbreaking paper on ‘Truth and Probability’ which laid the foundation of subjective probability theory, a final section that has received little attention in the literature is devoted to inductive logic. While he does not use the word ‘heuristic’, he characterizes induction as a ‘habit of the mind,’ explaining that he uses ‘habit in the most general possible sense to mean simply rule or the law of behaviour, including instinct,’ but also including ‘acquired rules.’ Ramsey gives the following pragmatic justification for being convinced by induction: ‘our conviction is reasonable because the world is so constituted that inductive arguments lead on the whole to true opinions,’ and states more generally that ‘we judge mental habits by whether they work, i.e., whether the opinions they lead to are for the most part true, or more often true than those which alternative habits would lead to’ (Ramsey, 1926). In modern terminology, Ramsey was pointing out that mental habits—such as inductive inference—may be more or less ‘ecologically rational’.

Karl Duncker (1935)

Karl Duncker was a pioneer in the experimental investigation of human problem-solving. In his 1935 book Zur Psychologie des produktiven Denkens, he discussed both heuristics that help to solve problems, but also hindrances that may block the solution of a problem—and reported on a number of experimental findings. Among the heuristics was a situational analysis with the aim of uncovering the reasons for the gap between the status quo and the problem-solvers goal, analysis of the goal itself, and of sacrifices the problem-solver is willing to make, of prerequisites for the solution, and several others. Among the hindrances to problem-solving was what Duncker called functional fixedness, illustrated by the famous candle problem, in which he asked the participants to fix a candle to the wall and light it without allowing the wax to drip. The available tools were a candle, matches, and a box filled with thumbtacks. The solution was to empty the box of thumbtacks, fix the empty box to the wall using the thumbtacks, put the candle in the box, and finally light the candle. Participants who were given the empty box as a separate item could solve this problem, while those given the box filled with thumbtacks struggled to find a solution. Through this experiment, Duncker illustrated an inability to think outside the box and the difficulty in using a device in a way that is different from the usual one (Glaveanu, 2019). Duncker emphasized that success in problem-solving depends on a complementary combination of both the internal mind and the external problem structure (cf. Groner et al., 1983).

George Polya (1945)

The Hungarian mathematician George Polya can be aptly called the father of problem-solving in modern mathematics and education. In his 1945 book, How to Solve it, Polya writes that ‘heuristic…or ‘ars inveniendi’ was the name of a certain branch of study…often outlined, seldom presented in detail, and as good as forgotten today’ and he attempts to ‘revive heuristic in a modern and modest form’. According to his four principles of mathematical problem-solving, it is first necessary to understand the problem, then plan the execution, carry out the plan, and finally, reflect and search for improvement opportunities. Among the more detailed suggestions for problem-solving explained by Polya are to ask questions such as ‘can you find the solution to a similar problem?’, to use inductive reasoning and analogy, or to choose a suitable notation. Procedures inspired by Polya’s (1945) book and several later ones (e.g., Induction and Analogy in Mathematics of 1954) also informed the field of artificial intelligence (AI) (Hertwig and Pachur, 2015).

Johannes Müller (1968)

In 1968, the German scientist Johannes Müller introduced the concept of systematic heuristics while working on his postdoctoral thesis at the Chemnitz University of Technology. Systematic heuristics is a framework for improving the efficiency of intellectual work using problem-solving processes in the fields of science and technology.

The main idea of systematic heuristics is to solve repeated problems with previously validated solutions. These methods are called programmes and are gathered in a library that can be accessed by the main programme, which receives the requirements, prepares the execution plan, determines the required procedures, executes the plan, and finally evaluates the results. Müller’s team was dismissed for ideological reasons, and his programme was terminated after a few years, but his findings went on to be successfully applied in many projects across different industries (Banse and Friedrich, 2000).

Imre Lakatos (1970)

In his ‘Methodology of Scientific Research Programmes’ that turned out to be a major contribution to the Popper–Kuhn controversy about the rationality of non-falsifiable paradigms in the natural sciences, Lakatos introduced the interesting distinction between a ‘negative heuristic’ that is given by the ‘hard core’ of a research programme and the ‘positive heuristic’ of the ‘protective belt’. While the latter suggests ways to develop the research programme further and to predict new facts, the ‘hard core’ of the research programme is treated as irrefutable ‘by the methodological decision of its protagonists: anomalies must lead to changes only in the ‘protective’ belt’ of auxiliary hypotheses. The Lakatosian notion of a negative heuristic seems to have received little attention outside of the Philosophy of Science community but may be important elsewhere: when there are too many ways to solve a complicated problem, excluding some of them from consideration may be helpful.

Gerhard Kleining (1982)

The German sociologist Gerhard Kleining suggested a qualitative heuristic as the appropriate research method for qualitative social science. It is based on four principles: (1) open-mindedness of the scientist who should be ready to revise his preconceptions about the topic of study, (2) openness of the topic of study, which is initially defined only provisionally and allowed to be modified in course of the research, (3) maximal variation of the research perspective, and (4) identification of similarities within the data (Kleining, 1982, 1995).

Automatic heuristics: learnt or innate?

Unlike the deliberate, and in some cases quite elaborate, heuristics reviewed above, at least some System 1 heuristics are often applied automatically, without any kind of deliberation or conscious reflection on the task that needs to be performed or the question that needs to be answered. One may view them as mere patterns of behaviour, and as such their scientific examination has been a long cumulative process through different disciplines, even though explicit reference to heuristics was not often made.

Traditionally, examining the behaviour patterns of any living creature, any study concerning thoughts, feelings, or cognitive abilities was regarded as the task of biologists. However, the birth of psychology as a separate discipline paved the way for an alternative outlook. Evolutionary psychology views human behaviour as being shaped through time and experience to promote survival throughout the long history of human struggle with nature. With many factors to consider, scholars have been interested in the evolution of the human brain, patterns of behaviour, and problem-solving (Buss and Kenrick, 1998).

Charles Darwin (1873)

Charles Darwin himself maybe qualifies for the title of first evolutionary psychologist, as his perceptions laid the foundations for this field that would continue to grow over a century later (Ghiselin, 1973).

In 1873, Darwin claimed that the brain’s articulations regarding expressions and emotions have probably developed similarly to its physical traits (Baumeister and Vohs, 2007). He acknowledged that personal demonstrations or expressions have a high capacity for interaction with different peers from the same species. For example, an aggressive look flags an eagerness to battle yet leaves the recipient with the option of retreating without either party being harmed. Additionally, Darwin, as well as his predecessor Lamarck, constantly emphasized the role of environmental factors in ‘the struggle for existence’ that could shape the organism’s traits in response to changes in their corresponding environments (Sen, 2020). The famous example of giraffes that grew long necks in response to trees growing taller is an illustration of a major environmental effect. Similarly, cognitive skills, including heuristics, must have also been shaped by the environments to evolve and keep humans surviving and reproducing.

Darwin’s ideas impacted the early advancement of brain science, psychology, and all related disciplines, including the topic of cognitive heuristics (Smulders, 2009).

William James (1890)

A few years later, in 1890, the father of American psychology, William James, introduced the notion of evolutionary psychology in his 1200-page text The Principles of Psychology, which later became a reference on the subject and helped establish psychology as a science. In its core content, James reasoned that many actions of the human being demonstrate the activity of instincts, which are the evolutionary embedded inclinations to react to specific incentives in adaptive manners. With this idea, James added an important building block to the foundation of heuristics as a scientific topic.

A simple example of such hard-wired behaviour patterns would be a sneeze, the preprogrammed reaction of convulsive nasal expulsion of air from the lungs through the nose and mouth to remove irritants (Baumeister and Vohs, 2007).

Ivan Pavlov (1897)

Triggered by scientific curiosity or the instinct for research, as he called it, the first Russian Nobel laureate, Ivan Pavlov, introduced classical conditioning, which occurs when a stimulus is used that has a predictive relationship with a reinforcer, resulting in a change in response to the stimulus (Schreurs, 1989). This learning process was demonstrated through experiments conducted with dogs. In the experiments, a bell (a neutral stimulus) was paired with food (a potent stimulus), resulting ultimately in the dogs salivating at the ringing of the bell—a conditioned response. Pavlov’s experiments remain paradigmatic cases of the emergence of behaviour patterns through association learning.

William McDougall (1909)

At the start of the 20th century, the Anglo-American psychologist William McDougall was one of the first to write about the instinct theory of motivation. McDougall argued that instincts trigger many critical social practices. He viewed instincts as extremely sophisticated faculties in which specific provocations such as social impediments can drive a person’s state of mind in a particular direction, for example, towards a state of hatred, envy, or anger, which in turn may increase the probability of specific practices such as hostility or violence (McDougall, 2015).

However, in the early 1920s, McDougall’s perspective about human behaviour being driven by instincts faded remarkably as scientists supporting the concept of behaviourism started to get more attention with original ideas (Buss and Kenrick, 1998).

John B. Watson (1913)

The pioneer of the psychological school of behaviourism, John B. Watson, who conducted the controversial ‘Little Albert’ experiment by imposing a phobia on a child to evidence classical conditioning in humans (Harris, 1979), argued against the ideas of McDougall, even within public debates (Stephenson, 2003). Unlike McDougall, Watson considered the brain an empty page (tabula rasa as described by Aristotle). According to him. all personality traits and behaviours directly result from the accumulated experience that starts from birth. Thus, the story of the human mind is a continuous writing process featured by surrounding events and factors. This perception was supported in the following years of the 20th century by anthropologists who revealed many very different social standards in different societies, and numerous social researchers argued that the wide variety of cross-cultural differences should lead to the conclusion that there is no mental content built-in from birth, and that all knowledge, therefore, comes from individual experience or perception (Farr, 1996). In stark contrast to McDougall, Watson suggested that human intuitions and behaviour patterns are the product of a learning process that starts blank.

B. F. Skinner (1938)

Inspired by the work of Pavlov, the American psychologist B.F. Skinner took the classical conditioning approach to a more advanced level by modifying a key aspect of the process. According to Skinner, human behaviour is dependent on the outcome of past activities. If the outcome is bad, the action will probably not be repeated; however, if the outcome is good, the likelihood of the activity being repeated is relatively high. Skinner called this process reinforcement learning (Schacter et al., 2011). Based on reinforcement learning, Skinner also introduced the concept of operant conditioning, a type of associative learning process through which the strength of a behaviour is adjusted by reinforcement or punishment. Considering, for example, a parent’s response to a child’s behaviour, the probability of the child repeating an action will be highly dependent on the parent’s reaction (Zilio, 2013). Effectively, Skinner argues that the intuitive System 1 may get edited and that a heuristical cue may become more or less ‘hard-wired’ in the subject’s brain as a stimulus leading to an automatic response.

The DNA and its environment (1953 onwards)

Today, there seems to be wide agreement that behaviour patterns in humans and other species are to some extent ‘in the DNA’, the structure of which was discovered by Francis Crick and James Watson in 1953, but that they also to some extent depend on ‘the environment’—including the social environment in which the agent lives and has problems to solve. Today, it seems safe to say, therefore, that the methods of problem-solving that humans apply are neither completely innate nor completely the result of environmental stimuli—but rather the product of the complex interaction between genes and the environment (Lerner, 1978).

Herbert Simon: rationality is bounded

Herbert Simon is well known for his contributions to several fields, including economics, psychology, computer science, and management. Simon proposed a remarkable theory that led him to be awarded the Nobel Prize for Economics in 1978.

Bounded rationality and satisficing

In the mid-1950s, Simon published A Behavioural Model of Rational Choice, which focused on bounded rationality: the idea that people must make decisions with limited time, mental resources, and information (Simon, 1955). He clearly states the triangle of limitations in every decision-making process—the availability of information, time, and cognitive ability (Bazerman and Moore, 1994). The ideas of Simon are considered an inspiring foundation for many technologies in use today.

Instead of conforming to the idea that economic behaviour can be seen as rational and dependent on all accessible data (i.e., as optimization), Simon suggested that the dynamics of decision-making were essentially ‘satisficing,’ a notion synthesized from ‘satisfy’ and ‘suffice’ (Byron, 1998). During the 1940s, scholars noticed the frequent failure of two assumptions required for ‘rational’ decision-making. The first is that data is never enough and may be far from perfect, while people dependably make decisions based on incomplete data. Second, people do not assess every feasible option before settling on a decision. This conduct is highly correlated with the cost of data collection since data turns out to be progressively harder and costlier to accumulate. Rather than trying to find the ideal option, people choose the first acceptable or satisfactory option they find. Simon described this procedure as satisficing and concluded that the human brain in the decision-making process would, at best, exhibit restricted abilities (Barros, 2010).

Since people can neither obtain nor process all the data needed to make a completely rational decision, they use the limited data they possess to determine an outcome that is ‘good enough’—a procedure later refined into the take-the-best heuristic. Simon’s view that people are bounded by their cognitive limits is usually known as the theory of bounded rationality (cf. Gigerenzer and Selten, 2001).

Herbert Simon and AI

With the cooperation of Allen Newell of the RAND Corporation, Simon attempted to create a computer simulator for human decision-making. In 1956, they created a ‘thinking’ machine called the ‘Logic Theorist’. This early smart device was a computer programme with the ability to prove theorems in symbolic logic. It was perhaps the first man-made programme that simulated some human reasoning abilities to solve actual problems (Gugerty, 2006). After a few years, Simon, Newell, and J.C. Shaw proposed the General Problem Solver or GPS, the first AI-based programme ever invented. They actually aimed to create a single programme that could solve all problems with the same unified algorithm. However, while the GPS was efficient with sufficiently well-structured problems like the Towers of Hanoi (a puzzle with 3 rods and different-sized disks to be moved), it could not solve real-life scenarios with all their complexities (A. Newell et al., 1959).

By 1965, Simon was confident that ‘machines will be capable of doing any work a man can do’ (Vardi, 2012). Therefore, Simon dedicated most of the remainder of his career to the advancement of machine intelligence. The results of his experiments showed that, like humans, certain computer programmes make decisions using trial-and-error and shortcut methods (Frantz, 2003). Quite explicitly, Simon and Newell (1958, p. 7) referred to heuristics being used by both humans and intelligent machines: ‘Digital computers can perform certain heuristic problem-solving tasks for which no algorithms are available… In doing so, they use processes that are closely parallel to human problem-solving processes’.

Additionally, the importance of the environment was also clearly observed in Newell and Simon’s (1972) work:

‘Just as scissors cannot cut paper without two blades, a theory of thinking and problem-solving cannot predict behaviour unless it encompasses both an analysis of the structure of task environments and an analysis of the limits of rational adaptation to task requirements’ (p. 55).

Accordingly, the term ‘task environment’ describes the formal structure of the universe of choices and results for a specific problem. At the same time, Newell and Simon do not treat the agent and the environment as two isolated entities, but rather as highly related. Consequently, they tend to believe that agents with different cognitive abilities and choice repertoires will inhabit different task environments even though their physical surroundings and intentions might be the same (Agre and Horswill, 1997).

Heuristics in computer science

Computer science as a discipline may have the biggest share of deliberately applied heuristics. As heuristic problem-solving has often been contrasted with algorithmic problem-solving—even by Simon and Newell (1958)—it is worth recalling that the very notion of ‘algorithm’ was clarified only in the first half of the 20th century, when Alan Turing (1937) defined what was later named ‘Turing-machine’. Basically, he defined ‘mechanical’ computation as a computation that can be done by a—stylized—machine. ‘Mechanical’ being what is also known today as algorithmic, one can say that any procedure that can be performed by a digital computer is algorithmic. Nevertheless, many of them are also heuristics because an algorithm may fail to produce an optimal solution to the problem it is meant to solve. This may be so either because the problem is ill-defined or because the computations required to produce the optimal solution may not be feasible with the available resources. If the problem is ill-defined—as it often is, e.g., in natural language processing—the algorithm that does the processing has to rely on a well-defined model that does not capture the vagueness and ambiguities of the real-life problem—a problem typically stated in natural language. If the problem is well-defined, but finding the optimal solution is not feasible, algorithms that would find it may exist ‘in principle’, but require too much time or memory to be practically implemented.

In fact, there is today a rich theory of complexity classes that distinguishes between types of (well-defined) problems according to how fast the time or memory space required to find the optimal solution increases with increasing problem size. E.g., for problem types of the complexity class P, any deterministic algorithm that produces the optimal solution has a running time bounded by a polynomial function of the input size, whereas, for problems of complexity class EXPTIME, the running time is bounded by an exponential function of the input size. In the jargon of computer science, problems of the latter class are considered intractable, although the input size has to become sufficiently large before the computation of the optimal solution becomes practically infeasible (cf. Harel, 2000; Hopcroft et al., 2007). Research indicates that the computational complexity of problems can also reduce the quality of human decision-making (Bossaerts and Murawski, 2017).

Shortest path algorithms

A classic optimization problem that may serve to illustrate the issues of optimal solution, complexity, and heuristics goes by the name of the travelling salesman problem (TSP), which was first introduced in 1930. In this problem, several cities with given distances between each two are considered, and the goal is to find the shortest possible path through all cities and return to the starting point. For a small input size, i.e., for a small number of cities, the ‘brute-force’ algorithm is easy to use: write down all the possible paths through all the cities, calculate their lengths, and choose the shortest. However, the number of steps that are required by this procedure quickly increases with the number of cities. The TSP is today known to belong to the complexity class NP which is in between P and EXPTIMEFootnote 1). To solve the TSP, Jon Bentley (1982) proposed the greedy (or nearest-neighbour) algorithm that will yield an acceptable result, but not necessarily the optimal one, within a relatively short time. This approach always picks the nearest neighbour as the next city to visit without regard to possible later non-optimal steps. Hence, it is considered a good-enough solution with fast results. Bentley argued that there may be better solutions, but that it approximates the optimal solution. Many other heuristic algorithms have been explored later on. There is no assurance that the solution found by a heuristic algorithm will be an ideal answer for the given problem, but it is acceptable and adequate (Pearl, 1984).

Heuristic algorithms of the shortest path are utilized nowadays by GPS frameworks and self-driving vehicles to choose the best route from any point of departure to any destination (for example, A* Search Algorithm). Further developed algorithms can also consider additional elements, including traffic, speed limits, and quality of roads, they may yield the shortest routes in terms of distance and the fastest ones in terms of driving time.

Computer chess

While the TSP consists of a whole set of problems which differ by the number of cities and the distances between them, determining the optimal strategy for chess is just one problem of a given size. The rules of chess make it a finite game, and Ernst Zermelo proved in 1913 that it is ‘determined’: if it were played between perfectly rational players, it would always end with the same outcome: either White always wins, or Black always wins, or it always ends with a draw (Zermelo, 1913). Up to the present day, it is not known which of the three is true, which points to the fact that a brute-force algorithm that would go through all possible plays of chess is practically infeasible: it would have to explore too many potential moves, and the required memory would quickly run out of space (Schaeffer et al., 2007). Inevitably, a chess-playing machine has to use algorithms that are ‘shortcuts’—which can be more or less intelligent.

While Simon and Newell had predicted in 1958 that within ten years the world chess champion would be a computer, it took until 1997, when a chess-playing machine developed by IBM under the name Deep Blue defeated grandmaster Garry Kasparov. Although able to analyse millions of possibilities due to their computing powers, today’s chess-playing machines apply a heuristic approach to eliminate unlikely moves and focus on those with a high probability of defeating their opponent (Newborn, 1997).

Machine learning

One of the main features of machine learning is the ability of the model to predict a future outcome based on past data points. Machine learning algorithms build a knowledge base similar to human experience from previous experiences in the dataset provided. From this knowledge base, the model can derive educated guesses.

A good demonstration of this is the card game Top Trumps in which the model can learn to play and keep improving to dominate the game. It does so by undertaking a learning path through a sequence of steps in which it picks two random cards from the deck and then analyses and compares them with random criteria. According to the winning result, the model iteratively updates its knowledge base in the same manner as a human, following the rule that ‘practice makes perfect.’ Hence the model will play, collect statistics, update, and iterate while becoming more accurate with each increment (Volz et al., 2016).

Natural language processing

In the world of language understanding, current technologies are far from perfect. However, models are becoming more reliable by the minute. When analysing and dissecting a search phrase entered into the Google search engine, a background model tries to make sense of the search criteria. Stemming words, context analysis, the affiliation of phrases, previous searches, and autocorrect/autocomplete can be applied in a heuristic algorithm to display the most relevant result in less than a second. Heuristic methods can be utilized when creating certain algorithms to understand what the user is trying to express when searching for a phrase. For example, using word affiliation, an algorithm tries to narrow down the meaning of words as much as possible toward the user’s intention, particularly when a word has more than one meaning but changes with the context. Therefore, a search for apple pie allows the algorithm to deduce that the user is highly interested in recipes and not in the technology company (Sullivan, 2002).

Search and big data

Search is a good example to appreciate the value of time, as one of the most important criteria is retrieving acceptable results in an acceptable timeframe. In a full search algorithm, especially in large datasets, retrieving optimal results can take a massive amount of time, making it necessary to apply heuristic search.

Heuristic search is a type of search algorithm that is used to find solutions to problems in a faster way than an exhaustive search. It uses specific criteria to guide the search process and focuses on more favourable areas of the search space. This can greatly reduce the number of nodes required to find a solution, especially for large or complex search trees.

Heuristic search algorithms work by evaluating the possible paths or states in a search tree and selecting the better ones to explore further. They use a heuristic function, which is a measure of how close a given state is to the goal state, to guide the search. This allows the algorithm to prioritize certain paths or states over others and avoid exploring areas of the search space that are unlikely to lead to a solution. The reached solution is not necessarily the best, however, a ‘good enough’ one is found within a ‘fast enough’ time. This technique is an example of a trade-off between optimality and speed (Russell et al., 2010).

Today, there is a rich literature on heuristic methods in computer science (Martí et al., 2018). As the problem to be solved may be the choice of a suitable heuristic algorithm, there are also meta-heuristics that have been explored (Glover and Kochenberger, 2003), and even hyper-heuristics which may serve to find or generate a suitable meta-heuristic (Burke et al., 2003). As Sörensen et al. (2018) point out, the term ‘metaheuristic’ may refer either to an ‘algorithmic framework that provides a set of guidelines or strategies to develop heuristic optimization algorithms’—or to a specific algorithm that is based on such a framework. E.g., a metaheuristic to find a suitable search algorithm may be inspired by the framework of biological evolution and use its ideas of mutation, reproduction and selection to produce a particular search algorithm. While this algorithm will still be a heuristic one, the fact that it has been generated by an evolutionary process indicates its superiority over alternatives that have been eliminated in the course of that process (cf. Vikhar, 2016).

Daniel Kahneman and Amos Tversky: heuristics and biases

Inspired by the concepts of Herbert Simon, psychologists Daniel Kahneman and Amos Tversky initiated the heuristics and biases research programme in the early 1970s, which emphasized how individuals make judgements and the conditions under which those judgements may be inaccurate (Kahneman and Klein, 2009).

In addition, Kahneman and Tversky emphasized information processing to elaborate on how real people with limitations can decide, choose, or estimate (Kahneman, 2011).

The remarkable article Judgement under Uncertainty: Heuristics and Biases, published in 1974, is considered the turning key that opened the door wide to research on this topic, although it was and still is considered controversial (Kahneman, 2011). In their research, Kahneman and Tversky identified three types of heuristics by which probabilities are often assessed: availability, representativeness, and anchoring and adjustment. In passing, Kahneman and Tversky mention that other heuristics are used to form non-probabilistic judgements; for example, the distance of an object may be assessed according to the clarity with which it is seen. Other researchers subsequently introduced different types of heuristics. However, availability, representativeness, and anchoring are still considered fundamental heuristics for judgements under uncertainty.

Availability

According to the psychological definition, availability or accessibility is the ease with which a specific thought comes to mind or can be inferred. Many people use this type of heuristic when judging the probability of an event that may have happened or will happen in the future. Hence, people tend to overestimate the likelihood of a rare event if it easily comes to mind because it is frequently mentioned in daily discussions (Kahneman, 2011). For instance, individuals overestimate their probability of being victims of a terrorist attack while the real probability is negligible. However, since terrorist attacks are highly available in the media, the feeling of a personal threat from such an attack will also be highly available during our daily life (Kahneman, 2011).

This concept is also present in business, as we remember the successful start-ups whose founders quit college for their dreams, such as Steve Jobs and Mark Zuckerberg, and ignore the thousands of ideas, start-ups, and founders that failed. This is because successful companies are considered a hot topic and receive broad coverage in the media, while failures do not. Similarly, broad media coverage is known to create top-of-mind awareness (TOMA) (Farris et al., 2010). Moreover, the availability type of heuristics was offered as a clarification for fanciful connections or irrelevant correlations in which individuals wrongly judge two events to be related to each other when they are not. Tversky and Kahneman clarified that individuals judge relationships based on the ease of envisioning the two events together (Tversky and Kahneman, 1973).

Representativeness

The representativeness heuristic is applied when individuals assess the probability that an object belongs to a particular class or category based on how much it resembles the typical case or prototype representing this category (Tversky and Kahneman, 1974). Conceptually, this heuristic can be decomposed into three parts. The first one is that the ideal case or prototype of the category is considered representative of the group. The second part judges the similarity between the object and the representative prototype. The third part is that a high degree of similarity indicates a high probability that the object belongs to the category, and a low degree of similarity indicates a low probability.

While the heuristic is often applied automatically within an instant and may be compelling in many cases, Tversky and Kahneman point out that the third part of the heuristic will often lead to serious errors or, at any rate, biases.

In particular, the representativeness heuristic can give rise to what is known as the base rate fallacy. As an example, Tversky and Kahneman consider an individual named Steve, who is described as shy, withdrawn, and somewhat pedantic, and report that people who have to assess, based on this description, whether Steve is more likely to be a librarian or a farmer, invariably consider it more likely that he is a librarian—ignoring the fact that there are many more farmers than librarians, the fact that an estimate of the probability that Steve is a librarian or a farmer, respectively, must take into account.

Another example is that a taxicab was engaged in an accident. The data indicates that 85% of the taxicabs are green and 15% blue. An eyewitness claims that the involved cab was blue. The court then evaluates the witness for reliability because he is 80% accurate and 20% inaccurate. So now, what would be the probability of the involved cab being blue, given that the witness identified it as blue as well?

To evaluate this case correctly, people should consider the base rate, 15% of the cabs being blue, and the witness accuracy rate, 80%. Of course, if the number of cabs is equally split between colours, then the only factor in deciding is the reliability of the witness, which is an 80% probability.

However, regardless of the colours’ distribution, most participants would select 80% to respond to this enquiry. Even participants who wanted to take the base rate into account estimated a probability of more than 50%, while the right answer is 41% using the Bayesian inference (Kahneman, 2011).

In relation to the representativeness heuristic, Kahnemann (2011) illustrated the ‘conjunction fallacy’ in the following example: based only on a detailed description of a character named Linda, doctoral students in the decision science programme of the Stanford Graduate School of Business, all of whom had taken several advanced courses in probability, statistics, and decision theory, were asked to rank various other descriptions of Linda according to their probability. Even Kahneman and Tversky were surprised to find that 85% of the students ranked Linda as a bank teller active in the feminist movement as more likely than Linda as a bank teller.

From these and many other examples, one must conclude that even sophisticated humans use the representativeness heuristic to make probability judgements without referring to what they know about probability.

Representativeness is used to make probability judgements and judgements about causality. The similarity of A and B neither indicates that A causes B nor that B causes A. Nevertheless, if A precedes B and is similar to B, it is often judged to be B’s cause.

Adjustment and anchoring

Based on Tversky and Kahneman’s interpretations, the anchor is the first available number introduced in a question forming the centre of a circle whose radius (up or down) is an acceptable range within which lies the best answer (Baron, 2000). This is used and tested in several academic and real-world scenarios and in business negotiations where parties anchor their prices to formulate the range of acceptance through which they can close the deal, deriving the ceiling and floor from the anchor. The impact is more dominant when parties lack time to analyse actions thoroughly.

Significantly, even if the anchor is way beyond logical boundaries, it can still bias the estimated numbers by all parties without them even realizing that it does (Englich et al., 2006).

In one of their experiments, Tversky and Kahneman (1974) asked participants to quickly calculate the product of numbers from 1 to 8 and others to do so from 8 to 1. Since the time was limited to 5 min, they needed to make a guess. The group that started from 1 had an average of 512, while the group that started from 8 had an average of 2250. The right answer was 40,320.

Perhaps this is one of the most unclear cognitive heuristics introduced by Kahneman and Tversky that can be interchangeably considered as a bias instead of a heuristic. The problem is that the mind tends to fixate on the anchor and adjust according to it, whether it was introduced implicitly or explicitly. Some scholars even believe that such bias/heuristic is unavoidable. For instance, in one study, participants were asked if they believed that Mahatma Gandhi died before or after nine years old versus before or after 140 years old. Unquestionably, these anchors were considered unrealistic by the audience. However, when the participants were later asked to give their estimate of Gandhi’s age of death, the group which was anchored to 9 years old speculated the average age to be 50, while the group anchored to the highest value estimated the age of death to be as high as 67 (Strack and Mussweiler, 1997).

Gerd Gigerenzer: fast-and-frugal heuristics

The German psychologist Gerd Gigerenzer is one of the most influential figures in the field of decision-making, with a particular emphasis on the use of heuristics. He has built much of his research on the theories of Herbert Simon and considers that Simon’s theory of bounded rationality was unfinished (Gigerenzer, 2015). As for Kahneman and Tversky’s work, Gigerenzer has a different approach and challenges their ideas with various arguments, facts, and numbers.

Gigerenzer explores how people make sense of their reality with constrained time and data. Since the world around us is highly uncertain, complex, and volatile, he suggests that probability theory cannot stand as the ultimate concept and is incapable of interpreting everything, particularly when probabilities are unknown. Instead, people tend to use the effortless approach of heuristics. Gigerenzer introduced the concept of the adaptive toolbox, which is a collection of mental shortcuts that a person or group of people can choose from to solve a current problem (Gigerenzer, 2000). A heuristic is considered ecologically rational if adjusted to the surrounding ecosystem (Gigerenzer, 2015).

A daring argument of Gigerenzer, which very much opposes the heuristics and biases approach of Kahneman and Tversky, is that heuristics cannot be considered irrational or inferior to a solution by optimization or probability calculation. He explicitly argues that heuristics are not gambling shortcuts that are faster but riskier (Gigerenzer, 2008), but points to several situations where less is more, meaning that results from frugal heuristics, which neglect some data, were nevertheless more accurate than results achieved by seemingly more elaborate multiple regression or Bayesian methods that try to incorporate all relevant data. While researchers consider this counterintuitive since a basic rule in research seems to be that more data is always better than less, Gigerenzer points out that the less-is-more effect (abbreviated as LIME) could be confirmed by computer simulations. Without denying that in some situations, the effect of using heuristics may be biased (Gigerenzer and Todd, 1999), Gigerenzer emphasizes that fast-and-frugal heuristics are basic, task-oriented choice systems that are a part of the decision-maker’s toolbox, the available collection of cognitive techniques for decision-making (Goldstein and Gigerenzer, 2002).

Heuristics are considered economical because they are easy to execute, seek limited data, and do not include many calculations. Contrary to most traditional decision-making models followed in the social and behavioural sciences, models of fast-and-frugal heuristics portray not just the result of the process but also the process itself. They comprise three simple building blocks: the search rule that specifies how information is searched for, the stopping rule that specifies when the information search will be stopped, and finally, the decision rule that specifies how the processed information is integrated into a decision (Goldstein and Gigerenzer, 2002).

Rather than characterizing heuristics as rules of thumb or mental shortcuts that can cause biases and must therefore be regarded as irrational, Gigerenzer and his co-workers emphasize that fast-and-frugal heuristics are often ecologically rational, even if the conjunction of them may not even be logically consistent (Gigerenzer and Todd, 1999).

According to Goldstein and Gigerenzer (2002), a decision maker’s pool of mental techniques may contain logic and probability theory, but it also embraces a set of simple heuristics. It is compared to a toolbox because just as a wood saw is perfect for cutting wood but useless for cutting glass or hammering a nail into a wall, the ingredients of the adaptive toolbox are intended to tackle specific scenarios.

For instance, there are specific heuristics for choice tasks, estimation tasks, and categorization tasks. In what follows, we will discuss two well-known examples of fast-and-frugal heuristics: the recognition heuristic (RH), which utilizes the absence of data, and the take-the-best heuristic (TTB), which purposely disregards the data.

Both examples of heuristics can be connected to decision assignments and to circumstances in which a decision-maker needs to decide which of two options has a higher reward on a quantitative scale.

Ideal scenarios would be deducing which one of two stock shares will have a better income in the next month, which of two cars is more convenient for a family, or who is a better candidate for a particular job (Goldstein and Gigerenzer, 2002).

The recognition heuristic

The recognition heuristic has been examined broadly with the famous experiment to determine which of the two cities has a higher population. This experiment was conducted in 2002, and the participants were undergraduate students: one group in the USA and one in Germany. The question was as follows: which has more occupants—San Diego or San Antonio? Given the cultural difference between the student groups and the level of information regarding American cities, it could be expected that American students would have a higher accuracy rate than their German peers. However, most German students did not even know that San Antonio is an American city (Goldstein and Gigerenzer, 2002). Surprisingly, the examiners, Goldstein and Gigerenzer, found the opposite of what was expected. 100% of the German students got the correct answer, while the American students achieved an accuracy rate of around 66%. Remarkably, the German students who had never known about San Antonio had more correct answers. Their lack of knowledge empowered them to utilize the recognition heuristic, which states that if one of two objects is recognized and the other is not, then infer that the recognized object has the higher value concerning the relevant criterion. The American students could not use the recognition heuristic because they were familiar with both cities. Ironically, they knew too much.

The recognition heuristic is an incredible asset. In many cases, it is used for swift decisions since recognition is usually systematic and not arbitrary. Useful applications may be cities’ populations, players’ performance in major leagues, or writers’ level of productivity. However, this heuristic will be less efficient in more difficult scenarios than a city’s population, such as the age of the city’s mayor or its sea-level altitude (Gigerenzer and Todd, 1999).

Take-the-best heuristic

When the recognition heuristic is not efficient because the decision-maker has enough information about both options, another important heuristic can be used that relies on hints or cues to arrive at a decision. The take-the-best (TTB) heuristic is a heuristic that relies only on specific cues or signals and does not require any complex calculations. In practice, it often boils down to a one-reason decision rule, a type of heuristic where judgements are based on a single good reason only, ignoring other cues (Gigerenzer and Gaissmaier, 2011). According to the TTB heuristic, a decision-maker evaluates the case by selecting the attributes which are important to him and sorts these cues by importance to create a hierarchy for the decision to be taken. Then alternatives are compared according to the first, i.e., the most important, cue; if an alternative is the best according to the first cue, the decision is taken. Otherwise, the decision-maker moves to the next layer and checks that level of cues. In other words, the decision is based on the most important attribute that allows one to discriminate between the alternatives (Gigerenzer and Goldstein, 1996). Although this lexicographic preference ordering is well known from traditional economic theory, it appears there mainly to provide a counterexample to the existence of a real-valued utility function (Debreu, 1959). Surprisingly, however, it seems to be used in many critical situations. For example, in many airports, the customs officials may decide if a traveller is chosen for a further check by looking only at the most important attributes, such as the city of departure, nationality, or luggage weight (Pachur and Marinello, 2013). Moreover, in 2012, a study explored voters’ views of how US presidential competitors would deal with the single issue that voters viewed as most significant, for example, the state of the economy or foreign policy. A model dependent on this attribute picked the winner in most cases (Graefe and Armstrong, 2012).

However, the TTB heuristic has a stopping rule applied when the search reaches a discriminating cue. So, if the most important signal discriminates, there is no need to continue searching for other cues, and only one signal is considered. Otherwise, the next most important signal will be considered. If no discriminating signal is found, the heuristic will need to make a random guess (Gigerenzer and Gaissmaier, 2011).

Empirical evidence on fast-and-frugal heuristics

More studies have been conducted on fast-and-frugal heuristics using analytical methods and simulations to investigate when and why heuristics yield accurate results on the one hand, and on the other hand, using experiments and observational methods to find out whether and when people use fast-and-frugal heuristics (Luan et al., 2019). Structured examinations and benchmarking with standard models, for example, regression or Bayesian models, have shown that the accuracy of fast-and-frugal heuristics relies upon the structure of the information environment (e.g., the distribution of signal validities, the interrelation between signals, etc.). In numerous situations, fast-and-frugal heuristics can perform well, particularly in generalized contexts, when making predictions for new cases that have not been previously experienced. Empirical examinations show that people utilize fast-and-frugal heuristics under a time constraint when data is hard to obtain or must be retrieved from memory. Remarkably, some studies have inspected how individuals adjust to various situations by learning. Rieskamp and Otto (2006) found that individuals seemingly learn to choose the heuristic that has the best performance in a specific domain. Nevertheless, Reimer and Katsikopoulos (2004) found that individuals apply fast-and-frugal heuristics when making inferences in groups.

Critiques

While interest in heuristics has been increasing, some of the literature has been mostly critical. In particular, the heuristics and biases programme introduced by Kahneman and Tversky has been the target of more than one critique (Reisberg, 2013).

The arguments are mainly in two directions. The first is that the main focus is on the coherence standards such as rationality and that the detection of biases ignores the context-environmental factors where the judgements occur (B.R. Newell, 2013). The second is that notions such as availability or representativeness are vague and undefined, and state little regarding the procedures’ hidden judgements (Gigerenzer, 1996). For example, it has been argued that the replies in the acclaimed Linda-the-bank-teller experiment could be considered sensible instead of biased if one uses conversational or colloquial standards instead of formal probability theory (Hilton, 1995).

The argument of having a vague explanation for certain phenomena can be illustrated when considering the following two scenarios. People tend to believe that an opposite outcome will be achieved after having a stream of the same outcome (e.g., people tend to believe that ‘heads’ should be the next outcome in a coin-flipping game with many consecutive ‘tails’). This is called the gambler fallacy (Barron and Leider, 2010). By contrast, the hot-hand fallacy (Gilovich et al., 1985) argues that people tend to believe that a stream of the same outcome will continue when there is a lucky day (e.g., a player is taking a shot in a sport such as a basketball after a series of successful attempts). Ayton and Fisher (2004) argued that, although these two practices are quite opposite, they have both been classified under the heuristic of representativeness. In the two cases, a flawed idea of random events drives observers to anticipate that a certain stream of results is representative of the whole procedure. In the first scenario of coin flipping, people tend to believe that a long stream of tails should not occur; hence the head is predicted. While in the case of the sports player, the stream of the same outcome is expected to continue (Gilovich et al., 1985). Therefore, representativeness cannot be diagnosed without considering in advance the expected results. Also, the heuristic does not clarify why people have the urge to believe that a stream of random events should have a representative, while in real life, it does not (Ayton and Fischer, 2004).

Nevertheless, the most common critique of Kahneman and Tversky is the idea that ‘we cannot be that dumb’. It states that the heuristics and biases programme is overly pessimistic when assessing the average human decision-making. Also, humans collectively have accumulated many achievements and discoveries throughout human history that would not have been possible if their ability to adequate decision-making had been so limited (Gilovich and Griffin, 2002).

Similarly, the probabilistic mental models (PMM) theory of human inference inspired by Simon and pioneered by Gigerenzer has also been exposed to criticism (B.R. Newell et al., 2003). Indeed, the enticing character of heuristics that they are both easy to apply and efficient has made them famous within different domains. However, it has also made them vulnerable to replications or variations of the experiments that challenge the original results. For example, Daniel Oppenheimer (2003) argues that the recognition heuristic (RH) could not yield satisfactory results after replicating the experiment of city populations. He claims that the participants’ judgements failed to obey the RH not just when there were cues other and stronger than mere recognition but also in circumstances where recognition would have been the best cue available. In any case, one could claim that there are numerous methods in the adaptive toolbox and that under certain conditions, people may prefer to use heuristics other than the RH. However, this statement is also questionable since many heuristics that are thought to exist in the adaptive toolbox acknowledge the RH as an initial step (Gigerenzer and Todd, 1999). Hence, if individuals are not using the RH, they cannot use many of the other heuristics in the adaptive toolbox (Oppenheimer, 2003). Likewise, Newell et al. (2003) question whether the fast-and-frugal heuristics accurately explain actual human behaviour. In two experiments, they challenged the take-the-best (TTB) heuristic, as it is considered a building block in the PMM framework. The outcomes of these experiments, together with others, such as those of Jones et al. (2000) and Bröder (2000), show that the TTB heuristic is not a reliable approach even within circumstances favouring its use. In a somewhat heated debate published in the Psychological Review 1996, Gigerenzer’s criticism of Kahneman and Tversky that many of the so-called biases ‘disappear’ if frequencies rather than probabilities are assumed, was countered by Kahneman and Tversky (1996) by means of a detailed re-examination of the conjunction fallacy (or Linda Problem). Gigerenzer (1996) remained unconvinced, and was in turn, blamed by Kahneman and Tversky (1996, p. 591) for just reiterating ‘his objections … without answering our main arguments’.

Conclusion

Our historical review has revealed a number of issues that have received little attention in the literature.

Deliberate vs. automatic heuristics

We have differentiated between deliberate and automatic heuristics, which often seem to be confused in the literature. While it is a widely shared view today that the human brain often relies heavily on the fast and effortless ‘System 1’ in decision-making, but can also use the more demanding tools of ‘System 2’, and it has been acknowledged, e.g. by Kahneman (2011, p. 98), that some heuristics belong to System 1 and others to System 2, the two systems are not as clearly distinct as it may seem. In fact, the very wide range of what one may call ‘heuristics’ shows that there is a whole spectrum of fallible decision-making procedures—ranging from the probably innate problem-solving strategy of the baby that cries whenever it is hungry or has some other problem, to the most elaborate and sophisticated procedures of, e.g., Polya, Bolzano, or contemporary chess-engines. One may be tempted to characterize instinctive procedures as subconscious and sophisticated ones as conscious, but a deliberate heuristic can very well become a subconsciously applied ‘habit of the mind’ or learnt routine with experience and repetition. Vice versa, automatic, subconscious heuristics can well be raised to consciousness and be applied deliberately. E.g., the ‘inductive inference’ from tasty strawberries to the assumption that all red berries are sweet and edible may be quite automatic and subconscious in little children, but the philosophical literature on induction shows that it can be elaborated into something quite conscious. However, while the notion of consciousness may be crucial for an adequate understanding of heuristics in human cognition, for the time being, it seems to remain a philosophical mystery (Harley, 2021; Searle, 1997), and once programmed, sophisticated heuristic algorithms can be executed by automata.

The deliberate heuristics that we reviewed also illustrate that some of them can hardly be called ‘simple’, ‘shortcuts’, or ‘rules of thumb’. E.g., the heuristics of Descartes, Bolzano, or Polya each consist of a structured set of suggestions, and, e.g., ‘to devise a plan’ for a mathematical proof is certainly not a shortcut. Llull (1308, p. 329), to take another example, wrote of his ‘ars magna’ that ‘the best kind of intellect can learn it in two months: one month for theory and another month for practice’.

Heuristics vs. algorithms

Our review of heuristics also allowed us to clarify the distinction between heuristics and algorithms. As evidenced by our glimpse at computer science, there are procedures that are quite obviously both an algorithm and a heuristic. Within computer science, they are in fact quite common. Algorithms of the heuristic type may be required for certain problems even though an algorithm that finds the optimal solution exists ‘in principle’—as in the case of determining the optimal strategy in chess, where the brute-force-method to enumerate all possible plays of chess is just not practically feasible. In other cases, heuristic algorithms are used because an exhaustive search, while practically feasible, would be too costly or time-consuming. Clearly, for many problems, there are also problem-solving algorithms which always do produce the optimal solution in a reasonable time frame. Given our definition of a heuristic as a fallible method, algorithms of this kind are counterexamples to the complaint that the notion has become so wide that ‘any procedure can be called a heuristic’. However, as we have seen, there are also heuristic procedures that are non-algorithmic. These may be necessary either because the problem to be solved is not sufficiently well-defined to allow for an algorithm, or because an algorithm that would solve the problem at hand, is not known or does not exist. Kleining’s qualitative heuristics is an example of non-algorithmic heuristics necessitated by the ill-defined problems of research in the social sciences, while Polya’s heuristic for solving mathematical problems is an example of the latter: an algorithm that would allow one to decide if a given mathematical conjecture is a theorem or not does not exist (cf. Davis, 1965).

Pre-SEU vs. post-SEU heuristics

As we noted in the introduction, the emergence of the SEU theory can be regarded as a kind of watershed for the research on heuristics, as it came to be regarded as the standard definition of rational choice. Post-SEU, fallible methods of decision-making would have to face comparison with this standard. Gigerenzer’s almost belligerent criticism of SEU shows that even today it seems difficult to discuss the pros and cons of heuristics unless one relates them to the backdrop of SEU. However, his criticism of SEU is mostly en passant and seems to assume that the SEU model requires ‘known probabilities’ (e.g., Gigerenzer, 2021), ignoring the fact that it is, in general, subjective probabilities, as derived from the agent’s preferences among lotteries, that the model relies on (cf. e.g., Jeffrey, 1967 or Gilboa, 2011). In fact, when applied to an ill-defined decision problem in, e.g., management, the SEU theory may well be regarded as a heuristic—it asks you to consider the possible consequences of the relevant set of actions, your preferences among those consequences, and the likelihood of those consequences. To the extent that one may get all of these elements wrong, SEU is a fallible method of decision-making. To be sure, it is not a fast and effortless heuristic, but our historical review of pre-SEU heuristics has illustrated that heuristics may be quite elaborate and require considerable effort and attention.

It is quite true, of course, that the SEU heuristic will hardly be helpful in problem-solving that is not ‘just’ decision-making. If, e.g., the problem to be solved is to find a proof for a mathematical conjecture, the set of possible actions will in general be too vast to be practically contemplated, let alone evaluated according to preferences and probabilities.

Positive vs. negative heuristics

To the extent that the study of heuristics aims at understanding how decisions are actually made, it is not only positive heuristics that need to be considered. It will also be required to investigate the conditions that may prevent the agent from adopting certain courses of action. As we saw, Lakatos used the notion of negative heuristics quite explicitly to characterize research programmes, but we also briefly review Duncker’s notion of ‘functional fixedness’ as an example of a hindrance to adequate problem-solving. A systematic study of such negative heuristics seems to be missing in the literature and we believe that it may be a helpful complement to the study of positive heuristics which has dominated the literature that we reviewed.

To the extent that heuristics are studied with the normative aim of identifying effective heuristics, it may also be useful to consider approaches that should not be taken. ‘Do not try to optimize!’ might be a negative heuristic favoured by the fast-and-frugal school of thought.

Heuristics as the product of evolution

Clearly, heuristics have always existed throughout the development of human knowledge due to the ‘old mind’s’ evolutionary roots and the frequent necessity to apply fast and sufficiently reliable behaviour patterns. However, unlike the behaviour patterns in the other animals, the methods used by humans in problem-solving are sufficiently diverse that the dual-process theory was suggested to provide some structure to the rich ‘toolbox’ humans can and do apply. As all our human DNA is the product of evolution, it is not only the intuitive inclinations to react to certain stimuli in a particular way that must be seen as the product of evolution, but also our ability to abstain from following our gut feelings when there is reason to do so, to reflect and analyse the situation before we embark on a particular course of action. Quite frequently, we experience a tension between our intuitive inclinations and our analytic mind’s judgement, but both of them are somehow the product of evolution, our biography, and the environment. Thus, to point out that gut feelings are an evolved capacity of the brain does in no way provide an argument that would support their superiority over the reflective mind.

Moreover, compared to the speed of problem change in our human lifetimes, biological evolution is very slow. The evolved capacities of the human brain may have been well-adapted to the survival needs of our ancestors some 300,000 years ago, but there is little reason to believe that they are uniformly well-adapted to human problem-solving in the 21st century.

Resource-bounded and ecological rationality

Throughout our review, the reader will have noticed that many heuristics have been suggested for specific problem areas. The methods of the ancient Greeks were mainly centred around solving geometrical problems. Llull was primarily concerned with theological questions, Descartes and Leibniz pursued ‘mechanical’ solutions to philosophical issues, Polya suggested heuristics for Mathematics, Müller for engineering, and Kleining for social science research. This already suggests that heuristics suitable for one type of problem need not be suitable for a different type. Likewise, the automatic heuristics that both the Kahneman-Tversky and the Gigerenzer schools focused on, are triggered by particular tasks. Simon’s observation that the success of a given heuristic will depend on the environment in which it is employed, is undoubtedly an important one that has motivated Gigerenzer’s notion of ecological rationality and is strikingly absent from the SEU model. If ‘environment’ is taken in a broad sense that includes the available resources, the cost of time and effort, the notion seems to cover what has been called resource-rational behaviour (e.g., Bhui et al., 2021).

Avenues of further research

A comprehensive study describing the current status of the research on heuristics and their relation to SEU seems to be missing and is beyond the scope of our brief historical review. Insights into their interrelationship can be expected from recent attempts at formal modelling of human cognition that take the issues of limited computational resources and context-dependence of decision-making seriously. E.g., Lieder and Griffiths (2020) do this from a Bayesian perspective, while Busemeyer et al. (2011) and Pothos and Busemeyer (2022) use a generalization of standard Kolmogorov probability theory that is also the basis of quantum mechanics and quantum computation. While it may seem at first glance that such modelling assumes even more computational power than the standard SEU model of decision-making, the computational power is not assumed on the part of the human decision-maker. Rather, the claim is that the decision-maker behaves as if s/he would solve an optimization problem under additional constraints, e.g., on computational resources. The ‘as if’ methodology that is employed here is well-known to economists (Friedman, 1953; Mäki, 1998) and also to mathematical biologists who have used Bayesian models to explain animal behaviour (McNamara et al., 2006; Oaten, 1977; Pérez-Escudero and de Polavieja, 2011). Evolutionary arguments might be invoked to support this methodology if a survival disadvantage can be shown to result from behaviour patterns that are not Bayesian optimal, but we are not aware of research that would substantiate such arguments. However, attempting to do so by embedding formal models of cognition in models of evolutionary game theory may be a promising avenue for further research.