Introduction

Recent developments in artificial intelligence (AI), machine learning (ML), neuro- and self-tracking technologies, or social robotics have increasingly prompted debates on the conditions for successful Human–Machine Interaction (HMI), as well as its potential implications and consequences for engineering, the sciences, ethics, and politics. While there is a strong tradition of (interdisciplinary) exploring the necessary conditions for interaction between humans and computers in informatics, engineering, the humanities, and the sciences, these disciplines were primarily concerned with questions regarding appropriate user interfaces for a long time, i.e., with possible ways of adequately and effectively transferring data and information between humans (understood as users) and computers with the goal of solving certain problems. Hence, the primary focus was on developing visually, haptically, and linguistically adequate input as well as output devices for effectively using computers (referred to as Human–Computer Interaction (HCI))Footnote 1.

However, since the 1980s, following, e.g., the development of the first Brain–Computer Interfaces (BCIs), augmented and virtual reality, ML, or ubiquitous computing, there has been a tendency away from this focus on useful devices toward a more sophisticated understanding of interaction, often referring to some kind of dialogue or communication between humans and machines in a broad sense (i.e., neither are humans only understood as users, nor is the focus solely on computers anymore). This often is referred to as HMI and ultimately resulted in the still ongoing efforts to simulate essential characteristics and conditions of human communication in machines. Even though there is by far no agreement on what these are exactly (for instance, consciousness, intelligence, or embodiment) and how they could effectively be simulated, the perspectives of sociology, philosophy, psychology, and cognitive science as well as media studies and communication science increasingly came into play in the development of machines capable of interacting with humans.

It is this background against which Janlert and Stolterman state that “interactivity is one of the most commonly mentioned and prominent characteristics of digital artifacts” (Janlert and Stolterman, 2017: 107; cf. Rafaeli, 1988; Bucy, 2004; McMillan, 2005). At the same time, however, the authors note that although there is a colloquial idea of what “interaction” means, and, hence, what the term refers to, the concepts’ usage, especially in scientific debates, is still vague and ambiguous (Janlert and Stolterman, 2017: p. 105; cf. Bucy, 2004). It can be noted that “interaction” is often used to denote “mutual or reciprocal action or influence” (Merriam-Webster Dictionary, 2022). According to this understanding, “interaction” refers to something (e.g., certain actions) taking place between two or more entities (in most cases: humans) with a view to some purpose or goal (within a certain context), and with any of these entities undertaking an active role (in the process of interacting). Given this characterization, the reason for the term “interaction” being vague and ambiguous becomes obvious: what exactly takes place in interactions in what contexts between what kinds of entities with a view to what purposes or goals can be interpreted in many different ways.

However, a vague and ambiguous concept of interaction makes it difficult to reasonably debate questions of ethics, politics, engineering, and the sciences when it comes to HMI. Strictly speaking, even (fundamental) questions like “is interaction (in this situation) actually happening? Is it good or bad? What does this mean for future design processes?” are difficult to answer (Janlert and Stolterman, 2017). This issue is further complicated by the fact that ethical and philosophical debates on HMI often refer only implicitly to the concept of interaction without analyzing or explaining it.

Against this background, in the following, we elaborate and analyze the different meanings and dimensions of the term “interaction” in the disciplines and discourses relevant to debates on modern HMI. This helps to highlight similarities and differences in disciplinary understandings and generate conceptual clarity for subsequent normative debates on HMI. For this purpose, we, first, introduce a four-dimensional model of interaction as a basis for analyzing the different meanings of “interaction”. Second, we present some important terms related to the concept of “interaction”, i.e., the concepts of interactivity, interactability, and interactiveness. Third, we elaborate on the most prominent meanings attached to the concept of interaction in the disciplines essential to the discourse on HMI. This will, fourth, be followed by an analysis of their key elements with a view to our four-dimensional model of interaction.

With this, we do not claim to elaborate a single correct definition of “interaction”. Rather, we aim to give an overview of the currently most prominent usages of “interaction” as well as their implications and presuppositions with the aim of providing a basis for a fruitful intra- and particularly interdisciplinary discourse on HMI between, e.g., philosophers of science, social scientists, linguists, engineers, or designers, that also captures novel aspects of interaction in emerging AI-based technologies. Not all approaches presented here make explicit reference to HMI. Nevertheless, they are relevant to the debate insofar as they provide frequently used background assumptions in debates about HMI.

A four-dimensional model of interaction

Recall that, according to the understanding of “interaction” as mutual or reciprocal action or influence, the term refers to something (e.g., certain actions) taking place between two or more entities with a view to some purpose or goal (within a certain context). Hence, interaction may be understood as a four-dimensional concept referring (1) to certain subjects (i.e., to the question: who interacts?), (2) to modes of interaction (how do these subjects interact?), (3) to purposes of interaction (why, or: for what reasons is interaction taking place?), and (4) to certain contexts (where, or: under what conditions is interaction taking place?). In the following, we refer to this as the SMPC model of interaction, with regard to which we analyze the different meanings of “interaction”. Ultimately, given its formal characterization, the reason for the vagueness and ambiguity of “interaction” can be further specified: it is the four dimensions of “interaction” that are interpreted differently intra- as well as interdisciplinary.

“Interaction” and related terms

The understanding of “interaction” as mutual or reciprocal action or influence has been criticized for not considering essential aspects of interaction, especially when it comes to HMI. Svanæs (2000), e.g., has pointed out that “being interactive” may not only denote some sort of ongoing process, but also some kind of potential or disposition: humans (as well as, for instance, computers) would even be considered interactive in situations where they were not actively involved in mutual or reciprocal action (humans) or not in use (computers and machines). Against this background, authors often refer to “interactivity” instead of “interaction” (Rafaeli, 1988; Svanæs, 2000; Kiousis, 2002; Bucy, 2004; Bucy and Tao, 2007), using it “(a) as a general term for the phenomenon of [or disposition for] interaction, and (b) as a term for ongoing interaction” (Janlert and Stolterman, 2017: p. 112). In the following, we are primarily interested in the different meanings and dimensions attached to processes of ongoing interaction (usage (b) of “interactivity”), whereas we understand usage (a) as delineating the conditions principally necessary for processes of interaction and hence, as being necessarily presupposed in any reference to “interactivity” in the sense of (b).

Janlert and Stolterman seem to have something similar in mind when introducing the term “interactability” to denote the “intrinsic quality of an artifact or system that allows for interactions with a user” (Janlert and Stolterman, 2017: p. 112) and, hence, to specify usage (a) of “interactivity”. According to the authors, “interactability” delineates the specific properties or qualities of specific objects (i.e., computers or machines) in specific scenarios, which (to a certain degree) enable interaction with these very objects in these very scenarios. In contrast, we understand usage (a) as designating the principal conditions that entities (computers, machines, or humans) must satisfy to be involved in interaction processes at all. Nevertheless, the term “interactability” captures an important aspect: the necessity of specifying these very conditions in view of specific scenarios and specific entities. We, thus, understand “interactability” as delineating the possibilities and limitations of specific computers or machines to interact or, in other words, the degree of a specific computer’s or machine’s disposition for interaction (i.e., usage (a) of “interactivity”) and, hence, the conditions for concrete processes of interaction with this very computer or machine (usage (b) of “interactivity”). This, however, presupposes some conceptual point of reference, i.e., an understanding of usage (a) informing of what it means that somebody or something has a disposition for interaction. The fact that Janlert and Stolterman miss this important point might be due to their more technical and, hence, application-oriented approach to the questions of the meaning of “interaction” in HMI.

They point, however, to another important concept in HMI scenarios: as “interactability” only specifies the degree of a specific computer’s or machine’s disposition for interaction, a high degree of interactability does not necessarily result in actual interaction between humans and computers or machines. What is lacking, therefore, is a term denoting “an artifact’s or system’s propensity to engage users in interactions” (Janlert and Stolterman, 2017: p. 113), i.e., a term that refers to the sense or extent to which a computer or machine stimulates its human counterpart to interact with it at all or to maintain an ongoing interaction. The authors propose the term “interactiveness” to denote this phenomenon.

To sum up, “interaction” is intricately connected to the terms “interactivity”, “interactability”, as well as “interactiveness” (Fig. 1).Footnote 2

Fig. 1: Interactivity, Interaction, Interactability, and Interactiveness.
figure 1

Relation of terms connected with “interaction”.

In what follows, against the background of our SMPC model, we are primarily interested in analyzing the different meanings of “interaction” as a process, i.e., in the sense of usage (b) of “interactivity”. Where appropriate, we will, nevertheless, introduce the meanings of “interactability” and “interactiveness” as implied in the different accounts of “interaction”.

The concept of “interaction” in disciplines essential to the discourse on HMI

“Interaction” in informatics and computer science

In informatics and computer science (as the main reference point of HMI debates), several stages of addressing interaction between humans and computers or machines must be distinguished. Following Charles Babbage’s concept of the Analytical Engine in 1837 (Bromley, 1982), Ada Lovelace’s first computer program in 1842 (Charman-Anderson, 2015), or the development of the first punch-card-based data processing system by Herman Hollerith in 1889 (Heide, 2009), the focus was mainly on questions of construction and effectiveness of algorithms and computer predecessors.

This did not change significantly until the development of Konrad Zuse’s first digital computer Z3 in 1941 and the growing need to cope with the constantly increasing amount of data and information (e.g., in research and the sciences) with the help of calculating machines (Bush, 1945). Subsequently, questions regarding appropriate user interfaces aiming at accessible and understandable ways of one- or two-way data and information transfer between users and computers became increasingly virulent. This was mostly referred to as questions of HCI. In the following years, the development of the first handwriting recognition devices (Dimond, 1957), the first graphical computer system DAC-1 (Krull, 1994), and the computer game Tennis for Two (Gold, 2004) were marking milestones in this regard.

Up to this point, questions of appropriate user interfaces for data and information exchange between users and computers were solely discussed regarding users transferring pre-formulated problems to computers, getting them processed and the results returned (e.g., transferring handwritten records or design sketches to be returned in digitized form). In the 1960s, this changed fundamentally when Joseph Licklider in his seminal paper Man-Computer Symbiosis brought up the vision of computers not only being used for processing pre-formulated problems but also for the development of new (technical) problems:

[…] many problems that can be thought through in advance are very difficult to think through in advance. They would be easier to solve, and they could be solved faster, through an intuitively guided trial-and-error procedure in which the computer cooperated, turning up flaws in the reasoning or revealing unexpected turns in the solution. Other problems simply cannot be formulated without computing-machine aid. Poincaré anticipated the frustration of an important group of would-be computer users when he said, “The question is not, ’What is the answer?’ The question is, ’What is the question?’” One of the main aims of man-computer symbiosis is to bring the computing machine effectively into the formulative parts of technical problems. (Licklider, 1960: p. 5).

In this regard, Licklider’s vision implied some kind of division of labor between user and computer. As such, it is often understood as having introduced the very idea of interactive systems (focusing on questions of interactability). This led to a decisive shift in considerations on user interfaces: ways of accessible and understandable data and information transfer, as well as questions regarding the possibility of and conditions for interactive communication and dialog between users and computers, increasingly got into the focus of research and development (cf. Licklider, 1960).Footnote 3 Licklider described the basic features of such interactive interfaces as follows:

Certainly, for effective man-computer interaction, it will be necessary for the man and the computer to draw graphs and pictures and to write notes and equations to each other on the same display surface. […] With such an input-output device, the operator would quickly learn to write or print in a manner legible to the machine. […] He could correct the computer’s data, instruct the machine via flow diagrams, and in general interact with it very much as he would with another engineer, except that the „other engineer“ would be a precise draftsman, a lightning calculator, a mnemonic wizard, and many other valuable partners all in one. (Licklider, 1960: p. 9).

In the subsequent years, a number of innovative interfaces were developed taking up these ideas, e.g., Ivan Sutherlands interactive drawing program (Sutherland, 1964), the first ever Virtual Reality System (Sutherland, 1968), the NLS (oN-Line System) introducing the computer mouse (Barnes, 1997), the RAND tablet with its GRAIL (GRaphical Input Language) system (Ellis et al., 1969) or several haptic interfaces (Brooks et al., 1990). This was followed in the 1970s by the development of, for instance, the first WYSIWYG (what you see it what you get)-based word processing software BRAVO (Newman, 2012), the concepts of responsive environments and artificial reality (Krueger, 1977; 1983), or the first ever data glove (Sturman and Zeltzer, 1994). In the 1980s, commercial computer systems emerged including more sophisticated WYSIWYG-based software (Johnson et al., 1989; Perkins et al., 1997) as did new interaction concepts as, for instance, multi-touch input devices and touch screens (Buxton et al., 1985; Lee et al., 1985; Buxton, 2010).

A somewhat different step in the history of interactive technologies, however, was the development of the first BCI in 1988, which aimed at enabling locked-in patients to communicate with their outside world (Farwell and Donchin, 1988) and, thus, added a further aspect of interaction between users and machines. Yet another aspect appeared in the 1990s when the first approaches to augmented reality emerged (Thomas and David, 1992). Moreover, growing miniaturization in microelectronics allowed for the development of new forms of humans interacting with machines, exemplified, for instance, by so-called embedded systems, ubiquitous computing (Weiser, 1991), or new interfaces for mobile devices. In the early 2000s, body language as yet another feature of interaction came to the fore and led to the development of gaming technologies like the EyeToy, the Wii Remote or Kinect (Nowogrodzki, 2018), as well as to the first approaches of using human hand gestures in interactions with machines (Maes and Mistry, 2009). Furthermore, research in the field of brain–computer interfaces was intensified with the aim of using the human body itself as an interface for interacting with machines (Velliste et al., 2008). From now on, interaction was not necessarily understood as a direct and explicit exchange of data and information anymore. This is one aspect leading to the increasing use of the concept of HMI instead of HCI.

In recent years, research in informatics and computer science had a strong focus on AI, especially on the development of neural networks and approaches of ML (Simeone, 2018), leading to groundbreaking developments in, for instance, online search and automated image recognition systems, social media algorithms, as well as concepts and first applications of, e.g., autonomous vehicles (Schwarting et al., 2018), medical decision support systems (McKinney et al., 2020), or systems for determining the probability of criminal recidivism (Biddle, 2022) and thus to novel ways of HMI.

In summary, the analysis and investigation of interaction between humans and machines in informatics and computer science can roughly be divided into three stages: in the first stage, beginning in the 1840s and ending around the 1950s, the focus was primarily on questions of engineering, construction, and effectiveness of algorithms and computer predecessors.

The second stage (focusing on HCI), starting in the 1950s, was dominated by approaches to develop appropriate user interfaces for data and information exchange between users and computers, focusing on the successful transfer of pre-formulated problems to computers to be processed and the results returned. At this stage, informatics and computer science were primarily concerned with questions of interactability and interactiveness, i.e., with the disposition of computers to interact with users in view of their objectives. Interaction was understood as a direct and explicit exchange of data and information and followed purely epistemic objectives (cf. Dix et al., 2003: p. 126; Norman, 1984; 2013) (i.e., computers were understood as cognitive devices “that extend or supplement human cognitive functioning by performing information processing tasks” (Brey, 2005: p. 384). Regarding our SMPC model of interaction, users and computers were understood as asymmetrical subjects of interaction (referring to the question of who interacts), interacting in the mode of exchanging data and information (referring to the question of how interaction takes place) for the purpose of solving certain user-centered problems (referring to the question of why interaction takes place), mostly in complex technical and mathematical contexts (referring to the question of where interaction takes place),

This changed fundamentally in the third stage of informatics’ and computer science’s investigation of interaction, beginning in the late 1980s and resulting in the reference to HMI instead of HCI: From now on interaction was not only analyzed from a user-centered perspective anymore. Rather, issues of interactive and responsive communication and dialogue between humans and machines came to the fore. No more were informatics and computer science solely concerned with questions of computers’ disposition to interaction in view of the user’s goals and intentions, but also with issues of human disposition to interaction with computers and machines as well as with (ongoing) processes of interaction. Furthermore, HMI now increasingly adhered to an ontic logic (i.e., “computers [and machines] simulate environments and tools to engage these environments” (Brey, 2005: p. 384). Analysis of such new forms of HMI thus involves investigating “the meaning of […] actions, […] goals and […] intentions. (It cannot be ‘just the data’.)” (Müller, 2011: p. 4). This is since in the third stage

[n]ew interactive environments are responsive, active, sensitive, and in a constant dialog with people in the environment. The environments themselves are in some sense becoming more agential and goal driven. Because interactivity is understood here as requiring agency of some sort, interactivity is not only about being reactive and responsive but also about pushing reality in a certain direction. (Janlert and Stolterman, 2017: 118).

The disposition to as well as (ongoing) processes of such new forms of interaction, according to Janlert and Stolterman, depend on several parameters such as agency, pace or time, independence, receptivity, predictability, and enforcement, which now come into focus of informatics and computer science. With a view to the four dimensions of the SMPC model, users and computers were increasingly understood as symmetrical subjects of interaction, interacting in the modes of communication and dialogue for the purposes of developing and solving certain problems as well as stimulating and changing environments within a broad range of different contexts.

“Interaction” in game theory

Another important reference point for debates about HMI is game theory. Dating back to the proof of the min-max theorem by John von Neumann (1928), modern formal game theory serves manifold fields of application, e.g., in economics (Laffont, 1997), sociology (Swedberg, 2001), philosophy (de Bruin, 2005), biology (Tomlinson, 1997), and particularly in HMI (Li et al., 2019). As a mathematical theory, it models decision-making situations in which rational participants interact with each other. Such interactive decision problems or games involve “two or more individuals making a decision in a situation where the payoff to each individual depends (at least in principle) on what every individual decides” (Webb, 2007: p. 61). The primary goal of game theory is to derive rational decision behavior in such decision problems, i.e., social conflict situations.

With a view to our SMPC model, interaction subjects are generally understood in terms of rational decision-makers, who interact in the mode of adopting strategies. That is, interaction subjects implement plans of action that they (a) have for every circumstance they can observe in a given decision problem, and that (b) have some effect on at least one other interaction subject. The purpose of interaction consists of maximizing rewards depending on the specific type of decision-making problem, i.e., the specific context (Osborne, 2011: p. 4). As regards HMI, game theoretical approaches are of particular interest, because they may help to understand fundamental processes as well as outcomes of (strategic) interactions between humans and machines and, thus, can be used to answer questions of, e.g., control or mutual “understanding” in such interactions (cf., e.g., Li et al., 2019).

“Interaction” in sociology

Apart from mathematical game theory, both sociological and philosophical concepts of interaction are often referred to with a view to the understanding and designing of HMI. In sociology, interaction is one of the key concepts being used to explain a broad variety of social phenomena on different societal levels. Its fundamental role in sociological theory derives from the fact that concepts of interaction can explain heterogeneous phenomena located on different societal levels (individual, organizational, and societal), like, e.g., interpersonal relations, the structuring, and reproduction of social situations and institutions as well as societal cohesion and transgenerational transfer of knowledge. For a long time, the use of “interaction” was, however, restricted to relations between (at least two) humans. However, with the rise of Science and Technology Studies (STS), “interaction” started to cover phenomena of both human–human interaction and HMI.

As the concept of interaction is crucial for every sociological theory, a variety of definitions can be identified throughout the history of sociological thinking (Bales, 1950; Hall, 1966; Goffman, 1967; Parsons and Ebinger, 1968). Theories differ regarding the preconditions of interaction or interactivity, which an entity must satisfy to be recognizable as a potential interaction partner. These preconditions implicitly or explicitly express different ideas of humans (or subjectivity). The variety of theories can be roughly categorized according to six criteria. 1. Intelligence: Some theories assume that only rational beings are capable of interaction and argue that interaction requires intelligence or the potential to mutually recognize interactive counterparts (Müller, 2011). 2. Intentionality: another type of thinking connects interaction to the ability of mutual understanding of goals and intentions, presupposing human beings as beings characterized by goals and intentions (Müller, 2011). 3. Embodiment: according to this line of thinking interaction is closely tied to embodiment (and intentionality). Therefore, only purposeful moving beings (and, hence, beings with a body and goals) can meet the prerequisites of interaction (Vannini, 2016). 4. Adaptability: some theories stress the importance of a shared common meaning and linguistic understanding, which they deem necessary for the potential to adapt one’s own behavior to the behavior of potential interaction partners (Krappmann, 1998). 5. Symbolic exchange: in the tradition of symbolic interactionism, interaction is defined by mutual exchange of interpretation, which is regarded as the ground of socialization (Blumer, 1986). 6. Intensity and friendliness: in his Formal Theory of Interaction, Herbert A. Simon (1952) develops a mathematical formulation of George C. Homan’s approach to human interaction as presented in his seminal theory of human group behavior. Simon defines interaction over time as a function of the level of friendliness among group members (over time), the amount of activity carried on by members within a group (over time) as well as the amount of activity imposed on the group by its environment (over time). Hence, interaction is formalized as \(I\left( t \right) = a_1 \cdot F\left( t \right) + a_2 \cdot A\left( t \right)\) (where F(t) is a function denoting friendliness over time, A(t) is a function denoting the activity imposed on a group over time, and a1 as well as a2 are variables specifying the level of friendliness, or the amount of activity, respectively).

Regarding our SMPC model and the question of interaction subjects (who interacts?), the sociological concept of interaction for a long time was restricted to humans. With the upcoming of Bruno Latour’s Actor-Network-Theory (ANT) (Latour, 2005) and STS, however, the concept became broader with also concerning interaction between humans and machines or rather between human and non-human entities. Latour (2005), as well as Castells (2010) thus changed the theoretical perspective towards a relational perspective, arguing that the physical and social world should be understood as a complex arrangement of intertwined relations between different entities, which do not have to be thought of as structurally identical to be attributed agency and interactivity. Following this line of argument, Latour and Castells rewrite the history of humans and technology by pointing out that the relations between humans and technological systems, humans and animals as well as humans and their natural environment are of the same importance for sociological analysis as are interpersonal relationships. Following this line of thinking, Werner Rammert (2002), a protagonist of German STS, raised the question of technological agency by following a functionalist approach rather than a materialistic one. By focusing on the function of technology, Rammert demonstrates the ability of technology to act and interact on different societal levels thereby shaping situational as well as institutional and societal structures.

Regarding the question of how interaction is taking place, different modes of interaction can be distinguished. Whereas Jensen (1998) argues that interactivity and interaction are bound to mutual adaption of behavior and action, Blumer (1986) emphasizes the importance of mutual exchange of interpretation. From the latter point of view, human interaction is mediated by using symbols and signification, by interpretation, or by ascertaining the meaning of others’ actions. The tradition of symbolic interactionism accentuates the importance of symbolization and distinguishes two types of interaction by introducing the difference between symbolic and non-symbolic interaction. The former occurs if an interaction partner has already interpreted the actions of their counterpart, i.e., interaction takes place in mutual role-taking as well as interpretation of behavior. On the contrary, the latter is exemplified in spontaneous, reactive responses to another’s actions. Interdependence of action is also crucial for Starkey Duncan (1989), who argues that a process is interactive if at least two interdependent individuals, where interdependence is identified as a state of reciprocal awareness, i.e., every interaction partner is a) aware of the presence of the other and b) assumes that the other is aware of his/her presence. Regarding modes of HMI, Philip Hayes and Reddy (1983) point out that interaction takes place if fragmented input can be parsed and combined, if abilities and limitations, actions and motives can be explained, and if a dialog can be started and perpetuated by keeping track of the focus of attention.

With a view to the purposes and contexts of interaction (why is interaction taking place? Where is interaction taking place?), the concept of interaction points to the construction and reproduction of social structures on different societal levels. Hence, interaction is a multi-stable process, whose purpose is dependent on the specific context in which interaction is taking place. Following Bahr and Stary (2016), interaction primarily has to be considered as a process of exchanging material and immaterial goods between acting parties (biological or technical entities) embodied in a certain context. This emphasis on context, which is shared by ANT, for instance, opens the analytical perspective for the importance of differentiation and the impossibility of generalizing hypotheses about possible outcomes of interaction. The range of possible outcomes includes the creation of order and meaning (symbolic interactions) as well as the construction of organizational and societal structures established through behavioral and cognitive schemes that result from repeated interactions and influence future events as patterns and expectations. Hence, the specific characteristics of each interaction, context, and partner are to be identified, as the very same interaction may lead to different results depending on the conditions of its context (e.g., an interaction between a robot and a resident of a nursing home may lead to another result than the interaction between a customer of a supermarket and the same robot). These differences must be taken into consideration when developing or evaluating concrete instances of HMI.

Summing up the different theoretical strings, sociological concepts of interaction can be differentiated based on the preconditions, which interaction partners must satisfy. While most theories exclusively reserve the concept of interaction for human exchange processes and their effects, with the rise of ANT and STS, a different approach was taken, which is of particular relevance for the analysis and evaluation of HMI. By extending the range of possible interaction partners to the realm of machines (in the broadest sense of this concept), ANT and STS enable using “interaction” as a heuristic concept to explain the technologically enabled conditions of modern sociality. Therefore, these approaches seem especially fruitful for understanding modern societal structures and social relations, whereas interaction concepts derived from the interpersonal sphere should only be cautiously transferred to contexts of HMI, since their preconditions often are not satisfied by technical systems.

“Interaction” in philosophy

In philosophy of technology, the development of the concept of interaction stretches from the pre-industrial age through industrialization to the present day: in the pre-industrial age, the focus was on humans in the “mirror of their machines” (Meyer-Drawe, 1996), and thus on their comparison. During industrialization, this focus shifted to the social consequences of an increasing integration of machines into everyday life: henceforth, it was no longer primarily about human–machine comparison, but about human–machine interaction (Gehlen, 2007; Müller and Liggieri, 2019). With the rapid development of new types of technologies, philosophers of technology were increasingly asking questions about concrete interactions between humans and machines. This focus is reflected by situated and contextual considerations of individual settings of HMI, which take modes and contexts as central aspects of interaction (how does interaction take place? Where does interaction take place?). This becomes particularly clear in recent approaches in the field of post-phenomenology as well as in technoscience. Both Peter-Paul Verbeek and Karen Barad, as contemporary key thinkers of the two fields, deal critically and productively with the notion of interaction by focusing on the aspect of the “between” in HMI.

In contrast, Shaun Gallager, a well-known thinker of enactivism, introduces his Interaction Theory (IT) in which he sheds light on the very context of concrete interactions based on the thesis that every form of interaction can only be determined by its situatedness. Finally, approaches like Luciano Floridi’s analytically oriented Philosophy of Information must be mentioned, which focuses on HMI from an ethical and epistemological perspective. In the following, these four prominent positions and research fields are presented and discussed in more.

Verbeek, who is one of the leading thinkers from the ranks of post-phenomenologists, is particularly concerned with the phenomenon of “human technology mediation” (Verbeek, 2005). Thus, he does not directly focus on the concept of interaction, but rather on a critical examination of concrete situations in which certain relations between humans and technologies are revealed. For Verbeek, interaction therefore names only one of several possible relations between humans and technology. With this, he draws attention to the fact that in concrete practical settings, it is through interaction that the involved entities first appear as what they are: humans and machines “are not pre-given entities but rather […] mutually shape each other in the relations that come about between them.” (Verbeek, 2015: p. 28). In this respect, post-phenomenological analyses of HMI are concerned, first, with the mode of interaction and, second, with the concrete, practical context of HMI.Footnote 4

In the field of technoscience, it is above all Karen Barad (2007) who is concerned with a critical discussion of the relationship between humans and technologies (or more generally: objects). Her starting point is a critique of the subject position within philosophical traditions. To support this, she develops the concept of intra-action—a term that is used to replace the concept of “interaction”, which presupposes pre-established bodies that participate in action with each other. In contrast, intra-action understands agency not as an inherent property of an individual or human to be exercised, but as a dynamism of forces in which all designated entities are constantly exchanging and diffracting, influencing, and working inseparably. Intra-action thus “[…] acknowledges the impossibility of an absolute separation or classically understood objectivity, in which an apparatus (a technology or medium used to measure a property) or a person using an apparatus are not considered to be part of the process that allows for specifically located ‘outcomes’ or measurement” (Stark, 2016). Rather, “[…] ’individuals’ would only exist within phenomena (particular materialized/materializing relations) in their ongoing iteratively intra-active reconfiguring” (Barad and Kleinman, 2012: p. 76).

In this respect, Barad, like Verbeek, on the one hand, focuses on the “in-betweenness” of humans and machines, while, on the other, she is interested in the concrete ways in which agents appear in interactions. With a view to our SMPC model, here, too, the focus on mode and context is at the center, which highlights the accent certain contemporary philosophies of technology place on HMI when dealing with the benefits and limits of the notion of interaction.

Although Shaun Gallagher does not focus on HMI explicitly, he nevertheless defines interaction in such a way that it can be used for the analysis of HMI. Gallagher’s concept of interaction is situated in an enactivist perspective, implemented in what Gallagher—in distinction from Theory-Theory and Simulation Theory—calls IT. For him, “IT emphasizes the importance of context and circumstance, and the role of communicative and narrative practices” (Gallagher, 2020: p. 100]. Hence, Gallagher—in analogy to Verbeek and Barad—points to the specific mode and context of interaction, insofar as the mode is expressed in the focus on communicative and narrative practices and the context in the focus on the situatedness of interactions: interaction comes into view as “a mutually engaged co-regulated coupling between at least two autonomous agents”, distinguishing between “(a) the co-regulation and the coupling mutually affect each other and constitute a self-sustaining organization in the domain of relational dynamics”, and “(b) the autonomy of the agents involved is not destroyed, although its scope may be augmented or reduced.” (Gallagher, 2020: p. 99).

In contrast to this, Luciano Floridi’s analytical account of interaction is, at least, two-fold: first, in his Ethics of Information, he states that “interactivity means that the agent and its environment (can) act upon each other” (Floridi, 2013: p. 140). With a view to our SMPC model, he understands agents, i.e., autonomous, adaptable, and situated systems as interacting with their environment to transform or produce certain effects upon each other via information exchange. Hence, according to Floridi, not only human agents may interact, but also artificial agents (insofar they are autonomous and adaptable), whereby both can act as moral agents (“if and only if they are capable of morally qualifiable action” (Floridi, 2013: p. 147)). This, of course, has important ethical implications for at least certain contexts of interaction.

Apart from this ethical perspective on interaction, in his Logic of Information, Floridi (2019) presents a second understanding of interaction aiming at providing us with some epistemic criterion for existence. In this regard, he defines internal interaction as the interplay and interlock of mathematically describable structures, and external interaction as our relationship (as agents) with such structures. Hence, external interactions would provide us with some metaphysical criterion for existence (“ghosts do not exist […], because there is no LoA [level of abstraction] at which you can interact with them” (Floridi, 2019: p. 94)). In this understanding, according to our scheme, “we” (as agents) interact with mathematically describable structures in our environment via information exchange to account for facts or beliefs about certain facts.

In summary, in the fields of (post-)phenomenology, technoscience, and enactivism, an awareness of the limits and possibilities of the notion of interaction is currently emerging. In these fields, thinkers turn to the concept of interaction in a fruitful and productive way, embedding it in practical and situated contexts to address particular modes and contexts of interaction. Against this background, central yields are, for example, the critical reflection on the relationship between humans and technology or machines, which highlights a mutual dependence between humans and non-humans or technology and emphasizes the concrete context in which “interaction” appears as a meaningful concept. The analytical tradition, on the other hand, aims at developing a fully fledged concept of interaction regarding the four dimensions of the SMPC model, which differ from an ethical and epistemological perspective.

“Interaction” in psychology and cognitive science

In psychology, the term “interaction” is used and studied in different sub-disciplines (e.g., social psychology, differential psychology, developmental psychology, ergonomics, industrial, or organizational psychology), resulting in a wide range of differing understandings. In general, as regards the first and fourth dimension of our SMPC model (who interacts? Where does interaction take place), psychological theories refer to two (or more), variables, psycho-physiological or interpersonal states, constructs, systems, environmental conditions, persons or behaviors interacting in specific social contexts (Dix et al., 2003; Bolis and Schilbach, 2020). However, psychological theories focus more on potential modes and purposes of interaction (how does interaction take place? Why does interaction take place?) as well as on possible psychological consequences of interaction.

The general question of how interaction takes place is answered in different ways, e.g., in the sense of being closely related to each other, of exchanging information, of interdependency, of reciprocal steering or control, of a joint influence on something third, or of collective transformations of the world (Hasson and Frith, 2016; Bolis and Schilbach, 2020). In general, interaction is understood as a dynamic process that relates to continuous mutual adaptation, a dynamic coupling of interacting parties, and a related development of complementary behavior (Hasson and Frith, 2016).

A wide variety of proposals exist in psychological literature regarding concrete modes of interaction (e.g., by gesture, facial expression, communication of content, cognition, emotion, intimacy, etc.). As regards HMI, cognitive aspects or mental states (thinking, learning, remembering, attention, perception, planning, decision-making, etc.) play a particularly important role in the context of psychology and cognitive science (Sharp, 2019; Cross and Ramsey, 2021): theories of cognition (e.g., mental models, information processing, distributed cognition, embodied interaction) are deemed highly important for studying, developing and designing interactive machines (Sharp, 2019). The same holds true for emotions, which are another important component of the psychological understanding of interaction and of research in psychology related to possible modes of interaction: on the one hand, emotional responses of users to machines or their design significantly influence concrete HMIs. On the other hand, interaction with machines can be used to influence people’s emotions in various ways (cf., e.g., persuasive technologies) (Sharp, 2019).

Modes of social interaction, e.g., in collaboration, communication, and coordination, are also highly important with a view to HMI. In psychology, social interaction is generally understood as a two-way process. The preconditions for such processes, however, not only complicate the development of HMI, but also the psychological understanding of human interaction with machines. The most prominent approaches to describe the preconditions of social interaction (e.g., common ground, perspective taking, and Theory of Mind (ToM)) assume that humans are equipped with direct and implicit knowledge of other humans (regarding, e.g., certain capacities of other humans, their physiological needs, etc.). A key difficulty for HMI in comparison to human–human interactions results from the fact that at least some aspects of social interaction between humans are never made explicit (Krämer et al., 2011). Furthermore, even if all aspects of human–human interaction could be made explicit, implementing the underlying rules or knowledge would not suffice to establish successful mentalization in machines and, hence, to allow developing forms of HMI that are similar to human–human interaction (Frith and Frith, 2003; Krämer et al., 2011).Footnote 5

Another model for unfolding interaction that is difficult to apply to machines is expressed in social penetration theory (Krämer et al., 2011; Fox and Gambino, 2021). According to this account, a stepwise reciprocal self-disclosure is necessary to develop relationships. However, when machines share information, such information is not based on personal values, experience, or self-image (Fox and Gambino, 2021). Thus, from a psychological point of view, there is much to suggest that the understanding of social interaction, i.e., the how of interaction between humans, is not fully transferable to the understanding of interaction between humans and machines.

For the psychological conceptualization of interaction between humans and machines and for the question of how interaction with machines takes place, it may therefore be more important to understand, recognize, and implement different types of interactions. For instance, users may give instructions to a system (typing commands, gesturing, etc.) (instructing type of interaction), enter dialogs with systems (conversing type), move through (virtual) spaces (exploring type), or respond to system-initiated interactions (responding type) (Sharp, 2019).

A central purpose of human–human interaction (why is interaction taking place?) is to form, maintain and shape relationships. In psychology, this aspect has also been addressed regarding HMI, both as a design goal and with a view to the psychological effects of relationships between humans and machines. The motivation for forming relationships generally is located in the human need to belong (Krämer et al., 2011; Fox and Gambino, 2021): humans as social beings seek the company of others, interact with others, and form various forms of interactional relationships (family, couple relationships, friendship, etc.). Against this background, interactions with machines in principle may also lead to relationships between humans and machines. More interesting from a psychological perspective, however, is the question under which exact conditions HMI may lead to the formation of relationships. Conditions that are discussed in this context are, among others, a certain degree of attractiveness, similarity, or reciprocal liking (Krämer et al., 2011; Fox and Gambino, 2021).

Other aspects regarding possible purposes of interaction are captured by social exchange theory, equity theory, or investment theory (Krämer et al., 2011; Fox and Gambino, 2021). A basic idea of social exchange is that people need to exchange different goods (tangible or abstract and with diverse functions) and cooperate to survive. Cost-benefit considerations play a crucial role here. Accordingly, humans would prefer relationships offering more advantages than disadvantages or costs (compared to other relationships), relationships in which they have invested more compared to others, or relationships in which partners are equal in the exchange process. The longer and closer a relationship is, however, the less equality seems to be a priority. These considerations are also applied to human interactions with machines, analyzing how people use cost-benefit trade-offs to enter satisfying interactions and relationships with robots, for example.

To summarize, according to the psychological understanding, interaction is not an exclusively human phenomenon. Thus, not only do humans interact with each other, but also, for instance, with certain human features (behavior, gestures, cognition, emotions, etc.), systems, environmental conditions, or machines. Furthermore, interaction types play an important role in describing and understanding different modes of interaction. In this context, the cognitive, emotional, and social features of humans are decisive for the psychological description and analysis of HMI modes. If, however, social interaction between humans and machines is understood by reference to theories of human–human interactions, the limits of transferability quickly become apparent.

“Interaction” in media studies and communication science

With the entry of the personal computer into everyday processes, media and communication studies experienced a radical upswing. In the 1980s, for example, the field of German Media Theory was established, which—under the auspices of Friedrich Kittler—identified the operations of transmission, storage, and processing as basic media functions (Kittler, 1993). While German media studies developed out of the humanities (more precisely, literary studies) and have a historical and theoretical focus, communication studies are more empirically oriented, i.e., they study the use and application of media technologies and conduct media research. Especially in the latter approach, concepts of interactivity (referring both to processes of interaction and to their necessary conditions) play a central role. Since Sheizaf Rafaeli’s (1988) account of reactive communication, for instance, interaction has been understood as a way of thinking about communication, where it was originally assumed as an attribute of face-to-face conversation (Isotalus, 1998; Rafaeli and Sudweeks, 1998) which then was extended to mediated communication settings and, as such, must be distinguished from social interaction (Bucy, 2004). In the latter regard, interaction in media studies and communication science “is used as a broad concept that covers processes that take place between receivers on the one hand and a media message on the other” (Jensen, 1998: p. 188). In this context, Bucy (2004) distinguishes two main types of interaction concepts: first, concepts that focus on human interaction with certain (e.g., online) content, and address the control that users exercise over its selection and presentation (cf., e.g., Steuer, 1995; McMillan, 2002; Stromer-Galley, 2004); and second, interaction processes that involve person-to-person conversations mediated by technology (cf., e.g., Massey and Levy, 1999). Furthermore, some approaches seek to combine these two types by focusing on the aspect of control in mediated person-to-person interaction processes (cf., e.g., Williams et al., 1988; Neuman, 1991). Common to these message-centered approaches is their strong focus on users, or more generally, their focus on human senders and recipients. With a view to our SMPC model of “interaction”, in message-centered approaches, users and media are deemed asymmetrical subjects of interaction with a focus on users (referring to the question of who interacts), who select, present, and control certain content (referring to the question of how interaction takes place) for the purpose of communication (referring to the question of why interaction takes place) in mediated contexts (referring to the question of where interaction takes place).

In contrast, structural approaches consider the technological attribute or media feature, which allows users to talk to other users, engage with or manipulate media, or influence its content. Here, interactivity is located as a property of technology or media. In this regard, Jensen, e.g., states that interactivity would be “a measure of a media’s potential ability to let the user exert an influence on the content and/or form of the mediated communication” (Jensen, 2008: p. 129). With a view to this, Jensen distinguishes three principal ways of defining interaction in media studies and communication science: first, approaches that define interactivity through prototypic examples (referring, e.g., to the telephone, audio conferencing systems, or email). Second, approaches defining interactivity through certain criteria deemed necessary for a reciprocal dialog between users and systems. And third, understandings of interaction as a continuum that can be present in varying degrees with reference to 1 to n dimensions (covering, for instance, the degree of choices available, the degree of modifiability, the quantitative number of the selections, and modifications available, or the degree of linearity or non-linearity). With a view to our SMPC model, in structural approaches users and media are understood as asymmetrical subjects of interaction with a focus on media and technology, which enables users to influence certain content for the purpose of communication in mediated contexts.

Apart from structural approaches, perceptual approaches (cf., e.g., McMillan, 2002) consider user perceptions as the unit of measure: the degree of interactivity, which is now presumed to have variable effects, is reflected in the extent to which users subjectively experience interactivity. As for the SMPC model, in perceptual approaches, users and media are deemed asymmetrical subjects of interaction with a focus on users and their experience of interaction with media and technology for the purpose of communication in mediated contexts.

Kiousis, for instance, combines a structural and perceptual approach when postulating that “[i]nteractivity can be defined as the degree to which a communication technology can create a mediated environment in which participants can communicate (one-to-one, one-to-many, and many-to-many), both synchronously and asynchronously, and participate in reciprocal message exchanges […]. Regarding human users, it additionally refers to their ability to perceive the experience as a simulation of interpersonal communication and increase their awareness of telepresence” (Kiousis, 2002: p. 372).

In traditional communication science, however, structural approaches were discussed quite critically. Ha and James (1998: p. 461), for instance, state that “[i]nteractivity should be defined in terms of the extent to which the communicator and the audience respond to, or are willing to facilitate, each other’s communication needs” and, hence, claim message-centered approaches as the only plausible accounts of interaction. Schumann et al. justify this claim by postulating that “[u]ltimately it is the consumer’s choice to interact, thus interactivity is a characteristic of the consumer and not a characteristic of the medium. The medium simply serves to facilitate the interaction” (Schumann et al., 2001: p. 45).

While all these perspectives have a strong focus on human users and social interaction, German Media Theory focuses more on technology-immanent interaction processes, with human users being only one actor among others. Bernhard Siegert, for example, examines the sign practices used for representations in media systems and thus investigates the beginning of electric media (Siegert, 2003). Friedrich Kittler, on the other hand, in his seminal study Aufschreibesysteme 1800/1900 examines “the network of techniques and institutions […] that allow a given culture to address, store, and process relevant data” (Kittler, 1985: p.519), focusing on technologies as diverse as the letterpress, the phonograph, or the gramophone. Here, it is not the human user who is at the center of interaction processes, but the diverse operations of medial mediation.

Conclusion

The terms of interaction and interactivity are used in many ways in debates on HMI. Ultimately, however, this results in the concept of interaction being vague and ambiguous, which makes it difficult to reasonably discuss questions of ethics, politics, engineering, and the sciences regarding HMI. Against this background, we analyzed the different meanings attached to interaction in the scientific disciplines relevant to debates on HMI to provide a basis for a fruitful intra- and particularly interdisciplinary discourse on HMI. For this purpose, we introduced the SMPC model, which alludes to interaction as a four-dimensional concept referring (1) to certain subjects (i.e., to the question: who interacts?), (2) to modes of interaction (how do these subjects interact?), (3) to purposes of interaction (why, or: for what reasons is interaction taking place?), and (4) to certain contexts (where, or: under what conditions is interaction taking place?), and is intricately connected to the terms of “interactivity”, “interactability”, as well as “interactiveness”. In view of this, our analysis showed a broad range of understandings of interaction in the disciplines of informatics and computer science, game theory, sociology, philosophy, psychology, and cognitive science as well as media studies and communication science (Table 1). Moreover, manifold positions regarding the connected terms became obvious (Fig. 2).

Table 1 Interaction in the disciplines.
Fig. 2: Interactivity, interaction, interactability, and interactiveness in the disciplines.
figure 2

Relation of terms connected with “interaction” and their reference in the disciplines.

This variety of understandings trivially results from the different research subjects and foci of the disciplines. Therefore, it seems impossible to develop one correct definition of interaction or interactivity. At the same time, this highlights the need for a basic understanding of the respective meaning(s) used in concrete debates on issues of HMI. For, after all, the underlying understanding of interaction influences, e.g., normative analyses or the development of recommendations for dealing with HMI. Even if no definitive understanding can be found to be consistently used across the disciplines, it is useful to be aware of the different disciplinary approaches to the phenomenon of interaction (in HMI). In this way, misunderstandings or (normative) decisions made on the wrong conceptual basis can be avoided.