Looking to recognise: the pre-eminence of semantic over sensorimotor processing in human tool use

Abstract

Alongside language and bipedal locomotion, tool use is a characterizing activity of human beings. Current theories in the field embrace two contrasting approaches: “manipulation-based” theories, which are anchored in the embodied-cognition view, explain tool use as deriving from past sensorimotor experiences, whereas “reasoning-based” theories suggest that people reason about object properties to solve everyday-life problems. Here, we present results from two eye-tracking experiments in which we manipulated the visuo-perceptual context (thematically consistent vs. inconsistent object-tool pairs) and the goal of the task (free observation or looking to recognise). We found that participants exhibited reversed tools’ visual-exploration patterns, focusing on the tool’s manipulation area under thematically consistent conditions and on its functional area under thematically inconsistent conditions. Crucially, looking at the tools with the aim of recognising them produced longer fixations on the tools’ functional areas irrespective of thematic consistency. In addition, tools (but not objects) were recognised faster in the thematically consistent conditions. These results strongly support reasoning-based theories of tool use, as they indicate that people primarily process semantic rather than sensorimotor information to interact with the environment in an agent’s consistent-with-goal way. Such a pre-eminence of semantic processing challenges the mainstream embodied-cognition view of human tool use.

Introduction

Tool use represents a fundamental facet of the human intrinsic ability to interact with the environment. Alongside bipedal locomotion and language, tool use is a founding characteristic of human beings. Therefore, the study of the cognitive mechanisms underlying the processing of tools is crucial in Cognitive Science. The critical relevance of the topic is well demonstrated by the large amount of research on perceptual and semantic processing of functional and motor properties of tools (also called “affordances”) that has been done in the last forty years1.

It should be noticed that the word “affordance” is probably one of the most ambiguous words in experimental psychology as it has acquired over time a multiplicity of meanings, hence becoming a term that generated confusion in the field of tool use, even among scholars. As an effort to reduce such an ambiguity, in this study, we endorsed the definition of the term as recently proposed by Osiurak and colleagues, i.e., as “an animal-relative, biomechanical property specifying an action possibility within a body/hand-centered frame of reference” (p. 410)1. Such an action possibility pertains to the physical but not to the neurocognitive domain, which, instead analyses how affordances are perceived. To this end, eye-tracking research paradigms, which investigate individuals’ gaze behaviour and their allocation of visuospatial attention, are good candidates to advance our understanding of the cognitive mechanisms associated with affordance perception2. Within the above framework, we used the word “affordance” or any synonym of it to refer to the action possibilities prompted by the visuo-perceptual context.

As a class of objects with intrinsic action and motor features3,4,5, tools are traditionally defined as handheld physical implementations that amplify the user’s sensorimotor capabilities. Note that, in this way, it is possible to use the word “object” in order to refer to the plausible recipient of an action6. However, in everyday life, tools are rarely used in isolation. They are generally part of a broader scenario that includes other objects and the set of contextual and spatial relations7. Thus, a large number of researchers used paradigms with paired objects (i.e., object-tool pairs) to investigate how the visuo-perceptual context modulates the functional and motor properties of tools8,9.

Interest in paired-object affordances has increased after the finding that visual extinction – a phenomenon in which patients with parietal damage fail to report stimuli presented on the contralesional visual field when two objects are simultaneously prompted – is reduced for objects co-located for action10,11. Using various paradigms, a series of studies corroborated the specificity of the paired-object affordance effect. Specifically, experiments with healthy participants investigated the cognitive and neural mechanisms underlying the facilitation that co-located-for-action and functionally linked objects provide to the perception of paired objects8,12,13. To date, speeded classification responses to paired objects emerge when objects are positioned in a standard co-location for right-handed actions8. The extraction of potential interactions between objects takes place automatically, with an affordance-related activation for objects that are “active” (for action purposes) in a visual scene and an affordance-related inhibition for “passive” objects in a visual scene14.

The majority of studies related to the paired-object affordance effect used compatibility paradigms to verify whether “action features” associated with an object (e.g., “graspability”) or with the visuo-perceptual context (e.g., functional or spatial relations with the objects of an object-tool pair) may have an impact on a task for which those action features are not relevant (e.g., categorization tasks). Less is known about the relative role of the visuo-perceptual context and action-related information on object recognition. Indirect suggestions come from priming studies in which a higher naming accuracy was achieved when a single object (target) was preceded by a different object (prime) with similar motor interactions. Intriguingly, the effect disappeared when the prime was a word, suggesting an action representation based on visual object information15. However, it has long been known that object naming and phonological retrieval rely on several distinct processes involving, among others, but not exclusively, object recognition. In turn, object recognition is based on the perception of form and colour, on visual analysis of the figure-ground relation and on the activation of stored semantic memories16. Therefore, object naming should not be taken as a synonym of object recognition.

From a strictly perceptual point of view, it has been suggested that an observer decodes a visual scene by incorporating functional information derived from relations between objects17. In particular, it appears that observers tend to perceptually group objects following familiar functional relations (e.g., a pitcher and a glass), as objects and their functional relations interact for object identification12. More recently, in a behavioural experiment, Borghi and colleagues9 used black-and-white images displaying two manipulable objects linked by either a functional (e.g., knife–butter) or a spatial (e.g., knife–coffee mug) relation. Results showed faster relatedness responses when objects were functionally rather than spatially linked.

Objects’ functional properties pertain to a kind of semantic knowledge (i.e., functional knowledge) associated with the object identity (“What is it?”) and with the goals (“What can I do with?”) attainable by using the object, whereas manipulation knowledge (i.e., sensorimotor knowledge) is related to the proper handling of an object18. It appears that objects’ functional knowledge can be conceived as a component of objects’ conceptual representation19. However, a long-established neuropsychological research tradition situates manipulation knowledge as central in tool use20,21. The so-called “manipulation-based” hypothesis generated strong resonance within the embodied cognition approach, which suggests that object knowledge is constituted by information inscribed within the motor and sensory systems20,21,22,23,24,25. In the embodied cognition perspective, it seems that the main point about tool use is to know how to manipulate it (i.e., using stored sensorimotor knowledge), rather than to reason about how the tool can be used alone or in interaction with other objects. Such a manipulation-based approach appears to be rather simplistic, especially if one considers that humans use tools to solve everyday problems, i.e., as a problem-solving situation sustained by technical reasoning skills26. Accordingly, recent lines of research contrasted the well-established manipulation-based approach and proposed a reasoning-based perspective whose basic assumption is that people reason about the physical object properties to solve everyday-life problems. Thus, upon seeing a tool, people do not automatically activate manipulation knowledge. Rather, they are confronted with everyday issues (e.g., hanging a picture on the wall), so that they use mechanical knowledge (i.e., technical reasoning) to reason about how to use a tool (e.g., a hammer) and solve the problem (e.g., pounding a nail in the wall). In other words, the reasoning-based theoretical framework supports the idea that people do not passively learn the relationship between objects (manipulation-based approach), but they dynamically generate it in order to “act” within a context1,6,27,28,29,30,31,32.

Classical neuropsychological models posit a dissociation between the visual processing streams associated with object recognition and those associated with object-directed action. It is generally accepted that the ventral stream (vision-for-perception system) is involved in object identification and recognition whereas the dorsal stream (vision-for-action system) deals with action-related and visuospatial object information mainly involved in the localization of objects in the space33. However, recent lines of research challenged the idea of functionally-separated processing streams for object recognition and object-directed action, assuming a joint and flexible involvement of ventral and dorsal brain areas in affordance processing1,34,35,36. In particular, as an attempt to overcome the dichotomy between manipulation-based and reasoning-based approaches, the so-called Three Action-System model (3AS) has been recently proposed1. On the basis of the dorsal-system partition34, the three neurocognitive systems of the 3AS that underlie the perception of affordances, mechanical knowledge and function knowledge are supposed to be, respectively, the dorso-dorsal system (i.e., the motor control system, in particular the bilateral superior parietal cortex and the intraparietal sulcus), the ventro-dorsal system (mainly the left inferior parietal cortex) and the ventral system (mainly the left temporal cortex). Thus, on the one hand, the reasoning-based approach to human tool use is supported by classical developmental studies that considered tool use as a problem-solving occurrence supported by technical reasoning26; on the other hand, the reasoning-based approach appears to be consistent with neuropsychological evidence that highlights the involvement of a wide and complex fronto-parietal and occipito-temporal brain network in tool use and affordance processing35.

In a most recent eye-tracking study, Federico and Brandimonte2, using an ecological experimental task (i.e., looking at 3D colour images depicting single tools or object-tool pairs), highlighted peculiar differences in participants’ visual exploration patterns as the degree of “action readiness” evoked by the visuo-perceptual context changed. To manipulate action readiness, the authors used thematically consistent (e.g., hammer-nail, both in the peri-personal space), thematically inconsistent (e.g., hammer-steel pot, both in the peri-personal space) and spatially inconsistent (e.g., hammer-nail, with the hammer in the peri-personal space and the nail in the extra-personal space) object-tool pairs. Results showed that single tools and tools of object-tool pairs were initially fixated longer on their functional area. However, extending the time-window of analysis, tools of thematically consistent object-tool pairs were visually encoded in a more suited-for-action way. Indeed, the fixation pattern focused on the manipulation area of the tool (e.g., the handle of a hammer) more than on its functional area (e.g., the head of a hammer). Conversely, tools of thematically and spatially inconsistent pairs obtained a reversed visuo-attentional pattern, with the functional area fixated longer than the manipulation area. It should be noticed that the experimental paradigm devised by Federico and Brandimonte2 involved an ecological task in which participants were asked to look at the visual scene in a natural way. Such a freely-look-at task might implicitly activate the goal of searching for potential mechanical actions between tools and objects. Hence, differences in visuo-attentional patterns might be evocative of a function-to-mechanical-to-motor “cascade” cognitive mechanism through which participants initially visually explore the scene (object-tool pairs) to gather the tool’s function knowledge (“What is it?”), then they try to solve first the mechanical knowledge issue (“How to use the tool with the object?”) and, finally, the motor control issue (“How to grasp and manipulate the tool?”). For instance, when the visuo-perceptual context is easy to decode in terms of action readiness (e.g., hammer-nail), the mechanical knowledge issue is promptly solved and the motor control instantiated, as indicated by the increase in the fixation time of the tool’s manipulation area. Conversely, when the visuo-perceptual context does not promote action readiness (e.g., bottle-cap), the mechanical knowledge issue may not be so quickly solved, so that the motor control may not be activated, as highlighted by the increment in the fixation time of the tool’s functional area. Those results were interpreted by the authors within a reasoning-based theoretical perspective, as suggesting that the flexible visuo-attentional patterns observed in the study might reflect the engagement of different tool-use neurocognitive systems (i.e., the Three Action-System)1. Within that theoretical frame of reference, Federico and Brandimonte2 introduced the concept of “action reappraisal” to refer to the cognitive processing of multiple sources of information (e.g., affordances, mechanical knowledge, functional knowledge, abstract knowledge, etc.) that can be used by an agent in order to reason about the possibility to act within and upon a context, in a proper and agent’s consistent-with-intention way27,28. The action reappraisal idea appears to be supported by recent neuropsychological evidence indicating the inferior parietal cortex and the middle temporal brain areas as regions where a multimodal integration of action and semantic information takes place to generate high-level cognitive representations about tools35,37,38,39.

Despite the intrinsic appeal of the action reappraisal idea, though, many questions still remain unanswered. One basic issue refers to the nature of the processing required by the task. In fact, Federico and Brandimonte2 used an implicit low-level task (i.e., free visual exploration of 3D images) in which the “to look at” instruction was self-sufficient for the task to be performed, with no further elaborative processing involved. However, to go a step further in the knowledge about the role of reasoning-based processing in human tool use, one should test the action reappraisal idea by introducing a kind of task that needs higher-level processing to be performed. The best candidate for such a research question is a simple short-term recognition task, which explicitly requires participants to look at the images with the higher-level purpose of recognising them as being present or not in the previously seen pair. The joint measures of visual exploration patterns and recognition performance should help disentangle the relative role of reasoning vs. manipulation-based processes. Therefore, in the present article, we investigated the effects of the action readiness prompted by the visuo-perceptual context on performance in both a lower level (free visual exploration) and a higher level (object short-term recognition) task.

We run two experiments. The first experiment was aimed at replicating the effects reported in a previous work2 and extending them to a new set of stimuli with a simplified paradigm. Hence, we analysed by eye-tracking the visuospatial attentional patterns of participants looking at 3D images depicting object-tool pairs that could be thematically consistent or thematically inconsistent, with the object on the left and the tool on the right, both in the person’s peri-personal space. The crucial independent variable was the thematic consistency between the stimuli composing the object-tool pairs, with the assumption that affordance perception should be facilitated in the thematically consistent condition by virtue of the higher action readiness elicited by the visuo-perceptual context2. In particular, we analysed the fixation patterns related to the tools of the object-tool pairs. The Areas of Interest considered in Experiment 1 were the ones related to the functional (middle-top area) and to the manipulation (middle-bottom area) parts of the tool. In accordance with previous results2, we expected different tool’s visual exploration patterns as the action readiness prompted by the visuo-perceptual context changed. Specifically, a longer fixation duration on the manipulation areas of the tools was expected in the thematically consistent condition.

In Experiment 2, we used eye-tracking during a yes-no short-term recognition task, in which new participants were first presented with the same object-tool pairs as in Experiment 1 (thematically consistent and thematically inconsistent pairs) and then, after each pair presentation, asked to decide whether a subsequent single object (or a single tool) was present in the original, just seen, pair. The main reason for using a short-term object recognition task was that, due to its explicit, high-level, semantic nature, such a task is instrumental in analysing differences in visual exploration patterns under object-tool thematically consistent or inconsistent conditions. In addition, a short-term recognition task should prevent participants from using recoding strategies of object-tool relations that might bias responses at test. Hence, Experiment 2 explored the novel, specific hypothesis that the activation of higher cognitive processes prompted by the goal-directed nature of the recognition task should influence the processing of the visual scene. Indeed, if the concept of action reappraisal is correct, then, contrasting a simple, spontaneous behaviour (looking at; Experiment 1) with a goal-directed behaviour (looking to recognise; Experiment 2) should make the functional-to-mechanical-to-motor cascade mechanism emerge more clearly. In particular, in Experiment 1, the activation of the implicit goal of searching for potential mechanical actions between tools and objects should produce a visuo-attentional pattern that, under higher action readiness conditions (i.e., in the thematically consistent conditions), favours the tool’s manipulation area, as a consequence of the motoric nature of the implicit task (i.e., a simpler resolution of the cascade mechanism). In Experiment 2, the higher level, goal-directed nature of the recognition task should promote functional/semantic processing over motor activations, with more fixations on the functional areas of the tools also in the thematically consistent condition. In other words, the cognitive nature of the recognition task should prevent a reasoning-based agent from proceeding toward the mechanical and then the motor processing, hence lingering in the functional/semantic processing.

Furthermore, in order to investigate the temporal dynamics of tool’s visual exploration, for both experiments we used two different time windows of eye-tracking data analysis (500 ms and 1000 ms). We expected a visual-attentional pattern initially (first 500 ms of visual exploration) focussing on the tools’ functional area in both the experiments and in all experimental conditions2. Finally, as regards recognition performance, in accordance with some recent literature9,12,15,17,18,37, faster recognition was predicted under thematically consistent conditions.

Results

Experiment 1

Experiment 1 was aimed to assess whether the action readiness evoked by the visual scene could modify tools’ fixation patterns. Data related to the mean fixation time spent by participants to look at the manipulation and functional AOIs of the tool, within a time window of analysis of 1000 ms, are summarized in Table 1. Data related to the first 500 ms of visual exploration are instead reported in Table 2.

Table 1 Experiment 1 – Mean Fixation Time of Manipulation and Functional AOIs of the Tool. Time window of analysis: 1000 ms.
Table 2 Experiment 1 – Mean Fixation Time of Manipulation and Functional AOIs of the Tool. Time window of analysis: 500 ms.

As regards the extended time window of analysis (1000 ms), a repeated-measure ANOVA revealed an interaction effect of Thematic Consistency and AOIs, F(1, 14) = 30.47, p < 0.001, ηp2 = 0.69. Post-hoc pairwise comparisons revealed that fixation duration for the manipulation AOI was longer in the thematically consistent than thematically inconsistent condition (p < 0.01). Instead, the functional AOI was fixated longer in the thematically inconsistent than thematically consistent condition (p < 0.05). In the thematically consistent condition, the manipulation AOI of the tool was fixated longer than its functional AOI (p < 0.05), whereas in the thematically inconsistent condition the functional AOI was fixated longer than the manipulation AOI (p < 0.01). The interaction effect is shown in Fig. 1. No main effects of Thematic Consistency or AOIs were found.

Figure 1
figure1

Experiment 1 – Tool’s visual exploration. In the thematically consistent condition, tools were fixated longer on their manipulation area whereas in the thematically inconsistent condition they were fixated longer on their functional area.

As regards the initial tool’s visual exploration (i.e., with a time window of analysis of 500 ms), a second repeated-measure ANOVA revealed a main effect of the AOIs (Functional vs. Manipulation) on the tool’s fixation duration, F(1, 14) = 30.35, p < 0.001, ηp2 = 0.68. The functional AOI (M = 76.65 ms, SD = 49.97) obtained longer fixations than the Manipulation AOI (M = 20 ms, SD = 23.31). No main effect of Context or interaction effects were found.

As hypothesised, participants’ visuo-attentional patterns focused on the manipulable part (e.g., the handle of a hammer) more than on the functional part (e.g., the head of a hammer) of the tools in the thematically consistent condition. Conversely, the functional area of the tools obtained more fixations in the thematically inconsistent condition (Figs. 1 and 2). Thus, tools’ specific areas obtained a reversed allocation of visuospatial attention as the action readiness evoked by the visuo-perceptual context changed. Importantly, the tool’s visual exploration started from the tools’ functional areas, as emerged from the analysis of a time window of 500 ms. These results confirm and extend previous findings regarding the emergence of peculiar tools’ fixation patterns as an effect of action reappraisal2.

Figure 2
figure2

Experiment 1 – Visual exploration heatmaps. Example of heatmaps related to participants’ visual exploration of object-tool pairs used in Experiment 1. (A) A heatmap of a thematically consistent object-tool pair (bowl-whip) that highlights a visuo-attentional focus on the manipulable part of the tool. (B) A heatmap of a thematically inconsistent object-tool pair (shoe-whip) that highlights a visuo-attentional focus on the functional part of the tool. For both (A) and (B) the time window of the eye-tracking analysis was 1000 ms.

Experiment 2

In Experiment 2 we used a higher level, goal-directed short-term object recognition task with the aim to explore whether functional/semantic processing may take over motor activations, and how that reverberates on tool’s visual-exploration. In addition, Experiment 2 was aimed to assess differences in object recognition performance as the action readiness of the visual context changed.

Eye-tracking data

Data related to the mean fixation time spent by participants to look at the manipulation and functional AOIs of the tool within the extended time window of analysis (1000 ms) are summarized in Table 3, whereas data related to the restricted time window of analysis (500 ms) are reported in Table 4.

Table 3 Experiment 2 – Mean Fixation Time of Manipulation and Functional AOIs of the Tool. Time window of analysis: 1000 ms.
Table 4 Experiment 2 – Mean Fixation Time of Manipulation and Functional AOIs of the Tool. Time window of analysis: 500 ms.

A repeated-measure ANOVA revealed a main effect of the AOIs on tool fixation duration within the extended time window of analysis (1000 ms), F(1, 28) = 66.62, p < 0.001, η2p = 0.70. This main effect was due to longer tool’s fixation duration on the Functional AOI (M = 247.29 ms, SD = 107.35) than on the Manipulation AOI (M = 79.86 ms, SD = 57.37). This effect is shown in Fig. 3. A main effect of Thematic Consistency on tool’s fixation duration was also found, F(1, 28) = 5.02, p = 0.033, η2p = 0.15. This main effect was due to longer tool’s fixation duration in the thematically inconsistent condition (M = 183 ms, SD = 117.25) than the thematically consistent condition (M = 152.78 ms, SD = 123.49). No interaction was found.

Figure 3
figure3

Experiment 2 – Mean Fixation Time of tool’s functional and manipulation AOIs. In Experiment 2, tools were fixated longer on their Functional AOI in all experimental condition (p < 0.001). Vertical bars denote 0.95 confidence intervals.

A second repeated-measure ANOVA revealed a main effect of the AOIs (Functional vs. Manipulation) on tool fixation duration within the first 500 ms of visual exploration, F(1, 28) = 82.88, p < 0.001, ηp2 = 0.75. Functional AOIs (M = 82.18 ms, SD = 47.24) were fixated longer than Manipulation AOIs (M = 19.39 ms, SD = 23.2). No main effect of Context or interaction effects were found.

As predicted, these results highlight a visuo-attentional pattern that emphasises tool’s functional areas in all experimental conditions (Figs. 3 and 4). Participants’ looked at the functional part of the tools longer in both thematically consistent and inconsistent conditions. In addition, within the thematically inconsistent condition, tool’s mean fixation time significantly increased. As regards the tool’s initial visual exploration (time window of analysis of 500 ms), no differences with Experiment 1 were found, with the tools’ functional area fixated longer than the manipulation area.

Figure 4
figure4

Experiment 2 – Visual exploration heatmaps. Example of heatmaps related to participants’ visual exploration of object-tool pairs used in Experiment 2. In both thematically consistent (A; e.g., bowl-whip) and thematically inconsistent (B; e.g., shoe-whip) conditions, the visuo-attentional focus was on the functional part of the tool. For both (A) and (B) the time window of the eye-tracking analysis was 1000 ms.

Behavioural data

Reaction times and accuracy are summarized in Table 5.

Table 5 Experiment 2 – Object recognition: mean reaction times and accuracy.

Hits.

A repeated-measure ANOVA revealed a main effect of Thematic Consistency on RTs, F (1, 25) = 6.93, p = 0.014, ηp2 = 0.22. A significant interaction between Thematic Consistency and Object Type was also found, F (1, 25) = 4.27, p = 0.049, ηp2 = 0.15. Post-hoc pair-wise comparisons revealed that reaction times for tools were higher in the thematically inconsistent than the thematically consistent conditions (p < 0.05). The effect is shown in Fig. 5a. No main effect of Object Type was found.

Figure 5
figure5

Experiment 2 – Mean RTs for Hits and Correct Rejections. (A) The interaction between Object Type and Thematic Consistency. Tools of object-tool pairs, but not objects, were recognised faster in the thematically consistent than in the thematically inconsistent condition. Vertical bars denote 0.95 confidence intervals. (B) The main effect of Object Type, with tools being rejected slower than objects. Vertical bars denote 0.95 confidence intervals.

Correct Rejections.

A repeated-measure ANOVA revealed a main effect of Object Type on RTs, F (1, 25) = 5.01, p = 0.034, ηp2 = 0.17. The effect is shown in Fig. 5b. No effects of Thematic Consistency or interaction were found.

The analysis of the behavioural data showed a main effect of Thematic Consistency on object recognition. When a thematically consistent pair preceded a single tool or object, participants were faster at recognising the second stimulus as part of the pair. In particular, single tools, but not objects, were recognised faster after the presentation of thematically consistent object-tool pairs (Fig. 5a). Importantly, a mirror effect of Object Type emerged for the Correct Rejections, with objects being rejected faster (Fig. 5b) than tools.

Overall, the results of Experiment 2 showed that the tool’s functional area was fixated longer irrespectively of the experimental condition. In addition, tools of thematically consistent object-tool pairs were fixated shorter and recognised faster.

Discussion

In two experiments, we explored whether and, if so, how the visuo-perceptual context and the goal of the task may influence both the tool’s visual exploration patterns and object recognition performance. In particular, in Experiment 1, we analysed by eye-tracking the fixation patterns of tools that were part of object-tool pairs by using an implicit, low level, “looking at” visual-exploration task, while in Experiment 2 we combined eye-tracking with an explicit, higher level, “looking to recognise” short-term yes-no recognition task. By contrasting these two distinct kinds of tasks, we aimed to highlight differences in the processing of the visual scene as the action readiness elicited by the visuo-perceptual context and the goal directedness of the task changed.

The results of Experiment 1 showed that when the visuo-perceptual context prompted high action readiness (i.e., under thematically consistent conditions), a distinctive fixation pattern of the part of the tool involved in its use (i.e., manipulation area) emerged. Conversely, when the context elicited lower action readiness (i.e., in the thematically inconsistent scenes), tools were fixated longer on their functional part. The results of Experiment 1 confirm and extend previous findings2, by using a new set of stimuli, a simplified paradigm and a different time-window of analysis.

Recent evidence has indicated that the initial visual exploration of a tool is at least in part aimed at gaining its function, with longer fixations to the tool’s functional area (“What is it?”)2,40. Then, a mental simulation of the action can be produced as an effect of mechanical knowledge (“How to use it?”). Hence, visual attention shifts towards the tool’s manipulation area1,2,27,28. In line with these claims, when we considered a restricted time window of analysis, results highlighted a visuo-attentional pattern that focussed on the tools’ functional areas during the first 500 ms, in both experiments, regardless of the experimental conditions. This kind of function-to-mechanical-to-motor cascade mechanism is reflected in a distinctive fixation pattern reasonably generated by the interactions between functional knowledge, mechanical knowledge and motor control neurocognitive systems, as suggested by the Three Action-System model1. Such a flexible and dynamic mechanism pertains to a neurocognitive level of environment-information gathering that engages peculiar operational strategies, hence producing different cognitive outputs.

In Experiment 1, the looking-at task presumably activated the implicit goal of searching for potential mechanical actions between objects. Such a goal promoted a visuo-attentional pattern consistent with its implicit motoric nature. Hence, when the visuo-perceptual context suggested higher action readiness (i.e., thematically consistent condition) the simplest resolution of the cascade mechanism was reflected in a visuo-attentional pattern that emphasised the tool’s manipulation area in order to actualise the action (i.e. using the tool on the object). Conversely, the core assumption of Experiment 2 was that the higher level, goal-directed, short-term recognition task should prevent an agent from proceeding toward the mechanical and then the motor processing, hence persisting in the functional/semantic processing. Indeed, as predicted, participants exhibited a fixation pattern focused on the tool’s functional area in all experimental conditions. Additionally, in contrast with the results of Experiment 1, in Experiment 2 the tool’s mean fixation time was significantly longer under the thematically inconsistent condition. The most straightforward interpretation of this result is that thematic inconsistency made extraction of information harder41.

As regards object recognition performance, in Experiment 2, the analysis of the hits revealed that action-prompting tools (i.e., tools of the thematically consistent object-tool pairs) were recognised faster. Conversely, the analysis of the correct rejections showed an object-type effect, with tools being rejected slower than objects. Such an effect cannot be attributed to an alignment effect given by the spatial disposition of the objects42. Indeed, if that were the case, the effect should emerge also for the hits. The presence of an object-type effect only for the correct-rejection responses indicates that action and/or motoric information associated with the tools is costly in terms of rejection performance. In contrast, when the visual context stimulates the action as in the thematically consistent condition, the same tools’ action-related information improves recognition performance. Thus, a hammer is recognised faster when it is seen after a hammer-nail pair as compared to when it is seen after a hammer-scarf pair, probably because the “hammering” action possibility prompted by the visuo-perceptual context is congruent with the action-related information conveyed by the vision of the tool. In contrast, when a screwdriver is seen after the same hammer-nail pair, the “hammering” information – which conflicts with the “screwing” information prompted by the screwdriver – is reasonably no longer useful to discriminate the previous stimulus, hence producing a cost in terms of rejection performance. This process would also explain the absence of effects for the objects of the pairs.

Overall, these results suggest that action readiness and the cognitive nature of task may influence the way in which the information is collected by an observer as well as object recognition performance. These results support previous evidence in the literature that emphasised the relevance of objects’ functional relations in object recognition9,12,15,17, while indicating – within a cognitive theoretical framework – how an agent can utilise the set of available information in order to interact with the environment in a reasoning-based way1,2,27,28.

Recently, Federico and Brandimonte2 introduced the concept of action reappraisal to refer to a multidimensional cognitive process that utilises multiple sources of information and distinct neurocognitive systems (e.g., function knowledge, mechanical and technical knowledge, abstract knowledge, motor system, etc.) to exploit the environment in terms of action2. In that theoretical perspective, tools are seen as manipulable, physical implements that amplify the user’s sensorimotor capabilities6 in order to solve everyday problems. Tool use is thereby conceptualised as an instance of a problem-solving situation sustained by mechanical knowledge and technical reasoning1,2,26,27,28,43. Accordingly, here we endorse a theoretical approach for which individuals use tools to deal with everyday circumstances, thus reasoning about the most appropriate use of them within a context, rather than to passively learn and actualise the actions (also called “gesture engrams”) that can be performed with them (i.e., the manipulation-based approach20,21,22,23,24). This theoretical perspective is in line with the working memory hypothesis of affordances as regards the claim that the nature of the task modifies the cognitive workload and, as a consequence, affordance perception44,45,46. Furthermore, taking into account the affordance-competition hypothesis47 according to which various affordances are “pre-activated” before being selected, action reappraisal would be explained as a cognitive process – presumably supported by the frontal lobes – that, from the multiple environment-available affordances, selects only those that are relevant to the individual’s intentions through an inhibitory mechanism sustained by high-level executive functions2,27.

The assumptions underlying action reappraisal appear to be substantiated by most recent evidence revealing a multiplicity of distinct neurocognitive systems involved in tool use1,35. In particular, the recently proposed Three Action-System model1 provides a theoretical framework within which the idea of action reappraisal can be easily incorporated. For instance, the differences observed in the fixation patterns of the two experiments reported here and the facilitation effects in tool recognition found in Experiment 2 might be interpreted as reflecting the interactions between functional knowledge, mechanical knowledge and motor control neurocognitive systems. Namely, the ecological nature of the task in Experiment 1 (looking at) may have implicitly activated the cascade mechanism such that participants first solved the functional knowledge issue (“What is it?”), then the mechanical knowledge issue (“How to use it?”) and finally they activated the motor system (i.e., mental simulation of the action), hence finalizing the whole cascade process. On the other hand, in Experiment 2, the goal-oriented nature of task (looking to recognise) may have allowed participants to skip the mechanical/motor processing and to keep focused on the functional/semantic aspects of the recognition task. Coherent with this interpretation, the behavioural data revealed that the same mechanical/motor information (i.e., action readiness) evoked by the visuo-perceptual context had a facilitation effect (shorter RTs) on hits under thematically consistent conditions, but an interference (longer RTs) on correct rejections.

Mechanical knowledge (the ventro-dorsal system) might be considered as a bridge system connecting higher-level semantic information associated with object function and identity (the ventral system) and the motor-control system (the dorso-dorsal system). This bridge system would generate a simulation of an action related to tool use, hence handling the perception of the related affordances in a proper and agent’s consistent-with-intention way2. Such a kind of dynamic synergy among the neurocognitive systems, on the one hand, produces an effect on the way action-related information is visually encoded, coherently with the direct-visual-route-to-action theoretical view for which vision guides action;33 on the other hand, it substantiates how this effect might reverberate on such high-level processes as object recognition.

Recent evidence indicates that human tool use relies on a large and composite interplay of brain areas pertaining to the fronto-parietal and occipito-temporal networks3,4,5,35,36,48,49,50,51,52,53,54,55. In particular, the mechanical knowledge seems to be stored in the left inferior parietal cortex, specifically within the PF cytoarchitectonic area of the supramarginal gyrus (SMG), whereas the neurocognitive system associated to affordance perception (i.e., the motor control system) appears to be the one composed by the bilateral superior parietal cortex and the intraparietal sulcus (putative human anterior intraparietal sulcus area and the anterior dorsal IPS). The left anterior portion of SMG, extending to the PFt cytoarchitectonic area of SMG seems to be an integrative neurocognitive layer between mechanical knowledge and the motor control system35,56,57. Despite the neural correlates of functional knowledge are still a debated issue in the literature49,50,58, recent evidence indicates the left temporal cortex, the left posterior middle temporal gyrus (pMTG) and the lateral occipital complex (LOC) as plausible neural substrates of functional knowledge35,36,52,58.

The left inferior parietal lobule might represent a neural substrate largely implicated in forming object-related action representations. In fact, a recent fMRI study highlighted compulsory access to abstract action information in the left inferior parietal lobe for object-directed actions, irrespective of task context53. Coherently, in the context of tool-related action processing and in the epistemological domain of others’ action understanding59, a most recent meta-analysis54 reported the activations of both the PF cytoarchitectonic area (within the left inferior parietal lobe) and the left inferior frontal gyrus in observational tool-use contexts. As we detailed before, the PF area is involved in the storage of mechanical knowledge (i.e., the ability to reason about physical object properties). Hence, it appears that observing tools to use them and observing others’ tool-use actions share such a specific neural counterpart. These findings clearly suggest that even observing others’ tool-use actions requires supplementary cognitive skills and support the idea that tool-use action understanding might be much more “dis-embodied” than habitually supposed54.

In both experiments, we used object-tool pairs that were thematically consistent or inconsistent. Notably, increasing evidence from studies using thematically related vs. unrelated pairs of objects suggests an involvement of the posterior parieto-temporal cortex (specifically, the temporo-parietal junction, the inferior parietal lobe, and the middle and superior temporal gyri) in objects’ thematic relations processing60,61,62,63. Moreover, an increased left occipital cortex activation (ventral stream) has been reported when objects are correctly positioned for action, while the anterior regions of the dorsal stream (e.g., supplementary motor area) have been reported to be activated when the task required an action decision but objects were not in the correct position for the action36. Crucially, a recent TMS study has suggested that the action readiness evoked by the visuo-perceptual context facilitates the elaboration of taxonomic semantic relations among objects, indicating the inferior parietal cortex and the middle temporal areas as regions where a multimodal integration of action and semantic information takes place to generate high-level cognitive representations about tools37. In the same direction, a recent fMRI study used both verbal stimuli and video to assess whether specific brain areas are involved in tool-related action processing, independently of the stimulus type. By using a multi-voxel pattern analysis, that study provided compelling evidence in favour of the identification of the lateral posterior temporal cortex as a crucial brain region where a cross-modal integration of action-related information is executed. Conversely, unimodal representations produced widespread and overlapped activations in the fronto-parietal network, in the regions that were not implicated in the cross-modal integration of action-related information39.

The overlap of activation in the temporal and parietal brain areas reported by the above-mentioned studies appears to be particularly intriguing when analysed from the perspective of the recently proposed Hub-and-Spoke hypothesis of Semantic Memory38,64. As an attempt to bridge the gap between embodied and unembodied theories of conceptual knowledge (see Meteyard, Cuadrado, Bahrami and Vigliocco65 for a strong-to-weak embodied theories comparison), the Hub-and-Spoke hypothesis assumes that semantic memories originate from the interaction of modality-specific sources of information, called spokes (e.g., visual features, praxis or somatosensory information, sounds, etc.), with a trans-modal semantic hub that provides a further modality-invariant representational resource. Spokes rely on distributed modality-specific cortical regions (e.g., motor areas for objects’ motor properties), whereas the cortical regions within the anterior temporal region (ATL) underpin the central representational hub. The interplay between spokes and hub gives rise to consistent and generalizable concepts.

Developed in the epistemological domain of semantic memory studies, the Hub-and-Spoke hypothesis is consistent with the concept of action reappraisal as an effect of multidimensional, information-gathering processes. Following this theoretical perspective, object-related action information might be conceptualised as modality-specific motoric information that adds to the representational resources typically involved in object recognition. Importantly, such a kind of motoric information potentially emanates from some (though not all) specific spokes included in the Hub-and-Spoke hypothesis38.

We considered human tool use as a kind of problem-solving situation actively sustained by technical and mechanical reasoning skills2,26,27,28. In this sense, neuroimaging evidence of a left inferior prefrontal cortex involvement in action planning and execution appears to be rather promising for interpreting our results35. These cortical regions are extensively implicated in executive functions as well as in motor timing, sequencing and simulation66,67,68. In particular, the rostro-lateral prefrontal cortex appears to be involved in reasoning tasks such as relational integration, i.e., considering different relations simultaneously69. The involvement of the frontal lobes in human tool use might also signal an inhibitory mechanism that modulates the selection of the most appropriate affordance among those that are simultaneously available in the environment. Such a selection mechanism would occur in accordance with the intentions and the action possibilities of the agent27,47. However, given the lack of neuropsychological studies on the correlation between frontal lesions and tool-use impairments, the role of the frontal cortex in human tool use requires further investigation in future research.

To sum up, by extending previous findings2, the present results converge towards recent lines of research that support a reasoning-based approach to human tool use. As we detailed above, the convergence between lower-level (Experiment 1) and higher-level (Experiment 2) measures plausibly reflects the involvement of distinct and complex neurocognitive systems engaged in human tool-use processing. Although the present results do not speak directly to questions about the neural counterparts of these effects, we believe that applying the present methods to patient studies or combining them with neuroimaging procedures may help address those questions.

The concept of action reappraisal – conceived as a wide-range cognitive theoretical perspective that links together different epistemological reservoirs (e.g., psychology of perception, memory studies, neuropsychological studies, etc.) – emphasises how adopting a transversal approach might be extremely prolific to study such a human-characterizing and complex activity as tool use. It is worth noticing that our results add to the growing literature suggesting that higher-level, semantic information is activated earlier than lower-level perceptual information and can affect visual perception, although the magnitude of top down processes is modulated by the context and expectations30,31,32,70,71. To this respect, one might argue that the emphasis given in this article to semantic processing would seem in contrast with the neuropsychological literature that has shown how patients with semantic deficits can still use tools55,72,73. However, we examined healthy participants engaged in free-observation and object-recognition ecological tasks, rather than to evaluate tool-use tasks in clinical settings. Therefore, the results of those studies55,72,73 are hardly comparable to the present findings.

As we discussed, mechanical knowledge appears to be necessary to use tools (in terms of actual utilization)1,2,6,27,28,29, whereas functional/semantic knowledge might be acting as a complementary addendum in order to construct object-related action representations12,13,15,16,17,18,19,30,31,32,39. Thus, for patients with semantic deficits, the integrity of the neurocognitive systems associated with mechanical knowledge would guarantee for their adequate tool use abilities55,72,73. However, while, on the one hand, functional/semantic knowledge might be neither necessary nor sufficient in order to reason about the technical properties of a tool or to actualise the actions linked to its use, on the other hand, it might be indispensable in order to create generalizable and abstract concepts (representations) linked to tools and objects37,38,39,64. Such representations might be usable in everyday life by an agent, in the context of a cognitive-oriented functioning, as we have recently suggested. Here, we wish to emphasise the dynamicity and automaticity of higher-level cognitive processes trough which a reasoning-based agent may elaborate multiple objects-related information (i.e., action reappraisal). Hence, we investigated how the magnitude of such high-level cognitive processes may be modulated by the visuo-perceptual context and the individuals’ goals. Considering tool use as a human ability that stands at the intersection of multiple cognitive processes, the action reappraisal idea might constitute a useful starting point in order to provide broader answers regarding the mechanisms that underlie tool processing in everyday life. While recent and converging evidence seem to support the action reappraisal idea1,2,6,27,28,29,30,31,32,35,36,37,38,39,44,45,46,53,54,59,64,71, further studies are clearly necessary in order to explore its deepest implications.

Methods

The experiments were conducted in the Laboratory of Experimental Psychology at Suor Orsola Benincasa University (Naples, Italy). The experiments were performed following the ethical standards laid down in the 1964 Declaration of Helsinki. The study received approval from the Ethics Committee of the Department of Educational, Psychological and Communication Sciences of Suor Orsola Benincasa University.

Experiment 1

Participants

Fifteen participants (8 females; mean age = 23.07 years, S.D. = 2.19) with normal or corrected-to-normal vision took part in the experiment. All were right-handed based on the Edinburgh Handedness Inventory74, had no history of neurological or psychiatric disorders and gave informed consent on their participation.

Materials

Twenty three-dimensional (3D) computer-graphics generated stimuli were used in Experiment 1. Two different classes of stimuli were used. The first group of stimuli was composed by 3D colour images depicting pairs of objects (a tool on the right – e.g., a screwdriver – and an object on the left, e.g., a screw) that were thematically consistent, placed on the part of a table closest to the observer, in the participant’s peri-personal space. There were the following ten object-tool pairs: nail-hammer, bowl-whip, carton box-cutter, bottle-bottle opener, screw-screwdriver, salami-knife, coffee cup-teaspoon, notebook-pen, glass-bottle, padlock-key. The second group of stimuli was composed by 3D colour images depicting pairs of thematically-inconsistent objects (a tool on the right – e.g., a hammer – and an object on the left, e.g., a scarf) placed on the part of the table that was closest to the observer, in the participant’s peri-personal space. This group comprised ten object-tool pairs: scarf-hammer, women shoe-whip, alarm clock-cutter, notebook-bottle opener, nut-screwdriver, bolt-knife, Christmas ball-teaspoon, men shoe-pen, cap-bottle, baseball-key.

Both the objects of the object-tool pairs (Fig. 6a,b) appeared placed directly on a table, with a mean perceived distance between object and tool (calculated from the centres) of approximately 20 cm and an angle of approximately 180 deg (centre-to-centre, by considering the horizontal line of the table). Some examples of the stimuli are illustrated in Fig. 6(a,b).

Figure 6
figure6

Example of stimuli used in both the experiments. (A) Thematically consistent object-tool pair (a bowl and a whip). (B) Thematically inconsistent object-tool pair (a shoe and a whip).(C) Single tool (a whip). (D) Single object (a shoe). Experiment 1 included (A) and (B). Experiment 2 included all stimuli.

Procedure

The air-conditioned room used in the experiment was maintained constant at a temperature of 24 °C during the entire duration of the study. Light conditions of the room were kept stable for all participants and for the entire duration of the experiment. Before starting, the participants signed informed consent. They were asked to self-report their right-handedness, their adequate visual acuity and the absence of any neurological and psychiatric diseases at the date of the experiment. The Edinburgh Handedness Inventory74 was administered to participants in order to verify that they were actually right-handers. Then, a classic optometric test with participants placed three meters away from the test stimuli was administered to evaluate visual acuity. The participants were seated on a chair and a headrest was used to prevent head movements in order to allow a precise eye-tracking recording. Participants seated at the distance of 54 cm from the monitor (23 inches, with a horizontal viewing angle of approximately 55 deg and a vertical viewing angle of approximately 15 deg) and were asked to keep their right hand motionless on the desk. In this way, the right hand was resting on the right side of the monitor, becoming peripherally visible to the participants in the context of the visual scene. More specifically, the right hand was located at an angle between 35 deg and 40 deg of the right visual space (mid-peripheral vision). Then, the experimental instructions were given. Participants were asked to complete an eye-tracking software calibration procedure by following with their eyes a white cross that sequentially appeared on nine parts of the screen (black background). Afterwards, participants were asked to “observe what appeared on the screen in the most natural way as possible” and the experiment started. A single trial of ten images related to each experimental condition was administered. Thus, twenty images were randomly presented according to the experimental visual flow (Fig. 7): before each stimulus, a fixation point (white cross over black background in the centre area of the screen) of 500 ms duration was shown. Then, the stimulus appeared for 3000 ms. After each stimulus, a black screen appeared for 4000 ms in order to permit retina relaxation. Each single presentation lasted 7.5 seconds (500 ms + 3000 ms + 4000 ms). Globally, the stimuli presentation lasted 150 seconds (7500 ms x 20 stimuli). At the end of the stimulation, a reachability task was administered using the same stimuli as those used during the experiment, which required participants to indicate if tools and objects in the visual scene were graspable with their right hand, according to their perspective. Participants correctly reported that both the tools and the objects were reachable with their right hand. Most relevant, in the thematically consistent condition, participants reported that the tool was potentially usable on the object (e.g., the hammer was effectively usable on the nail), whereas, in the thematically inconsistent condition, tools and objects were reported to be not immediately usable together in a proper way (e.g., the bottle opener was not considered usable on the glass despite their spatial proximity). For each participant, the overall duration of the experiment was 13 minutes. At the end of the experiment, participants were debriefed regarding the purposes of the study and the methods used. No participant was excluded from the sample.

Figure 7
figure7

Experiment 1 - Experimental flow. For each trial, a fixation point appeared for 500 ms, then an object-tool pair appeared for 3000 ms. This pair could be thematically consistent or thematically inconsistent and it was followed by a black screen that appeared for 4000 ms to permit retina relaxation.

Apparatus and software paradigms

We used a Full-HD Webcam (Logitech HD Pro C920, sampling rate: 30 Hz) as eye-tracking hardware and custom Python/JavaScript software to manage the experiment and to acquire gaze behaviour data. The eye-tracking technology used in the present study is based on WebGazer, an eye-tracking JavaScript library75. All the software paradigms were conceived and programmed using the Python programming language and the PsychoPy framework76 with custom code optimization. In order to extract and analyse the data collected through the software paradigms, distinct ad-hoc, custom-made scripts were engineered and developed using PHP programming language and the MySQL Database Management System. All stimuli were presented on a 23-inches monitor (Dell S2319H) at a resolution of 1920*1080px. All experimental software was executed by using an Apple MacBook Pro (13-inch, 2017) running macOS Catalina (version 10.15).

Gaze-behaviour data

Participants’ visual-exploration patterns were analysed in terms of mean fixation duration (milliseconds) on different Areas of Interest (AOIs). We defined two AOIs: the manipulation area of the tool (i.e., the middle-bottom area where to put the hand in order to use it) and the functional area of the tool (i.e., middle-top area of the tool through which it is possible to understand its function; Fig. 8). Taking into account the technical limits of the eye-tracking technology used, both the AOIs were computed with a perimeter increased by 64 pixels in all directions. Mean fixation durations to the AOIs were averaged per condition. For each stimulus, the first 250 ms of eye-tracking data were excluded from the gaze-behaviour analysis in order to reduce the error produced by the fixation point in participants’ visual-exploration patterns. Only the first 1000 ms of visual-exploration data for each stimulus were analysed in order to reduce data dispersal effects due to participants’ visual-scene exploration. Within the first 1000 ms of visual exploration, two different time windows of analysis were considered (first 500 ms and first 1000 ms). An at-a-glance qualitative indication of differences in participants’ fixation patterns may be appreciated in the visuo-attentional heatmaps (Fig. 2).

Figure 8
figure8

Areas of Interest considered in the experiments. The AOIs considered in the experiments were the manipulation area of the tool (circled in red, labelled as “M”) and the functional area of the tool (circled in green, labelled as “F”).

Data analysis

To analyse how participants looked at the tools of the object-tool pairs as the visuo-perceptual context changed, we performed a 2 × 2 repeated measure ANOVA with AOIs (manipulation vs. functional area) as a 2-level factor and Thematic Consistency (thematically consistent vs. thematically inconsistent pairs) as a 2-level factor on tool fixation duration (milliseconds) for each time window of analysis (500 ms and 1000 ms). An alpha level of 0.05 was used for all the analyses. We used Bonferroni correction for multiple comparisons. All the analyses were conducted by using the open-source statistical software “R” (v. 3.6.1; GUI v. 1.70; build 7684) for Apple macOS operating system (Catalina, v. 10.5).

Experiment 2

Participants

Twentynine participants (11 females; mean age = 21.16 years, S.D. = 3.62) with normal or corrected-to-normal vision were included in this experiment. All were right-handed based on the Edinburgh Handedness Inventory74 and had no history of neurological or psychiatric disorders. Participants gave their informed written consent.

Materials

Forty three-dimensional (3D) computer-graphics images generated stimuli were used in the experiment. Four different classes of stimuli were used. The first two groups of stimuli were the same as in Experiment 1 (10 thematically consistent and 10 thematically inconsistent object-tool pairs). The third group of stimuli was composed by 3D colour images of a single tool placed at the centre of a table in the participant’s peri-personal space. This group was composed of ten stimuli: hammer, whip, cutter, bottle opener, screwdriver, knife, teaspoon, pen, bottle, key. The fourth group of stimuli was composed by 3D colour images of a single object placed in the centre of a table in the participant’s peri-personal space. This group comprised ten objects: nail, bowl, carton box, bottle, screw, salami, coffee cup, notebook, glass, padlock. Single objects and tools (Fig. 6c,d) appeared placed on the centre of a table, with the centre of the stimulus aligned with the centre of the table. Some examples of the stimuli are illustrated in Fig. 6(a–d).

Procedure

The experimental setting and general procedure were the same as in Experiment 1 except that participants were engaged in a short-term recognition task. Namely, participants seated at a distance of 54 cm from the monitor (23 inches, with a horizontal viewing angle of approximately 55 deg and a vertical viewing angle of approximately 15 deg) and were asked to keep their right hand motionless on the desk and the left hand on a keyboard with the index finger placed on the “E” key and the middle finger placed on the “W” key. In this way, the right hand was resting on the right side of the monitor, becoming peripherally visible to the participants in the context of the visual scene. More specifically, the right hand was located at an angle between 35 deg and 40 deg of the right visual space (mid-peripheral vision). Then, the experimental instructions were given. Participants were asked to complete an eye-tracking software calibration procedure by following with their eyes a white cross that sequentially appeared on nine parts of the screen (black background), then the experiment started. Each trial began with a fixation point (+) that remained in view for 500 ms. Then, participants had to observe a first stimulus consisting of an object-tool pair that remained in view for 2000ms. while their gaze behaviour for the stimulus was analysed through eye-tracking technology. Then, a second fixation point (+) was shown for 500 ms, followed by the target stimulus consisting of a single object or a single tool. Participants had to indicate, by pressing the appropriate key on the keyboard (“W” key for yes responses; “E” key for no responses), whether the latter stimulus (single tool or single object) was present in the previously observed object-tool pair. At the end of each trial a black screen (blank) appeared for 4000 ms in order to permit retina relaxation. Stimuli were randomly selected. The experimental paradigm was composed by a 2 (Thematic Consistency: 10 thematically consistent vs. 10 thematically inconsistent object-tool pairs) x 2 (Object Type: tools vs. objects) x 2 (Response Type: yes vs. no) within-factor design. Overall, the experiment consisted of 80 trials (10 × 2 × 2 × 2) without repetitions. The reachability task administered at the end of the experiment once again showed that participants correctly reported that both the tool and the object of the object-tool pairs were reachable using their right hand. In addition, the tool was considered usable on the object only in the thematically consistent condition. The experiment lasted approximately 30 minutes per participant. At the end of the experiment, participants were debriefed regarding the purpose of the study and the methods used as in Experiment 1. No participant was excluded from the sample. The experimental flow is summarised in Fig. 9.

Figure 9
figure9

Experiment 2 – Experimental flow. A fixation point appeared for 500 ms, then an object-tool pair appeared for 2000ms. This pair could be either thematically consistent or thematically inconsistent and it was followed by a second fixation point of 500 ms. Finally, a single tool or a single object appeared. Participants pressed the “yes” key or the “no” key to indicate whether the single tool or the single object was present or not in the object-tool pair. At the end of each trial, a black screen appeared for 4000 ms to permit retina relaxation.

Apparatus and software paradigms

In Experiment 2 we used the same apparatus and software paradigms as in Experiment 1.

Gaze-behaviour data

The same criteria as in Experiment 1 guided the analysis of data related to participants’ gaze-behaviour. Differences in participants’ visuo-attentional patterns may be qualitatively appreciated in the fixation heatmaps (Fig. 4).

Data analyses

First, we analysed eye-tracking data in the same way as in Experiment 1. Secondly, we analysed behavioural data related to participants’ recognition performance. Three male participants were excluded from the behavioural data analysis as their object recognition performance was above 2.5 S.D. from the mean (outliers). An alpha level of 0.05 was used for all the analyses. We used Bonferroni correction for multiple comparisons. All the analyses were performed using R (v. 3.6.1; GUI v. 1.70; build 7684) for Apple macOS operating system (Catalina, version 10.5). For all the analyses, an alpha level of 0.05 was used.

Eye-tracking data analysis

We performed a 2 × 2 repeated-measure ANOVA with AOIs (manipulation vs. functional area) as a 2-level factor and Thematic Consistency (thematically consistent vs. thematically inconsistent object-tool pairs) as a 2-level factor on tool fixation duration (expressed in milliseconds).

Behavioural data analysis

Mean reaction times and accuracy were calculated. As typically observed with this kind of recognition task, accuracy was above 0.95 in all conditions, hence we analysed only mean RTs. For the mean RTs analysis, we performed two (Hits and Correct Rejections) distinct 2 × 2 repeated-measure ANOVAs with the 2-level factor Thematic Consistency (thematically consistent vs. thematically inconsistent object-tool pairs) and the 2-level factor Object Type (tools vs. objects) on RTs.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Code availability

Software code and scripts are available on reasonable request to the corresponding author.

References

  1. 1.

    Osiurak, F., Rossetti, Y. & Badets, A. What is an affordance? 40 years later. Neuroscience & Biobehavioral Reviews 77, 403–417 (2017).

    Article  Google Scholar 

  2. 2.

    Federico, G. & Brandimonte, M. A. Tool and object affordances: An ecological eye-tracking study. Brain and Cognition 135, 103582 (2019).

    PubMed  Article  Google Scholar 

  3. 3.

    Chao, L. L. & Martin, A. Representation of manipulable man-made objects in the dorsal stream. Neuroimage 12(4), 478–484 (2000).

    CAS  PubMed  Article  Google Scholar 

  4. 4.

    Johnson-Frey, S. H. The neural bases of complex tool use in humans. Trends in Cognitive Sciences 8(2), 71–78 (2004).

    PubMed  Article  Google Scholar 

  5. 5.

    Króliczak, G. & Frey, S. H. A common network in the left cerebral hemisphere represents planning of tool use pantomimes and familiar intransitive gestures at the hand-independent level. Cerebral Cortex 19(10), 2396–2410 (2009).

    PubMed  Article  Google Scholar 

  6. 6.

    Osiurak, F., Jarry, C. & Le Gall, D. Grasping the affordances, understanding the reasoning: toward a dialectical theory of human tool use. Psychological Review 117(2), 517 (2010).

    PubMed  Article  Google Scholar 

  7. 7.

    Mizelle, J. C. & Wheaton, L. A. Why is that hammer in my coffee? A multimodal imaging investigation of contextually based tool understanding. Frontiers in Human. Neuroscience 4, 233 (2010).

    CAS  Google Scholar 

  8. 8.

    Yoon, E. Y., Humphreys, G. W. & Riddoch, M. J. The paired-object affordance effect. Journal of Experimental Psychology: Human Perception and Performance 36(4), 812 (2010).

    PubMed  Google Scholar 

  9. 9.

    Borghi, A. M., Flumini, A., Natraj, N. & Wheaton, L. A. One hand, two objects: Emergence of affordance in contexts. Brain and Cognition 80(1), 64–73 (2010).

    Article  Google Scholar 

  10. 10.

    Riddoch, M. J., Humphreys, G. W., Heslop, J. & Castermans, E. Dissociations between object knowledge and everyday action. Neurocase 8(1), 100–110 (2002).

    PubMed  Article  Google Scholar 

  11. 11.

    Riddoch, M. J. et al. I can see what you are doing: Action familiarity and affordance promote recovery from extinction. Cognitive Neuropsychology 23(4), 583–605 (2006).

    CAS  PubMed  Article  Google Scholar 

  12. 12.

    Green, C. & Hummel, J. E. Familiar interacting object pairs are perceptually grouped. Journal of Experimental Psychology: Human Perception and Performance 32(5), 1107 (2006).

    PubMed  Google Scholar 

  13. 13.

    Roberts, K. L. & Humphreys, G. W. Action relationships concatenate representations of separate objects in the ventral visual system. Neuroimage 52(4), 1541–1548 (2010).

    PubMed  Article  Google Scholar 

  14. 14.

    Xu, S., Humphreys, G. W. & Heinke, D. Implied actions between paired objects lead to affordance selection by inhibition. Journal of Experimental Psychology: Human Perception and Performance 41(4), 1021 (2015).

    PubMed  Google Scholar 

  15. 15.

    Helbig, H. B., Graf, M. & Kiefer, M. The role of action representations in visual object recognition. Experimental Brain Research 174(2), 221–228 (2006).

    PubMed  Article  Google Scholar 

  16. 16.

    Price, C. J., Moore, C. J., Humphreys, G. W., Frackowiak, R. S. J. & Friston, K. J. The neural regions sustaining object recognition and naming. Proceedings of the Royal Society of London. Series B: Biological Sciences 263(1376), 1501–1507 (1996).

    ADS  CAS  PubMed  Article  Google Scholar 

  17. 17.

    Green, C. & Hummel, J. E. Functional interactions affect object detection in non-scene displays. In Proceedings of the Annual Meeting of the Cognitive Science Society (Vol. 26, No. 26) (2004).

  18. 18.

    Garcea, F. E. & Mahon, B. Z. What is in a tool concept? Dissociating manipulation knowledge from function knowledge. Memory & Cognition 40(8), 1303–1313 (2012).

    Article  Google Scholar 

  19. 19.

    Ni, L., Liu, Y. & Yu, W. The dominant role of functional action representation in object recognition. Experimental Brain Research 237(2), 363–375 (2019).

    PubMed  Article  Google Scholar 

  20. 20.

    Buxbaum, L. J. Ideomotor apraxia: a call to action. Neurocase 7(6), 445–458 (2001).

    CAS  PubMed  Article  Google Scholar 

  21. 21.

    Thill, S., Caligiore, D., Borghi, A. M., Ziemke, T. & Baldassarre, G. Theories and computational models of affordance and mirror systems: an integrative review. Neuroscience & Biobehavioral Reviews 37(3), 491–521 (2013).

    Article  Google Scholar 

  22. 22.

    Buxbaum, L. J. & Kalénine, S. Action knowledge, visuomotor activation, and embodiment in the two action systems. Annals of the New York Academy of Sciences 1191, 201 (2010).

    ADS  PubMed  PubMed Central  Article  Google Scholar 

  23. 23.

    Borghi, A. M. Object concepts and action: Extracting affordances from objects parts. Acta Psychologica 115(1), 69–96 (2004).

    PubMed  Article  Google Scholar 

  24. 24.

    Borghi, A. M. & Riggio, L. Sentence comprehension and simulation of object temporary, canonical and stable affordances. Brain Research 1253, 117–128 (2009).

    CAS  PubMed  Article  Google Scholar 

  25. 25.

    Barsalou, L. W. Grounded cognition. Annual Review of Psychology 59, 617–645 (2008).

    Article  Google Scholar 

  26. 26.

    Beck, S. R., Apperly, I. A., Chappell, J., Guthrie, C. & Cutting, N. Making tools isn’t child’s play. Cognition 119(2), 301–306 (2011).

    PubMed  Article  Google Scholar 

  27. 27.

    Osiurak, F. & Badets, A. Tool use and affordance: Manipulation-based versus reasoning-based approaches. Psychological Review 123(5), 534 (2016).

    PubMed  Article  Google Scholar 

  28. 28.

    Osiurak, F. & Badets, A. Use of tools and misuse of embodied cognition: Reply to Buxbaum (2017). Psychological Review 124(3), 361–368 (2017).

    PubMed  Article  Google Scholar 

  29. 29.

    Osiurak, F. What neuropsychology tells us about human tool use? The four constraints theory (4CT): mechanics, space, time, and effort. Neuropsychology Review 24(2), 88–115 (2014).

    PubMed  Article  Google Scholar 

  30. 30.

    Bar, M. Visual objects in context. Nature Reviews Neuroscience 5(8), 617 (2004).

    CAS  PubMed  Article  Google Scholar 

  31. 31.

    Bar, M. et al. The contribution of context to visual object recognition. Journal of Vision 5(8), 88–88 (2005).

    Article  Google Scholar 

  32. 32.

    Bar, M. et al. Top-down facilitation of visual recognition. Proceedings of the National Academy of Sciences 103(2), 449–454 (2006).

    ADS  CAS  Article  Google Scholar 

  33. 33.

    Milner, A. D. & Goodale, M. A. Two visual systems re-viewed. Neuropsychologia 46(3), 774–785 (2008).

    CAS  PubMed  Article  Google Scholar 

  34. 34.

    Rizzolatti, G. & Matelli, M. Two different streams form the dorsal visual system: anatomy and functions. Experimental Brain Research 153(2), 146–157 (2003).

    PubMed  Article  Google Scholar 

  35. 35.

    Reynaud, E., Lesourd, M., Navarro, J. & Osiurak, F. On the neurocognitive origins of human tool use: A critical review of neuroimaging data. Neuroscience & Biobehavioral Reviews 64, 421–437 (2016).

    Article  Google Scholar 

  36. 36.

    Roux-Sibilon, A., Kalénine, S., Pichat, C. & Peyrin, C. Dorsal and ventral stream contribution to the paired-object affordance effect. Neuropsychologia 112, 125–134 (2018).

    PubMed  Article  Google Scholar 

  37. 37.

    De Bellis, F. et al. Left inferior parietal and posterior temporal cortices mediate the effect of action observation on semantic processing of objects: evidence from rTMS. Psychological Research, 1-14 (2018).

  38. 38.

    Lambon Ralph, M. A. L., Jefferies, E., Patterson, K. & Rogers, T. T. The neural and computational bases of semantic cognition. Nature Reviews Neuroscience 18(1), 42 (2017).

    Article  CAS  Google Scholar 

  39. 39.

    Wurm, M. F. & Caramazza, A. Distinct roles of temporal and frontoparietal cortex in representing actions across vision and language. Nature Communications 10(1), 289 (2019).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  40. 40.

    Ambrosini, E. & Costantini, M. Body posture differentially impacts on visual attention towards tool, graspable, and non-graspable objects. Journal of Experimental Psychology: Human Perception and Performance 43(2), 360 (2017).

    PubMed  Google Scholar 

  41. 41.

    Just, M. A. & Carpenter, P. A. Eye fixations and cognitive processes. Cognitive Psychology 8(4), 441–480 (1976).

    Article  Google Scholar 

  42. 42.

    Tucker, M. & Ellis, R. On the relations between seen objects and components of potential actions. Journal of Experimental Psychology: Human Perception and Performance 24(3), 830 (1998).

    CAS  PubMed  Google Scholar 

  43. 43.

    Mounoud, P., Duscherer, K., Moy, G. & Perraudin, S. The influence of action perception on object recognition: a developmental study. Developmental Science 10(6), 836–852 (2007).

    PubMed  Article  Google Scholar 

  44. 44.

    Randerath, J., Li, Y., Goldenberg, G. & Hermsdörfer, J. Grasping tools: effects of task and apraxia. Neuropsychologia 47(2), 497–505 (2009).

    PubMed  Article  Google Scholar 

  45. 45.

    Randerath, J., Goldenberg, G., Spijkers, W., Li, Y. & Hermsdörfer, J. From pantomime to actual use: how affordances can facilitate actual tool-use. Neuropsychologia 49(9), 2410–2416 (2011).

    PubMed  Article  Google Scholar 

  46. 46.

    Randerath, J., Martin, K. R. & Frey, S. H. Are tool properties always processed automatically? The role of tool use context and task complexity. Cortex 49(6), 1679–1693 (2013).

    PubMed  Article  Google Scholar 

  47. 47.

    Cisek, P. Cortical mechanisms of action selection: the affordance competition hypothesis. Philosophical Transactions of the Royal Society B: Biological Sciences 362(1485), 1585–1599 (2007).

    Article  Google Scholar 

  48. 48.

    Beauchamp, M. S. & Martin, A. Grounding object concepts in perception and action: evidence from fMRI studies of tools. Cortex 43(3), 461–468 (2007).

    PubMed  Article  Google Scholar 

  49. 49.

    Boronat, C. B. et al. Distinctions between manipulation and function knowledge of objects: evidence from functional magnetic resonance imaging. Cognitive Brain Research 23(2-3), 361–373 (2005).

    PubMed  Article  Google Scholar 

  50. 50.

    Canessa, N. et al. The different neural correlates of action and functional knowledge in semantic memory: an FMRI study. Cerebral Cortex 18(4), 740–751 (2007).

    MathSciNet  PubMed  Article  Google Scholar 

  51. 51.

    Creem-Regehr, S. H. & Lee, J. N. Neural representations of graspable objects: are tools special? Cognitive Brain Research 22(3), 457–469 (2005).

    PubMed  Article  Google Scholar 

  52. 52.

    Orban, G. A. & Caruana, F. The neural basis of human tool use. Frontiers in Psychology 5, 310 (2014).

    PubMed  PubMed Central  Google Scholar 

  53. 53.

    Chen, Q., Garcea, F. E., Jacobs, R. A. & Mahon, B. Z. Abstract representations of object-directed action in the left inferior parietal lobule. Cerebral Cortex 28(6), 2162–2174 (2018).

    PubMed  Article  Google Scholar 

  54. 54.

    Reynaud, E., Navarro, J., Lesourd, M. & Osiurak, F. To Watch is to Work: a Review of NeuroImaging Data on Tool Use Observation Network. Neuropsychology Review, 1–14 (2019).

  55. 55.

    Goldenberg, G. & Spatt, J. The neural basis of tool use. Brain 132(6), 1645–1655 (2009).

    CAS  PubMed  Article  Google Scholar 

  56. 56.

    Caspers, S. et al. The human inferior parietal lobule in stereotaxic space. Brain Structure and Function 212(6), 481–495 (2008).

    PubMed  Article  Google Scholar 

  57. 57.

    Caspers, S. et al. The human inferior parietal cortex: cytoarchitectonic parcellation and interindividual variability. Neuroimage 33(2), 430–448 (2006).

    PubMed  Article  Google Scholar 

  58. 58.

    Goldenberg, G. Apraxia: The cognitive side of motor control. (Oxford University Press, 2013).

  59. 59.

    Thompson, E. L., Bird, G. & Catmur, C. Conceptualizing and testing action understanding. Neuroscience & Biobehavioral Reviews. 105, 106–114 (2019).

    Article  Google Scholar 

  60. 60.

    Kalénine, S. et al. The sensory-motor specificity of taxonomic and thematic conceptual relations: A behavioral and fMRI study. Neuroimage 44(3), 1152–1162 (2009).

    PubMed  Article  Google Scholar 

  61. 61.

    Kalénine, S. & Buxbaum, L. J. Thematic knowledge, artifact concepts, and the left posterior temporal lobe: Where action and object semantics converge. Cortex 82, 164–178 (2016).

    PubMed  PubMed Central  Article  Google Scholar 

  62. 62.

    Sass, K., Sachs, O., Krach, S. & Kircher, T. Taxonomic and thematic categories: Neural correlates of categorization in an auditory-to-visual priming task using fMRI. Brain Research 1270, 78–87 (2009).

    CAS  PubMed  Article  Google Scholar 

  63. 63.

    Tsagkaridis, K., Watson, C. E., Jax, S. A. & Buxbaum, L. J. The role of action representations in thematic object relations. Frontiers in Human Neuroscience 8, 140 (2014).

    PubMed  PubMed Central  Article  Google Scholar 

  64. 64.

    Rogers, T. T. & McClelland, J. L. Semantic cognition: A parallel distributed processing approach. (MIT press, 2004).

  65. 65.

    Meteyard, L., Cuadrado, S. R., Bahrami, B. & Vigliocco, G. Coming of age: A review of embodiment and the neuroscience of semantics. Cortex 48(7), 788–804 (2012).

    Article  Google Scholar 

  66. 66.

    Koechlin, E. & Summerfield, C. An information theoretical approach to prefrontal executive function. Trends in Cognitive Sciences 11(6), 229–235 (2007).

    PubMed  Article  Google Scholar 

  67. 67.

    Bortoletto, M. & Cunnington, R. Motor timing and motor sequencing contribute differently to the preparation for voluntary movement. Neuroimage 49(4), 3338–3348 (2010).

    PubMed  Article  Google Scholar 

  68. 68.

    Stadler, W. et al. Predicting and memorizing observed action: differential premotor cortex involvement. Human Brain Mapping 32(5), 677–687 (2011).

    PubMed  Article  Google Scholar 

  69. 69.

    Christoff, K. et al. Rostrolateral prefrontal cortex involvement in relational integration during reasoning. Neuroimage 14(5), 1136–1149 (2001).

    CAS  PubMed  Article  Google Scholar 

  70. 70.

    Meteyard, L., Bahrami, B. & Vigliocco, G. Motion detection and motion verbs: Language affects low-level visual perception. Psychological Science 18(11), 1007–1013 (2007).

    PubMed  Article  Google Scholar 

  71. 71.

    Bar, M. A cortical mechanism for triggering top-down facilitation in visual object recognition. Journal of Cognitive Neuroscience 15(4), 600–609 (2003).

    PubMed  Article  Google Scholar 

  72. 72.

    Buxbaum, L. J., Schwartz, M. F. & Carew, T. G. The role of semantic memory in object use. Cognitive Neuropsychology 14(2), 219–254 (1997).

    Article  Google Scholar 

  73. 73.

    Osiurak, F. et al. Object utilization and object usage: A single-case study. Neurocase 14(2), 169–183 (2008).

    PubMed  Article  Google Scholar 

  74. 74.

    Oldfield, R. C. The assessment and analysis of handedness: the Edinburgh inventory. Neuropsychologia 9(1), 97–113 (1971).

    CAS  PubMed  Article  Google Scholar 

  75. 75.

    Papoutsaki, A. et al. Webgazer: Scalable webcam eye tracking using user interactions. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence-IJCAI 2016 (2016).

  76. 76.

    Peirce, J. et al. PsychoPy2: Experiments in behavior made easy. Behavior Research Methods 51(1), 195–203 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Contributions

G.F. and M.A.B. conceived and designed the study and the experiments; G.F. developed the experimental software and scripts, conducted the experiments, pilots and setup, analysed the data and prepared the figures; G.F. wrote the paper; M.A.B. revised the manuscript and provided critical comments and theoretical contribution.

Corresponding author

Correspondence to Giovanni Federico.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Federico, G., Brandimonte, M.A. Looking to recognise: the pre-eminence of semantic over sensorimotor processing in human tool use. Sci Rep 10, 6157 (2020). https://doi.org/10.1038/s41598-020-63045-0

Download citation

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.