The Size Congruity Effect Vanishes in Grasping: Implications for the Processing of Numerical Information

Namdar, Gal; Ganel, Tzvi; Algom, Daniel

doi:10.1038/s41598-018-21003-x

Download PDF

Article
Open access
Published: 09 February 2018

The Size Congruity Effect Vanishes in Grasping: Implications for the Processing of Numerical Information

Gal Namdar¹,
Tzvi Ganel¹ &
Daniel Algom²

Scientific Reports volume 8, Article number: 2723 (2018) Cite this article

1283 Accesses
4 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Judgments of the physical size in which a numeral is presented are often affected by the task-irrelevant attribute of its numerical magnitude, the Size Congruity Effect (SCE). The SCE is typically interpreted as a marker of the automatic activation of numerical magnitude. However, a growing literature shows that the SCE is not robust, a possible indication that numerical information is not always activated in an automatic fashion. In the present study, we tested the SCE via grasping by way of resolving the automaticity debate. We found results that challenge the robustness of the SCE and, consequently, the validity of the automaticity assumption. The SCE was absent when participants grasped the physically larger object of a pair of 3D wooden numerals. An SCE was still recorded when the participants perceptually indicated the general location of the larger object, but not when they grasped that object. These results highlight the importance of the sensory domain when considering the generality of a perceptual effect.

A double dissociation between action and perception in bimanual grasping: evidence from the Ponzo and the Wundt–Jastrow illusions

Article Open access 04 September 2020

Distinct sensorimotor mechanisms underlie the control of grasp and manipulation forces for dexterous manipulation

Article Open access 25 July 2023

Evidence to support the mechanical advantage hypothesis of grasping at low force levels

Article Open access 02 December 2022

Introduction

According to Bertrand Russell’s famous epigram, pure mathematics is the subject that we do not know what we are talking about, or whether what we say is true. The truism granted, mathematics, too, has its pawns, tokens, and moves, and it is the numerals that comprise the abstract, dimension-less tools of the game of mathematics. However, in contrast with their abstract nature in mathematics, in the empirical world every numeral comes dressed in a multitude of physical attributes. Every numeral you have ever seen, heard, or touched came in a certain physical size, shape (font), loudness, or texture. Numerals can even come in the form of graspable three-dimensional objects. Inevitably, a numeral is the union semantic and physical features. If so, it is only appropriate to ask: How do these features interact in cognitive processing? Suppose that you judge the physical size of a numeral presented for view. Does the numerical magnitude of the numeral influence your perception of size? Such an influence is consistent with the popular idea that numerical information is activated in an automatic fashion just whenever a numeral is presented for view for any purpose. However, the effect of irrelevant numerical magnitude is often absent in judgments of size, thereby challenging the alleged automaticity of numerical perception. A novel attempt at resolving the nature of numerical processing forms the overriding theme of the present study.

The issue is not new, of course, but here we address it by combining ideas and methods from two separate domains. The first domain is that of visual perception, in particular visual selective attention: Can people focus exclusively on physical size while ignoring numerical magnitude? If they can, they then select the physically larger member of the pair 7 2 (congruent stimulus) as fast as that in the pair 7 2 (incongruent stimulus). If they cannot, performance is better with the first than with the second pair, yielding the typical measure of the size congruity effect (SCE). Because the SCE frequently obtains, it is considered by many to be a marker of the automatic activation of numerical information. The problem is that the SCE is absent in a fair number of studies. Moreover, its absence is not haphazard or random, but rather the orderly outcome of the experimental design used. Slight variation of the stimuli and the design are capable of producing an inflated SCE, of eliminating the SCE altogether, or of reversing it such that physical size now intrudes on numerical perception. The upshot is, the question of automatic activation of numerical magnitude is not fully settled within visual perception and attention. It is at this juncture that evidence from a second domain, grasping, becomes valuable.

It is increasingly recognized that our sense of vision subserves two separate functions: perception and action. Vision is used in the first sense when you assess (subjectively) the size of a chair in the shop to decide if it fits your room. Vision is used in the second sense when you reach to the chair with your hands (e.g., to lift it after purchase). To underscore the difference, the function of vision in the second instance is to guide motor action by the viewer, rather than to enable mere perception or a perceptual report. Now, all SCE studies to date are based on visual perception, not visually guided action – the pertinent studies did not include a motor component by which the participant extends her hands to hold the stimulus. The present study fills this lacuna by subjecting the hypothesis of automaticity to testing by grasping. This test is singularly severe because grasping has been shown to be immune to effects of task-irrelevant features or factors of context. For example, grasping has been shown to be exempt from such compelling visual (perceptual!) illusions as those of height-width illusion or the Ponzo. If the SCE survives responding via grasping, the hypothesis that numerical information is activated in a mandatory fashion is lent powerful support.

In the remainder of the introduction, we first establish the perception-action partition of visual functions. We then summarize relevant research from vision-for-perception and from vision-for-action by way of motivating the present study. We believe that our approach and findings is bound to revitalize the debate on the key cognitive question of how people process semantic information.

The Perception-Action Complementarity

The distinction between the two visual systems is supported by a plausible anatomical substrate and systematic differences in operation characteristics. According to an influential idea suggested by Goodale and Milner¹ the visual system is segregated, anatomically and functionally, into two visual pathways. In this view, the ventral pathway supports visual perception, while the dorsal pathway subserves goal-directed action. The evidence supporting this model is strong, based on multiple instances of double-dissociation^{2,3,4,5,6,7,8} (but see^9,10). Corresponding to the anatomical distinction are characteristic differences in modus operandi. The perceptual system is notorious for its susceptibility to contextual information. For example, people are unable to ignore the meaning of a (color) word when naming its print color (the Stroop effect, see^11,12 for reviews), are unable to ignore the object’s height when judging its width (Garner interference¹³, see¹⁴ for review), and are unable to ignore the object’s size when estimating its weight (the size-weight illusion¹⁵). Familiar visual-perceptual illusions (e.g., the Ponzo or Ebbinghaus) are further instances of contextual influence on perception. The SCE, when it is present, provides another example of the difficulty at ignoring task-irrelevant information in perception, this one in the numerical domain. It takes people longer to select the physically larger member of a pair of numerals when this member is numerically smaller than when it is also numerically larger.

In contrast with mere perception, actions guided by vision seem largely exempt from contextual influences. It is repeatedly found that reach-to-grasp kinematics are unaffected by context because the dorsal stream visual network acts via exclusive focusing on the to-be-held object, preserving its absolute size (e.g.,¹⁶ and references therein). It is due to this effector-specific absolute mode of operation that grasping is largely resistant to visual illusions. In fact, mere intention to grasp suffices to activate the action system. By contrast, viewing passively the same stimuli fools the perceptual system because the ventral system relies on relative metrics within scene-based reference frames.

Concerning visual illusions in particular, Vishton et al.¹⁷ showed that the magnitude of the Ebbinghaus illusion decreases when the observer is asked to grasp or reach for the target rather than merely report its size. If the observer is asked to throw balls at the target, the perception of the target’s size is correlated with the number of the successful hits¹⁸. Visual search of features such as object orientation has been shown to require less saccades when participants intend to act upon the searched items¹⁹. In this study, we tested the effect on performance of both action and of the intention to act.

Vision-for-Perception: Is Numerical Magnitude Processed in a Mandatory Fashion?

The SCE is regularly observed in studies of numerical perception (e.g.,^{20,21,22,23,24,25,26,27}). Given its recurrence, the effect is often taken to reflect on the mandatory processing of numerical magnitude (e.g.,^28,29,30. However, systematic biases in the standard experimental design call into question the robustness of the SCE, and consequently, the obligatory nature of numerical processing. Little stimulus alchemy suffices to eliminate the SCE or reverse the effect such that physical size intrudes on perception of numerical magnitude more than vice versa (the reverse-SCE²³). Algom et al.²⁰ (see also¹⁴) have identified two critical biases prevalent in published SCE studies. First, there is a glaring asymmetry in the number of stimuli used for the numerical and the physical dimensions. Typically, the numbers 1 to 9 (inclusive) are used for the former, but only two or three values (small, medium, large) are used for the latter. As a result, virtually all pertinent research pitted a finely grained numerical dimension against a coarse physical dimension. This asymmetry itself can determine the observed interaction (=SCE). Melara and Mounts³¹ (see also^12,32) have shown that the mere number of stimuli on an irrelevant dimension affects classification performance on the relevant dimension.

For another bias, the relative discriminability of values along the number and the size dimensions was not matched. Matched discriminability or salience means that the time and accuracy needed to tell apart values along the number dimension is the same as those needed to tell apart values along the size dimension. However, mismatched discriminability favoring numbers was present in virtually all studies of the SCE. Values along the number dimension were much more discriminable from one another than values along the size dimension (e.g., it took participants much longer to tell the small and large physical sizes apart than to tell the numbers 4 and 5 apart). The presence of this asymmetry is critical because the more discriminable dimension will disrupt performance on the less discriminable dimension more than vice versa^31,32. In the present case, irrelevant numerical values affected performance with physical size (=SCE) not because they are activated in an automatic fashion but simply because the values of number differed perceptually from one another more than did the values of physical size from one another. Notably, when care was taken to match discriminability (and number of values along the two dimensions), the SCE collapsed. And, when physical size was purposely made more salient than numerical value, a reverse SCE emerged^{12,20,23,24,26,27,32}. The malleability of the SCE casts doubt on the automatic nature of processing numerical magnitude. Recently, Sobel and his coworkers³³ have found that number and size are processed in an independent fashion, a result inconsistent with automaticity. Are there further means to resolve the issue of mandatory processing of numerical magnitude?

Perception-for-Action: Context Independence when Grasping Numbers?

Grasping a part or some property of a multidimensional object has been found free of interference from task-irrelevant features of the same stimulus, although such interference is present in visual perception^3,13. Major visual-perceptual illusions diminish or disappear altogether when the responses are motor actions under direct visuomotor control^{7,34,35,36,37,38,39,40} (but see^10,41,42). Grasping is immune to contextual information to the extent that it has been shown to violate even such a fundamental psychophysical principle as Weber’s Law^36,43,44,45. If context independence prevails, then the SCE should vanish when people grasp numbers of different physical size, i.e., numerical magnitude should not affect the grasping response. Here we asked: Does the alleged automaticity in perceiving numerical information survive the change in response mode? If numerical magnitude is always processed, then the SCE should also emerge in grasping.

The Present Study

The main goal was to subject the SCE to a stringent test by a modality known to be immune to task-irrelevant information. Observing the vanishing of the SCE in grasping would add to the growing doubts about its obligatory nature⁴⁶. Simultaneously, this observation would further support the context independent nature of the action system. This much granted, a stringent test invites equally stringent controls to exert its full impact. The control condition is required by the nuanced relation between grasping and motor responding. Note that grasping always involves motor responding mediated by the dorsal action system, but that the reverse is not true. All motor responses are not also those of grasping and are not mediated by the dorsal action system. Suppose that you are presented with two bottles and asked to select the larger one. When you do so by grasping the larger bottle, it is likely that your object-directed response is mediated by the dorsal action system. When you do so by pointing to the direction of the larger bottle with your finger (without holding it and without any plan of holding it), it is likely that your response is mediated by the ventral perceptual system. In point of fact, the latter example is on a par with oral or keypress responding conveying the person’s perception. Therefore, it is important to dissociate the influence of actual grasping from mere motor reaction without grasping or any intention of grasping the object. Again, this latter condition is actually that of perception, the motor movement notwithstanding.

In Experiment 1, we pioneered the crucial examination of the SCE by grasping. The participants grasped literally the larger real-life object depicting a numeral. In the perceptual control condition, the participants made the same decision (i.e., selecting the physically larger object), but they conveyed their response by tapping on the thigh in the side of the selected object. This perceptual control condition was also included to test for the possibility that people simply ignore the numerical information embedded in the objects.

In Experiment 2, we used a modified perceptual size-congruency paradigm aimed at reducing reaction times in order to equate RTs with the grasping condition in Experiment 1. This was done in order to test whether the pattern of results obtained in Experiment 1 in the perceptual task could have been attributed to potential differences in reaction times between the grasping and the perceptual task. In Experiment 3, we expanded the arsenal of measures to include on-line kinematic recording of the full trajectory of movements in the two conditions. To anticipate the main finding, the results of all three experiments converged on the conclusion that the SCE is evident in perceptual estimations but vanishes in grasping.

Experiment 1: Grasping Congruent and Incongruent Objects

Method

Participants

A group of 24 right-handed students from Ben-Gurion University of the Negev, who gave their informed consent, participated in the experiment (11 males; mean age = 25.04, SD = 2.21). The experimental protocol was approved by the ethics committee of the Department of Psychology in Ben-Gurion University of the Negev. The study adhered to the ethical standards of the Declaration of Helsinki.

All participants signed a consent form prior to their participation in the experiment. The manuscript contains no information or images that could lead to identification of a study participant.

Stimuli and Design

The stimuli consisted of wooden objects cut in the form of the Arabic digits 2 and 8. Each digit-form was produced in two values of size, large (50 × 32 mm in height and width) and small (30 × 18 mm). The third dimension was constant at 5 mm. Factorial combination of number and size created the current set of 4 stimuli: a physically large object in the shape of a numerically large number, a physically large object in the shape of a numerically small number, a physically small object in the shape of a numerically large number, and a physically small object in the shape of a numerically small number. Note the first and last of these objects comprise congruent stimuli, whereas the two in-between comprise incongruent stimuli.

On each trial, a pair of objects was placed in front of the participant and she indicated which of the two was physically larger (see Fig. 1). The experiment included two conditions. In the object-directed movement condition (ODM), the participant reached for the physically larger object with the right hand, grasped and lifted it using the index finger and the thumb. In the non-directed movement condition (NDM), the participant used her index finger and thumb to touch either the right or left thigh, corresponding to the location of the physically larger digit. Note that the NDM condition entailed neither grasping nor intension of grasping. Each condition consisted of a block of 96 trials (half congruent, the other half incongruent), preceded by 10 practice trials. The order of conditions and order stimulus presentation within a condition was random and different for each participant.

Procedure

The participant was sitting in front of a black table top with the tips of the index finger and thumb of the right hand closed together and pressing on the Spacebar key as the starting point. The participant wore a set of LCD goggles (Translucent Technologies, Toronto, ON), with liquid-crystal shutter glasses that were used to control for stimulus exposure time. The experimenter was present in the room and placed the stimuli on the top of the table to start each trial. The stimulus objects became visible (i.e., the trial started) only with the opening of the goggles. Two stimulus objects were placed side by side, 120 mm apart. The midpoint between the stimuli was located at a distance of 150 mm from the starting point (see Fig. 1 again). The tasks were speeded and the participants were asked to respond as speedily and accurately as possible.

Data Analysis

The dependent measure was reaction time. We measured the RT of releasing the spacebar upon presentation of the stimulus objects (i.e., from the opening of the glasses). For each participant under each condition, trials with reaction times 2.5 SD below or above average were removed from the analysis (a total of 3% of the trials). RTs were collected using MATLAB (Mathworks, Natick, MA) and Psychtoolbox (http://www.psychtoolbox.org/). Accuracy levels in both tasks were near 100%.

Results

Figure 2 gives the results. The crucial data are presented at the left-hand half of Fig. 2. When the observer grasped the physically larger object, the number imparted by its shape did not make a difference. Incongruent and congruent objects were separated by a negligible less than 2 ms difference with average response times of 336.5 and 335.2 ms, respectively (F < 1). Clearly, the SCE vanished in grasping. A Bayesian analysis further supported the parity with a Bayes factor (BF₀₁) of 2.77, indicating that the null hypothesis was 2.77 times more likely than the alternative. Our observers noticed the number under the current presentation as is revealed by the results in the condition in which the observer merely indicated the lateral location of the physically larger object (a perceptual response, right-hand half of Fig. 2). In the NDM condition, the responses took longer for incongruent than for congruent objects [388 and 373 ms, respectively, F(1,23) = 19.27, p < 0.001, η_p² = 0.45, BF₁₀ = 94]. The SCE thus emerged in this condition. The interaction of task (grasping, directing) and object congruity (congruent, incongruent) further supported the vanishing of the SCE in grasping [F(1,23) = 18.7, p < 0.001, η_p² = 0.45]. A main effect was also found for task [F(1,23) = 20.3, p < 0.001, η_p² = 0.47, BF₁₀ > 1000] with RTs for grasping faster by 44 ms, on average, than those in the perceptual indication task. We discuss the source and implications of this difference between the two tasks in Experiment 2 and in the results section of Experiment 3 in which a similar difference is observed.

Discussion

The evaporation of the SCE in grasping is the signature of the present results. The implications are far-reaching. At minimum, the results indicate that the SCE may be confined to visual perception; it is certainly missing when the response entails direct visuomotor control. It would be easy to dismiss the significance of these results as another example of the insensitivity of grasping to competing task-irrelevant information. On this view, it can be argued that the observers did not even notice the number information. However appealing, this reasoning is not valid. The data in our non-grasping, perceptual condition (done with the same stimuli under the same viewing conditions) show that the same participants were eminently aware of the numbers embedded in the objects (see^47,48 for similar results). We conclude that the breakdown of the SCE in this experiment is a genuine result.

Is the SCE, hence the activation of numerical magnitude, mandatory under all contexts? Our results show that it is not. Our data thus join several demonstrations from visual perception itself in challenging the automaticity assumption in numerical perception. Automatic processing of numerical magnitude is a contingent phenomenon, not a necessary one.

One obvious reservation with respect to the conclusions drawn is the difference in absolute RTs between perception and action. Those for perception were longer than those for grasping. Consequently, the absence of the SCE in grasping can possibly be attributed to this faster responding, rather than to the different modality employed. Experiment 2 was planned to test for this possibility.

Experiment 2: Fast Perceptual Estimation of Congruent and Incongruent Images

Our goal in this experiment was to test for the presence of the SCE in perception when responding is fast as that observed in grasping in Experiment 1. To generate very fast perceptual responses, we introduced three modifications of the task used in Experiment 1. First, we used a computerized version of the SCE task with the images of the same 3D objects appearing on the computer screen. Second, the participants responded to the larger physical image by pressing the appropriate key. Note that these features comprise the standard way of assaying the SCE in visual perception (when there is no interest in another modality). Third, we motivated our participants for fast responding by intruding a time-window at the end of which a disagreeable sound was presented to signal “sluggish” reaction. We asked: Would an SCE still observed under very fast perceptual responding?