Introduction

Information processing systems have evolved over time to facilitate survival and reproduction and allowing primates to interact with the physical reality that surrounds them. Appropriate information about the environment that can be used to guide action/s is extracted using perceptual systems. Many different objects may be present in a visual field, yet information specific to just one of these objects must uniquely determine the spatiotemporal coordinates of the end-point of the reaching gesture, the orientation and opening of the hand and so on. Some type of selective processes seems to be at work mapping out only those aspects of the visual array specific to the targeted object onto appropriate action control parameters. Due to the pressure to survive in complex environments, highly efficient attention systems have evolved linking action/s with targeted object/s1,2,3.

By necessity, these systems must, nevertheless, represent more than just the object targeted for action. Consider, for example, the moment when someone chooses a piece of fruit from a bowl containing a rich variety: while many pieces are visible and within reaching distance, only the desired one governs the pattern and the direction of the hand's movement. Coherent action propelling the hand directly to the chosen piece would be difficult, if not impossible, if competing objects were not fully represented. As the hand is clearly able to move around and/or above the irrelevant objects (in this case, pieces of fruit), these must - by necessity - be internally represented4.

A number of studies examining the properties of selective reaching-to-grasp actions in humans have shown that the motor system does not process information regarding a target's surroundings in some circumstances, while it does in others. A circle surrounded by other circles will, in fact, appear smaller/larger if the circles surrounding it are enlarged/reduced (the Titchener illusion5). The illusion has been considered evidence that at times the motor system uses a representation of object size that is unaffected by context and quite different from that used by perception processing6. Other ‘illusion’ studies7,8,9 and ‘selection-for-action’ paradigms4,10,11,12,13,14,15,16,17,18,19,20 have, instead, revealed that the perceptual features of objects surrounding a target do indeed determine interference effects. In these circumstances, the simultaneous activation of responses to both a target and a distractor produces a cross-talk during which the kinematic properties of the grasping movement evoked by a distractor contaminate those evoked by the target21,22. This ‘fusion’ process has been explained in terms of selective attention mechanisms, which mediate the selection of particular objects for overt action, with one specific mechanism apparently acting to inhibit competing internal representations of non-targets23,24,25. The interference effects caused by the presence of non-targets observed during those experiments may reflect inhibitory mechanisms in real time.

In contrast with the wealth of the psychophysical data available concerning action selection in humans21,22, it is as yet unclear if this process also applies to motor programming in non-human primates. The study outlined here was undertaken with the intent of ascertaining if these animals (in this case Macaca fascicularis monkeys) apply principles of action selection while they are going about their activities executing normal, often feeding-linked actions, such as grasping objects. The arena of prehensile actions was considered the most favorable condition in comparative terms to investigate action selection mechanisms in non-primates given the similarities between humans and macaques in their kinematic patterns of reaching and grasping26,27,28,29,30,31,32,33 and the possibility of capitalizing on a paradigm already successfully utilized by one study to analyze these mechanisms in humans12. The individuals participating in that particular study were instructed to grasp an object flanked by a distractor that was smaller or larger than the target (i.e., the object to be grasped). In the former case, the amplitude of the hand aperture, that is, the distance between the index finger and the thumb, was greater with respect to a no-distractor-situation while in the latter, the opposite was true. The results indicate that the intrinsic features (in this case, the size) of a distractor elicit competing responses and have a selective influence on kinematic parameterization.

In light of these findings, the present study was designed to systematically investigate reaching-to-grasp movement kinematics in free-ranging macaque monkeys as they grasped objects presented alone or flanked by smaller or larger distractors. We predicted that, just as in humans, if monkeys employ action selection principles for their reaching-to-grasp movements, their grip sizes should vary depending on the features of the distractors that were present and/or viable. We found that monkeys represent distractors in terms of their corresponding grasping actions and these representations compete for action control. These findings suggest that, again just as in humans, action selection mechanisms in monkeys have access to action-directed representations.

Results

Procedure

Twenty adult Old World monkeys (Macaca fascicularis) that were members of a single free-ranging troop made up of 65 animals living in Pulau Besar, Langkawi, Malaysia were observed. The animals were filmed from a distance while they reached for and grasped objects naturally found in that environment. Kinematic analysis of those movements was performed post hoc using an in-house software. In order to compare the results with human studies12,15, our analysis focused on precision and power grips (Fig. 1) and specifically the amplitude of the maximum hand aperture (i.e., the distance between the index finger and the thumb). As previously demonstrated, this dependent parameter efficaciously measures the effects of distractors during grasping movements22.

Figure 1
figure 1

Grip posture, marker positioning and stimuli layout.

Schematic drawing showing the grip posture adopted by an animal and positioning of the markers for the purpose of digitalization. Markers were located (post hoc) on the wrist and on the distal phalanx of the thumb and index fingers. In the upper close up, a precision grip (involving the tip of the forefinger and the thumb) used to grasp small objects and a power grip (the monkey's fingers are wrapped around an object in opposition to the thumb) are represented. In the lower one, an example of the incongruent condition in which a large sized target (T) is flanked by a small distractor (D) is represented.

Test conditions

Out of the 4500 grasping movements filmed, 600 that were performed in a context considered suitable to test our experimental hypotheses were extracted post hoc. One hundred and eighty of these were randomly chosen, 30 for each of the six experimental conditions being studied: 1) There was a ‘small control’ condition in which a monkey grasped a small object using a precision grip. 2) There was a ‘large control’ condition in which a monkey grasped a large object using a power grip. In both control conditions no other objects were present within reaching distance. 3) In the ‘small congruent’ condition the monkey grasped a small object in the presence of other small-sized objects within reaching distance. 4) In the ‘large congruent’ condition the monkey grasped a large object in the presence of other large-sized objects within reaching distance. 5) In the ‘small incongruent’ condition the monkey grasped a small object in the presence of other large-sized objects within reaching distance. 6) In the ‘large incongruent’ condition the monkey grasped a large object in the presence of other small-sized objects within reaching distance.

When an analysis of variance (ANOVA) with object size (small, large) and conditions (control, congruent, incongruent) as within-subjects factors was performed, the object size × condition interaction was found to be significant (F (1,19) = 52.28, P < 0.0001). Post hoc tests indicated that the hand aperture was correlated with the object size for the ‘control’ conditions. That is, the maximum hand aperture was significantly smaller for the small than for large objects and vice versa (Ps < 0.05; Fig. 2). These findings are in accordance with classical kinematic descriptions of reaching-to-grasp movements in both humans and macaques in those cases objects were presented without flankers12,26,27,28,31,32,34,35. Similar effects were found for the ‘congruent’ conditions (Ps < 0.05; Fig. 2). In agreement with human studies employing parallel experimental procedures, the presence of a flanker eliciting a hand aperture similar to the one needed for the target did not modify the correlation between the hand aperture and the object size12. That is, the maximum hand aperture was significantly smaller for the smaller than for the larger objects and vice versa (Ps < 0.05; Fig. 2). Noticeably and as previously reported in human studies14,15,20, in the incongruent conditions, information gained from a flanker did not go unnoticed: the hand aperture used to grasp the target was similar to the one that would have been used for the flanker. When the animal grasped a large target flanked by an object evoking an incongruent small grasp, the amplitude of the maximum hand aperture was smaller than what would have been used if the target had been flanked by a congruent object or presented alone (Ps < 0.05; Fig. 2). The opposite took place when the animal grasped a small target flanked by a large object (Ps < 0.05; Fig. 2). Although the grip aperture for the ‘small incongruent’ condition was smaller, it was, nevertheless, sufficient for the ‘large’ target to be grasped.

Figure 2
figure 2

Graphic representation of the interaction “condition by stimulus size” for the test conditions.

Grip apertures for power and precision grip movements for the control, congruent and incongruent experimental conditions are represented. Bars represent the standard error of means.

Control conditions

To examine whether the number of distractors affected the trial outcomes, we classified the movements depending on the number of distractors present within reaching distance (see Table 1). When an ANOVA analysis including condition (congruent, incongruent), object size (small, large) and number of distractors (1, 2, 3) as the main factors was performed, the condition × object size interaction was found to be significant (F(1,19) = 35.21, P < 0.0001). Post hoc tests for this interaction led to the same results as for the main analysis. The amplitude of the maximum grip aperture for the small congruent condition was smaller than that for the large congruent condition (30 ± 2 vs. 58 ± 3 mm; P < 0.05). And it was significantly smaller for the small incongruent than for the small congruent condition (44 ± 2 vs. 58 ± 4 mm; P < 0.05). The opposite effect was found when a large incongruent was compared with a large congruent condition (30 ± 2 vs. 53 ± 2 mm; P < 0.05). Significant differences between the small incongruent and the large incongruent conditions were also found (44 ± 2 vs. 53 ± 3 mm; P < 0.05). The congruent and incongruent conditions appeared to be unaffected by the number of distractors present, as the ‘number of distractors’ main factor did not interact significantly with the other factors.

Table 1 Movements classified according to the number, type, location and reaching distance of distractors

To examine whether highly salient food items used as targets and/or distractors biased the strength of the effects, we classified the movements depending on the type of items utilized: in the first situation both items were food, in the second both were inedible, in the third the target was a food item and the distractor was inedible and in the fourth the target was inedible and the distractor edible (see Table 1).

When an ANOVA analysis including condition (congruent, incongruent), object size (small, large) and edible/non edible combination (1, 2, 3, 4) was performed, the condition by object size interaction was found to be significant (F(1,19) = 27.12, P < 0.001). Post hoc tests for this interaction confirmed the results of the original analysis. For the congruent conditions, the amplitude of the maximum grip aperture was correlated with object size (small congruent = 30 ± 3 mm; large congruent = 57 ± 4 mm; Ps < 0.05). The amplitude of the maximum grip aperture was smaller for the large incongruent condition than that for the large congruent one (45 ± 2 vs. 57 ± 4 mm; P < 0.05). The opposite effect was found when the small incongruent condition was compared with the small congruent condition (55 ± 4 vs. 30 ± 3 mm; P < 0.05). Significant differences between the small incongruent and large incongruent conditions were also noted (P < 0.05; 45 ± 2 vs. 55 ± 4 mm). The congruent and the incongruent conditions were unaffected by the type of edible/non-edible combination presented as the ‘combination’ main factor did not interact significantly with the others.

To examine whether the distractor's location affected experimental outcomes, we classified the movements depending on their location (see Table 1). An ANOVA including condition (congruent, incongruent), object size (small, large) and location (right, left, behind the target, in front of the target) was performed. The interaction condition by object size was significant (F(1,19) = 46.03, P < 0.0001). Post hoc tests for this interaction confirmed the results of the original analysis. The amplitude of the maximum grip aperture correlated with object size for the congruent conditions (small congruent = 31 ± 2 mm; large congruent = 59 ± 3 mm; Ps < 0.05). The amplitude of the maximum grip aperture was smaller for the large incongruent than for the large congruent condition (45 ± 2 vs. 59 ± 3 mm; P < 0.05). The opposite effect was found when the monkeys grasped a small object flanked by a large one (31 ± 2 vs. 51 ± 3 mm; P < 0.05). Significant differences between small incongruent and large incongruent conditions were found (45 ± 2 vs. 51 ± 3 mm; P < 0.05). The congruent and incongruent conditions were found to be unaffected by the distractor's location as the ‘location’ main factor did not interact significantly with the others.

To examine if the distractors located beyond reaching distance elicited distractor effects similar to those that within reach, additional controls were run. We randomly selected 20 more movements for each experimental condition (‘small congruent’, ‘large congruent’, ‘small incongruent’ and ‘large incongruent’) in which the distractor(s) was/were beyond the animal's reach. Efforts were made to match the conditions in which a target was surrounded by within-reach distractors as far as the number of distractors, distractor types and locations were concerned. When an ANOVA analysis including object size (small, large), conditions (congruent, incongruent) and distractor distance (within reach, beyond reach) as within-subjects factors was performed, the interaction object size × condition × distractor distance was found to be significant (F (1,19) = 38.12, P < 0.0001). Post-hoc tests indicated that the maximum hand aperture was significantly smaller for the small than for the large congruent condition and vice versa (within reach = 30 ± 3 vs. 59 ± 4 mm; beyond reach = 29 ± 3 vs. 59 ± 4 mm; Ps < 0.05). When the distractor was ‘within-reach’, the maximum grip aperture amplitude for the large incongruent condition was smaller than that for the large congruent one (45 ± 4 vs. 59 ± 4 mm; P < 0.05). The opposite effect was noted when the monkey grasped a small object flanked by a large one (30 ± 3 vs. 55 ± 4 mm; P < 0.05). But information gained from the distractor did not influence the target action pathways for the ‘beyond-reach’ incongruent conditions. No significant differences were found between the small congruent and incongruent conditions (29 ± 2 vs. 28 ± 2 mm; P > 0.05) or between the large congruent and incongruent conditions (59 ± 4 vs. 60 ± 3 mm; P > 0.05).

Discussion

This study focused on some aspects of action selection mechanisms in macaque monkeys living in totally unconstrained situations. The paradigm utilized here was, nevertheless, characterized by real world interactions within an established, natural, verifiable structure. It was our intent to observe the primates reaching for an object (such as a stone) when others were logistically available and to verify if and how distractors can interfere with an action plan.

Generally speaking, the monkeys' behavior was comparable with what is classically observed in humans carrying out similar tasks: there were indeed interference effects in movement kinematics when the monkeys went to grasp a target in the presence of distractors12. The fact that these effects were noted in Old World monkeys provides, moreover, further insight into our understanding of how action selection mechanisms have evolved in primates within perception action systems. Some have postulated that action selection processes may have evolved to mediate the selection of particular objects for overt action and one of these could be a mechanism inhibiting competing internal representations of non-targets23. The interference linked to the presence of non-targets noted here may reflect these inhibitory mechanisms in real time.

We hypothesized that objects in an action space can be processed in a parallel way during an initial perceptual analysis of non-targets. As these perceptual inputs are capable of automatically activating their associated responses, this initial perceptual processing flows continuously into brain areas that represent and subsequently initiate action. In view of this highly efficient, automatic conversion of perceptual inputs into object-directed actions, different objects in a scene can evoke parallel actions36,37. In other words, the type of representation created for a distractor contains information about the action that the object prompts and that action, if incompatible, competes with the one programmed for the target. This hypothesis is consistent with the affordance theory38 according to which perceptions can flow directly into actions even if there is little or no intention to act36,38,39,40. Monkeys, then, just as humans, could be sensitive to non-targets' effects in view of their potential role as targets for action.

This conclusion might contrast the model according to which the motor system uses different pathways to represent the visual world and to plan action6, a concept supported by findings suggesting that grasp calibration is refractory to the perceptual features surrounding a target5,41. We need, nevertheless, to remember that most of the distractors used in these experiments were two dimensional (2-D) while the targets themselves were three-dimensional (3-D). In this respect, interference effects on the kinematics of reaching-to-grasp movements appear to be present only for 3-D distracting information42. Two dimensional photographic print distractors which do not share functional features with the target do not appear to affect hand grasp formation. ‘Dimensionality’ may then be the reason why in some circumstances motor resistance to surrounding visual information has been noticed43. The differences, unfortunately, between 2-D versus 3-D distractors cannot be tested in non-human primates, at least in the fully ecological conditions considered in the present study.

The distractors in the incongruent conditions utilized here required a different type of prehension with respect to the target objects. Parallel computations for different types of grasps - one for the target and one for the distractor - may have caused the variations noted in the movement kinematics directed to grasp a target. In macaque monkeys, various types of grasping movements or actions are subserved by different neural populations44 and the kinematic changes depending on the type of grasp needed26,31,32. In light of these findings/observations, it is possible that conflicts erupt when the distractor and the target require different prehensile patterns to be grasped or manipulated. Even though, in fact, the grip aperture needed for the small incongruent condition was larger and a slight movement of the remaining fingers was noted during the pre-shaping phase, the monkey still, nevertheless, used a precision grip. Likewise, although the grip amplitude was smaller for the large incongruent condition, the monkey still used a power grip to grasp the large object.

The results presented here seem to favor the hypothesis that, as previously demonstrated45, monkeys and humans not only share a number of kinematic features and neural responses with regard to grasping actions but also some selection mechanisms, such as inhibition, specifically linked to action control. When attention is focused on a target, the representation of potential distractor/s is inhibited. As both the target and the distractor evoke parallel actions, the competition between simultaneous responses is resolved by inhibitory mechanisms. This hypothesis is largely compatible with theories emphasizing the importance of attention in shaping behaviour by affecting motor output2.

The findings outlined here also seem to support the affordance competition hypothesis46 according to which sensory stimuli tend to directly evoke the actions afforded by them and that competing or interfering evoked actions are eliminated by means of selective attention mechanisms that reduce the amount of information that is transformed into action-related representation. The action control role of selective attention also suggests that one of the primary functions of neural connections from the motor to the sensory areas observed in many species47 could be to facilitate the selection between competing actions.

The main implication of these findings is that monkeys seem to be able to link perception to action through internal representations, implying that non-human primates possess the ability to form mental representations of space and objects48. The actions an object prompts appear to be activated automatically, while inhibitory attentional processes channel the action into meaningful goal-directed behavior. Taken together, our findings mirror those described in humans and indicate that the basic cognitive operations allowing for action selection have deep evolutionary roots.

Methods

Study species

Twenty adult macaque monkeys (Macaca fascicularis), all belonging to a single free-ranging troop made up of 65 animals living in Pulau Besar, Langkawi, Malaysia, were studied. The troop included 5 males and 5 females, all with an estimated age of no less than four years.

Data collection

A total of 10 hours of video footage was filmed during daylight hours between 10.00 a.m. and 14.00 p.m. in the time period between November 2 and November 27, 2008. The video was filmed ad libitum using a digital camcorder. In view of the difficulty of filming any given monkey grasping an object for any length of time before it moved away or turned its back, ad libitum rather than all-occurrence sampling was considered the most appropriate method to assess their behavior in natural conditions49. The monkeys were filmed while they were standing or sitting on the ground as they grasped objects during their normal activities. As all contact with them was avoided, the video footage was filmed using a zoom lens from a distance. Only reaching and grasping movements performed on a plane perpendicular to the camera axis and with the animal located in the central part of the image were selected for analyses. As is documented in the literature concerning both humans34 and macaques28,31, many reaching/grasping movements take place in the sagittal plane, a methodology utilized to avoid motion artifacts.

Grip classification

Two experts unaware of the study rationale and blind to the experimental conditions analyzed the footage frame-by-frame (frame duration: 20 milliseconds) using an in-house software developed to perform post hoc kinematical analysis. Reliability between the two was quite high (Cohen's κ = .85). For comparison purposes, the movements most closely resembling those studied in the laboratory26,28,29,30, in semi-ecological27,50,51 or in fully-ecological31,32 settings, were selected for further analysis. Video frame sequences were analyzed for grasping movements that could be unambiguously identified and classified as a power or a precision grip on the basis of the skin surface areas contacting the object. As explained above, although all grasping movements were analyzed, our study focused on precision and power grips (Fig. 1). Used to manipulate small objects such as seeds and soil fragments, the distal pad of the thumb is opposed to the radial side of the index finger during pinch grip tasks. Used to manipulate large objects such as stones or pieces of fruit, all four fingers and the palm are wrapped around an object in one direction while the thumb is wrapped around it in the opposite one during a power grip task. Obviously, spontaneous movements do not necessarily/exclusively fit into classical categories in natural environments: at times three fingers are involved, at others various finger combinations are utilized often changing fluidly from one configuration to another.

Experimental stimuli

The stimuli considered fell under two main categories, namely ‘small’ and ‘large.’ ‘Small’ objects were small pieces of fruit, soil fragments and small rocks (diameter ~ 1 cm) requiring a precision grip movement implying a small grip aperture (i.e., maximum distance between the index finger and the thumb). ‘Large’ objects were large pieces of fruit, clay balls, large rocks (diameter ~ 4 cm) requiring a power grip movement implying the opposition of the thumb with the other fingers and a large grip aperture. These objects were chosen because their shapes resembled those used in previous studies on humans and macaques (i.e., spherical objects). All of the objects that were assessed were indigenous to that area and were not introduced by the experimenters. The experimental stimuli could be clearly isolated from the background as in all cases the terrain was either sand or mud.

Movement analysis

The video footage was transferred to an in-house software developed to perform two-dimensional (2-D) kinematic analysis. Only those movements that were carried out while the animals were in a sitting position (i.e., with the elbow flexed and the torso bent forward) and that were characterized by similar hand-object distances (20 cm; ±0.3 cm) were compared. That position (Fig. 1) was chosen because it facilitated the comparison of kinematic parameters across human34 and macaque26,27,28,31,32 studies. To avoid any skewing effect, only time frames in which reaching movements were performed along a plane that was perpendicular to the camera axis and the animal was located in the central part of the image were selected and analyzed. The positioning of the video camera axis and the plane of motion were verified by measuring the length of selected bone elements (e.g., arm). This procedure was utilized to guarantee a constant point of reference during movements taking place on a plane perpendicular to the camera axis. A frame of reference identifying X and Y axes as horizontal (ground) and vertical directions was manually set by an operator. A known length, selected case by case, in the camera's field of view and in the same plane as the movement was used as the measurement reference unit. As shown in Fig. 1, markers were placed on each subject's wrist and on the nails of the index fingers and thumbs to indicate the grip aperture as a function of time. The starting position was defined as the right hand resting on the ground in between the legs. The hand starting area for the selected movements was similar across subjects (±0.3 cm2). Initiation of movement was defined as zero wrist velocity. The end of the movement was defined as the moment when the hand grasped the object. Analyses were carried out manually post hoc by a single analyst. Movement tracking procedures were then performed in order to extract the kinematic parameter of interest. More specifically, in accordance with previous selection-for-action grasping studies in humans12, the analysis focused on the maximum grip aperture amplitude (the maximum distance between the thumb and the index finger). In accordance with the observation protocol, the laterality quotient (LQ) was 71 (±10) with a LQ of 100 reflecting a full right-hand preference. In order to facilitate comparisons with human data, only right hand grasping movements were considered.

Statistics

A generalized linear mixed model implemented in the SPSS statistical package was utilized. Bonferroni corrections were applied (alpha level = 0.05).