Introduction

To say that the self is a construction allows for the possibility of selfless states, experiential states in which the sense of self is highly attenuated or absent. But how to conceptualize such states is tricky.

(Thompson, 2020, p. 114)

Concern and reflection on the problem of the “Self” have a long history in the philosophical tradition. The discourse on the self among ancient Greek philosophers like Heraclitus, Parmenides, and Socrates centered upon the exploration of knowledge, consciousness, and the nature of reality. Plato and Aristotle contributed their insights on the intricate connection between the self and the realms of the physical and the metaphysical. In Eastern philosophy, luminaries like Confucius, Laozi, and Buddha delved into the concept of the self, emphasizing notions of self-cultivation, impermanence, and the illusion of a permanent, unchanging identity. In recent decades, with the development and intersection of philosophy of mind, cognitive science, and consciousness science, the “self” has gradually become a much-debated topic in these fields, and many interdisciplinary studies and discussions have emerged; with philosophers such as Derek Parfit, Thomas Metzinger, and Daniel Wegner examining facets of personal identity, the essence of consciousness, and the effects of social constructs on self-perception. Among the many discussions, the substance theory and illusion theory are the two most representative views, despite their contradictions. The core claim of the former is that the self is a reality, an independent entity, based on the notion that most of us intuitively perceive the sameness of our personality, memories, and recollections as if the subject “I” is always present, while the latter states that the substantial, continuous self is an illusion because our experience is always fluid. Descartes’ theory of the self represents the progenitor of the substance theory because the idea that the self is somehow the “center” of the mental world resonates most with our common sense (Searle, 2005); thus, many researchers hope to find a unique structure or region in the brain as a basis for this. Some researchers believe there is a specific region in the brain that can demonstrate the existence of a unique physiological mechanism behind the ego. For example, Searle (2004) highlighted the necessity of positing a rational self or agent, one that is free to act and is capable of being held responsible for their actions. This notion is a composite of the concepts of free action, explanation, responsibility, and giving reasons for things. However, numerous studies attempting to locate the functional regions associated with the ego in the human brain have found them to be widely distributed. Even self-referential processing has been demonstrated to be mediated by cortical midline structures (Northoff et al., 2006), the whole “self-referential network” is regarded to include multiple brain regions like the prefrontal cortex, posterior cingulate cortex, precuneus, and temporoparietal junction (Damasio, 2012). In other words, emerging brain imaging results have not helped us to locate the expected “center” corresponding to the substantive self. From a neuroscientific perspective, we can see how one neuron affects the others and how one state leads to another, which does not seem to require an “I”. Thus, attention again turns to the illusionist position of the self, the idea that the substantial, continuous self is an illusion, as first proposed by Hume (1896), who suggested that the self is not an entity but rather a “bundle of sensations.” Illusion implies that we reject the self as a conscious entity and accept that the “self” does not represent anything lasting and real.Footnote 1 This claim, although counterintuitive, has profoundly influenced the study of the self in logical empiricism, on which the influential “neuronihilism” of the contemporary conceptualizations of the self is based (Zhang and Chen, 2016).

The greatest dilemma for substance theorists, as highlighted by illusionists, is that they cannot provide more direct evidence for the existence of a physical self, because when we look at ourselves, we seem to find only a series of interconnected processes without an intrinsic independent entity. However, illusionists cannot satisfactorily explain why we experience the self as a unified self-subject that controls what we think and do (Zhang et al., 2015). Unfortunately, classical Western philosophy either assumes a self beyond experience or adopts a neglectful attitude to avoid this issue. Some even fail to notice this contradiction. For example, the self-concept theory in psychology barely discusses the abovementioned problem. The main, and perhaps only, tradition we know of that addresses this paradox directly and has long studied it comes from the Eastern research tradition of meditative practices of suitable mindfulness/awareness (Thompson, 2014). The constructivism of the self inherits the ideas of the Eastern research tradition but rejects the extreme claims of both substance theory and illusion theory and states that the self is neither an entity nor an illusion. Constructivism considers the self as an ongoing process, in which the “I” is equivalent to the process itself; in other words, we are not different from the process. We are different as individuals, not because we have a unique metaphysical quality but because different selves emerge from specific and irreducible conditions, and each person’s self is constructed in the process of change. For this reason, it is necessary to return to the traditional Eastern roots of our understanding of self-constructionism.

Eastern tradition roots of self-constructionism

Five aggregates and the self

The five aggregates refer to material form, feeling, perception (or cognition), inclination (or volition), and consciousness. They are a set of categories common to all schools of Buddhism, and the Sanskrit equivalent of “aggregate” is skandha, which literally means “accumulation or heap” (Davids and Stede, 2015). It is said that when the Buddha first taught the framework for examining experience, he used the term “a heap of grain” to represent each embodiment. In accordance with Buddhist philosophy, the quintet of aggregates coalesces to shape our sensory experiences and the entirety of entities within our subjective worldly domain; they correspond to the five aspects of material and mental levels. From the cognitive psychology perspective, these five aggregates correspond to the systems of information input, emotion encoding, information integration, information processing power, and information recognition (Thompson, 2020). These five aggregates constitute the psychophysical complex which further forms each moment of individual experience and the distinctive person (Varela et al., 2017). Therefore, the five aggregates are a natural and logical way of finding the self.

The material form refers to the body’s composition and the activity of all material objects. The Buddhist classification of matter focuses more on directly understanding the interaction between our sensory organs and matter, which means the intuitive experience is vital. Therefore, material form refers to the body, and the physical stimuli at the level of the sadindriya or six sense organs—eye (cakkhu), ear (sota), nose (ghāna), tongue (jivhā), body (kāya), and mind (mano)—and their corresponding sensory experiences (color, sound, fragrance, taste, touch, and dharma). The material form is also a mental activity that registers certain distinguishable sensory qualities. However, neither the body nor the physical environment is the same as the self. While the body is inarguably vital to us, as we would not want to lose parts of it and would undoubtedly suffer if anything unfortunate happened to it, a scenario such as a “mind transplant” that is commonly depicted in science fiction indicates that the body is not the same as the self; thus, the search for the self does not stop at the material form. Beyond the material form, the other four aggregates belong to the mind.

According to early Buddhist taxonomies, the feeling is a kind of mindfulness that arises from the internal reception of the environment, producing three different reactions—happiness, suffering, and surrender (no suffering or happiness)—in response to the stimuli from the positive or negative external environment (Anālayo, 2015). The feeling exists in the direct experience of different sensory events registered. According to contemporary neuroscience, internal and external sensory stimuli cause a series of bodily changes and reactions, which in turn cause the individual to have either pleasant, unpleasant, or neutral feelings (Damasio, 2003). These three types of feelings arising from the six indriyas concern us and profoundly affect us, but hardly anyone would consider these feelings (either physical or mental) as the self. We regard these feelings more as the individual’s direct and low-level affective appraisal of mental events (affective appraising).

Similar to feeling, perception (or cognition) comes from the six indriyas and refers to our perceptual and cognitive identification of what is registered and felt. In psychological terms, if receptive embodiment refers primarily to sensation, then thought embodiment is perception, and each provides answers to the questions “what is happening to me” and “what is happening in the outside world” respectively (Humphrey, 2022). Perception or cognition is not the same as the self; it is rather about our understanding of the world and also, about inclination (or volition).

Inclination or volition refers to habitual patterns of thinking, feeling, perceiving, and acting, and is the tendency to respond in a particular way to what is perceived as pleasure, suffering, or surrender. Specifically, it manifests itself in the form of subjective activity in the individual in response to external stimuli that result in corresponding language or behavior. This implication is closer to the self, but the self’s stability over time is not equivalent to the content of the action implication, such as habits, motivations, and emotional tendencies, which can change significantly over time.

The last aggregate, consciousness, contains all the other aggregates and is the experience of the mind that accompanies the other four aggregates. Therefore, each indriya or sense will have a different input, which leads to a different experience and object being experienced at each moment, and, certainly, we cannot find our actual self in the sense implication either. In examining each of these in turn, we understand why Nāgārjuna, the founder of the Middle Way (Madhyamaka), believed that the self is neither a separate entity nor can it be said that there is no self at all, and that inquiry into the nature of the self requires recourse to the middle way, which transcends the substance theory and illusion theory (Thompson, 2020).

The middle way and constructivism

Nāgārjuna presents a paradox challenging the notion of the substance theory. If the self were identified with the aggregates, it would undergo processes of arising and ceasing, resembling other impermanent phenomena. Conversely, if the self were considered distinct from the aggregates, it would lack the defining characteristics inherent to the aggregates (Garfield, 1995). In other words, whether we can find the self in the five aggregates, or if the self is equivalent to the five aggregates, or even to one or some of them, and since material forms, feelings, perceptions, and other such factors are constantly changing, arising, and ceasing, then the self will also be constantly changing, arising, and ceasing. The ever-changing five aggregates are insufficient to constitute a self that can remain the same and have an independent existence from one moment to the next. If the above hypothesis does not hold, then the self would have to be composed of something distinct from the five aggregates, and such a self would no longer depend on any experience and would thus be ultimately unknowable (Thompson, 2020), which is also clearly unworkable. Nāgārjuna’s argument can be summarized in the following way: if the self is an independently real thing, then by definition of the qualifying characteristics of being independent, immanent, and absolute, the self should not depend on anything else. However, by analyzing the five aggregates, we find that no real independent thing in the related experiences satisfies this criterion. However, Nāgārjuna also argued that nothing can be separated from the conditions of its emergence, formation, and extinction, and claimed, based on a critique of the two extreme positions, that the self is described in terms of the dependent arising expressions of the early Abhidharma tradition (Varela et al., 2017).

In Madhyamaka, the concept of dependent arising includes three levels of dependence: causal, whole/part, and conceptual (Newland, 2009). Causal dependence is the dependence of a phenomenon on its causes and conditions, not only of its creation but also its cessation and extinction. The whole/part dependence is more often emphasized in Madhyamaka as the dependence of the whole on the part, but the complex systems theory goes further and states that the dependence of the part on the whole is also essential. Conceptual dependence is the most subtle and, therefore, the most important of the three levels of dependence. Conceptual dependence means that our identification of something as a whole depends on how we conceptualize it and use a particular word to refer to it (Garfield, 1995). Nāgārjuna dismantled various philosophical positions and demonstrated that all phenomena are empty of intrinsic nature. He used logical reasoning to show that any attempt to establish an independent, enduring essence for phenomena leads to contradictions (Siderits and Katsura, 2013).

Thompson builds on Nāgārjuna’s advancement with the idea that “the self is constructed in process.” He argues that our ordinary or everyday notion of the self is not a concept of the inner and substantive nature of the person but rather a concept of the experiencing subject and the acting self-subject. When we look closely at what we call the “self,” we find a collection of interconnected processes rather than any separate entity or inherent object. These physical, physiological, mental, and psychological processes have causes and conditions for their emergence and termination, which cannot be separated from them. Thompson proposes an enactive approach to creating the self based on the core idea of the disembodied theory of dependent arising. The most basic concept of this approach is “I am,” which originates from Indian philosophy (Billington, 1997), and is called “I-making” by Thompson to express the feeling of being an “I.” This “I” is the thinker of thoughts, the doer of deeds, and has temporal continuity. Generative cognition understands cognition from the perspective of the living organism as a biodynamic system, a high-level self-organizing, autonomous system in which the self is not reducible to the underlying mind or physical time that constitutes them but emerges from the cautionary wholeness of this self-organizing system. From a non-reductionist standpoint, the generative approach advocates a constructive view of the self, that is, the self is neither a being nor a non-being but rather a “selfless self” that is autonomous and organized to achieve sameness. The self is neither an entity nor an absence but a process of constructing sameness, and it is this process that generates an “I,” the self being the process itself (Thompson, 2014).

Thompson begins the defense of his generative claims by employing a “self-specifying system.” First, since the self is a process, in the process of constructing sameness to generate an “I,” there is a collection of processes that needs to be able to specify each other and, by this mutual specification, simultaneously enable the collection of processes to constitute, as a whole, a self-perpetuating system that is distinct from its environment. This system performs a functional distinction between self and non-self in its interactions with the environment. The boundary between the self and the other is crucial, from the simplest amoeba to the highly complex human. This distinction is the basis for processing “what is happening to the self” and “what is happening in the outside world.” Nevertheless, the key to the eventual emergence of a sense of self is to move from a rudimentary system of self-designation to a full-fledged system of selfhood, which Thompson argues requires a crucial component to move from a simple system that can distinguish between the self and the other to a system that has a constant sense of being the thinker of my thoughts and the performer of my actions, a “self-designating system”. The self-designating system distinguishes between perception and action, generating a boundary between self and non-self; the self-labeling system generates a cognitive, action-self-subject perspective based on the changing body-mind state. This means that individuals can access their changing states of experience and view themselves as subjects of these states. It is within these systems of self-designation and self-labeling that the self is generated, and thus, it is distinct from both a separate thing and a purely mental concept but is constructed as a process (Zhang and Chen, 2016).

The claim of self-constructionism is gradually gaining attention and importance in contemporary times (Thompson, 2014), thanks to the revival of the Eastern research tradition on the one hand and experimental philosophical progression on the other.

Experimental philosophical progression of self-constructionism

One of the fundamental experiences of bodily self-awareness is an awareness of boundary. In Thompson’s enactive approach of the self, the self-designating system’s distinction between the self and the other is accomplished by matching motor commands and sensory stimuli to different judgments. The phenomenologist Gallagher (2000) also values the role of boundary, arguing that the study of the self should focus on the distinction and definition between the minimal self and the narrative self. The minimal self, phenomenologically speaking, is a consciousness that sees itself as the immediate subject of experience and is not continuous in time. The minimal self depends on brain processes and an ecologically embedded body, but having self-experience does not require one to be aware or conscious of it. The narrative self is a more or less coherent self, with a past and a future made up of various stories about ourselves told by us and others. This division can help us better understand that “even after all the unnecessary features of the self have been stripped away, we still have an intuition that there is a basic, immediate or primitive ‘something’ that we would like to call ‘the self’” which makes it easier to apply scientific empirical evidence to philosophical discourse and carry out empirical tests in a more straightforward framework (Gallagher and Zahavi, 2020).

Construction of the minimal self

According to Gallagher, the minimal self has two core components: the sense of ownership and the sense of agency, which help us to effectively identify the bodily self and understand the construction of the minimal self. It is hypothesized that the individual’s awareness of the minimal self is achieved through the experiences of ownership and autonomy. The former is usually defined as the feeling that one’s body (or body part) belongs to them, while the latter is the feeling that I am the one who initiates the action or causes it. In terms of experimental philosophical approaches, the emergence of the rubber hand illusion research paradigm has contributed significantly to the naturalization of self-constructionism (Botvinick and Cohen, 1998; Zhang and Hommel, 2016).

By matching visual and tactile stimuli parallelly, the classical rubber hand illusion experiment can make participants experience the illusion of ownership of an external object that is not actually part of their own body (e.g., a rubber hand). The experiment was conducted as follows: the participant sat in front of a table and placed their left or right hand and forearm on the table according to a cue. During the experiment, a rubber hand was placed in front of them, and a shield was set up next to the participant’s arm to block their view, thus ensuring that they could only see the rubber hand and not their real hand during the experiment. After 10 min, the participant was asked to answer a questionnaire on whether they felt a sense of ownership over the rubber hand, and to perform a proprioceptive drift, that is, without visual feedback. The test was conducted by using a hand that was not brushed to indicate the position of the other hand. The questionnaire and proprioceptive drift results showed that simultaneous visual and tactile stimuli allowed participants to experience ownership of the rubber hand (Zhang and Hommel, 2016; Zhang et al., 2018). This phenomenon directly triggers our thinking about whether body imagery is stable. Body imagery in healthy adults is usually considered relatively stable, and many of our self-relevant cognitive activities depend on this stability to function correctly. However, the rubber hand illusion experiment not only questions the stability of body imagery but also directly raises a question on the plasticity of the self. Several studies have shown that bottom-up stimuli can induce changes in participants’ physical representations (Armel and Ramachandran, 2003). If this is the case, then the plasticity of the self seems conclusive.

However, further research on the illusion of ownership found that there was a negative correlation between the time interval separating visual and tactile stimuli and the degree of ownership illusion experienced by the participants, that is, the illusionary experience decreased as the time interval increased; specifically, a time interval of 300 ms or less hardly affected the participants’ experience of solid ownership of the rubber hand, and when the time interval was between 400 and 500 ms, the illusionary experience decreased significantly. When the interval exceeded 600 ms, the participants tended to stop feeling that the rubber hand was part of their body (Shimada et al., 2009). A negative correlation between distance and illusion experience was also found when controlling for the distance between the real and the fake hands, with the intensity of ownership illusion decreasing with increasing distance between the real and the fake hands. The range commonly used in most experiments is 10−15 cm. When the distance between the real hand and the rubber hand exceeded 27.5 cm, the extent of the illusion of ownership decreased significantly, implying that spatial consistency is also a factor that affects the strength of the illusionary experience (Lloyd, 2007). In addition, feature consistency has also been suggested to influence the production of the illusion of possession. The illusion of possession disappeared when the rubber hand was rotated to an angle of 90° from the real hand, when the rubber hand was replaced by a block of wood that bore no resemblance to the human hand, or even when the rubber hand and the real hand did not belong to the same side (Guterstam et al., 2011). These findings suggest that the mechanism by which visual and tactile extrasensory sensations work in shaping body ownership may not be so simple and that their specific role may be influenced by several factors, as illustrated by the top-down intervention of body imagery in the results obtained from the above study. Nevertheless, the results of the numerous studies on the rubber hand illusion have shaken the traditional notion that bodily self-imagery is stable and unchanging and replaced it with a constructive and plastic idea.

The role of kinesthesia in the experience of body illusions has also been explained to some extent when the bidirectional separation of the sense of possession and autonomy is investigated by introducing motion factors into the classical rubber hand illusion. For example, in the moving rubber hand illusion experiment, a unique setup allowed the participants to experience active motion by controlling the motion of a simulated wooden hand and passive motion by having the simulated hand drive the motion of the real hand. Consistent with the classical rubber hand illusion results, participants develop a sense of ownership over the fake hand when the real and fake hands are in the same position. However, in contrast to the classical study, participants develop a sense of agency as well as a stronger sense of ownership of the rubber hand only when the movement of the real hand is initiated by the participants (Kalckert and Ehrsson, 2014) as opposed to the case of passive movement. In addition to the importance of visual and tactile information, kinesthesia also plays an essential role during the experience of a sense of possession. One study found that by applying tactile stimuli to specific fingers of the participant separately, significant proprioceptive shifts only occurred in the finger to which the tactile stimulus was applied, but not in the unstimulated finger, that is, the sense of possession illusion appeared differentiated across fingers. However, when participants were allowed to move actively, proprioceptive shifts were manifested on all fingers, regardless of whether they received simultaneous visuo-tactile stimulation. This result suggests that the sense of agency plays a role in integrating different body parts into a continuum in forming a unified body perception (Tsakiris, 2017).

Although there remains disagreement as to whether a single multisensory integration or a combination of multisensory integration and bodily representation influences a sense of body ownership or autonomy, the critical role of integration of visual, tactile, and kinesthetic-based extrasensory sensations in shaping an individual’s sense of body ownership is evident (Zhang and Chen, 2016). Traditional studies around body representation and self-identification tend to assume relatively stable imagery; however, the results of a series of rubber hand illusion studies suggest a constructive and generative possibility. Although we cannot overturn the original assumptions in this regard, these studies have at least somewhat shaken the assumption that the self is a single, unchanging entity. Zhang and Chen (2016) found that when the virtual hand was first presented in front of the participants and then moved to the middle position, the participants did not experience a sense of ownership over the virtual hand as strongly as when the virtual hand was first presented away from them and then moved to the middle position. This study suggests the possibility that the self is constructed based on certain principles and norms during our interactions with the outside world. In addition to exploring how the minimal self is represented and constructed, a comprehensive understanding of the self requires the ability to encompass how the narrative self is constructed.

Construction of the narrative self

The first experiment that explores the construction of the narrative self based on a multisensory integration perspective is the enfacement illusion. In the experiment, participants receive the visual stimulus of an unfamiliar face being brushed by a small cotton swab on the screen they are facing and, at the same time, experience the tactile stimulus of a simultaneous or unsynchronized brush by the experimenter on their face; the simultaneous appearance of both visual and tactile stimuli causes changes in self-face recognition. By comparing participants’ self-identification tasks before and after the experiment, the researchers found that synchronized interpersonal multisensory stimulation caused participants to judge the face of an unfamiliar person as resembling their face (compared to the condition in which the visual and tactile stimuli were not synchronized) and affected their performance in distinguishing the differences between the physical features of the other person’s face and their own (Tajadura-Jiménez et al., 2012). This effect is reflected in the mental representation of self-other face features, emotional understanding, social cognition, and other aspects.

When the stimuli used were the hands or faces of individuals of a different race than the participants, the simultaneous interpersonal multisensory stimuli induced some changes in the participants’ implicit social cognitive attitudes; for example, after the effect of the simultaneous visual and tactile stimuli, the participants judged the person in the video as more trustworthy and more similar to themselves in terms of personality traits. The White participants’ implicit attitudes toward dark-skinned individuals were assessed by the Implicit Association Test (IAT) before the start of the experiment. In the experiment, participants were presented with a scene of a Black person’s hand or face being brushed by applying synchronous visual and tactile stimuli, and their implicit attitudes were measured again after the experimental treatment. Results found that synchronized interpersonal multisensory stimuli significantly reduced White participants’ implicit biases toward the Black group. These changes are thought to occur through a process of self-association, which initially occurs in the physical or bodily domain and then extends to the conceptual domain, ultimately allowing the narrative self to assume a continuously constructed plasticity (Maister et al., 2015). In addition, interpersonal multisensory stimulation also plays an important role in emotional cognition, as participants’ speed of recognizing the facial expressions of strangers (especially fearful expressions) increased significantly after experimental treatment with the face recognition illusion wherein they were able to control the movement of the virtual face by moving or touching their face, further demonstrating that mindfulness is migratory (Maister et al., 2013). The experimental results showed that when the movement of the virtual face was consistent with the participant’s active movement, they accepted the emotion expressed on the virtual face on the computer screen as their own. This acceptance was not only reported in the comparison measure of the Including Other in the Self (IOS) scale but also in the completion of the divergent-thinking task, which can be influenced by emotions (Ma et al., 2016). In addition, changing the participants’ affective states changed their familiarity with the stranger’s face, indicating that the stability of the narrative self is vulnerable to the surrounding environment and current personal emotions (Zhang and Hommel, 2022).

In summary, body illusion affects the identification and construction of the individual’s minimal self and changes many aspects of the higher representations of the self, emotional, and social cognition. Zhang and colleagues (2018) investigated the embodied constructs of minimal and narrative selves by examining how sense of ownership and sense of agency influence anxiety. The results showed that different experimental conditions could trigger different levels of ownership and autonomy experiences, and the participants’ anxiety levels were influenced by the type of task (obtaining reward or avoiding punishment) and virtual images (cat’s paw or human hand) under different states of ownership and autonomy, suggesting that there should be both top-down and bottom-up influences between the minimal self and the narrative self. Moreover, the minimal self and the narrative self are likely further connected through the critical constituent of emotion (Zhang et al., 2018). The experimental philosophical approach outlines that the self is influenced by top-down and bottom-up processing and provides rich scientific data through a testable theoretical framework.

When comparing the Eastern and Western research traditions, the two approaches form two types of arguments that are relatively distinct. However, it is easy to identify that the common issue that emerges from both and maybe a breakthrough for the development of self-constructionism, is the question of the first or third-person in contemporary studies of the self and consciousness.

Self’s first and third-person approaches and their integration in eastern and western dialogues

In contemporary philosophical and scientific research on consciousness and the self, there has been a separation and opposition between the first-person and third-person approaches (Varela, 1996): the first-person points to subjective experience, with data corresponding to the mind, while the third-person points to objective observation, with data corresponding to matter. The root of the separation between the two may be the mind-body dualism in traditional philosophy. “The trouble with dualism is that it explains both too much and too little, and few philosophers are satisfied with it” (Humphrey, 1999). However, the various monistic claims are also barely convincing, such as physicalism which, as the most extreme kind of monism, claims that particular subjective feelings are equivalent to particular physical brain processes. This explanatory theory does not easily or satisfactorily account for why there is a problem with the subjectivity of self-experience (Humphrey, 2022). While the purely discursive approach of traditional philosophy cannot cope with this, the problem of the subjectivity of self-experience is beyond the scope of physical explanation, and the problem of the self cannot be solved solely through the objective observational methodology of science. The constructivist claim of the self arises from the irreconcilable debate between entity theory and illusion theory. However, further argumentation and advancement of the constructivist theory of the self require methodological attention to integrating first and third-person approaches and reciprocal approach constraints. Reciprocal constraints, in particular, refer to specific training exercises that improve the participants’ self-awareness so that they can become aware of experiences that are otherwise unnoticeable and express them in the form of verbal reports, thus better guiding the researcher to observe the changes in the training process in the third-person approach.

The first-person method refers to any method that helps the researcher or the subject access and obtain their own conscious experience or subjectivity, such as introspection in psychology, reduction in phenomenology, and cessation in Buddhism. The corresponding data gathered is the vivid experience of human cognition and mental events. In contrast, the third-person method refers to the experimental method of natural science that includes scientific observation, experiment, and induction. It comprises standard scientific research methods in used contemporary cognitive science and cognitive neuroscience, such as skin conductance response, positron tomography, and functional magnetic resonance imaging. The corresponding data is behavioral or neural correlates of first-person experience. The third-person approach is more in line with our preconceptions about scientific research.

Along with the intervention of natural science research methods such as cognitive psychology and neuroscience in the study of the self, the use of the third-person approach has directly provided a large amount of objective data for exploring the self and has contributed significantly to the development of self-constructive claims. Another critical reason contemporary psychological and cognitive science research prefers the third-person approach is that first-person reports are often considered inaccessible and thus not intersubjectively verifiable (Morten, 2008). Thus, rather than the researcher being proactive in choosing a third-person approach, we still need a more precise first-person approach. Although the first- and third-person approaches correspond to two very different ways of viewing the self, this does not necessarily mean that the two approaches are opposed to each other; in fact, what is needed for a comprehensive and objective understanding of the self is a more integrated approach and a more refined first-person approach. This view echoes Chalmers’ call to methodologically advance the analysis of first-person or phenomenological data to become more sophisticated (Choifer, 2018). A focus on first-person methods requires an emphasis on the development of mindful reflection and introspection, and the Buddhist tradition has accumulated a great deal of knowledge and experience in these areas; thus revisiting the Eastern Buddhist tradition can help to develop a better, more integrated and precise approach to the study of self-questions.

The Buddhist tradition differs from ordinary introspection in that it has a method for systematically increasing an individual’s ability to become aware of their own phenomenal experience (Otani, 2003). It is claimed that much of the content on mental phenomena contained in the Buddhist texts is derived from the meditator’s reflection on their inner phenomena during the practice of “meditation.” Among the many types currently available, the central feature of almost every meditation activity remains its “quiescence” (Samatha) and “insight” (Vipasyana) qualities. “Quiescence” emphasizes the stability of the meditation experience and the intensity of attention, and its training is closer to the “focused attention” form of meditation. The practice of “quiescence” requires unwavering focus on a target object. As attention wanders during the practice, the practitioner’s mind may wander, and the goal of the practice is to be alert and redirect attention to the target object. Therefore, the focus of stopping is on the stability of attention. “Insight,” on the other hand, emphasizes awareness and observation of experiential mental phenomena. Its training is akin to the “opening monitoring” form of meditation. The practice of “insight” requires a posture of suspension, a turning of attention to one’s own internal experience, thus reducing attachment to the object of attention and striving to maintain a state of “knowing without following”; subsequently, the practice of open monitoring is performed on this basis, observing the experience as the flow of awareness and alertness is maintained. Thus, the focus of “insight” is on the clarity of the content of attention (Laukkonen and Slagter, 2021). The meditator can gradually grasp a purer self-awareness through practice, thereby increasing the accuracy and objectivity of the first-person report.

Varela (2001) suggested the following specific integration of first- and third-person methods in the study of consciousness: (1) To produce more accurate first-person data in the phenomenological sense, researchers need to create specific experimental contexts so that subjects can actively engage in identifying and describing the experienced phenomenal invariants or categories. (2) Experience’s phenomenal invariants and categories can be used to detect the dynamic neural signals associated with them and the structural invariants of brain activity. Therefore, neuroscientists can rigorously constrain, analyze, and interpret data on the physiology associated with consciousness, thus coming closer to and eventually establishing the relationship between brain activity and subjective experience. (3) Neuroscientific analyses enriched by phenomenological descriptions can help researchers further test the plausibility of first-person data and accordingly revise and improve phenomenological interpretations, thus helping participants to become more fully aware of what was previously inaccessible or to get closer to the phenomenologically inaccessible dimensions of experience (Chen, 2011). Based on this suggestion and idea, we can also advance the integration research progression advocated by self-constructivism in the following way.

First, in rubber hand illusion experiments, one of the most common instruments to measure the sense of self is self-report questionnaires related to the sense of possession or agency, and there is bound to be some individual variation in the accuracy of the results. The reliability of questionnaire results is related mainly to participants’ ability to capture experiences related to their sense of self. The accuracy of data analyzed by third-person methods can be significantly improved if participants have special training in attentional, metacognitive, and bodily perceptual skills. The improvement of subjective perception may be advanced by the study of interoception, which has received more attention in recent years (Chen et al., 2023; Gao et al., 2019). Interoception, in short, refers to an individual’s perception of signals that originate from within the body and is measured by the results of heartbeat perception tasks (Pollatos and Herbert, 2018). For example, the level of interoception can effectively predict the degree of possession experienced in the rubber hand illusion experiment, and individuals with higher interoception are less likely to experience possession of external objects (Tsakiris et al., 2011). The role of interoception varies across contexts (Quigley et al., 2021). For example, when interoceptive indicators such as heartbeat are not presented simultaneously, interoception plays a more invariant role in maintaining the stability of an individual’s original sense of body possession. In contrast, when interoceptive indicators are presented simultaneously with external objects, individuals with higher interoception are more likely to develop a sense of possession of external objects (Suzuki et al., 2013). In this regard, interoception may be an essential link between first-person internal perception and third-person external observation, and an in-depth study of interoception issues could better facilitate the integration of first- and third-person approaches.

Second, one of the major dilemmas in current research regarding the self is the need for a unified normative theoretical framework that can be tested by empirical research. More accurate first-person data could facilitate the formation of more complete theoretical hypotheses that can be tested by empirical research, thus refuting the notion that the experimental philosophical approach around the self-problem can do nothing more than a piecemeal bottom-up investigation of some influencing factors. The free-energy principle proposed by Friston and his colleagues attempts to construct a unified theoretical framework. This idea derives from the second law of thermodynamics, which states that the brain, as an organ of the self-subject assessing the internal and external environment and maintaining individual equilibrium, must maintain a relatively low entropy value. The goal of maintaining a low entropy value is achieved either by acting on the environment to change the input or by updating the assessment of the input to reduce the appearance of surprise (Friston, 2018). Specifically, the brain maintains self-stability through predictive coding. This model supports a hierarchical complementarity between top-down and bottom-up processing. Top-down information flows between layers in the form of predictive outcomes of sensory events and is accessible through first-person reports; bottom-up information flows between layers reflecting the effects of sensory events and is accessible through third-person observations. The highest level of this structure is the multisensory area, which is responsible for the integration and representation of both types of information. This theory has received increasing attention in recent years and, although its explanatory power still needs to be subjected to more tests, it at least provides a theoretical framework that can be tested.

The “self” problem has always been a central topic in studying the human mind, both in the Eastern Buddhist tradition of the Five Aggregates and in the Western experimental philosophical approach, which addresses the nature of the self. Although there are significant differences in the origins and specific methods of the two approaches, the advancement and further exploration of the problem of the self require us to establish a context of dialogue between the Eastern and Western research traditions, and to achieve a transcendence of the current controversy and dilemma by integrating the first- and third-person approaches. The dynamic “reciprocal constraint” (Chen, 2011; Varela, 1996) between the first-person approach and the third-person approach makes the theoretical framework testable, which can be confirmed or falsified by the third-person approach, with the former constantly enriching the theoretical system and the latter promoting the revision and updating the original hypotheses for improvement.