Introduction

Indirect reciprocity refers to a mechanism of evolution of giving behavior wherein a cooperator is given help not from its beneficiary but from a third party1,2,3. Social information about others, such as reputation or gossip, plays a central role there in order to distinguish between helpful and non-helpful individuals. In our society, it is common for individuals to give good or bad reputations to each other according to how they behaved in previous social encounters. In particular, when we establish a large-scale society, in which they contact not only their relatives but inevitably many strangers, knowing reputations of such strangers is essential. For example, it has been suggested that two thirds of our conversation is about social topics4,5,6, implying the importance of reputation in our daily life. Consequently, complex structure of mutual evaluation among individuals can emerge in a society, where a variety of individuals exist. For example, some may receive good reputations from many individuals, and others may receive bad reputations from many individuals. There may also be intermediate ones who receive good and bad reputations to some equal extent.

Reputation in indirect reciprocity is moral assessment of individuals, namely who is good and who is bad, in a world of binary reputations. Many theoretical models of indirect reciprocity have considered a situation where all in the population give the same reputation to a given individual7,8,9,10,11,12,13,14,15. One of the reasons for this treatment is because the model becomes analytically tractable. Such reputation is called “public reputation”. Under a public reputation model, we need to know how a given individual is evaluated but not by whom, which considerably simplifies the system. The reputation state of all individuals in the population is given by a one-dimensional array, each component of which is how individual (say, i) is evaluated.

To consider a more realistic and general situation, however, we suppose another setting in which each individual independently evaluates a given individual. A reputation given under such a system is called “private reputation”. Under the assumption of private reputation, opinions on the same person may not agree between individuals, and hence we need to know not only how a given one is evaluated but also by whom. The reputation state of all individuals in the population is, thus, generally represented by a two-dimensional matrix (called “image matrix”16,17,18,19,20; each of its components represents how individual (say, i) is evaluated by another individual (say, j).

Reputations can be private if only a part of the individuals in the population can observe a specific interaction7,16,18,19,20,21, if there is a possibility of individually committing errors in assigning reputations to others8,18,20,21,22, and/or if different individuals adopt different rules of reputation assessment16,21,23,24. In recent years, models of private reputation have been used in studies for reasoning a variety of human nature, such as empathetic behaviors25,26,27, prejudicial attitudes28, and so on29,30,31,32. Most of those studies have been based on individual-based computer simulations so far (but see references18,23,30,33,34), primarily because of its difficulty of their analytical treatment. In that respect, the study33 is notable in the sense that it makes a strict, but extreme, assumption that a social interaction is observed by a single observer, in order for the authors to avoid solving infinitely many equations of joint probabilities. However, for a more general setting where many observers independently observe the same social interaction, the nature of “image matrix”, that is, the opinion distribution of who evaluates whom and how, has been studied only through computer simulations. To our knowledge, no analytical insights have been provided so far.

In this study, we analytically tackle the question of private reputation. We assume that all individuals adopt the same “discriminator strategy” (explained in Model section in detail). Following a widespread convention in studies of indirect reciprocity, we consider a world of binary reputation; an individual is deemed either good or bad. A rule of how to assign a reputation is called “social norm”, and we assume that all individuals in the population share the same social norm. However, as a source of disagreement between different individuals on the reputation of the same target, we consider individual errors in reputation assignment; that is, each individual can independently commit an error in assigning a reputation to others. Thus, the same person (say, i) can be deemed good by some individuals and deemed bad by the other individuals in the population at the same time. The “goodness” of individual i is then defined as the proportion of those who regard i as good among all in the population. Under this setting, we derive an integro-differential equation that describes how the frequency distribution of goodnesses in the population changes over time, and calculate its equilibrium distribution. When the population is sufficiently large, we demonstrate that the equilibrium distribution is approximated by a summation of Gaussian functions. Furthermore, we reveal that the equilibrium distribution of goodness very much differs between social norms adopted in the population. We then give intuitive interpretations to each equilibrium. We believe that this study provides a fundamental advance in the study of indirect reciprocity. In addition, the results of this study can lead to unraveling complex relationships among individuals through reputations in a society.

Model

Let us consider a population where there are a certain number, N, of individuals. Suppose that at each time t, either a good or bad reputation is given from every individual to every individual. This corresponds to a case of “private reputation”, where each individual independently assigns a reputation toward the same target. Let \(\beta _{ji}\) be one (resp. zero) if individual i is good (resp. bad) in the eyes of individual j. Matrix \(\{\beta _{ji}\}\) is called the image matrix. As noted in the introduction, the “goodness” of individual i, denoted by \(p_{i}\), is defined as the proportion of individuals who give a good reputation to individual i in the population. Thus, it is given as \(p_i=N_i/N\), where \(N_{i}=\sum _{j=1}^{N} \beta _{ji}\) is the total number of individuals who give a good reputation to individual i.

At each elementary step of update, we randomly select a donor and a recipient from among N individuals (see Fig. 1 for schematics). They may be the same individual, but such a case occurs with probability 1/N and can rightfully be neglected in the following analysis that assumes a large N. The donor takes one of the two actions to the recipient; cooperation or defection. The donor has a rule to choose to cooperate or defect, that can be conditional on the reputation of the recipient in the eyes of the donor. This study supposes that all individuals have the same rule called “discriminator strategy”. A donor with this strategy chooses to cooperate (resp. defect) with a recipient whose reputation is good (resp. bad) in the eyes of the donor. In words of the image matrix, donor (\(i_D\)) chooses cooperation (resp. defection) with recipient (\(i_R\)) if \(\beta _{{i_D}{i_R}}=1\) (resp. \(\beta _{{i_D}{i_R}}=0\)). We suppose that with probability \(0 \le e_1 \le 1/2\), the donor takes the opposite action to the intended one, in which case we say that an “error in action” occurred.

Figure 1
figure 1

Schematics of indirect reciprocity with private reputation. In every round, a donor (\(i_D\)) and a recipient (\(i_R\)) are randomly chosen. A goodness of the recipient in the present round is given by \(p_{i_R}\). In other words, the recipient’s reputation in the eyes of a random observer is good (resp. bad) with probability \(p_{i_R}\) (resp. \(1-p_{i_R}\)). The donor chooses cooperation (resp. defection) with the recipient if the recipient’s reputation in the eyes of the donor is good (resp. bad). After the interaction, each observer independently assigns a new reputation to the donor by taking into account whether the donor took cooperation or defection and whether the recipient’s reputation in the eyes of that observer was good or bad before the interaction. As a result, the goodness of the donor is updated to \(p'_{i_D}\).

After an action, intended or unintended, is taken by the donor, all individuals independently update the donor’s reputations as observers (see Fig. 1). In words of the image matrix, a reputation of the donor (\(i_D\)) in the eyes of each observer j, which is denoted by \(\beta _{j{i_D}}\), is updated for all j simultaneously. How each observer updates donor’s reputation follows a social norm adopted by the observer. In this study, we consider “second-order” social norms, which are mappings that assign an updated reputation to the donor from a combination of the donor’s actual action (first-order information) and the recipient’s reputation in the eyes of the observer (second-order information)35. We assume that all individuals adopt the same social norm. Here, we suppose that with probability \(0< e_2 < 1/2\), an observer assigns to the donor the opposite reputation to the intended one, in which case we say that an “error in assessment” occurred.

Furthermore, among \(2^4=16\) possible second-order social norms, we focus on four norms: Stern-Judging (SJ), Simple Standing (SS), Shunning (SH), and Scoring (SC), which have often been the main target of studies in the literature of indirect reciprocity among 16 possible second-order norms7,12,14. Table 1 shows how these four norms assign a reputation. Observers with these norms assign a good (resp. bad) reputation to a donor when the donor cooperates (resp. defects) with a good recipient from the observer’s point of view. On the other hand, there are some differences in their ways of reputation assignment when the recipient is bad from the observer’s point of view. First, SC7 gives the same reputation independent of whether the recipient is good or bad. Thus, SC is a first-order norm in accurate classification. Second, SJ14 (also known as “Kandori” after Kandori36) conversely assigns a bad (resp. good) reputation to a donor who cooperates (resp. defects) with the bad recipient. Third, SS12 always gives a good reputation to a donor when the recipient is bad. Fourth, SH3 always gives a bad reputation to a donor when the recipient is bad.

Table 1 How observers with four social norms, \({\text{{SJ}}}\), \({\text{{SS}}}\), \({\text{{SH}}}\), and \({\text{{SC}}}\), assign a reputation to a donor when an error in assessment does not occur. Rows indicate whether the donor takes cooperation (C) or defection (D) with a recipient. Columns indicate whether the recipient’s reputation is good (G) or bad (B) in the eyes of the observer.

We are interested in what type of structure of reputation assessment between individuals emerges in the population, and why. To this end, we will analytically derive the equilibrium distribution of “goodness” of individuals in the population.

An overview of simulation results

First, we have conducted individual-based computer simulations. Fig. 2-A shows a snapshot of reputation assignment between all the individuals after a sufficiently long time has passed in a simulation. We note that, for social norm SJ, a similar pattern has been observed in the study18 (see their Fig. 2). Hilbe et al.20 have obtained the image matrix for eight different social norms, including SS and SJ (see their Fig. 2). Figure 2-B (colored area) is a frequency distribution of goodness, \(p_i\), in an equilibrium state obtained by computer simulations. The four panels clearly differ from each other, depending on what social norm is employed by the population. Below we will develop a theory that explains those patterns shown in Fig. 2-B.

Figure 2
figure 2

(A) Reputations between all individuals. The image matrix \(\{\beta _{ji}\}\) is drawn, where each row represents who evaluates (j) and each column represents who is evaluated (i). Colored and uncolored dots indicate good (\(\beta _{ji}=1\)) and bad (\(\beta _{ji}=0\)) reputations, respectively. From the top, each panel indicates that individuals employ norms SJ, SS, SH, and SC, respectively. One might easily see the vertical stripes on the panels of SS and SC, which mean that various goodnesses coexist among individuals. For all the panels, computer simulations are performed with parameters \(N=100\), \(e_1=e_2=0.1\). In our computer simulations, we assume that N elementary steps of updates occur per unit time. These snapshots are taken at time \(t=100\) (sufficiently long time passed). (B) Frequency distribution of goodness, \(p_i\), at an equilibrium calculated from computer simulation results. The horizontal and vertical axes indicate goodness p and equilibrium frequency \(\phi ^*(p)\), respectively. Computer simulations are performed with parameters \(N=500\), \(e_1=e_2=0.1\). The equilibrium frequency distribution, represented by colored areas in each panel, is calculated by taking the time average of 1000 snapshots during time \(101\le t\le 1100\). Curves in black represent our analytical approximations using mixture Gaussian distribution fitting (details explained in the main text), and they show excellent fits to the results of computer simulations (see insets for minor deviations). Numbers next to each peak represent labels of each Gaussian distribution, which shall be introduced later in the main text.

Formulation of macroscopic dynamics of reputation

We now consider a single update of goodness \(p_i\) (see Fig. 1 for schematics). The update is a process in which a donor and a recipient are randomly chosen from the population, the donor takes an action to the recipient, and all individuals in the population update the donor’s reputation in their eyes. Suppose that the donor is individual \(i_D\) and that the recipient is individual \(i_R\). In the following we denote the social norms employed in the population as \(A={\text{SJ}},{{\text{SS}}},{\text{SH}},{\text{SC}}\).

Because the goodness of the recipient is \(p_{i_R}\) and because the donor is randomly sampled from the population, the probability that the recipient is good in the eyes of the donor is \(p_{i_R}\). Given this, there are two possibilities in donor’s actual action toward the recipient.

In the first possibility, the donor cooperates with the recipient. This occurs with probability

$$\begin{aligned}&h(p_{i_R}):=p_{i_R}(1-e_1)+(1-p_{i_R})e_1 \end{aligned}$$
(1)

here, the first term of Eq. (1) represents the case in which the recipient’s reputation in the eyes of the donor is good (with probability \(p_{i_R}\)) and the donor succeeds in performing cooperation as intended (with probability \(1-e_1\)). On the other hand, the second term represents the other case in which the recipient’s reputation in the eyes of the donor is bad (with probability \(1-p_{i_R}\)) but the donor erroneously cooperates (with probability \(e_1\)). When the donor cooperates with the recipient, the number of those who assign a good reputation to the donor at the next time step (i.e. after this donor’s cooperation), denoted as \(N_{i_D}'\), is given by

$$\begin{aligned} \begin{aligned}{}&N_{i_D}' = X_{1} + X_{2}{,} \\&X_{1} \sim {{\mathscr {B}}}(N_{i_R},a^{\text{GC}}_A){,} \\&X_{2} \sim {{\mathscr {B}}}(N-N_{i_R},a^{\text{BC}}_A), \end{aligned} \end{aligned}$$
(2)

or, in an equivalent shorthand notation;

$$\begin{aligned}{}&N_{i_D}'\sim {{\mathscr {B}}}(N_{i_R},a^{\text{GC}}_A)+{{\mathscr {B}}}(N-N_{i_R},a^{\text{BC}}_A) \end{aligned}$$
(3)

here, \({{\mathscr {B}}}(n,p)\) represents a binomial distribution with success probability p and trial number n. The first term on the right side of Eq. (3) is the number of individuals who assign good reputations to donor \(i_D\) at the next time step among \(N_{i_R}\) observers who assign good reputations to recipient \(i_R\) at the present time step. There, \(a^{\text{GC}}_A\) indicates the probability that an observer who assigns a good (G) reputation to the recipient at the present time step assigns a good reputation at the next time step to the donor who cooperates (C) with that recipient under social norm A. The values of \(a^{\text{GC}}_A\) can be calculated for each social norm A, and they are shown in Table 2. The second term in the right side of Eq. (3) is the number of individuals who assign good reputations to donor \(i_D\) at the next time step among \(N-N_{i_R}\) observers who assign bad reputations to the recipient at the present time step. There, \(a^{\text{BC}}_A\) indicates the probability that an observer who assigns a bad (B) reputation to the recipient at the present time step assigns a good reputation at the next time step to the donor who cooperates (C) with that recipient under social norm A (see Table 2). For the calculation of \(a^{\text{GC}}_A\) and \(a^{\text{BC}}_A\), compare Tables 1 with 2; G-pivots in Table 1 become \(1-e_{2}\) in Table 2, corresponding to the fact that the assignment of a good reputation to the donor is successful without an error in assessment with probability \(1-e_{2}\), and B-pivots in Table 1 become \(e_{2}\) in Table 2, corresponding to the fact that a good reputation is erroneously assigned to the donor with probability \(e_{2}\).

Table 2 Probabilities with which an observer assigns a good reputation to a donor, given the donor’s action toward the recipient and the observer’s evaluation of the recipient at the present time. Rows indicate whether the donor chooses to cooperate (C) or defect (D) with the recipient, and columns indicate whether the observer assigns a good (G) or bad (B) reputation to the recipient at the present time step.

The expected value and the variance of \(p_{i_D}'=N_{i_D}'/N\) are given by

$$\begin{aligned} \text {E}[p_{i_D}']&=\frac{\text {E}[N_{i_D}']}{N}=(\underbrace{a^{\mathrm{{GC}}}_A-a^{\mathrm{{BC}}}_A}_{=:\Delta f_A^{\mathrm{{C}}}})p_{i_R}+a^{\mathrm{BC}}_A&(=:f_A^{\mathrm{{C}}}(p_{i_R})), \end{aligned}$$
(4)
$$\begin{aligned} \text {Var}[p_{i_D}']&=\frac{\text {Var}[N_{i_D}']}{N^2}=\frac{p_{i_R}a^{\mathrm{GC}}_A(1-a^{\mathrm{GC}}_A)+(1-p_{i_R})a^{\mathrm{BC}}_A(1-a^{\mathrm{BC}}_A)}{N}=\frac{e_2(1-e_2)}{N}&(=:s^2) . \end{aligned}$$
(5)

In the second possibility, the donor defects with the recipient. This occurs with the complementary probability to Eq. (1), that is

$$\begin{aligned}&1-h(p_{i_R})=p_{i_R}e_1+(1-p_{i_R})(1-e_1). \end{aligned}$$
(6)

The value of \(N_{i_D}'\) at the next time step follows

$$\begin{aligned}{}&N_{i_D}'\sim {{\mathscr {B}}}(N_{i_R},a^{\mathrm{GD}}_A)+{{\mathscr {B}}}(N-N_{i_R},a^{\mathrm{BD}}_A), \end{aligned}$$
(7)

where we have used the same shorthand notation as Eq. (3). Here, \(a^{\mathrm{GD}}_A\) indicates the probability that an observer who assigns a good (G) reputation to the recipient at the present time step assigns a good reputation at the next time step to the donor who defects (D) with that recipient under social norm A. Similarly, \(a^{\mathrm{BD}}_A\) indicates the probability that an observer who assigns a bad (B) reputation to the recipient at the present time step assigns a good reputation at the next time step to the donor who defects (D) with that recipient under social norm A. See Table 2 for their values.

The expected value and the variance of \(p_{i_D}'=N_{i_D}'/N\) are given by

$$\begin{aligned} \text {E}[p_{i_D}']&=\frac{\text {E}[N_{i_D}']}{N}=(\underbrace{a^{\mathrm{GD}}_A-a^{\mathrm{BD}}_A}_{=:\Delta f_A^{\mathrm{D}}}) p_{i_R} +a^{\mathrm{BD}}_A&(=:f_A^{\mathrm{D}}(p_{i_R})), \end{aligned}$$
(8)
$$\begin{aligned} \text {Var}[p_{i_D}']&=\frac{\text {Var}[N_{i_D}']}{N^2}=\frac{p_{i_R}a^{\mathrm{GD}}_A(1-a^{\mathrm{GD}}_A)+(1-p_{i_R})a^{\mathrm{BD}}_A(1-a^{\mathrm{BD}}_A)}{N}=\frac{e_2(1-e_2)}{N}&(= s^2) . \end{aligned}$$
(9)

Two linear functions \(f_A^{\mathrm{C}}\) (defined in Eq. (4)) and \(f_A^{\mathrm{D}}\) (defined in Eq. (8)), as well as their slopes \(\Delta f_A^{\mathrm{C}}\) and \(\Delta f_A^{\mathrm{D}}\), will be of particular importance in the analysis below. In the following, we call \(f_A^{\mathrm{C}}\) and \(f_A^{\mathrm{D}}\) “C-map” and “D-map”, respectively.

Time change of reputation distribution

For simplicity we start with the case of \(N\rightarrow \infty\), where the variance \(s^2\) in Eqs. (5) and (9) is ignored. Let us define \(\phi (p)\) as a frequency distribution of individuals with goodness p in the population. Then, its time evolution is given by

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t}\phi (p)=-\phi (p)+\int _{0}^{1}\{h(p')\delta (p-f_A^{\mathrm{C}}(p'))+(1-h(p'))\delta (p-f_A^{\mathrm{D}}(p'))\}\phi (p')\mathrm {d}p' \end{aligned}$$
(10)

here, we use \(\delta (\cdot )\) as a Dirac delta function. In Eq. (10), the first term on the right side represents a loss of individuals with goodness p due to updates of their reputations. The first (resp. second) term inside the integral on the right side represents donors with an updated goodness p after meeting a recipient with goodness \(p'\) and cooperating (resp. defecting) with him/her.

Next, we consider a case of \(1\ll N<\infty\), and replace delta functions in Eq. (10) with Gaussian functions, because binomial distribution is well approximated by Gaussian distribution for large N. In the following, we represent a Gaussian function with mean \(\mu\) and variance \(\sigma ^2\) by

$$\begin{aligned} g(p;\mu ,\sigma ^2):=\frac{1}{\sqrt{2\pi \sigma ^2}}\exp \left[-\frac{(p-\mu )^{2}}{2\sigma ^2}\right] . \end{aligned}$$
(11)

Accordingly, \(\delta (p-f_A^{\mathrm{C}}(p'))\) and \(\delta (p-f_A^{\mathrm{D}}(p'))\) in Eq. (10) are replaced with \(g(p;f_A^{\mathrm{C}}(p'),s^2)\) and \(g(p;f_A^{\mathrm{D}}(p'),s^2)\), respectively. Thus we obtain

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t}\phi (p)=-\phi (p)+\int _{0}^{1}\{h(p')g(p;f_A^{\mathrm{C}}(p'),s^2)+(1-h(p'))g(p;f_A^{\mathrm{D}}(p'),s^2)\}\phi (p')\mathrm {d}p'. \end{aligned}$$
(12)

A calculation of equilibrium state

Again, we start with the case of \(N\rightarrow \infty\). When \(\mathrm{d}\phi /\mathrm{d}t=0\) is satisfied in Eq. (10), an equilibrium state \(\phi =\phi ^*\) of the equation is given by

$$\begin{aligned} \phi ^*(p)=\int _{0}^{1}\{h(p')\delta (p-f_A^{\mathrm{C}}(p'))+(1-h(p'))\delta (p-f_A^{\mathrm{D}}(p'))\}\phi ^*(p')\mathrm {d}p'. \end{aligned}$$
(13)

We assume that this equilibrium state is described by a summation of delta functions with peak \(\mu _j\) and mass \(q_j\) (\(j=1, \ldots )\), i.e.,

$$\begin{aligned} \begin{aligned}{}&\phi ^*(p)=\sum _jq_j\delta (p-\mu _j),\\&\sum _jq_j=1. \end{aligned} \end{aligned}$$
(14)

By substituting Eq. (14) in Eq. (13), we obtain

$$\begin{aligned} \begin{aligned} \sum _jq_j\delta (p-\mu _j)&=\int _{0}^{1}\{h(p')\delta (p-f_A^{\mathrm{C}}(p'))+(1-h(p'))\delta (p-f_A^{\mathrm{D}}(p'))\}\sum _jq_j\delta (p'-\mu _j)\mathrm {d}p'\\&=\sum _jq_j\{h(\mu _j)\delta (p-f_A^{\mathrm{C}}(\mu _j))+(1-h(\mu _j))\delta (p-f_A^{\mathrm{D}}(\mu _j))\}. \end{aligned} \end{aligned}$$
(15)

Thus, the problem in the case of \(N\rightarrow \infty\) is to derive pairs \(\{(q_j,\mu _j)\}_{j=1, \ldots }\) which satisfy Eq. (15).

Next, we consider the case of \(1\ll N<\infty\). From Eq. (12), an equilibrium state \(\phi =\phi ^*\) is given by

$$\begin{aligned} \phi ^*(p)=\int _{0}^{1}\{p'g(p;f_A^{\mathrm{C}}(p'),s^2)+(1-p')g(p;f_A^{\mathrm{D}}(p'),s^2)\}\phi ^*(p')\mathrm {d}p', \end{aligned}$$
(16)

here we assume that this equilibrium state is given by a summation of Gaussian functions;

$$\begin{aligned} \begin{aligned}{}&\phi ^*(p)=\sum _jq_jg(p;\mu _j,\sigma _j^2),\\&\sum _jq_j=1. \end{aligned} \end{aligned}$$
(17)

We further assume that deviations \(\sigma _j\) are negligible in the order of O(1) and that

$$\begin{aligned} \sigma _j=O(N^{-1/2}) \end{aligned}$$
(18)

holds. When we substitute Eq. (17) into Eq. (16), we obtain

$$\begin{aligned} \begin{aligned} \sum _jq_jg(p;\mu _j,\sigma _j^2)&=\int _{0}^{1}\{h(p')g(p;f_A^{\mathrm{C}}(p'),s^2)+(1-h(p'))g(p;f_A^{\mathrm{D}}(p'),s^2)\}\sum _jq_jg(p';\mu _j,\sigma _j)\mathrm {d}p'\\&=\sum _jq_j\int _{0}^{1}\{h(p')g(p;f_A^{\mathrm{C}}(p'),s^2)+(1-h(p'))g(p;f_A^{\mathrm{D}}(p'),s^2)\}g(p';\mu _j,\sigma _j^2)\mathrm {d}p'\\&\simeq \sum _jq_j\int _{-\infty }^{\infty }\{h(\mu _j)g(p;f_A^{\mathrm{C}}(p'),s^2)+(1-h(\mu _j))g(p;f_A^{\mathrm{D}}(p'),s^2)\}g(p';\mu _j,\sigma _j^2)\mathrm {d}p'\\&=\sum _jq_j\{h(\mu _j)g(p;f_A^{\mathrm{C}}(\mu _j),s^2+(\Delta f_A^{\mathrm{C}})^2\sigma _j^2)+(1-h(\mu _j))g(p;f_A^{\mathrm{D}}(\mu _j),s^2+(\Delta f_A^{\mathrm{D}})^2\sigma _j^2)\} \end{aligned} \end{aligned}$$
(19)

here, from the second to third line, we have used the following two approximations. One is that the interval of integral \(0\le p'\le 1\) is replaced with \(-\infty< p' < \infty\). The other is that some but not all \(p'\) are replaced with \(\mu _j\). A rationale behind these approximations are that \(g(p';\mu _j,\sigma _j^2)\) is almost zero outside the interval \(\mu _j - O(N^{-1/2})< p' < \mu _j + O(N^{-1/2})\), the width of which is as small as \(O(N^{-1/2})\). From the third to fourth line in Eq. (19), we have calculated an integral of a product of two Gaussian functions through completing the square with respect to \(p'\), as

$$\begin{aligned} \begin{aligned}{}&\int _{-\infty }^{\infty }g(p;f_A^{\mathrm{C}}(p'),s^2)g(p';\mu _j,\sigma _j^2)\mathrm {d}p'\\&=\int _{-\infty }^{\infty }\frac{1}{\sqrt{2\pi s^2}}\exp \left[ -\frac{(p-f_A^{\mathrm{C}}(p'))^2}{2s^2}\right] \frac{1}{\sqrt{2\pi \sigma _j^2}}\exp \left[ -\frac{(p'-\mu _j)^2}{2\sigma _j^2}\right] \mathrm {d}p'\\&=\frac{1}{\sqrt{2\pi s^2}}\frac{1}{\sqrt{2\pi \sigma _j^2}}\int _{-\infty }^{\infty }\exp \left[ -\frac{(p-(\Delta f_A^{\mathrm{C}}p' + a_A^{\mathrm{BC}}) )^2}{2s^2}-\frac{(p'-\mu _j)^2}{2\sigma _j^2}\right] \mathrm {d}p'\\&=\frac{1}{\sqrt{2\pi s^2}}\frac{1}{\sqrt{2\pi \sigma _j^2}}\exp \left[ -\frac{(p-(\Delta f_A^{\mathrm{C}}\mu _j + a_A^{\mathrm{BC}}) )^2}{2(s^2+(\Delta f_A^{\mathrm{C}})^2\sigma _j^2)}\right] \\&\quad \underbrace{\int _{-\infty }^{\infty }\exp \left[ -\frac{s^2+(\Delta f_A^{\mathrm{C}})^2\sigma _j^2}{2s^{2}\sigma _j^2}\left( p'-\frac{s^2\mu _j+(\Delta f_A^{\mathrm{C}})(p-a_A^{\mathrm{BC}})\sigma _j^2}{s^2+(\Delta f_A^{\mathrm{C}})^2\sigma _j^2} \right) ^2\right] \mathrm {d}p'}_{=\displaystyle \sqrt{2\pi \frac{s^2\sigma _j^2}{s^2+(\Delta f_A^{\mathrm{C}})^2\sigma _j^2}}}\\&=\frac{1}{\sqrt{2\pi (s^2+\sigma _j^2(\Delta f_A^{\mathrm{C}})^2)}}\exp \left[ -\frac{(p-f_A^{\mathrm{C}}(\mu _j))^2}{2(s^2+(\Delta f_A^{\mathrm{C}})^2\sigma _j^2)}\right] \\&=g(p;f_A^{\mathrm{C}}(\mu _j),s^2+(\Delta f_A^{\mathrm{C}})^2\sigma _j^2). \end{aligned} \end{aligned}$$
(20)

Thus, the problem in the case of \(1\ll N<\infty\) is to derive triples \(\{(q_j,\mu _j,\sigma _j)\}_{j=1, \cdots }\) which satisfy Eq. (19).

Now we give an intuitive interpretation of Eq. (19). The left side of Eq. (19) represents a summation of Gaussian functions with mean \(\mu _j\) and variance \(\sigma _j^2\), whereas the right side represents another summation of Gaussian functions, which have been transformed from the original summation. Let us call individuals represented by the jth Gaussian function \(g(p;\mu _j,\sigma _j^2)\) with mass \(q_{j}\) “class-j” individuals. Eq. (19) tells us that among those donors who interact with class-j recipients, the fraction \(h(\mu _j)\) of them cooperate with their recipients, and the distribution of their updated goodness becomes \(g(p;f_A^{\mathrm{C}}(\mu _j),s^2+(\Delta f_A^{\mathrm{C}})^2\sigma _j^2)\). The transition of mean, \(\mu _j\mapsto f_A^{\mathrm{C}}(\mu _j)\), is governed by the C-map. As for the transition of variance, \(\sigma _j^2\mapsto s^2+(\Delta f_A^{\mathrm{C}})^2\sigma _j^2\), the first term \(s^2\) represents newly generated variance due to errors in assessment and to the finiteness of the population size. The second term \((\Delta f_A^{\mathrm{C}})^2\sigma _j^2\) means that the variance \(\sigma _j^2\) that originally existed in the distribution of goodness of class-j recipients is damped by the C-map (recall that its slope is \(\Delta f_A^{\mathrm{C}}\)). Similarly, among those donors who interact with class-j recipients, the fraction \(1-h(\mu _j)\) of them defect with their recipients, and the distribution of their updated goodness becomes \(g(p;f_A^{\mathrm{D}}(\mu _j),s^2+(\Delta f_A^{\mathrm{D}})^2\sigma _j^2)\). Similar explanations are possible for the transition of mean, \(\mu _j\mapsto f_A^{\mathrm{D}}(\mu _j)\) and for the transition of variance, \(\sigma _j^2\mapsto s^2+(\Delta f_A^{\mathrm{D}})^2\sigma _j^2\).

Equilibrium state for each social norm

We now give an overview of our calculation of the equilibrium state for each social norm, \(A=\mathrm{SJ},\mathrm{SS},\mathrm{SH},\mathrm{SC}\). Fig. 2-B shows that analytical solutions to Eq. (19) excellently fit results of computer simulations (see SI for a more detailed calculation of \((q_j, \mu _j,\sigma _j)\)).

As seen in Eq. (19), C-map (\(f_{A}^{\mathrm{C}}\)) and D-map (\(f_{A}^{\mathrm{D}}\)) play an important role in considering the transition of each peak position \(\mu _j\). As Fig. 3-A illustrates, these C-map and D-map differ depending on the social norm that the population adopts. If there was only one map f, sequentially applying this map would lead to a fixed point, which is a crossing point of map f and the identity map (represented by a 45 degree line), as Fig. 3-B illustrates, and this fixed point would correspond to the peak position of the single Gaussian distribution at an equilibrium state. In our case, we have two maps \(f_{A}^{\mathrm{C}}\) and \(f_{A}^{\mathrm{D}}\) so the situation is different, but analyzing a fixed point of each map is still crucial for analyzing Eq. (19).

Below we will study each social norm.

Figure 3
figure 3

(A) C-map \(f_{A}^{\mathrm{C}}\) (solid line) and D-map \(f_{A}^{\mathrm{D}}\) (broken line). We have used \(e_1=e_2=0.1\). From the left to right, the panels show cases of \(A=\mathrm{SJ},\mathrm{SS},\mathrm{SH},\mathrm{SC}\). Black solid line indicates an identity map. (B) Illustration of reaching a fixed point \(p^*=f(p^*)\) by sequential application of a map f. Because slopes of all C-maps and D-maps are less than 1 and greater than \(-1\), the fixed point is always stable.

When the social norm is SJ: As Fig. 3-A shows, both C-map and D-map have the same fixed point, \(p=1/2\). Thus, the only possible peak position of Gaussian distributions at the equilibrium state is at

$$\begin{aligned} \mu _1=\frac{1}{2}. \end{aligned}$$
(21)

The equilibrium distribution is given by a single Gaussian distribution.

When the social norm is SS: As Fig. 3-A shows, the C-map is a constant map, \(f_{\mathrm{SS}}^{\mathrm{C}}(p)=1-e_2\), so this position is one of the peaks of the Gaussian distributions at the equilibrium state;

$$\begin{aligned} \mu _1=1-e_2{,} \end{aligned}$$
(22)

(see an illustration in Fig. 4-A). The other peaks can be obtained by repeatedly applying the D-map. More specifically, \((j+1)\)-th peak position \(\mu _{j+1}\) is obtained by

$$\begin{aligned} \mu _{j+1}=f_{\mathrm{SS}}^{\mathrm{D}}(\mu _j)\ \ (j\ge 1), \end{aligned}$$
(23)

(see an illustration in Fig. 4-B). These infinite classes allow us to characterize the population.

Figure 4
figure 4

An illustration to interpret the equilibrium state for \(A=\mathrm{SS}\). (A) In the left panel, the yellow solid line shows the C-map, which maps any value p to a constant value, \(1-e_2\). This mapped value is labeled as \(\mu _1\). The right panel (same as a panel in Fig. 2-B) shows the equilibrium state \(\phi ^*(p)\) for \(A=\mathrm{SS}\), where the peak positions of all the classes \(\mu _{1}, \mu _{2}, \mu _{3}, \cdots\) are mapped to \(\mu _1\) by the C-map. (B) In the left panel, the yellow broken line shows the D-map, which sequentially maps the 1st peak to 2nd, 3rd, 4th peaks, and so on. The right panel illustrates how the peak position \(\mu _j\) of class-j is mapped to the peak position \(\mu _{j+1}\) of class-\((j+1)\) by the D-map.

When the social norm is SH: As Fig. 3-A shows, the D-map is a constant map, \(f_{\mathrm{SH}}^{\mathrm{D}}(p)=e_2\), so this position is one of the peaks of the Gaussian distributions at the equilibrium state;

$$\begin{aligned} \mu _1=e_2. \end{aligned}$$
(24)

The other peaks can be obtained by repeatedly applying the C-map for the same reason as in the case of \(A=\mathrm{SS}\). Thus, \((j+1)\)-th peak position \(\mu _{j+1}\) is obtained by

$$\begin{aligned} \mu _{j+1}=f_{\mathrm{SH}}^{\mathrm{C}}(\mu _j)\ \ (j\ge 1), \end{aligned}$$
(25)

When the social norm is SC: Both C-map and D-map are constant maps; \(f_{\mathrm{SC}}^{\mathrm{C}}(p)=1-e_2\) and \(f_{\mathrm{SC}}^{\mathrm{D}}(p)=e_2\). Thus, there are two possible peak positions of Gaussian distributions at the equilibrium state. They are at

$$\begin{aligned} \begin{aligned}{}&\mu _1=1-e_2,\\&\mu _2=e_2, \end{aligned} \end{aligned}$$
(26)

and the equilibrium distribution is given by a summation of two Gaussian distributions.

We can also derive \(\sigma _j^2\) and \(q_j\) for each social norm. See SI for the detailed calculation. Here we only summarize the results in Table. 3.

Table 3 Analytical solutions to Eq. (19). h is defined as \(h(p)=p(1-e_1)+(1-p)e_1\) (see Eq. (1)). We employ the convention, \(\prod _{k=1}^{0} \cdot = 1\). From this table, we see that, for SJ norm, neither the error rate in action \(e_1\) nor the error rate in assessment \(e_2\) influences the stationary distribution. For SS and SH, \(e_1\) influences only masses \(q_j\), and \(e_2\) influences masses \(q_j\), means \(\mu _j\), and variances \(\sigma _j^2\). For SC, \(e_1\) influences nothing, but \(e_2\) influences means \(\mu _j\) and variances \(\sigma _j^2\).

Based on Table 3, we now describe the equilibrium distribution of goodness in the population for each social norm.

When the social norm is SJ: All individuals receive good reputations from almost a half of the population and receive bad reputations from almost the other half of the population. The average goodness in the population is 1/2. Surprisingly, the two error rates \(e_1\) and \(e_2\) do not affect the equilibrium distribution at all.

When the social norm is SS: There are an infinite number of peaks in the equilibrium distribution. The highest one is at \(\mu _{1}=1-e_{2}\) and individuals that belong to this class-1 are those who receive good reputations the most. The second highest peak is at \(\mu _{2} = 2e_{2}(1-e_{2})\) and individuals that belong to this class-2 are those who receive bad reputations the most. The positions of the third, fourth highest peaks and so on are arranged in an oscillating fashion across 1/2 as \(\mu _2< \mu _4< \cdots< 1/2< \cdots< \mu _3 < \mu _1\). The average goodness in the population is relatively high compared with the other three social norms.

When the social norm is SH: There are an infinite number of peaks in the equilibrium distribution. The highest one is at \(\mu _{1}=e_{2}\) and individuals that belong to this class-1 are those who receive a good reputation the least. The positions of the second, third highest peaks and so on are monotonically increasing as \(\mu _1< \mu _2< \mu _3< \cdots < 1/2\). The average goodness in the population is relatively low compared with the other three social norms.

When the social norm is SC: A half of the individuals receive good reputations from a majority of individuals (i.e., high goodness, \(\mu _1=1-e_2\)), and the other half receive bad reputations from a majority of individuals (i.e., low goodness, \(\mu _2=e_2\)). The average goodness in the population is 1/2, which is the same as in the case of SJ. However, there is a large difference in the frequency distribution of goodnesses between SJ and SC, as shown in Fig. 2-B. The action error rate, \(e_1\), does not affect the equilibrium distribution at all.

Remarks on SS: The equilibrium distribution of goodness under SS is especially interesting because there are some individuals with low goodness (such as class-2) although the average goodness in the population is high. Here, we explain a mechanism of how such an equilibrium distribution is formed under SS. First of all, SS tends to generate many individuals with high goodnesses labeled as class-1. This is because once a donor cooperates with a recipient, observers assign good reputations to the donor under SS regardless of whether the recipient’s reputation in the eyes of those observers is good or bad, unless observers commit an error in assessment (see Fig. 5-A). On the other hand, SS also generates a small number of individuals with low goodness labeled as class-2. Such individuals with low goodness emerge when a donor defects with a recipient in class-1, either because the donor belongs to a minority of individuals who think the recipient is bad or because the donor thinks the recipient is good but this donor erroneously chooses defection as opposed to his/her intention. In either case, such a donor receives bad reputations from almost all observers and descend to class-2, because in the eyes of those observers the donor’s defection is seen as a defection against a good recipient (see Fig. 5-B). The mechanism of how individuals in class-\((j+1)\) are generated is similar; a donor who defects with a recipient in class-j moves to class-\((j+1)\).

Figure 5
figure 5

An interpretation of the equilibrium state for SS. (A) When a donor cooperates with a recipient, the donor receives good reputations from a lot of observers, independent of classes of the donor and the recipient. Such a donor moves to class-1. Because this process frequently occurs, SS generates a majority of individuals with high goodness. (B) When a donor defects with a recipient in class-1, the donor receives bad reputations from a lot of observers and such a donor moves to class-2. This process does not frequently occur, but SS definitely generates a minority of individuals with low goodness.

One might expect that the goodness of a randomly sampled individual from the population that employs SS should always be higher than the goodness of a randomly sampled individual from the population that employs SJ, because SS assigns a good reputation in more cases than SJ (compare SS an SJ in Table 1; if a donor receives a good reputation under SJ, such a donor would also receive a good reputation under SS). However, this naive expectation is not true because class-2 individuals (and more generally, class-2j individuals) under SS have the goodness of less than 1/2, whereas all individuals under SJ have the goodness of about 1/2. This apparently paradoxical phenomenon is now explained as follows. Observers under SS more frequently assign good reputations than those under SJ, and thus generate a lot of individuals with high goodness (i.e., class-1 individuals). However, the existence of such individuals in turn causes the emergence of a minority of individuals with low goodness (such as class-2 individuals). As a result, a large divide with respect to one’s goodness occurs among individuals in the population that employs SS; extremely good individuals and extremely bad individuals coexist there.

Discussion and conclusion

This study theoretically analyzed a question of how reputation relationships among individuals are established, by using a model of indirect reciprocity. Under the assumption of private reputations, the question has mainly been discussed by computer simulations until now, except for few studies18,33. Here we formulated a change of “goodness” of an individual, which is defined as the proportion of individuals who regard the focal individual as good, by a stochastic process (Eqs. (3) and (7)). Then, we formulated time evolution of a frequency distribution of goodness in the population by a deterministic integro-differential equation (Eq. (12)). By employing an approximation that uses a mixture Gaussian distribution and assumes a large population size, we obtained a closed equation that the equilibrium distribution must satisfy (Eq. (19)). We succeeded in calculating the equilibrium distribution of goodness (Table 3) and interpreted its meaning.

As far as we know, this is the first study that has analytically derived the equilibrium frequency distribution of goodness for a model of indirect reciprocity that assumes private reputations, and we believe that our study provides a major advance in theoretical studies of indirect reciprocity. As relevant literature, we compare our approach with two recent works that have analytically studied a model of private reputations.

Uchida and Sasaki18 analyzed the average goodness in the population under the SJ social norm for a model of private assessment, and reached the conclusion that it is 1/2. In contrast, our study has derived the distribution of goodness. From this obtained distribution it is easy to calculate the average goodness under SJ, that is 1/2. Moreover, we have analyzed three other social norms, SS, SH, and SC. By using the approach developed in this paper, it is possible to analyze the other 12 second-order social norms that have not been studied here in a similar manner.

Okada et al.33 studied cases where there is always only one observer who updates his/her private reputation of a donor. By making such an extreme assumption, the authors successfully avoided calculating higher-order correlations between reputations of the same individual among observers. In contrast, we have assumed that all individuals in the population play a role of observers and update their private reputation of the same donor simultaneously. Such an approach explicitly considers correlations in opinions among observers. It will be interesting to develop a similar theoretical framework to ours that studies a model in which only a part of the individuals in the population (say, proportion \(0< \theta < 1\)) become observers and simultaneously update their private reputations of the same donor. We leave it as a future study.

As significant progress in the analysis of reputation structure, this study treated a model that all the players adopt the (1) same (2) second-order social norms under (3) random interactions between a donor and a recipient. However, this simple model may not perfectly reflect a real human society. First, the society consists of various kinds of people who have different viewpoints, i.e., different social norms. This extension brings another question of which social norms can be evolutionarily advantageous, concerning studies on the emergence of cooperation11,37,38,39 and exploitation40,41,42. Second, real people may take more information into account than second-order social norms do when they assign reputations to others, such as third-order information (i.e. the current reputation of a donor) (e.g., social norms named standing43, staying44 and consistent standing20) or more35,45. Such additional pieces of information will bring more complexity to the reputation structure among people35,45. Third, real people interact mainly with neighbors or friends. Such biased interactions are often modeled by introducing lattices or complex networks46,47,48,49,50,51,52,53,54,55,56,57. Our study can be applied to the analysis of reputation structure even for such extended situations in the future.

There are various kinds of people in a society, from those who receive good reputations from many people to those who receive good reputations from a few. Such diversity is established by complex dynamics of mutual evaluation of their behavior. This study theoretically approached such complex dynamics. Although there are some differences between our simple model and a real society, our findings give some basic insight into the mechanism of how good and bad individuals emerge in the context of indirect reciprocity (corresponding to “generalized exchange”58,59,60,61 in sociology). In conclusion, this study provides a new theoretical approach to investigate reputation structure in the population where individuals privately assess each other.