Behavioral consequences of second-person pronouns in written communications between authors and reviewers of scientific papers

Pronoun usage’s psychological underpinning and behavioral consequence have fascinated researchers, with much research attention paid to second-person pronouns like “you,” “your,” and “yours.” While these pronouns’ effects are understood in many contexts, their role in bilateral, dynamic conversations (especially those outside of close relationships) remains less explored. This research attempts to bridge this gap by examining 25,679 instances of peer review correspondence with Nature Communications using the difference-in-differences method. Here we show that authors addressing reviewers using second-person pronouns receive fewer questions, shorter responses, and more positive feedback. Further analyses suggest that this shift in the review process occurs because “you” (vs. non-“you”) usage creates a more personal and engaging conversation. Employing the peer review process of scientific papers as a backdrop, this research reveals the behavioral and psychological effects that second-person pronouns have in interactive written communications.


Supplementary Note 4. Robustness checks
We have conducted the following six robustness checks to further buttress the robustness of our findings.
Their details are reported below.
Robustness Check 1: More "you" usage is associated with stronger effect.Robustness check 1 examines whether the frequency of "you" usage influences "you" usage's effectiveness.To do this, we categorize responses into groups based on the frequency of "you": a few (one or two), moderate (three through five), and many (six or more) "you" usage.We also designate responses without any "you" as our reference group.We then re-estimated the benchmark model and report the results in Supplementary Table 3.In a nutshell, the effects of "you" usage on nearly all outcomes amplify as the frequency of "you" rises, save for reviewers' positivity captured by the Python package TextBlob (but not by the R package sentimentr).
Robustness Check 2: Excluding courteous usage of "you."In the main regression (Table 3), a paper is categorized into the treatment group so long as the word "you" presents in the 1 st round author response, regardless of the context or circumstance in which "you" is applied.The treatment group can thus both "courteous you" (such as "Thank you") and "non-courteous you."However, since courteous "you" usage are frequently thrown around as formality (or even cliché ), it may not necessarily produce a personal, engaging conversation.
Robustness check 2 addresses this possibility.Specifically, to construct a cleaner treatment group, we include a paper in the treatment group only the "usage" usage is conversational (as opposed to courteous, e.g., "thank you").Of all 5,042 "you" papers, 1,847 samples (36.63%)only contain courteous "you."We exclude these 1,847 courteous "you" papers from our dataset and re-estimated our regression models.The results are displayed in Supplementary Table 4, echoing our main findings.
Robustness Check 3: Using third-person addresses (e.g., "the reviewer") as control group.While Robustness Check 2 attempts to build a cleaner treatment group, Robustness Check 3 aims to build a cleaner control group.In our sample, the non-"you" author responses can be further divided into two categories, wherein authors either (a) instead used third-person language (i.e., "the reviewer") to address the reviewer, or (b) did not use second-or third-person addressesperhaps only engaged in addressing the questions.Here, we exclude category (b) from the dataset and re-estimate our models.The corresponding estimation in Supplementary Table 5 again consistent with the main findings in Table 3 in main text.
Robustness Check 4: propensity score matching (PSM) to establish comparable treatment and control groups.To alleviate the concern that authors' "you" usage may not be sufficiently random (exogenous), we utilized propensity score matching (PSM) to establish comparable treatment and control groups for model estimation.By combining the DID approach with the PSM matching techniques, we aim to obtain "matched" experiment and control groups, which possess comparable observable characteristics.Most covariates imbalance between the two groups (Supplementary Table 6) are no longer statistically significant after matching, indicating that the 1:1 nearest neighbor PSM matching algorithm efficiently reduce the bias associated with the observable characteristics.More details on the kernel density maps of PSM before and after matching are provided in Supplementary Fig. 3. Again, consistent results are obtained through the use of the PSM-DID approach.(Supplementary Table 7).
Robustness Check 5: Heckman model.Because authors may use "you" in response to the initial use of "you" by reviewers, one may be concerned that our estimation suffers from potential self-selection bias.
Robustness Check 5 implements a Heckman two-stage model to alleviate this concern 1,2 .Specifically, the firststage model predicts authors' "you" usage, based on covariates such as author features including gender and rank, and the reviewers' "you" usage in the 1 st round (i.e., "initial reviews").
For gender determination, we probabilistically inferred the gender of authors from their names, utilizing the Social Security Administration (SSA) database [3][4][5] .As for rank, as it is impractical to manually retrieve precise professorship details of more than ten thousand authors in our dataset, we approximate author rank using the H-index (a citation-centric metric denoting scientific impact) supplemented from the Web of Science database.This process generates the inverse Mills ratio (IMR) representing unobserved determinants of authors' "you" usage.
The second-stage main model then formally estimates the effects of authors' "you" usage, correcting for the self-selection by controlling for the IMR from the first stage.The resulting estimated effects remain consistent and robust, offering further confidence in our findings (Supplementary Table 8 and Supplementary Table 9).
Robustness Check 6: Placebo test.To further enhance the credibility of our results, we conducted a placebo test.Specifically, we assigned Response with "You" to our observations in a random manner (referred to as "placebo "you" usage": wherein the treatment group is randomly generated, allowing replacement and with group size unchanged as 38% of the papers).Subsequently, we replicated the baseline DID regression using the generated "pseudo-you" data, obtaining estimates for the beta coefficient of the key variables (interaction term in Equation 2).This process was repeated 500 times, yielding a corresponding distribution of 500 beta coefficient estimates, which is illustrated in Supplementary Fig. 4 (where the vertical dashed line in the figure corresponds to the coefficient from the Table 3).
As expected, since the placebo "you" usage is generated randomly, the expected value of the beta coefficient estimates should be close to zero.Supplementary Fig. 4 reveals that the distribution of the estimates (of the placebo "you" usage) is centered around zero, and as expected, our benchmark estimates clearly lie outside the range of the placebo estimates.This further bolsters our confidence in our DID findings, such that these findings are not driven by other unobservable factors.

Supplementary Note 5. Description of two indicators of engaging conversations
Subjectivity.Following the definition by Bravo 6 , "subjectivity" captures the extent to which a text contains personal opinions rather than factual information.In this study, the subjectivity of each peer review report was obtained using the TextBlob python package.The TextBlob package employs a built-in lexicon to determine how subjective a text is on a scale ranging from 0 (very objective) to 1.0 (very subjective).Higher subjectivity scores indicate that the text is more opinionated and subjective in nature, whereas lower subjectivity scores suggest that the text is predominantly objective without much personal opinions.
To provide readers with a sense of what linguistic markers are deemed as "subjective" by TextBlob in our dataset, here we present the top 100 subjective marker words identified in reviewer comments (Supplementary Fig. 5).Notably, among these words, "different," "important," "interesting," "clear," and "new" are the top 5 subjective markers found in our reviewer comments data, highlighted in red.
The word cloud above lists key subjective markers and their relative weights in our data.However, these words are also presented in an isolated, decontextualized fashion.To this end, in Supplementary Table 10, we hand-picked a few sentences from our reviewer comments data, and calculated the subjectivity scores for those sentences using TextBlob.We believe that this in situ presentation renders the subjective score more sensible and relatable.Note that these example sentences are for illustrative purposes only, as our data analyses calculated subjective scores based on the full reviewer comments instead of a single sentence.
Word Complexity.Intuitively, words with more syllables also tend to be more complicated and difficult to understand.In our analysis, we measured a reviewer comment's word complexity by calculating the average number of syllables per word in that comment.
In our dataset, the mean Word Complexity value of the reviewer comments is 1.954 syllables (SD = 0.133).As with subjectivity scores, in Supplementary Table 11 we showcase examples of complex and simple sentences from our reviewer comment data, accompanied by their respective complex scores.Again, please note that our actual analyses calculated word complexity on a reviewer comment (instead of sentence level).low prevalence of the engagement topic (1.1%) is to be expected: The low prevalence of the engagement topic (1.1%) is to be expected: After all, there are as many as 40 topics exist in the text; Moreover, the majority of said text is more likely to be languages oriented towards substantial matters of the research than those employed to engage with people.On the other hand, the distribution reveals significant diversity in the topic, as evidenced by a standard deviation of 2.5% and a maximum value of 41.9%.These variabilities provide ample opportunities for discerning the impact of the "you" usage.
Several robustness checks for LDA.Considering the potential impact of topic count (n = 40 in our study) on LDA results, we conducted the first robustness test for our LDA models with varying the number of topics (35 and 45 topics, respectively; Columns (1) and (2) in Supplementary Table 13), and the results remained robust.
Second, in addition to employing the LDA to uncover latent topics within the peer review reports, we manually compiled a list of "high-engagement" words (e.g., "exciting," "interesting," and "enjoy"; a total of 116 words listed in Supplementary Table 15) based on word counts.We used this new set to reassess the impact of the "you" usage on engagement.The results once again support a positive association between the "you" usage and engagement (Columns (3) in Supplementary Table 13), albeit with slightly reduced significance.We suspect that this diminished significance might stem from the challenge in formulating a predetermined (as opposed to data-driven) word list that effectively captures spontaneous, real-world engagement.For example, while a predetermined word list might use descriptive terms like "engage" or "collaborate," organic conversations often involve context-specific phrases like "your revision addressed my concern" or "following your recommendations." Third, the topic model is estimated using both the treatment and control groups.It is possible that the control group may have a greater impact on the determination of topics.Therefore, we have alternatively constructed a structural topic model (STM) which considers the source of a review comment (treatment or control group) as a factor in the model's estimation.As is shown in Columns (4) in Supplementary Table 13, although the estimated effect is not as substantial (p = 0.168) as in the original LDA model, the direction of the effect remains the same, which overall aligns with proposed account.
Lastly, we have also tested our models with the engagement topic (topic 11) replaced by unrelated topics.These unrelated topics, which are used as "placebos," included topics 13 and 26, which pertain to specific scientific fields (presumably electromagnetism and ecology), and topics 18 and 36, which relate to manuscript evaluation (likely in terms of exposition and methodology).The results in Supplementary Table 14 reveal no significant correlation between the use of "you" and these unrelated topics This result further strengthens our confidence in the validity of the proposed engagement mechanism.

Supplementary Note 8. Dictionaries for the measurements of three variables
Three variables, including the Negativity (Hand Coded), "High-engagement" Words, and Friendliness of Authors (1 st Round) were constructed using manually created dictionaries.The 92 negative words was employed to create the variable Negativity (Hand Coded).As outlined in the methodology, this variable involves calculating the frequency of these 92 negative words appearing in our peer review reports (normalized by dividing by 100 for scaling purposes).
The "High-engagement" Words lexicon was developed by tallying words in peer review reports that were pertinent to reviewer engagement, combined with existing engagement lexicons (including Oxford Languages, Cambridge Dictionary, Merriam-Webster Dictionary, Collins Dictionary, Thesaurus.com,and WordHippo), yielding a total of 116 terms highly relevant to reviewer engagement.In the same vein, the Friendly of Authors lexicon was also created using a comparable approach, encompassing 178 words used to calculate the frequency of friendly words used by authors in the first round of review (Supplementary Table 15).
We also conducted a mediation analysis.The result shows that the relationship between "you" usage and positivity is fully mediated by participants' perception of an personal and engaging conversation (unstandardized indirect effect = 0.17, SE = 0.03, 95% CI = [0.11,0.23]; 5,000 bootstrap resamples).
Taken together, these results replicate those of the main text experiment, thus bolstering our confidence in our experiments.
In a nutshell, we observe no statistically significant effect of "you" usage on contention, personal connection, or perceived duty.Including any of these three variables as a covariate does not significantly change the results of the aforementioned ANOVA or the mediation analyses.
For this replication study, following the SAGER guidelines we have refrained from conducting post hoc gender-based analyses, as the sample size may be insufficient to enable meaningful conclusions.
For more details of the experimental design, we also share the experiment material on the Open Science Framework repository (https://doi.org/10.17605/OSF.IO/XWYS4).

Supplementary Table 2. Detailed description and summary statistics of four control variables
(2)

Table 3 . DID estimates with continuous usage of "you"
* The five official Nature Communications categories.SupplementaryNotes: Each column in the table represents a DID regression with control variables and paper fixed effects.The coefficients of the variable Response with "You"[1,   2], Response with "You"[3, 5], and Response with "You"[6, max]are not included in this table, in that the pure effect of "you" usage is absorbed by paper fixed effects.Standard errors in parentheses are clustered at the paper level.*p < 0.05, ** p < 0.01, *** p < 0.001.Two-sided t tests with a 95% confidence interval are employed here and throughout the tables of Supplementary Information.Supplementary

Table 4 . DID estimates with treatment group excluding courteous usage of "you"
Notes: Each column in the table represents a DID regression with control variables and paper fixed effects.The coefficients of the variable Response with "You" are not included in this table, in that the pure effect of "you" usage is absorbed by paper fixed effects.Standard errors in parentheses are clustered at the paper level.*p < 0.05, ** p < 0.01, *** p < 0.001.Supplementary

Table 5 . DID estimates with the usage of "the reviewer" as control group
Notes: Each column in the table represents a DID regression with control variables and paper fixed effects.The coefficients of the variable Response with "You" are not included in this table, in that the pure effect of "you" usage is absorbed by paper fixed effects.Standard errors in parentheses are clustered at the paper level.* p < 0.05, ** p < 0.01, *** p < 0.001.

Table 7 . Estimation results with PSM-DID approach
Notes: Each column in the table represents a PSM-DID regression with control variables and paper fixed effects.The coefficients of the variable Response with "You" are not included in this table, in that the pure effect of "you" usage is absorbed by paper fixed effects.Standard errors are reported in parentheses and are clustered at the paper level.* p < 0.05, ** p < 0.01, *** p < 0.001.

Table 10. Qualitative evidence of some examples with more and less subjective sense
Each column in the table represents a Heckman regression with control variables and paper fixed effects.The results still exist with paper fixed effects included.Standard errors in parentheses are clustered at the paper level.* p < 0.05, ** p < 0.01, *** p < 0.001.

Supplementary Table 12. Different use of singular and plural first-person pronouns by reviewers
Each column in the table represents a DID regression with control variables and paper fixed effects.The coefficients of the variable Response with "You" are not included in this table, in that the pure effect of "you" usage is absorbed by paper fixed effects.Standard errors are reported in parentheses and are clustered at the paper level.* p < 0.05, ** p < 0.01, *** p < 0.001.

Table 13. Estimation results with alternative number of topics
Notes: Each column in the table represents a DID regression with control variables and paper fixed effects.The coefficients of the variable Response with "You" are not included in this table, in that the pure effect of "you" usage is absorbed by paper fixed effects.Standard errors are reported in parentheses and are clustered at the paper level.* p < 0.05, ** p < 0.01, *** p < 0.001.

Supplementary Table 14. Estimation results with alternative topics in the latent 40 topics
Notes: Each column in the table represents a DID regression with control variables and paper fixed effects.The coefficients of the variable Response with "You" are not included in this table, in that the pure effect of "you" usage is absorbed by paper fixed effects.Standard errors are reported in parentheses and are clustered at the paper level.* p < 0.05, ** p < 0.01, *** p < 0.001.

Supplementary Table 16. Estimates for a "you" conversation initiated by reviewers
Notes: Each column in the table represents a DID regression with control variables and paper fixed effects.The coefficients of the variable Response with "You" are not included in this table, in that the pure effect of "you" usage is absorbed by paper fixed effects.Standard errors are reported in parentheses and are clustered at the paper level.* p < 0.05, ** p < 0.01, *** p < 0.001.

Supplementary Table 17. Estimates for a "you" conversation not initiated by reviewers
Each column in the table represents a DID regression with control variables and paper fixed effects.The coefficients of the variable Response with "You" are not included in this table, in that the pure effect of "you" usage is absorbed by paper fixed effects.Standard errors are reported in parentheses and are clustered at the paper level.* p < 0.05, ** p < 0.01, *** p < 0.001.

Table 18. Estimates for a "you" conversation initiated by reviewers on six behavioral outcomes
Each column in the table represents a DID regression with control variables and paper fixed effects.The coefficients of the variable Response with "You" are not included in this table, in that the pure effect of "you" usage is absorbed by paper fixed effects.Standard errors are reported in parentheses and are clustered at the paper level.* p < 0.05, ** p < 0.01, *** p < 0.001.

Table 19. Estimates for a "you" conversation not initiated by reviewers on six behavioral outcomes
Each column in the table represents a DID regression with control variables and paper fixed effects.The coefficients of the variable Response with "You" are not included in this table, in that the pure effect of "you" usage is absorbed by paper fixed effects.Standard errors are reported in parentheses and are clustered at the paper level.* p < 0.05, ** p < 0.01, *** p < 0.001.