Abstract
Pronoun usage’s psychological underpinning and behavioral consequence have fascinated researchers, with much research attention paid to second-person pronouns like “you,” “your,” and “yours.” While these pronouns’ effects are understood in many contexts, their role in bilateral, dynamic conversations (especially those outside of close relationships) remains less explored. This research attempts to bridge this gap by examining 25,679 instances of peer review correspondence with Nature Communications using the difference-in-differences method. Here we show that authors addressing reviewers using second-person pronouns receive fewer questions, shorter responses, and more positive feedback. Further analyses suggest that this shift in the review process occurs because “you” (vs. non-“you”) usage creates a more personal and engaging conversation. Employing the peer review process of scientific papers as a backdrop, this research reveals the behavioral and psychological effects that second-person pronouns have in interactive written communications.
Similar content being viewed by others
Introduction
In written communications, one can address the other conversational party using either second-person pronouns or their third-person counterparts. For instance, during the peer review process of a scientific paper, an academic may address the reviewers either using “you” (e.g., “the issue you brought up”) or a third-person reference instead (e.g., “the issue the reviewer brought up…”). Whether this choice matters, however, is less known. This question is embedded within the recent research investigating the behavioral and psychological consequences of personal pronoun usage1,2,3, which in turn falls under the broader research category of the social function of language usage4,5,6. Building upon this growing literature, the present research aims to investigate how the usage of second-person pronouns (“you,” “your,” and “yours”; hereinafter, we use the terms second-person pronoun usage and “you” usage interchangeably) impacts the outcome of written communications.
Currently, a wealth of research has investigated the impact of “you” usage on individuals’ mental state and/or behavior. For instance, “you” can draw the attention of a conversational party and hence evoke higher involvement7,8,9. Moreover, generic “you,” as in “you shall not murder,” signals normative behavior and hence impacts persuasion10,11,12,13. Furthermore, “you” usage in lyrics like “I will always love you” or movie quotes like “here’s looking at you, kid” can remind one of somebody in their own life (a loved one in these examples)14. Despite their important insights, however, most such investigations focus on one-way and one-off communications. While another body of literature does investigate “you” in two-way communications, it is largely limited to close relationships, mainly focusing on how pronoun usage reflects a party’s self- or other-focus4,15,16,17. Therefore, the field’s knowledge is still limited about the role of second-person pronouns in bilateral, dynamic, and interactive conversations, especially beyond close relationships.
To bridge this gap, in the present paper, we examine the behavioral and psychological consequences of second-person pronoun usage in interactive, conversational settings. Specifically, by analyzing 25,679 instances of revision correspondence with Nature Communications, we focus on how “you” (vs. non-“you”) usage in authors’ responses to reviewers may influence reviewers’ behavior. This dataset is ideal for our investigation, because the peer review process allows us to compare naturally occurring instances of both “you” and non-“you” responses.
The extant literature has shown that by directly addressing a conversational party, second-person pronouns can evoke the listener’s attention, personal relevance, and involvement in the communication7,14. Other personal pronouns do not possess this feature. For instance, in stark contrast to “you,” third-person pronouns often function to signal objectivity and minimize the involvement or even the existence of the speaker18,19,20. Building on this literature, we contend that in a communicative setting, addressing the other party as “you” (vs. not as “you”) should be associated with a more personal and engaging conversation, in contrast to an impersonal, businesslike exchange.
This feature of “you” usage may, in turn, lead to observable behavioral patterns in peer review outcomes. First, the personal and engaging conversational tone stimulated by “you” usage may in and of itself make the reviewer like the responses more, as individuals tend to favor things that are personally relevant8,14. Second, communicative norms that govern such conversations may call for greater politeness, civility, and embarrassment avoidance (“face-saving”) in communications21,22,23,24, making the comments more favorable (or less harsh) than they otherwise would be and resulting in greater positivity and fewer questions in reviewer comments.
Building on this perspective, here we show that when the authors use (vs. do not use) second-person pronouns to address the reviewers, they also see less lengthy reviewer comments, encounter fewer questions, and receive more positive and less negative feedback. We further link this shift in the review process to a more personal and engaging conversation prompted by “you” usage: First, when authors address reviewers using “you,” the reviewer responses tend to include fewer first-person singular pronouns, suggesting decreased self-focus25,26,27,28; and to use less complex words, a staple feature of in-person conversation29,30,31,32. Second, thematic analyses conducted using Latent Dirichlet Allocation (LDA) show that second-person pronouns are indeed associated with increased reviewer engagement in their comments. Core findings from our dataset are also causally supported by a pre-registered behavioral experiment (N = 1601). Specifically, when participants assuming the role of reviewers are addressed in second person (vs. third person), they evaluate an otherwise identical author response as more positive. This effect is mediated by the extent to which the conversation is perceived as personal and engaging. Taken together, this research investigates the behavioral consequence and psychological underpinning of second-person pronoun usage employing field and lab data. In so doing, we contribute to the literature on language usage (and pronoun usage in particular) and shed light on the collegiate understanding of the peer review process and science of science in general.
Results
Data and design
We analyzed revision correspondence of all papers published in Nature Communications between April 2016 (when the journal first began publishing reviewer reports) and April 2021. This dataset contains 13,359 published papers that account for a total of 29,144 rounds of review. In the present research, a “round” of review is defined as one exchange between the editorial/reviewer team (hereinafter simply “reviewers”) and the authors, with the reviewer comments being followed by the author responses. For instance, the “1st round of review” begins with the initial comments from the reviewers and the authors’ responses to those comments, the 2nd round of review consists of the next batch of reviewer comments and the authors’ responses to them, and so on. In our analysis, we focus on the authors’ usage of second-person pronouns in addressing the reviewer team in the first review-response-review process (i.e., reviewer comments in the 1st round, author responses in the 1st round, and reviewer comments in the 2nd round). We focus on this process because it constitutes most of our observations (25,679, or 88.11% of 29,144 review rounds) and, more importantly, affords a difference-in-differences (DID) design, which we elaborate below. Figure 1 illustrates our focal data and study design (full details regarding the number of papers and rounds of review can be found in Supplementary Note 1).
In examining the impact of “you” usage, it is important to consider some distinctive features of the peer review process. As an illustration, Fig. 2 shows the authors’ response to the reviewers in the 1st round of review (marked by the vertical dashed line), as well as the number of questions reviewers posed before and after this response (i.e., question counts in the 1st and 2nd review rounds). We then compare question counts following both “you” and non-“you” usage in a quasi-experimental fashion. Specifically, we categorize a paper into the treatment group if its authors used “you” in their 1st-round responses (which, in our context, can be considered as the treatment administered to reviewers), and the control group if they did not. Note that the treatment and control groups here are not in the strict experimental sense, as the papers are not randomly assigned to them.
Importantly, as illustrated in Fig. 2, to estimate the effect of “you” usage, it could be misleading to simply contrast the number of questions reviewers raised after the author responses with “you” (5.82) and without (4.25). This is because empirically, the treatment and control groups may begin with different question counts (which happens to be the case in our data—29.87 and 24.30, respectively, as per Fig. 2). To offset this initial discrepancy, we instead measure the decline in question count from the 1st to the 2nd round. Specifically, authors who used “you” saw a decrease of 24.05 questions in the subsequent round, while those who did not use “you” saw a decrease of 20.05.
Here, the control group reduction of 20.05 questions reflects a “natural” progression in our data, such that question count dwindles as the review process progresses, regardless of whether the author used “you” (see Fig. 2). On the other hand, the treatment group reduction of 24.05 also encompasses our focal effect of “you” usage, in addition to the overall trend. Therefore, the difference between the two reductions provides a relatively precise estimation of the effect of “you” usage. Specifically, compared to the control group, the treatment group experienced a steeper decline in question counts (by a margin of 4).
This approach to estimating the effect of “you” usage constitutes a DID framework, a quasi-experimental method widely used in observational data analysis. Specifically, the first “differences” here are the differences in question counts before and after author response, within both the treatment and control groups. These differences serve to offset initial discrepancies between the groups. The second “difference” then contrasts these two differences to estimate the effect of “you” usage—hence the name “difference-in-differences.”
As depicted in Fig. 1, during the 1st round of review, authors of 5042 papers (37.74% of all 13,359 papers) used “you” in their responses to the reviewer comments (i.e., treatment group), whereas authors of 8317 (62.26%) papers did not use “you” (i.e., the control group). We then estimate the effect of “you” on various behavioral and psychological outcomes by comparing the average change in such outcomes before and after a response with “you” versus without “you”. In what follows, this DID model enables us to examine more closely the effect of “you” usage on total number of questions from the reviewers, total length of reviewer comments, and positivity/negativity of the 2nd-round reviewer comments.
In addition, we also employ this DID model to examine the impact of “you” usage on how personal and engaging the reviewer-author communication is. Measurements of interest include the subjectivity of the language used in the reviewer comments, the frequency of reviewers’ use of first-person pronouns, the complexity of the vocabulary used by the reviewers, and the extent to which the reviewers engage with the authors.
Equation (2) in the Methods section formulates the DID model summarized above. To ensure the robustness of our analysis, a variety of control variables are also included in this model. The summary statistics of our dependent and control variables are presented in Table 1.
Reviewers wrote less and asked fewer questions following authors’ “you” (vs. non-“you”) usage
Table 2 summarizes the DID estimates on two review outcomes: the total number of questions the reviewer raised, and the total number of words the reviewer wrote. The first non-header row is of particular interest, as it reports the DID estimator, or the effect of the interaction between “you” usage and time (i.e., before vs. after the author’s response).
In Column (1), the significant, negative coefficient (−4.0019) indicates that “you” usage has a negative effect on the total number of questions the reviewer asked. Specifically, when exposed to “you” (vs. non-“you”) author response in the 1st round of review, reviewers raised fewer questions in the 2nd round (t(25675) = −10.01, p < 0.001, B = −4.00, 95% CI = [−4.79, −3.22]). This result remains robust when the control variables (see Table 1) and paper fixed effects are included in the DID model (“you” usage sees 3.34 fewer questions; t(12319) = −9.40, p < 0.001, B = −3.34, 95% CI = [−4.03, −2.64]; Column (2)). Similarly, reviewers addressed by “you” (vs. non-“you”) language also wrote 172.15 fewer words as estimated by the basic DID model (t(25675) = −9.54, p < 0.001, B = −172.15, 95% CI = [−207.50, −136.79]; Column (3)), or 135.59 fewer words when the control variables and paper fixed effects are included (t(12319) = −9.36, p < 0.001, B = −135.59, 95% CI = [−163.98, −107.20]; Column (4)).
Reviewer Comments Are More Positive (and Less Negative) Following Authors’ “You” (vs. Non-“You”) Usage
In addition, we find that authors using “you” also receive more positive (and less negative) reviewer comments during the review process. To assess positivity in a reliable and robust manner, we employed multiple widely-adopted automated text analysis techniques to analyze the reviewer comments (see Sentiments of Reviewers’ Comments in the Methods section for more details on these measurements).
Table 3 summarizes the corresponding DID estimates, with control variables and paper fixed effects included. Columns (1) and (2) reflect the positivity of reviewer comments employing the Python package TextBlob and R package sentimentr, respectively. Columns (3) and (4), on the other hand, assess the negativity of reviewer comments employing the Python package NLTK and a hand-coded lexicon of common negative words, respectively. As indicated in the first non-header row of Table 3, the findings are consistent across all measurements of review positivity/negativity, such that authors’ use of “you” in the 1st round is significantly associated with increased positivity and decreased negativity of the reviewer comments in the 2nd round.
To further validate these findings, we conducted six additional robustness checks, the detailed results of which are reported in the Supplementary Information for succinctness. To briefly summarize: First, we demonstrate that more “you” usage is associated with a stronger effect on the variables above (Supplementary Note 4 and Supplementary Table 3). Second, to construct a cleaner treatment group, we included a paper in the treatment group only when its “you” usage is conversational (as opposed to courteous; e.g., “thank you”). Of all 5042 “you” papers, 1847 (36.63%) contain only courteous “you,” and are thus excluded from analysis during this robustness check (Supplementary Note 4 and Supplementary Table 4). Third, to construct a cleaner control group, we only included a paper in the control condition if it explicitly addresses the reviewer in third person (e.g., “the reviewer”; Supplementary Note 4 and Supplementary Table 5). Fourth, we employed a matching technique (propensity score matching, PSM) to obtain matched treatment and control groups with comparable observable characteristics (Supplementary Note 4; Supplementary Tables 6, 7; Supplementary Fig. 3). Fifth, we employed a two-stage Heckman model (Supplementary Note 4; Supplementary Tables 8, 9) to capture authors’ “you” usage in response to the initial use of “you” by reviewers. In doing so, we allow for a more reciprocal, dynamic view of the impact of “you.” Sixth, to further account for the non-randomness in “you” usage, we employed placebo (non-parametric permutation) tests to validate that our DID findings are not spurious (Supplementary Note 4 and Supplementary Fig. 4). Our results remain robust to all these robust checks.
More personal and engaging conversation following “you” (vs. non-“you”) usage: “I” usage, word complexity, and reviewer engagement
Thus far, we have demonstrated that addressing reviewers as “you” is associated with fewer questions and less writing from the reviewers, as well as more positive and less negative reviewer comments. In postulating the underlying mechanism of the effect, we contend that addressing the other party in second person is also associated with a more personal and engaging conversation, which is in turn responsible for these marked effects on the review process. Below, we examine several potential indicators of personal and engaging conversations to test this hypothesis employing the same DID model.
A potential indicator is the subjectivity of reviewer comments—the extent to which comments reflect personal opinions rather than factual information33. High subjectivity in language indicates that the text or utterance is more opinionated and personal (as opposed to factual and unbiased) in nature. The usage of subjective languages is a marked feature of interpersonal conversations34,35,36,37. We assess language subjectivity using the Python package TextBlob38 (see Supplementary Note 5 for method details, and Supplementary Fig. 5 for the most frequently used words in our data indicating subjectivity). However, our prediction that authors’ “you” usage is associated with increased subjectivity in reviewer responses does not reach the 0.05 level of significance (see Column (1) in Table 4).
One evidence of a more personal and engaging conversation involves first-person pronoun usage. In our data, authors’ “you” usage is associated with reviewers’ decreased usage of first-person singular pronouns (e.g., “I,” “me,” “my”; see Column (2) in Table 4), which can indicate self-focused attention25,26,27,28. On the other hand, there is no statistically significant difference for first-person plural pronouns (i.e., “we,” “us,” “our”), which often indicate a communal focus39 (Supplementary Note 6 and Supplementary Table 12). This result suggests that following authors’ “you” usage, reviewer may show less self-focus, hence making fewer “I” statements.
Additional evidence of a more personal conversation is found in word complexity. A reviewer comment is more complex if the words in it contain more syllables on average. We find that authors’ second-person pronoun usage is associated with decreased word complexity in reviewer comments (see Column (3) in Table 4). This result suggests that the reviewers, when addressed using second-person pronouns, favored more plain, readable language over complex and formal written language, a choice often made to facilitate a conversation29,30,31,40,41.
Yet more evidence of personal, engaging conversation is found by employing the text mining technology Latent Dirichlet Allocation (LDA), which identifies the hidden topics in reviewer comments that may potentially indicate reviewers’ engagement with the authors. R package topicmodels was applied on reviewer comments of 1st and 2nd round, and revealed 40 hidden topics at optimal best model fit (see Supplementary Note 7 and Supplementary Fig. 6). Here, we focus on one topic (topic 11) that consists of numerous words reflecting communication and engagement during the review process (Fig. 3). All 40 identified topics and their top 10 marker words are displayed in Supplementary Note 7 and Supplementary Fig. 7. The distribution of document-level topic proportions within the chosen topic (i.e., reviewer engagement) is presented in Supplementary Note 7 and Supplementary Fig. 8.
We measure the engagement level of a review comment by the frequency of words associated with the identified engagement topic. This frequency is equal to the probability of a review containing the engagement topic, multiplied by the total word count of said review. We find that the topic of engagement appears significantly more frequently in reviewer comments if “you” was used (vs. not used) by the authors. This result again suggests that the use of “you” may have triggered greater engagement in the subject matter of the paper (see Column (4) in Table 4). We further experimented with alternative LDA models with 35 and 45 topics, as well as the use of a subset of “high-engagement” words (e.g., “exciting,” “interesting,” and “enjoy”; see all 116 words in Supplementary Note 8 and Supplementary Table 15) and the adoption of a structural topic model42,43. In all alternative models, we obtain results consistent with our predictions (see Supplementary Note 7 and Supplementary Table 13). However, no statistically significant difference was observed between the treatment and control groups when the selected topic of reviewer engagement was replaced with other, unrelated topics (see Supplementary Note 7 and Supplementary Table 14).
Effect of the usage of second-person pronouns by the reviewers on engagement measurements
If an author’s “you” usage can render conversations more personal and engaging, it follows that this effect should grow even stronger when both parties employ “you” to address each other. In this section, we examine how both parties’ “you” usage jointly impacts indicators of a personal and engaging conversation. This addition of reviewer usage of “you” into our analyses yields a difference-in-difference-in-differences (DDD) model. This DDD model is best viewed as splitting our original DID model into two separate yet comparable DIDs: one with reviewers who used “you” in the 1st round and the other without. This design thus allows us to examine the impact of reviewers’ “you” usage by contrasting the two separate DID models. Indeed, as demonstrated in Supplementary Note 9, Supplementary Tables 16, 17, we find that when the reviewer initiates a “you” (vs. non-“you”) conversation in the first place, most of our DID (save for subjectivity) yields a larger effect size. In other words, the effect of “you” usage is the most evident when both parties use “you” language. Table 5 formally compares the effects of the two DIDs, forming a third differential impact based on reviewers’ initial “you” usage. The spirit of our analysis echoes that of Kenny and colleagues’ seminal work on dyadic data analysis, which factors the role of both parties into the analysis44,45.
Recall that author’s usage of “you” is sufficient to elicit significant behavioral consequences (i.e., question numbers, word counts, positivity, negativity), irrespective of whether the reviewer used “you” first or not (refer to Supplementary Note 9; Supplementary Tables 18, 19). What we attempt to demonstrate here is the amplifying effect of mutual “you” usage on our mechanism—that is, creating a personal, engaging conversation.
Note that although the focus of this research lies in authors’ “you” usage, our DDD model, together with the previously discussed Heckman Model, affords a reciprocal perspective into how reviewers’ “you” usage also impacts the author. Specifically, reviewers’ “you” usage can not only stimulate authors’ “you” usage (Heckman Model) but also strengthen the contribution of “you” usage to boosted engagement (DDD). Also note that our DDD analysis can also cascade into the remaining rounds, and we direct interested readers to Supplementary Note 2 for more information.
Additionally, we report DDD results for number of questions, number of words, positivity, and negativity in Supplementary Tables 18, 19. Although the DID effect sizes are generally larger when reviewers used “you” in the 1st round, these DDD results are not statistically significant.
Behavioral experiment
The above analyses provide converging evidence that “you” usage is associated with more personal and engaging communication. However, secondary data have a limited capacity for establishing psychological mechanisms and, crucially, causality. To address this, we conducted a controlled, pre-registered (https://aspredicted.org/9yw2f.pdf) experiment to supplement our field data. In this study, 1601 participants were asked to play the role of reviewers and evaluate an author’s response. Of all participants, 901 (56.3%) self-identified as female, 676 (42.2%) as male, and 24 (1.5%) as non-binary or chose not to disclose their gender; Mage = 41.9 years.
Participants were randomly assigned to one of two conditions, in which they were addressed by the author using either “you” or non-“you” language. Participants then responded to a battery of questions regarding the author’s response. Detailed design and procedures are outlined in the Methods section. Key findings are summarized below, while secondary analyses are available in Supplementary Method 1.
First, an ANOVA reveals that participants addressed with “you” rated the author’s response more positively (M = 5.77, SD = 0.98) than did those who were not (M = 5.61, SD = 1.01; F(1, 1599) = 10.62, p = 0.001, Cohen’s d = 0.16, 95% CI = [0.06, 0.26]). Furthermore, “you” (vs. non-“you”) usage also led participants to perceive their exchange with the author as more personal and engaging (M = 5.13, SD = 1.10 vs. M = 4.76, SD = 1.24; F(1, 1599) = 40.78, p < 0.001, Cohen’s d = 0.32, 95% CI = [0.22, 0.42]). Figure 4 illustrates these findings.
Second, a mediation analysis shows that the relationship between “you” usage and positivity is fully mediated by participants’ perception of an personal and engaging communication (unstandardized indirect effect = 0.19, SE = 0.03, 95% CI = [0.14, 0.26]; 5000 bootstrap resamples).
Taken together, “you” usage indeed makes the reviewer–author communication more personal and engaging, which in turn leads to more positive reviewer comments. To further validate these results, we also replicated the main effects and the mediation effect above in a separate sample (N = 1200) employing the same experimental design. In this second experiment, we also find that these findings cannot be attributed to alternative processes such as contention, personal connection, or obligation. Refer to Supplementary Method 2 for detailed results.
Discussion
This work examines the correspondence in the peer review process and finds that when author responses use (vs. do not use) second-person pronouns (e.g., “you”), reviewers ask fewer questions, provide briefer responses, and offer more positive and fewer negative comments. Both lab and field evidence converge to demonstrate that this is the case because “you” (vs. non-“you”) usage fosters a more personal and engaging conversation.
An apparent practical implication of this work is, of course, that authors of academic papers can employ second-person pronouns strategically during the review process to their benefit. However, we believe that our findings extend beyond academic contexts and could be relevant for other forms of (formal) written communication. For example, businesses might utilize “you” in their marketing materials to nudge consumer attitude; likewise, professionals or politicians could use “you” to foster greater engagement. While the effectiveness of these applications requires further empirical validation, the real-world implications of our findings prove both intriguing and potentially impactful.
Conceptually, our study first contributes to the broad literature on language usage, particularly pronoun usage. Researchers have long known that nuances in language use matter. For example, the presence or absence of future tense in a language affects its users’ future orientation6, and word choice can signal political stance46. Within this field, pronoun usage has fascinated theorists for decades, as it can reflect individuals’ mental states such as narcissism27 or lead to various mental processes or behaviors (such as introducing independence/interdependence self-construal)47,48. Recent technological advancements have significantly fueled research on pronoun usage, enabling the collection of large amounts of data from various online platforms49,50,51.
With respect to second-person pronouns, while their usage has been studied in unidirectional, one-off communication7,8,9,10,11,12,13,14, understanding “you” usage in dynamic, bilateral, reciprocal contexts remains critical. Thus far, important work has explored the bilateral usage of “you” in close relationships4,15,16,17. Additionally, methods like the Actor-Partner Interdependence Model have further enriched our understanding of communications between comparable parties44,45. Nonetheless, current insights into mutual “you” usage are mostly confined to close relationships whose parties are of relative equal stations. Hence, there remains a need to explore more diverse contexts such as familial, professional, or adversarial communications, particularly those between unequal parties like superiors and subordinates, professors and students, or, in our case, reviewers and authors. Extant work has shown, for instance, that high-power individuals tend to use “I” less often, instead favoring more “we” and “you” usage52. In a similar vein, our study enriches our understanding of “you” usage in two-way communications that are both professional and hierarchical.
Moreover, by revealing the link between language and review outcomes, we contribute to the emerging field of science of science, which scientifically probes the practice of science itself53,54. Regarding the peer review process, several often-overlapping science of science sub-fields, such as bibliometrics, scientometrics, and metascience, have accumulated important insights into how scientific publication works, what potential biases exist, and how to ensure rigorous, transparent outcomes55,56,57. Through the present work, we underscore that perspectives and methods of language study can bear promising fruit in science of science, and we contribute to the few extant works that have already begun to explore this front (finding, e.g., that scientific papers often use generic, overgeneralized language that signals impact at the cost of precision)5.
Several limitations in our data should be noted. To begin, the lack of pre-1st-round reviewer comments prevents direct verification of the parallel trend assumption for DID analysis. As a result, the randomness of “you” and non-“you” usage poses a limitation in our data (we have, however, employed such methods to address this issue as PSM, Heckman model, permutation test, and behavioral experiment). Moreover, our dataset comprises only papers eventually published, leading to potential selection biases due to the absence of review reports from rejected submissions or those authors opted not to pursue. Additionally, since publishing review correspondence in Nature Communications was optional before November 2022, our data (April 2016 to April 2021) only include authors who opted for publication. These limitations could hinder our ability to analyze “you” usage in, say, more conflictual communications, despite its well-established potential to convey confrontation (e.g., challenging, blaming, or finger-pointing)2,16,17,58. Likewise, selection biases in our data also prevent us from comparing “you” usage in accepted versus rejected manuscripts, or between authors who did versus did not choose to publish their review records. Thus, we encourage future research to explore diverse datasets to expand on our findings.
Furthermore, in this study, we interpret the decreased “I” usage by reviewers following authors’ “you” usage as indicative of a reduction of self-focused attention. However, we recognize the complexities around this inference59, as “I” language may also signify language concreteness1 and self-disclosure15, contributing to a more personal conversation. While this alternative account is unlikely to contradict our findings due to extensive triangulation, we nevertheless call on future research to delve deeper into first-person usage in written communication.
Methods
Ethics
This research is approved by the Office of Research and Knowledge Transfer at Lingnan University and complies with all pertinent ethical regulations.
Peer review data
We sourced peer review data for all papers from April 2016 to April 2021 directly from Nature Communications. Each paper’s Supplementary Information section typically hosts its peer review file, which we downloaded using a custom Python (v3.7) script. These files, originally in PDF format, include both reviewer comments and author responses. To create a paper-level peer review dataset, we first separated reviewer comments from author responses for every review round and created separate TXT files for both. We then generated the variables used in our analysis for each paper by review round employing text mining techniques.
To construct the panel data for studying our proposed effects, we employed several automated text analysis techniques to generate desired variables. Specifically, we leveraged Python packages such as TextBlob and NLTK, as well as R packages including sentiments and topicmodels. These methods are well-established in the fields of natural language processing and computer science and are widely adopted in social science studies.
Sentiments of reviewers’ comments
We generated the following four sentiment metrics for reviewer comments. Two of these capture positivity, while the other two capture negativity.
Python-based positivity
Positivity is also known as “polarity” in Python and calculated by the TextBlob Python package. TextBlob calculates how positive a reviewer comment is on a scale ranging from −1.0 (highly negative) to 1.0 (highly positive). This calculation is enabled by TextBlob’s built-in lexicon, which contains a collection of words and their part-of-speech meanings.
R-based positivity
Using the sentimentr package in R, we gained an alternate metric of review positivity, which is also gauged on a −1.0 (highly negative) to 1.0 (highly positive) scale.
Python-based negativity
Utilizing Python’s NLTK package, we derived the negativity of a review. This approach leverages the VADER (Valence Aware Dictionary and Sentiment Reasoner) sentiment analyzer to evaluate each review’s negative emotion scores on a 0 (not negative at all) to 1 (very negative) scale.
Manually coded negativity
Following Delgado et al.60, we incorporated the 30 negative words most frequently employed by our sampled reviewers. We also introduced other negative words that recurrently appeared in our dataset, resulting in a compilation of 92 negative terms. To measure negativity, we determined the occurrence rate of these negative words (scaled by dividing by 100). The scale thus starts at 0 (not negative at all) and increases by 0.01 (or 1%) each time one of the 92 negative words is used.
Indicators of personal and engaging conversations
The following four variables serve as indicators of a personal and engaging conversations:
Subjectivity
Assessed using the TextBlob Python package, again using its built-in lexicon. This measure scales from 0 (very objective) to 1 (very subjective). For illustrative examples of varying subjectivity in reviewer comments, see Supplementary Note 5 and Supplementary Table 10.
First-person singular pronoun usage
Quantified by counting occurrences of terms like “I,” “me,” “my,” and “mine” within a reviewer report.
Word complexity
Captured by the average number of syllables per word in a peer review report. More syllables per word indicates a more complex vocabulary. For examples of complex and simple words, refer to Supplementary Note 5 and Supplementary Table 11.
Reviewer engagement
Deduced from the proportion of the “engagement topic” in a reviewer report. This proportion is obtained by employing the Latent Dirichlet Allocation (LDA) model, a well-established method in natural language processing that uncovers latent topics within a collection of texts.
In our context, the texts in question are the reviewer reports. The LDA model assumes that each report comprises several topics (with the combined probability of all topics being 1) and that every topic is a discrete probability distribution over all words. By implementing the LDA model with a predetermined number of topics, document–topic and topic–word pairs can be formed based on the words included in each reviewer report, allowing us to identify latent topics.
To implement the LDA model, we followed a data preprocessing approach similar to those used in recent studies61. Initial steps involved the removal of stop words (e.g., “and,” “or”), numbers, and punctuation. We also use stemmed and lower-case words for consistency. We then employed the R package topicmodels to assess the model performance and estimate an appropriate number of topics. Specifically, after experimenting with topic counts ranging from 10 to 100 (at 10-topic intervals), we determined 40 to be the optimal number in that it has the lowest perplexity score. With topic number set to 40, the engagement level was subsequently formulated as:
where % of engagement topic is the probability or proportion of the engagement-related topic in the review text calculated by the LDA analysis. Number of words in the reviewer report is the total word count in each review text.
Model
We employ the difference-in-differences (DID) model to identify the impact of second-person pronouns on various outcome variables of interest. Specifically, we estimate the following model:
Where \({y}_{{it}}\) represents the outcome variable of paper i in round t. Response_with_youit denotes whether the author(s) of a paper responded with “you”, taking the value of 1 if the response includes “you” and 0 otherwise. After_responseit denotes whether the observation period is after the response, taking the value of 1 if so and 0 otherwise. Xit is a vector of controlling variables of a paper, including (1) the number of pages62; (2) the number of references63; (3) the title length; (4) the number of authors64; (5) H-index of the first author65; (6) the gender of the first author61; (7) the last initial of the first author66; (8) the positivity of authors in the 1st round of review; (9) the friendliness of authors in the 1st round of review; (10) the positivity of reviewers in the 1st round of review; (11) the month the paper was published67; (12) the year the paper was published67; and (13) the discipline to which the paper belongs (Nature Communications identifies five disciplines: biological sciences, physical sciences, health sciences, earth and environmental sciences, and scientific community and society). \({\delta }_{i}\) is the paper fixed effects, controlling for the potentially unobserved paper-level factors. \({\varepsilon }_{{it}}\) is a random error term. The coefficient β1 is our coefficient of interest, examining the differential effects of responses with and without “you” (on various outcomes) before and after the response. We find that the residuals for DID models approximate a normal distribution, and the variance of the residuals is stable across different levels of the independent variables, as exemplified in Supplementary Fig. 9.
Behavioral experiment
Pre-registration
The behavioral experiment was pre-registered on April 28, 2023 (Pacific Time) with AsPredicted (https://aspredicted.org/9yw2f.pdf). Here, we disclose a total of two deviations from the pre-registration protocol. First, the reported mediation analysis was not originally included in the protocol and was added in response to a review comment. Second, the actual sample size exceeded the pre-registered target by one participant, as explained below.
Participants
We recruited 1601 Amazon Mechanical Turk panelists via the CloudResearch platform, who participated in the study for monetary compensation. No statistical method was used to predetermine sample size. All participants provided informed consent before participating in the study.
The pre-registered target sample size was 1600. However, due to CloudResearch’s process for determining sample size, which is outside our control, the study eventually yielded 1601 participants. This deviation was anticipated and noted in our pre-registration.
The participant gender distribution is as follows: Of all 1601 participants, 901 (56.3%) self-identified as female, 676 (42.2%) as male, and 24 (1.5%) as non-binary or chose not to disclose their gender. While we did not plan for a priori gender-based analysis, we have included the results of post hoc analyses in Supplementary Method 1, in compliance with the editorial policies of the Nature Portfolio (as of November 12, 2023).
Data exclusion
Our pre-registration dictates that data would be excluded from analysis if flagged as fraudulent by Qualtrics, the survey platform used for our study. However, Qualtrics’ Expert Review function did not detect any fraud that would warrant data exclusion. Consequently, no data were excluded from the analyses.
Procedure
After providing informed consent, all participants were asked to read a brief introduction to the peer review process. The introduction read “Peer review is a process all academics need to go through if they want to get their research work published. When a researcher submits a research paper to an academic journal, the paper is subject to an independent assessment by other field experts called the reviewers (whose role we ask you to play here).”
All participants were then asked to imagine that they had recently reviewed a manuscript for an academic journal. To provide sufficient realism, this hypothetical manuscript was very loosely adapted from a 2020 paper published in Nature Communications68, selected due to its subject matter being easily understandable for laypersons. Specifically, participants were told that “This work examines the possibility that people with more emotional experience (joy, anger, distress, etc.) also have richer emotional vocabulary (i.e., words describing states of emotions) in their language usage.”
All participants were then instructed to imagine that after reviewing the manuscript, they wrote the following comments to the author of the paper:
-
Overall, the paper presents an interesting theory and is well-written.
-
The studies included in the paper are well designed and the interpretation of data is generally convincing.
-
That being said, detailed criteria on what counts as “emotional vocabulary” are lacking. For instance, the usage of such words as “alone” or “bad” does not necessarily carry emotional connotations. As a result, the inclusion of such words in data analysis may prove problematic.
-
The contribution of the work is insufficiently elaborated. To this end, the paper needs to better explain why this work helps advance what the field already knows.
Note that no “you” language was presented in these comments.
All participants were then informed that they had now received the author’s responses. The responses were otherwise identical, save for how the participants (i.e., the reviewers) were addressed. By this design, participants were randomly assigned to one of the two conditions (i.e., “you” and non-“you”). Specifically, participants in the “you” [non-“you”] condition read:
We appreciate your [the reviewer’s] comments, which we find very useful. With regard to the questions you [the reviewer] raised:
-
You [The reviewer] advised us to provide details on how emotional vocabulary is determined. Building on your [the reviewer’s] advice, we now include a thorough discussion of your [the reviewer’s] concern over this issue, and lay out the selection procedure of those words in the manuscript.
-
In this discussion, we also address your [the reviewer’s] concern that some words are not applied solely to emotional experience.
-
You [The reviewer] suggested that the contribution of this work be differentiated from existing research. Following your [the reviewer’s] suggestion, we explain how this work advances the understanding of emotions and affective language.
-
As per your [the reviewer’s] recommendation, in this revision we also further elaborate the contribution of this work in the discussion section.
The participants were unaware of their assigned condition and were not cognizant of the existence of the alternate condition to which they were not assigned. The investigators, on the other hand, were not blinded to allocation during experiments and outcome assessment.
Participants were then prompted to evaluate the how personal and engaging they found the conversation to be on a 4-item, 7-point Likert scale (1 = strongly disagree; 7 = strongly agree; Cronbach’s α = 0.86): “In general, I find the conversation between the parties engaging,” “The author is engaging in a personal conversation with me,” “The correspondence between the reviewer and the author feels conversational,” and “I find the author personable.” Participants also rated the positivity of the author’s response on a single-item Likert scale “My overall impression of the author’s response is positive.”
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
All data necessary for reproducing the results presented in this paper have been deposited in OSF (https://doi.org/10.17605/OSF.IO/XWYS4)69.
Code availability
All code necessary to reproduce our analyses are available at OSF (https://doi.org/10.17605/OSF.IO/XWYS4)69.
References
Yin, Y., Wakslak, C. J. & Joshi, P. D. “I” am more concrete than “we”: Linguistic abstraction and first-person pronoun usage. J. Pers. Soc. Psychol. 122, 1004–1021 (2022).
Biesen, J. N., Schooler, D. E. & Smith, D. A. What a difference a pronoun makes: I/We versus you/me and worried couples’ perceptions of their interaction quality. J. Lang. Soc. Psychol. 35, 180–205 (2016).
Packard, G., Moore, S. G. & McFerran, B. (I’m) happy to help (you): the impact of personal pronoun use in customer-firm interactions. J. Mark. Res. 55, 541–555 (2018).
Seraj, S., Blackburn, K. G. & Pennebaker, J. W. Language left behind on social media exposes the emotional and cognitive costs of a romantic breakup. Proc. Natl Acad. Sci. USA 118, e2017154118 (2021).
DeJesus, J. M., Callanan, M. A., Solis, G. & Gelman, S. A. Generic language in scientific communication. Proc. Natl Acad. Sci. USA 116, 18370–18377 (2019).
Chen, M. K. The effect of language on economic behavior: evidence from savings rates, health behaviors, and retirement assets. Am. Econ. Rev. 103, 690–731 (2013).
Brunyé, T. T., Ditman, T., Mahoney, C. R., Augustyn, J. S. & Taylor, H. A. When you and I share perspectives: pronouns modulate perspective taking during narrative comprehension. Psychol. Sci. 20, 27–32 (2009).
Escalas, J. E. Self-referencing and persuasion: narrative transportation versus analytical elaboration. J. Consum. Res. 33, 421–429 (2007).
Mildorf, J. Reconsidering second-person narration and involvement. Lang. Lit. 25, 145–158 (2016).
Orvell, A., Kross, E. & Gelman, S. A. “You” and “I” in a foreign land: the persuasive force of generic-you. J. Exp. Soc. Psychol. 85, 103869 (2019).
Orvell, A., Kross, E. & Gelman, S. A. How “you” makes meaning. Science 355, 1299–1302 (2017).
Orvell, A., Kross, E. & Gelman, S. A. Lessons learned: young children’s use of generic-you to make meaning from negative experiences. J. Exp. Psychol. Gen. 148, 184–191 (2019).
Orvell, A., Kross, E. & Gelman, S. A. “You” speaks to me: effects of generic-you in creating resonance between people and ideas. Proc. Natl Acad. Sci. USA 117, 31038–31045 (2020).
Packard, G. & Berger, J. Thinking of you: how second-person pronouns shape cultural success. Psychol. Sci. 31, 397–407 (2020).
Slatcher, R. B., Vazire, S. & Pennebaker, J. W. Am “I” more important than “we”? Couples’ word use in instant messages. Pers. Relatsh. 15, 407–424 (2008).
Simmons, R. A., Gordon, P. C. & Chambless, D. L. Pronouns in marital interaction: What do “you” and “I” say about marital health? Psychol. Sci. 16, 932–936 (2005).
Williams-Baucom, K. J., Atkins, D. C., Sevier, M., Eldridge, K. A. & Christensen, A. “You” and “I” need to talk about “us”: Linguistic patterns in marital interactions. Pers. Relatsh. 17, 41–56 (2010).
Smoke, T. A Writer’s Workbook: A Writing Text with Readings. (Cambridge University Press, 2005).
Webb, C. The use of the first person in academic writing: objectivity, language and gatekeeping. J. Adv. Nurs. 17, 747–752 (1992).
Hinkel, E. Objectivity and credibility in L1 and L2 academic writing. in Culture in second language teaching and learning (ed Hinkel, E.) (Cambridge University Press, 1999).
Brown, P., Levinson, S. C. & Gumperz, J. J. Politeness: Some Universals Language Usage (Cambridge University Press, 1987).
Hwang, K. Face and favor: the Chinese power game. Am. J. Sociol. 92, 944–974 (1987).
Watts, R. J., Ide, S. & Ehlich, K. Politeness in Language (De Gruyter, 1992).
DeBono, A., Shmueli, D. & Muraven, M. Rude and inappropriate: the role of self-control in following social norms. Pers. Soc. Psychol. B 37, 136–146 (2011).
Martínez, I. A. Native and non-native writers’ use of first person pronouns in the different sections of biology research articles in English. J. Second Lang. Writ. 14, 174–190 (2005).
Davis, D. & Brock, T. C. Use of first person pronouns as a function of increased objective self-awareness and performance feedback. J. Exp. Soc. Psychol. 11, 381–388 (1975).
Raskin, R. & Shaw, R. Narcissism and the use of personal pronouns. J. Pers. 56, 393–404 (1988).
Zimmermann, J., Wolf, M., Bock, A., Peham, D. & Benecke, C. The way we refer to ourselves reflects how we relate to others: associations between first-person pronoun use and interpersonal problems. J. Res. Pers. 47, 218–225 (2013).
Lewis, M. L. & Frank, M. C. The length of words reflects their conceptual complexity. Cognition 153, 182–195 (2016).
Koppen, K., Ernestus, M. & Van Mulken, M. The influence of social distance on speech behavior: formality variation in casual speech. Corpus Linguist. Ling. 15, 139–165 (2019).
DuBay, W. H. The Principles of Readability (Impact Information, 2004).
Goel, V., Grafman, J., Tajik, J., Gana, S. & Danto, D. A study of the performance of patients with frontal lobe lesions in a financial planning task. Brain 120, 1805–1822 (1997).
Bravo, G., Grimaldo, F., López-Iñesta, E., Mehmani, B. & Squazzoni, F. The effect of publishing peer review reports on referee behavior in five scholarly journals. Nat. Commun. 10, 322 (2019).
Bybee, J. L. & Hopper, P. J. Frequency and the Emergence of Linguistic Structure Vol 45 (John Benjamins Publishing Company, 2001).
Kärkkäinen, E. Stance taking in conversation: from subjectivity to intersubjectivity. Text. Talk. 26, 699–731 (2006).
Du Bois, J. W. & Kärkkäinen, E. Taking a stance on emotion: affect, sequence, and intersubjectivity in dialogic interaction. Text. Talk. 32, 433–451 (2012).
Baumgarten, N., House, J. & Du Bois, I. Subjectivity in Language and in Discourse (Brill, 2012).
Lorla, S. TextBlob Documentation Release 0.16.0. TextBlob (2020).
Pennebaker, J. W., Booth, R. J., Boyd, R. L. & Francis, M. E. Linguistic Inquiry and Word Count: LIWC2015 (Pennebaker Conglomerates, 2015).
Oppenheimer, D. M. Consequences of erudite vernacular utilized irrespective of necessity: Problems with using long words needlessly. Appl. Cogn. Psychol. 20, 139–156 (2006).
Flesch, R. A new readability yardstick. J. Appl. Psychol. 32, 221–233 (1948).
Roberts, M. E., Tingley, D., Stewart, B. M. & Airoldi, E. M. The structural topic model and applied social science. in Advances in neural information processing systems workshop on topic models: computation, application, and evaluation (2013).
Roberts, M. E., Stewart, B. M. & Tingley, D. stm: An R package for structural topic models. J. Stat. Softw. 91, 1–40 (2019).
Kenny, D. A. Interpersonal Perception: The Foundation of Social Relationships (Guilford Press, 2019).
Kenny, D. A., Kashy, D. A. & Cook, W. L. Dyadic Data Analysis (Guilford Press, 2020).
Gentzkow, M., Shapiro, J. M. & Taddy, M. Measuring group differences in high‐dimensional choices: method and application to congressional speech. Econometrica 87, 1307–1340 (2019).
Kühnen, U. & Oyserman, D. Thinking about the self influences thinking in general: cognitive consequences of salient self-concept. J. Exp. Soc. Psychol. 38, 492–499 (2002).
Gardner, W. L., Gabriel, S. & Lee, A. Y. ‘I’ value freedom, but ‘we’ value relationships: Self-construal priming mirrors cultural differences in judgment. Psychol. Sci. 10, 321–326 (1999).
Iliev, R., Dehghani, M. & Sagi, E. Automated text analysis in psychology: methods, applications, and future developments. Lang. Cogn. 7, 265–290 (2015).
Pennebaker, J. W. & Chung, C. K. Counting little words in big data: the psychology of individuals, communities, culture, and history. in Social cognition and communication (eds Forgas, J. P., Vincze, O. & László, J.) 25–42 (Psychology Press, 2014).
Humphreys, A., Wang, R. J.-H., Fischer, E. & Price, L. Automated text analysis for consumer research. J. Consum. Res. 44, 1274–1306 (2018).
Kacewicz, E., Pennebaker, J. W., Davis, M., Jeon, M. & Graesser, A. C. Pronoun use reflects standings in social hierarchies. J. Lang. Soc. Psychol. 33, 125–143 (2014).
Fortunato, S. et al. Science of science. Science 359, eaao0185 (2018).
Wang, D. & Barabási, A.-L. The Science of Science (Cambridge University Press, 2021).
Belter, C. W. Bibliometric indicators: opportunities and limits. J. Med. Libr. Assoc. 103, 219–221 (2015).
Huisman, J. & Smits, J. Duration and quality of the peer review process: the author’s perspective. Scientometrics 113, 633–650 (2017).
Elson, M., Huff, M. & Utz, S. Metascience on peer review: testing the effects of a study’s originality and statistical significance in a field experiment. Adv. Methods Pract. Psychol. Sci. 3, 53–65 (2020).
Rogers, S. L., Howieson, J. & Neame, C. I understand you feel that way, but I feel this way: the benefits of I-language and communicating perspective during conflict. PeerJ 6, e4831 (2018).
Carey, A. L. et al. Narcissism and the use of personal pronouns revisited. J. Pers. Soc. Psychol. 109, e1–e15 (2015).
Delgado, A. F., Garretson, G. & Delgado, A. F. The language of peer review reports on articles published in the BMJ, 2014–2017: an observational study. Scientometrics 120, 1225–1235 (2019).
Leung, F. F., Gu, F. F., Li, Y., Zhang, J. Z. & Palmatier, R. W. Influencer marketing effectiveness. J. Mark. 86, 93–115 (2022).
Card, D. & DellaVigna, S. Page limits on economics articles: evidence from two journals. J. Econ. Perspect. 28, 149–168 (2014).
Vieira, E. S. & Gomes, J. A. N. F. Citations to scientific articles: its distribution and dependence on the article features. J. Informetr. 4, 1–13 (2010).
Freeman, R. B. & Huang, W. Collaborating with people like me: ethnic coauthorship within the United States. J. Labor Econ. 33, S289–S318 (2015).
Hirsch, J. E. Does the h index have predictive power? Proc. Natl Acad. Sci. USA 104, 19193–19198 (2007).
Huang, W. Do ABCs get more citations than XYZs? Econ. Inq. 53, 773–789 (2015).
Ma, C., Li, Y., Guo, F. & Si, K. The citation trap: papers published at year-end receive systematically fewer citations. J. Econ. Behav. Organ. 166, 667–687 (2019).
Vine, V., Boyd, R. L. & Pennebaker, J. W. Natural emotion vocabularies as windows on distress and well-being. Nat. Commun. 11, 4525 (2020).
Cao, C. C. Behavioral consequences of second-person pronouns in written communications between authors and reviewers of scientific papers. OSF. https://doi.org/10.17605/OSF.IO/XWYS4 (2023).
Acknowledgements
C.C. is supported by the Research Grants Council of Hong Kong (13501722) and the Lam Woo Research Fund (F871223) at Lingnan University. Y.L. is supported by the Research Grants Council of Hong Kong (13503323), the Lam Woo Research Fund (LWP20020) and Faculty Research Grant (DB23A5) at Lingnan University, and the National Natural Science Foundation of China (72271060). C.M. is supported by the General Project of National Natural Science Foundation of China (72074045).
Author information
Authors and Affiliations
Contributions
C.M., Y.L., Z.S. and C.C. designed research; Z.S., C.C., Y.L. and C.M. performed research; Z.S., C.C. and S.L. collected and analyzed data; and Z.S., C.C. and Y.L. wrote the paper. All authors wrote, edited, and revised the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Reagan Mozer, James Pennebaker, Stephen Pinfield, Ariana Orvell and Yang Wang for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sun, Z., Cao, C.C., Liu, S. et al. Behavioral consequences of second-person pronouns in written communications between authors and reviewers of scientific papers. Nat Commun 15, 152 (2024). https://doi.org/10.1038/s41467-023-44515-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-023-44515-1
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.