Introduction

Relationships are a central element of human sociality. Here, we present and test a tool designed to estimate the subjectively perceived quality of a relationship between two agents (“relationship closeness”). Extensive literatures study the determinants of relationship closeness and investigate its impact on wide-ranging dimensions of human well-being including health, the incidence and resolution of conflict, and economic productivity1,2. Based on existing research, the study of relationship closeness can offer important insights into the human condition and contribute to public understanding of pressing contemporary issues such as how to build healthier, more resilient, productive, and inclusive societies3,4. Our current contribution is to introduce an improved technique for measuring relationship closeness that is low-cost to implement and well-suited to a wide range of applications.

Influential work in psychology dating back several decades has developed a range of techniques for quantifying relationship closeness. Prominent examples include: the Relationship Closeness Inventory (RCI)5, the Subjective Closeness Index (SCI)5, the Love and Liking scale (LLS)6 as well as the Personal Acquaintance Measure (PAM)7. While these methods focus on different types or aspects of relationships and differ in their conceptual foundations, they share the common feature that their implementation requires responses to, sometimes quite extensive, multi-item questionnaires.

Our primary concern is with an offshoot from this literature, which has sought to develop more compact tools for measuring relationship closeness that can serve as valid substitutes for extensive multi-item questionnaires. Two well established and highly cited tools are the Inclusion of Other in the Self (IOS) scale8 and the Oneness scale9 which we describe in detail in the next section. Both techniques are well-known and the two key papers that introduced and popularized them had, at the time of writing, accumulated almost 9000 citations between them8,9, with only a minority of papers citing both articles. Both tools are quick and easy to implement and have been shown to accurately estimate relationship closeness as measured by extensive survey instruments5,8,7. This holds across a wide range of relationship classes, from acquaintances to close friends10. The tools have been widely used across the social and behavioral sciences especially in the disciplines of psychology and sociology11,12,13,14,15,16 and in various applied fields such as health17,18,19; there is also growing interest in new areas of application (e.g., research in economics20,21,22 or computer science23,24) where, until recently, these tools had barely been used at all.

To date, however, researchers considering using one of these tools have faced a tradeoff. Specifically, the IOS scale is more “convenient” to implement (it requires measurement of just one scale instead of two) but comparative testing has shown that the Oneness scale is the more “predictive” tool in that it correlates more strongly with other, more complex, measures of relationship closeness as found by Gächter et al.10. Since its publication10, several studies25,26,27,28,29,30,31,32 have relied on their evidence to motivate the use of the IOS scale as a good predictor of relationship closeness even though it is not the best available tool in this respect. While sacrificing accuracy for simplicity or convenience may have been a defensible trade-off, as we demonstrate below, it is no longer necessary.

In this paper, we propose an estimation instrument which builds closely on the original IOS scale. A key feature is that we extend the tools’ response range (from a 7-point) to an 11-point scale. Based on this feature, we refer to our tool as the “IOS11 scale”. The primary motivation for extending the response range is that it provides a more nuanced measurement tool, with its degree of granularity more comparable to that of the two-item Oneness scale. To see why, consider a participant who responds with scores of, say, 3 and 4 on the two Oneness items. This participant receives a score of 3.5, a value not measurable on the original IOS scale. If the advantage of Oneness derives from this finer implied scale, the expanded IOS11 scale should substantially close that gap. We do not presume that finer granularity is the only plausible explanation of the differential performance between the IOS scale and the Oneness scale, however. Other contending possibilities, for example, are that the two items of the Oneness scale pick up somewhat distinct aspects of relationship closeness or that two-item estimation is inherently less noisy33,34. We address the former possibility further in the Results section. While our data shed some light on what factors may be at play, our primary objective was to test the conjecture that finer granularity might reduce the gap between the predictive performance of the Oneness scale and our IOS11 scale.

Minded by the important growth of, often very large-scale, data collection in online environments35,36, a second innovative feature of the IOS11 scale is that we implement it via an interactive, computerized, interface. The result is a simple and intuitive task suited to a range of computerized environments from lab to online participant pools such as Amazon MTurk or Prolific.

Following Gächter et al.10, we test the performance of the IOS11 scale by examining its correlation with a set of other well-established but more elaborate estimates of relationship closeness (RCI5, SCI5, LLS6, and PAM7) and we benchmark the performance of our tool against Oneness and the original IOS scale. The IOS11 scale thereby complements other work developing the IOS scale to suit an online study environment37,38,39. We also include a pre-registered replication of Gächter et al.’s10 Study 3 alongside our validation of the IOS11 scale. We find that the IOS11 scale elicits relationship closeness more accurately than the IOS scale and just as well as the more complex Oneness scale. We argue that our tool with its combination of high accuracy and cost-effectiveness is an attractive new approach for fast, convenient, and effective estimation of relationship closeness.

Methods

The IOS11 scale

The left hand side of panel (a) in Fig. 1 presents the original IOS scale8. A respondent is required to say which of the seven pairs of circles best represents their relationship with another identified individual. As noted in the introduction, responses to this simple task correlate (Spearman’s \(\rho \in [\)0.514, 0.820], p < 0.001) with estimates based on considerably more complex measurement approaches10. However, the Oneness scale, which takes the average of responses on two items—the IOS scale and the We scale40 (top right of Fig. 1)—has been shown to outperform the basic IOS scale in its correlation with other estimates of relationship closeness10.

Figure 1
figure 1

Graphical comparison of the interfaces of the IOS scale and our IOS11 scale. Panel (a) depicts the IOS scale, the We scale, and the Oneness scale. Panel (b) illustrates the IOS11 scale. The initial screen participants see when entering the elicitation is blank. For illustration purposes, we are depicting the slider at a central position in this figure.

In developing the IOS11 scale, panel (b), and for reasons already explained, we conjectured that extending the 7-point response scale of the original IOS scale might enhance its predictive accuracy. Extending the number of pairs of circles from which participants can choose, however, creates two obvious challenges. The first is how to visualize an increased number of overlapping circles without their presentation becoming too cluttered, complicated, or confusing. Secondly, we needed to decide by how many options the answer range should be extended.

We addressed the first of these challenges by developing our tool as a computerized version of the IOS scale using an interactive screen that allows participants to intuitively adjust the degree to which circles overlap. Our layout is displayed in the bottom panel (b) of Fig. 1. Participants move a slider below the circle diagram to adjust the degree to which the circles overlap. These changes to the scale do not affect the portability, ease of explanation, or the time it takes to complete the task compared to the original IOS task. The resulting tool also has the obvious attraction that the IOS11 scale can be implemented in a wide range of computerized environments supporting easy use in online surveys and online or lab experiments (it can be accessed under https://doi.org/10.17605/OSF.IO/9DBR6).

This leads us to the second consideration of how many degrees of overlap to offer. The move to a computerized environment allows, in principle, the implementation of a very fine-grained (quasi-continuous) scale.

However, some authors have suggested that using a continuous or ‘visual-analogue’ scale can be a source of noise if respondents “[are] unable to reliably make meaningful and valid fine-grained distinctions”41. Moved by this consideration, we stick with a discrete version of the task. To enhance comparability to previous studies, we kept the maximum and minimum overlap of circles identical to the IOS scale. We then chose the number of levels such that the change in distance between the centers of the circles is approximately linear and so that the original IOS levels form a subset of the extended version (see online Appendix A.2 for details). This leads to a setup with 11 relationship closeness levels as shown in the middle column of Table 1. The left-hand column of Table 1 shows how scores on the original IOS scale map into a subset of scores on the new tool. Additionally, the rightmost column of Table 1 shows how the IOS11 scale can be recoded to a 7-point scale with endpoints matching the original IOS scale for comparability.

Table 1 Comparison of the IOS and IOS11 Scales.

Procedures

We test convergent validity of the IOS11 scale by examining how well it correlates with scores obtained through a range of other measures of relationship closeness and we benchmark its performance against the original IOS scale and the Oneness scale. We employ a between-participant design, where each participant either performs the two tasks necessary to estimate Oneness (i.e., the average responses on the IOS and We scales) or completes our IOS11 task. We then explore the within-participant correlation of scores from each of the IOS, Oneness, and IOS11 scales to a series of well-established survey instruments designed to capture relationship closeness. As noted above, the different scales that we use are the RCI5, the SCI5, the LLS6 as well as the PAM7.

Note that some of these measures were constructed to capture different specific degrees of relationships (e.g., the RCI5 explicitly refers to romantic relationships, whereas the PAM7 was designed for acquaintances). However, from a behavioral scientist’s perspective, it is useful to have a general-purpose and portable measurement tool that can be reliably used in a range of relationships. For that reason, following Gächter et al.10, we employ a between-subject variation where participants were asked to either consider a very close person; a friend; or an acquaintance across all of the core questions within the study. Hence, our main experiment can be considered a two-by-three treatment design varying Oneness and IOS11 tasks on the one hand and the type of relationship considered on the other. Since we borrow from Gächter et al.10 when testing the validity of the IOS11 scale, our hypotheses as well as the statistical analyses closely follow their work.

We presented the instruments eliciting relationship closeness in random order, followed by questions regarding demographics and other individual attributes. Further, to ensure salience of the considered person throughout the study, we ask participants in the beginning of the experiment to provide the initials of the person they are thinking of. These initials are then inserted in all parts where the instructions explicitly refer to another person. We also asked each participant to rate a stranger via either the Oneness scale or the IOS11 scale to examine individual-level variation in interpretation of the scale. This showed limited evidence of any consistent demographic determinants (see online Appendix A.1). The full instructions and details of the various measures of relationship closeness employed as benchmarks are in the online Appendix B.

We pre-registered our study (https://www.socialscienceregistry.org/trials/7947) and collected data online in July 2021 using the survey software Qualtrics42. Our pre-registration includes a description of the experimental design, the targeted sample size as well as the key variables of interest. Although the pre-registration did not set out a detailed plan for data analyses, as our approach replicates and extends Gächter et al.10 we follow their statistical analyses. The study was approved by the Nottingham School of Economics’ Research Ethics Committee.

We recruited 751 participants with N ≈ 125 per treatment using Prolific’s UK sample (the exact numbers of participants in each treatment are in Fig. 2). All participants completed an informed consent form at the start of the study and all methods in this study were conducted in accordance with relevant guidelines for the ethical treatment of human participants. The mean age of our participants is 35.22 years (SD = 13.86, Mdn = 32, Min = 17, Max = 75) with 501 (67%) identifying as female, 242 (32%) identifying as male, and 10 participants not revealing their gender. The sample includes 29% students and 56% of the participants are either in full- or part-time employment. Using an online participant pool such as Prolific therefore provided us with a more heterogeneous demographic than utilizing a student sample. We also obtained additional survey data of other demographics directly from Prolific including age, gender, education levels, and details about the participant’s household. We paid a flat fee of £1.20 per participant and the study took about 15 minutes to complete.

Figure 2
figure 2

Relationship levels, elicitation tools and recorded scores. In each panel, we present scores of the IOS scale, the We scale, the Oneness scale and the IOS11 scale from top to bottom. The Oneness scale is the arithmetic mean of responses on the IOS and We scale. The IOS11 scores are recoded as defined in Table 2. The boxplots capture the median and the interquartile range. The whiskers range from the 10th to 90th percentile. Each circle in the distribution plot captures a unique observation. Different relationship levels are presented in three distinct panels. (a) Close person; (b) friend; (c) acquaintance.

Results

As a first descriptive benchmarking of the IOS11 scale against the IOS and Oneness scales, we examine the reported relationship closeness scores across different treatments. All analyses below utilize the recoded scores for the IOS11 scale (as per final column of Table 1) to allow for direct comparisons between methodologies. However, our results are also robust when using the IOS11 scale without recoding the scores.

Figure 2 plots scores of the IOS and We scale, the Oneness scale (the arithmetic mean of responses on the IOS and We scale), and the IOS11 scale for each level of relationship. The box plots capture the interquartile range for each estimate and the underlying distributions are indicated by the circles above the boxes. The different colors indicate whether the person thought of was a close person (dark blue), a friend (blue) or an acquaintance (light blue). The different scales (IOS, We, Oneness, and IOS11) for each relationship level are then presented in separate bars from top (close person, panel a) to bottom (acquaintance, panel c).

Figure 2 shows that for all four instruments, there is clear and coherent variation in reported closeness comparing different relationship levels. Based on pairwise Kolmogorov–Smirnov (KS) tests, participants who considered a close person reported significantly higher scores than those who considered a friend (DIOS = 0.379; DWe = 0.440; DOneness = 0.446; DIOS11 = 0.340; p < 0.001) and scores for those considering an acquaintance were lower still (DIOS = 0.532; DWe = 0.352; DOneness = 0.431; DIOS11 = 0.589; p < 0.001). Moreover, the figure also shows that reported levels of closeness are similar across methods. Notwithstanding this general coherence, Fig. 2 reveals some differences across the distributions of scores for different methods, in the comparison of IOS and We scale scores.

Notice that for ratings of a close person, the interquartile range and median value for the We scale lie to the right of that for the IOS scale reflecting, in part, a markedly stronger tendency for participants to record maximum values on the We scale, relative to the IOS scale (D = 0.237; p = 0.001 for KS test comparing the two distributions). This is suggestive evidence that IOS and We scales may, to some extent, be capturing different aspects of relationship closeness and, if they are, this could be part of the explanation for why the Oneness scale, which combines the two scales, has tended to psychometrically outperform the IOS scale alone. Notice, however, that relative to the IOS scale, at the eyeball level the distribution of the IOS11 scale more closely resembles the distribution of the Oneness scale.

Based on KS tests, the IOS11 and Oneness scores are statistically indistinguishable from each other for a close person (D = 0.142; p = 0.164); a friend (D = 0.063; p = 0.966); and an acquaintance (D = 0.158; p = 0.095). To the extent that the Oneness scale outperforms the IOS scale in tracking other estimates of relationship closeness, these results suggest the possibility that the IOS11 scale might close some of that performance gap.

Table 2 reports within-participant Spearman’s rank correlations between IOS, Oneness and IOS11 (columns) and a set of nine benchmark scores obtained from distinct scales (rows) with darker shades of blue indicating stronger correlations. Columns 1 to 3 display the results for the IOS scale, the Oneness scale, and the IOS11 scale from our study, whereas columns 4 and 5 reproduce results for the IOS scale and the Oneness scale from Gächter et al.10 for comparison. The first row reports correlations with the overall RCI benchmark score and the next three rows report correlations with its three sub-components (frequency, diversity, and strength)5. “Love” and “Like” scores are two elements of LLS6. The final row reports correlations with an Index of relationship closeness (IRC); this is a single index developed by Gächter et al.10 but derived from the set of other benchmark scores5,8,7 using a principal components analysis10.

Table 2 Correlations across scores obtained by relationship scales.

Across the table, we find moderately strong to strong correlations throughout; all are statistically significant at the 1% level. Table 3 reports pairwise tests of differences between correlation coefficients (IOS scale vs. Oneness scale; IOS scale vs. IOS11 scale; IOS11 scale vs. Oneness scale).

Table 3 Pairwise comparisons of correlation coefficients.

Table 2, combined with the tests presented in Table 3, reveals three broad patterns. First, correlations between Oneness and the various benchmark scores tend to be systematically higher than those between the benchmarks and the original IOS scale (in Table 3, comparing the IOS scale with the Oneness scale, there are two cases where the correlation is significantly higher for the Oneness scale, at the 5% level or higher, and none in the opposite direction). Second, the IOS11 scale outperforms the original IOS scale (in Table 3, there are three cases where the IOS11 scale has a significantly higher correlation with a comparator benchmark, at the 5% level or better, and no cases where IOS performs better). Thirdly, we find no significant differences when comparing the correlations between the Oneness scale and the IOS11 scale for each of the nine benchmark scales (in Table 2, across the nine benchmarks, differences go in both directions, but they are never significantly different at the 5% level and few of the p-values in the final column of Table 3 are close to significance at any conventional level).

The three broad patterns just identified each hold for the IRC: this is meaningful because the IRC is arguably the most informative of the benchmarks (by virtue of being the principal component of the larger set of estimates). More specifically, based on results reported in the final row of Table 3, we replicate the finding of Gächter et al.10 that the Oneness scale outperforms the IOS scale in terms of its correlation with the IRC (z = − 2.085; p = 0.037 in Table 3); we see that the correlation of the IOS11 scale with the IRC is stronger than that for the original IOS scale (z = − 2.252; p = 0.024); and it is statistically indistinguishable from the Oneness scale (z = − 0.155; p = 0.876). Since scores three to five are identical in the IOS and the IOS11 scale, we replicate Table 3 by excluding participants with these scores. We find that all our results are robust (see online Appendix A.3).

It is also worth noting that, overall, we replicate the evidence from Gächter et al.10 in finding correlation coefficients that very closely mimic the original results. This is noteworthy as we utilized a different study population (US vs. UK) on different platforms (MTurk vs. Prolific), and a substantive amount of time has passed since the original data collection (2014 vs. 2021).

Based on these results, we summarize our main finding as follows: In terms of convergent validity, our tool, the IOS11 scale, matches the performance of the Oneness scale in terms of its correlation with a set of scores obtained through established estimates of relationship closeness, but it does so whilst maintaining the simplicity of the single-item IOS scale.

Discussion and conclusion

In this paper, we have introduced the IOS11 scale as a tool for eliciting relationship closeness. The primary advantage of the IOS11 scale lies in addressing the issue that, until now, researchers considering using IOS-like scales have faced a tradeoff between the simplicity of the single-item IOS scale and the added accuracy of the two-item Oneness scale. The IOS11 scale resolves this tension by offering a new 11-point version of the IOS scale which, according to our results, is statistically indistinguishable from the Oneness scale in terms of its ability to track a range of more complex questionnaire-based estimates of relationship closeness5,8,7. For those considering the use of some IOS-style tool, the IOS11 scale provides a convenient, highly portable, and efficient method for the elicitation of relationship closeness in any computerized environment.

Our study also complements ongoing research developing estimation techniques for relationship closeness37,38,39. Two of these studies develop online versions of the IOS scale using a continuous scale and, like us, conjecture that a more fine-grained tool may increase precision38,39. A third study compares scores obtained from the standard IOS scale with a continuous version and a step-choice version37. Using a within-participant design, the authors conclude that a continuous version is least likely to suffer from a no overlap bias, where participants avoid selecting the pair of circles without overlap. However, none of the three papers benchmark to the Oneness scale or the RCI5, SCI5, LLS6, or PAM7.

Previous studies utilizing scales from the IOS family have also investigated other psychometric properties, such as test–retest reliability, convergent validity, and predictive validity. In the original paper that introduced the IOS scale8, the authors find a high correlation (r = 0.83) across a two-week test–retest, and strong evidence of convergent validity (0.09 ≤ r ≤ 0.45 with other estimates of relationship closeness) and discriminant validity (r = 0.09 with a methodologically similar, but conceptually unrelated measure)8. Similarly, also for the Oneness scale, previous work found strong evidence of test–retest reliability (r = 0.93)21 across two-weeks and convergent validity (0.36 ≤ Spearman’s ρ ≤ 0.58) with other estimates of relationship closeness10. Whilst we find clear similarities between the estimates of relationship closeness as revealed by the IOS11 and IOS scales and the Oneness scale, in terms of their correlations with other estimates of relationship closeness, future work could usefully explore other psychometric properties of the IOS11 scale including test–retest reliability, discriminant validity, convergence of self- and partner-report or its validity in predicting other meaningful behavior.