Introduction

Diversity in scientific teams is often a catalyst for creativity and innovation (Misra et al., 2017; Smith-Doerr et al., 2017), and numerous studies have documented that gender diversity, the equitable representation of genders, is important for the development, process, and outcomes of scientific teams (Bear and Woolley, 2011; Hall et al., 2018; Misra et al., 2017; Riedl et al., 2021; Smith-Doerr et al., 2017; Woolley et al., 2010). Furthermore, research has found evidence that a higher proportion of women on a team increases collective intelligence (Riedl et al., 2021; Woolley et al., 2010), and that gender-balanced teams lead to the best outcomes for group process (Bear and Woolley, 2011; Carli, 2001; Taps and Martin, 1990). When scientists hear that the proportion of women influences team performance, they often ask “What proportion is needed, and why does the proportion of women impact team success?”

The answers to these questions remain unclear. To date, most research on the impact of gender composition on scientific teams only uses quantitative metrics (e.g., comparing team rosters and bibliometric data) (Badar et al., 2013; Lee, 2005; Lerback et al., 2020; Pezzoni et al., 2016; Wagner, 2016; Zeng et al., 2016). Although these quantitative metrics provide a reasonable starting point, they emphasize the presence of women rather than their levels of integration or participation, which may perpetuate tokenism on scientific teams. As Smith-Doerr et al. (2017), reported

Our journey through the literature demonstrated a critical difference between diversity as the simple presence of women and minority scientists on teams and in workplaces, and their full integration (p. 140).

Similarly, Bear and Woolley (2011) conducted a meta-analysis of the literature from multiple disciplines and found that when diverse team members were integrated holistically, team diversity contributed to innovation. Conversely, in studies where teams had diverse membership but failed, these teams were often relying on token members and did not have authentic and full integration of those diverse members. Bear and Woolley (2011) suggest that the proportion of women on a team roster should be studied as follows:

It is not enough to simply examine the number of women in a particular institution or role. … In order to be truly effective, the role that women play in scientific teams should also be taken into consideration and promoted in order to yield the substantial benefits of increased gender diversity (p. 151).

These recent studies signal a paradigm shift in literature in the perceptions of diversity on teams because historically, diversity on teams was perceived as negative. In 1997, Baugh and Graen (1997) described teams with women and minorities were perceived to be less effective. Benschop and Doorewaard (1998) described how teams simply (re)produce gender inequality and they did not see a future in teams providing opportunities for women. Guimerà et al. (2005) claimed that while diversity may potentially spur creativity, it typically promotes conflict and miscommunication. Today, it is well accepted in the literature that to create new knowledge and solve complex global problems, studies in the science of team science (SciTS), knowledge innovation, creative, and more have documented that diversity in teams is important for the process, interactions, and outcomes (Bear and Woolley, 2011; Hall et al., 2018; Misra et al., 2017; Riedl et al., 2021; Soler-Gallart, 2017; Ulibarri et al., 2019; Woolley et al., 2010).

Numerous researchers have called for varied approaches to the study of women on teams. Madlock-Brown and Eichmann (2016) wrote that we “need a multi-pronged approach to deal with the persisting gender gap issues” (p. 654). Bozeman et al. (2013), explained that we understand collaboration from a bibliometric standpoint, but much more qualitative research is needed about the meaning of collaboration and the more informal side of collaboration, including mentoring, ingrained biases, and balancing collaborations (Reardon, 2022). Further, many of these studies about women on teams were conducted with undergraduate students within curricular settings, not with real-world scientific teams. Fundamentally, to understand gender patterns in scientific collaborations, qualitative and mixed methods research approaches are needed that study the process of scientific team development and not just team outcomes (Keyton et al., 2008; Wooten et al., 2014).

Hypotheses

This study focused on 12 interdisciplinary university scientific teams that were part of an institutional team science program from 2015 to 2020 aimed at cultivating, integrating, and translating scientific expertise. Team science is research conducted collaboratively by small teams or larger groups (Cooke and Hilton, 2015). The program included multiple forms of evaluation, including participant observation, focus groups, interviews, and surveys at multiple time points. More specifically, gender diversity was explored by using mixed-methods data from team interactions to investigate two primary research questions: (1) what is the role of women on scientific teams? and (2) how do women impact team interactions?

Members of the 12 teams completed social network surveys about their relationships including who they seek advice from, who is a mentor, who serves on student committees, who they learn from, and who they collaborate with. Social network analysis studies the behavior of the individual at the micro level, the pattern of relationships (network structure) at the macro level, and the interactions between the two (Stokman, 2001). In the context of team science, social network analysis provides insights into how interactions are related to team success and how the social processes teams use supports the knowledge-creation process (Cravens et al., 2022; Giuffre, 2013; Granovetter, 1977; Love et al., 2021; Zhang et al., 2020). Utilizing these data, we calculated the indegree for each team member’s relationship with other team members. Indegree quantifies the number of other team members that stated they had the selected relationship with the given individual. For example, the advice indegree counts the number of other team members that reported receiving advice from that person. To compare results across the teams, the indegree and outdegree measures were scaled by the number of respondents to account for the total number of possible connections for individuals. These social network measures allowed us to test five hypotheses based on the current team science literature and other disciplines about how women impact team interaction and collaboration.

Hypothesis 1: Women faculty will have a higher indegree than men faculty within the mentoring and student committee networks. Men faculty members will have a higher indegree than Women faculty members in the advice and leadership networks.

Hypothesis 2: Men at all career stages will be more likely to be considered a leader on the team than women, measured by having a higher average scaled indegree in the leadership network.

Hypothesis 3: Various networks will be correlated as follows:

  1. a.

    Leadership and advice networks will be positively correlated.

  2. b.

    Mentoring networks will not be positively correlated with leadership or advice networks.

  3. c.

    Mentoring and student committees will be correlated.

Hypothesis 4: The social and collaboration relations will be more positively correlated for women than for men.

Hypothesis 5: Non-faculty team members will have more social connections on teams with a senior woman relative to those on teams without a senior woman.

These hypotheses are grounded in the literature on the persistent, latent, and subtle ways gender inequality is reproduced within organizations (Acker, 1992; Benschop and Doorewaard, 1998; Cole, 2004; Fraser, 1989; Gaughan and Bozeman, 2016; Madlock-Brown and Eichmann, 2016; Sprague and Smith, 1989). Many theories regarding the impact of gender diversity assume that teams reproduce socialized patterns of behavior. Zimmerman and West (1987) wrote that gender is not a biological concept, but it is a social construction that “involves a complex of socially guided perceptual, interactional, and micropolitical activities that cast particular pursuits as expressions of masculine and feminine ‘natures’” (West and Zimmerman, 1987). Gender is thus created by social organization and performed in our everyday lives and the ways we interact with one another (Butler, 1988). Gender, albeit a social construct, is an influential schema that impacts behaviors and interactions in society (West and Zimmerman, 1987).

According to Zimmerman and West (1987) and Butler (1988), the process of gender socialization includes ideas about who is a leader, how leaders should act, and even what leaders should look like. Many studies have found that women may not be perceived as leaders even when their status or contributions to the team are high (Bunderson, 2003; DiTomaso et al., 2007; Humbert & Guenther, 2017; Joshi, 2014). Other studies have found that men were more influential in groups, even when they were in the minority (Craig and Sherif, 1986), and that teams with women and minorities were perceived to be less effective (Baugh and Graen, 1997). Furthermore, although leadership responsibilities often become attached to specific roles, they can also be conferred and performed based on the perception of the individual qualities or capabilities of team members (Butler, 1988). For example, if a woman is a principal investigator (PI), a man on the team may also be considered a leader and vice-versa. These conferred roles may impact individual responsibilities and further solidify the perception of who is the team leader.

Perceptions about the roles of women and men can also impact the responsibilities they are assigned during meetings and the duties they are expected to perform in the workplace. In academia, faculty are frequently expected to engage in service work to support the university, the discipline, and the community. Service work may include mentoring, advising, and serving on committees. Recent studies suggest what has been long perceived within academia, that when controlling for rank, race/ethnicity, and discipline, women spend significantly more hours on service work when compared to their male colleagues, (Guarino and Borden, 2017; Misra et al., 2011; Urry, 2015). In STEM disciplines, women spend a higher percentage of their time on mentoring than their male counterparts (21% for women vs. 15% for men) (Misra et al., 2011). Researchers have not yet explored whether team science exacerbates or mitigates this disparity in service work.

Literature has documented that collaboration patterns are different for women and men. Women faculty and students participate in more interdisciplinary research in almost all fields at every career stage (Rhoten and Pfirman, 2007). In addition, women tend to have more collaborators than men (Bozeman and Gaughan, 2011), and studies have found that being well-connected correlates with success for women (Madlock-Brown and Eichmann, 2016). Is it possible that having a senior woman on the team creates a culture of collaboration, such that non-faculty, which might be traditionally marginalized on a team, are more well-connected? We evaluate this here by comparing the connectedness of non-faculty on teams with and without a senior woman.

In part, the lack of understanding about why gender diversity matters on scientific teams result from primarily studying member demographic profiles rather than studying how teams are functioning, including exchanges of knowledge, power dynamics, and the team development process which is critical to team success (Smith-Doerr et al., 2017). This study moves beyond team composition to expand and examine real-world scientific teams through analysis of relational data to answer the questions: What is the role of women on scientific teams; and How do women impact team interactions?

Methods

Sample

This study was conducted at a land grant, R1 University in the western region of the United States. The primary sample for this study was 12 self-formed, interdisciplinary scientific teams with varied research foci, who were participants in a competitive university-funded team science program from 2015 to 2020. To apply for funding, each team submitted a written application and competed in a pitch fest (a brief oral presentation of their proposed project) that was followed by an intensive question and answer session by the review team. The topics for the interdisciplinary teams that were selected were broadly defined across STEM-related fields. The teams were expected to contribute to high-level program goals, which included:

  1. 1.

    Increase university interest in multi-dimensional, systems-based problems

  2. 2.

    Leverage the strengths and expertise of a range of disciplines and fields

  3. 3.

    Shift the funding landscape towards investing in team science/collaborative endeavors

  4. 4.

    Develop large-scale proposals; high caliber research and scholarly outputs; new, productive, and impactful collaborations

These overarching goals were measured by having the teams report on a variety of outcome metrics, including publications, proposals submitted, and awards received.

Participation in the team science program occurred through two cohorts and lasted 24–30 months for each cohort. However, a team in the second cohort left the program after 12 months. During the program, teams met with administrative leadership, the team science research team, and some external partners every 3–4 months to provide progress updates on stated milestones and receive feedback and mentorship. Additional support was provided through individualized trainings/workshops approximately every few months throughout the program. These sessions provided additional instruction on team science principles, social network analysis interpretation, marketing/branding, diversity and inclusion, opportunity identification, philanthropic fundraising, technology transfer, visioning, and team management/leadership. Some of the training was attended by multiple teams, but often these were specifically designed for the needs and developmental stage of each team. An additional team volunteered to participate in the study but was not part of the formal program. This team, also self-formed, was an interdisciplinary team that had received a large award through a federal grant. The 13 teams were randomly assigned a number from 1 to 13 to maintain anonymity and are referred to in this study by their team number. Team 2 was excluded from the study altogether because two of the authors were members of this team.

Data collection

Multiple types of evaluation data, at multiple points in time, were collected throughout the university-based team science program including participant observation, focus groups, turn-taking data, rosters, interviews, and surveys. This study utilized the resulting data from rosters, participant observation, field notes, and responses to a social network survey. Data for this article is from social network surveys at the conclusion of the program or the closest associated data point. Selecting data from a similar timepoint follows the recommendations of Wooten et al. (2014) who differentiated between development, process, and outcome metrics for scientific teams.

Rosters

Teams submitted rosters with demographic information including name, email, self-identified gender, title, college, department, and role on the team (i.e., PI, member, graduate students, etc.). Rosters were updated annually during the program and provided the data to define senior woman and junior faculty and other demographic categories.

Social network survey

Each team member on the roster was sent an email after the program end date and was asked to complete an online social network survey that had two sections: demographic and social network relational questions (see Appendix Table 2). Following IRB protocol #19-8622H, participation was voluntary, and all subjects were identified by name on the social network survey to allow for the complete construction of social networks. Names were deleted prior to data analysis and result reporting.

To ensure that respondents had the option to select a self-identified gender, the social network survey included a demographic question that asked participants to self-identify their gender by filling in a blank space rather than choosing from a prescribed drop-down list. This was the gender attribute used for analysis in this article. Two respondents did not answer the gender demographic survey question, and the roster data was used for these participants. There was no variability in the level of missingness across questions. Respondents either completed the survey or did not.

The network survey’s relational questions asked about the presence and absence of interactional mentoring, advice, leadership, and collaborative relationships with other members of the team. The first set of questions was developed by the research team primarily to collect information about scientific collaborations since joining the team. The survey asked, who have you:

  • talked about possible joint research/ideas/concepts/connections

  • worked on research, collaborations, tech projects, or consulting projects

  • worked on joint publications presentations, or conference proceeding

  • worked on or submitted a grant proposal; and sat on a student’s committee together (or is a member of your thesis/dissertation committee)

The second set of questions focused on social relationships within the team, including:

  • I learn from [this person]

  • I seek advice from [this person]

  • I hang out with [this person] for fun

  • [this person] is a leader on the team

  • [this person] is a mentor to me

  • [this person] is a friend

  • [this person] energizes me

Participant observation and field notes

A researcher attended two to six team meetings for each team to collect observational data. There were two exceptions to this as Team 1 did not have face-to-face team meetings, precluding participant observation; and Team 5 did not consent to observation at their meetings. After the meetings, the researcher recorded field notes to provide qualitative insights into the progress of the team development, their patterns of collaboration, and gender interactions as suggested by Marvasti (2004). The field notes supported the development of the senior women classification (see Appendix Table 1 for classification definitions). In addition to roster information, many teams had separate leadership teams that met and determined the scientific direction of the team. If a team had a woman on the leadership team, as recorded in field notes, then they received the designation of having a senior woman.

Statistical analysis

RStudio (R Studio Team, 2020) was used to analyze the social network data. The data were summarized using outdegree, indegree, and average degree. The outdegree of an individual is a measure of how many other team members they indicated receiving advice, mentorship, etc. from on the team. Alternatively, the indegree of an individual is a measure of how many other team members reported receiving advice, or mentorship, from that person. Average degree is the average number of immediate connections (i.e., indegree plus outdegree) for a person in a network (Giuffre, 2013; Hanneman and Riddle, 2005). To compare results across the teams, the indegree and outdegree measures were scaled by the number of respondents to account for the total number of possible connections for individuals (which is a function of both team size and response rate). The scaled indegree is thus the proportion of the team that named that team member for a given category. For example, if a team member has a scaled mentor indegree of 0.10, then 10% of the responding team members consider this individual to be a mentor. Confidence intervals for scaled indegrees were calculated using a t-distribution due to limited sample size.

The social relation question set responses were also analyzed separately and then combined for further statistical analysis. Three measures were created: collaboration, social, and professional support. To create the measure called collaboration, the following questions were combined: worked on research, collaborations, tech projects, or consulting projects; worked on joint publications presentations, or conference proceedings; worked on or submitted a grant proposal. To create the measure called social, the measures: I hang out with [this person] for fun and [this person] is a friend were combined. Finally, to create the measure called professional support, the measures: I seek advice from [this person], [this person] is a mentor to me, and sat on a student’s committee together (or is a member of your thesis/dissertation committee) were combined (see Appendix Table 2 for Terms and Associated Survey Questions).

In addition, data from the social network relational questions were used to construct multiple social network diagrams, wherein nodes represent the team members, and an edge exists from participant A to participant B if A perceived a relation with B. For example, in the mentorship network, a link from A to B signified that A considered B to be a mentor.

Field notes were analyzed using a constant comparative method (Mathison, 2013) to provide qualitative insights into the progress of overall and individual team development, patterns of collaboration, and gender interactions as suggested by Marvasti (2004).

Classifications

For analysis purposes, three classifications were created from the demographic data. Senior woman indicates there was a woman PI or a woman on the leadership team. Faculty was defined as an assistant, associate, and full professor. Non-faculty were defined as undergraduate students, graduate students, postdocs, research associates, community partners, and project managers. In the study, 78.5% of faculty, and 77.6% of non-faculty completed the survey (see Appendix Fig. 1 for more details on response rate and Appendix Table 1 for terms and definitions).

Results

Demographic data

Over half of the 204 team members, 160 (78.2%), completed the survey. Out of 160 respondents, 84% of women and 73% of men completed the survey. Table 1 provides demographic data by team number. Team size ranged from a low of 6 and a high of 30 members and the average number of team members was 15. The university had seven colleges, and all teams had representation from three to seven colleges.

Table 1 Team demographic information and survey response rates.

Hypotheses testing

Test results of the five study hypotheses are presented below.

Hypothesis 1: Women faculty will have a higher indegree than men faculty within the mentoring and student committee networks, and men faculty members will have a higher indegree than women faculty members in the advice and leadership networks.

The first hypothesis was designed to investigate if women were perceived to be doing more service work and emotional labor (mentoring and student committee networks), and men were perceived as being leaders (leader and advice networks) (Guarino and Borden, 2017; Misra et al., 2011; Urry, 2015).

Figure 1 compares the average indegree values of men and women on each team in four social network diagrams (mentoring, student committees, leader, and advice). The data in Fig. 1 do not support the hypothesis that more team members went to women faculty for mentoring and for serving on student committees. Further, the data did not support that more team members went to men faculty for advice or reported viewing them as leaders.

Fig. 1: For each team and each social network, the average scaled indegree was computed for the women and the men team members.
figure 1

These are plotted against one another, where the size of the dot reflects the number of team members that completed the survey. When the number of respondents is low (a small dot), the scaled indegree is expected to be more variable, whereas when the number of respondents is high (a large dot), the scaled indegree is expected to be less variable and more representative of the whole team’s perceptions. Each graph reports a different social network question (mentor, student committee, advice, and leader).

The Fig. 1 mentoring network does, however, illustrate that teams in the study either engaged or did not engage in mentoring. On teams where women had a high mentoring indegree, men also had a high indegree in the mentoring network. This indicates that mentoring was team-specific rather than gender-specific. This aligns with other studies about team processes that found team norms (like mentoring) impact the behaviors and processes of teams (Duhigg, 2016; Winter et al., 2012).

Hypothesis 2: Men at all career stages are more likely to be considered a leader on the team than women, measured by having a higher average scaled indegree in the leadership network (Table 2).

Table 2 Average scaled indegree of faculty and non-faculty in the who is a leader social network, accompanied with a 95% confidence interval.

Literature in business, political science, and sociology report that men are more likely to be perceived as leaders (Baugh and Graen, 1997; Bunderson, 2003; Craig and Sherif, 1986; DiTomaso et al., 2007; Humbert and Guenther, 2017; Joshi, 2014). Based on this, we hypothesized that these perceptions would also be present in scientific teams (Table 2, Fig. 2). In the study, both men faculty and men non-faculty were more likely to be reported as a leader on the team; however, this finding was not statistically significant based on a 95% confidence interval (CI) (Table 2).

Fig. 2: For each team, the average scaled indegree in the leader network for women and men, and faculty and nonfaculty was computed.
figure 2

The values for men and women for each of the faculty types are plotted against one another. Faculty were more likely to be considered leaders than non-faculty, but there were no significant differences between reporting men or women as leaders on scientific teams.

Figure 2 illustrates the scaled indegree for women and men faculty and non-faculty, which shows faculty are more likely to be considered leaders than non-faculty. Nevertheless, there were no significant differences in whether team members reported men or women as leaders on scientific teams.

Hypothesis 3: Based on socialized gendered perceptions various networks will be correlated as follows:

  1. 1.

    Leadership and advice networks will be positively correlated.

  2. 2.

    Mentoring networks will not be positively correlated with leadership or advice networks.

  3. 3.

    Mentoring and student committees will be correlated.

The third hypothesis focused on whether gendered perceptions resulted in certain network diagrams being correlated. Previous studies have found that men are more likely to be perceived as leaders (Baugh and Graen, 1997; Bunderson, 2003; Butler, 1988; Craig and Sherif, 1986; DiTomaso et al., 2007; Humbert and Guenther, 2017; Joshi, 2014) and women are more likely to be perceived as mentors or caretakers (Guarino and Borden, 2017; Misra et al., 2011; Urry, 2015). These perceptions are sedimented in the language used to describe men and women (Sprague and Massoni, 2005).

Fig. 3: Each node (circle) represents one of the social networks, and the thickness of the edge between two circles is proportional to the average Pearson correlation between the given networks across all teams.
figure 3

We see the advice, leader, and mentor networks were highly correlated but only weakly correlated with the student committee network.

Based on this literature, we hypothesized that the leadership and advice networks would be correlated because both leading and giving advice suggest a greater power differential. Second, the mentoring network would not be correlated with leadership or advice networks because mentoring is more closely aligned with caregiving activities, which are considered more feminine. Third, the mentor and student committee networks would be correlated because these acts are associated with caretaking. Here, we tested if the networks related to leadership were correlated and if networks related to mentorship and service work such as serving on student committees were correlated.

Figure 3 illustrates the correlations for four of the network diagrams (mentoring, student committee, advice, and leadership) and reports the significance. The first gendered perception, that the leadership and advice networks would be correlated, was validated by the data. In the study, the leadership and advice networks were correlated (0.83). However, the hypothesis that the mentoring network would not be correlated with leadership (0.82) and advice (0.84) was not supported. These network diagrams were correlated, indicating team members who reported other team members as being leaders also reported that they received advice and mentoring from them. Finally, the hypothesis that mentoring and student committee diagrams would be correlated was also not validated by the data (0.32). One factor that could be contributing to these results comes from studies that show perceived organizational support, as well as perceived leader support, correlate with creativity and satisfaction in the workplace (Handley et al., 2015; Moss-Racusin et al., 2012; Smith et al., 2015). On the teams, members that are perceived as leaders are likely to provide support to others on the team. Notably, these studies did not explicitly examine gender in their findings.

Hypothesis 4: The social and collaboration relations will be more positively correlated for women than for men.

A growing body of literature seeks to understand the connection between interpersonal relationships and knowledge innovation (Reference Blinded). We investigate this by considering how three types of interactions collaborative, social, and professional are intertwined on scientific teams. The purpose of this hypothesis was to closely examine the collaboration patterns of men and women and the connection between interpersonal relationships and knowledge creation. To create the measures in this hypothesis, social network survey questions were combined. For example, the measure social is a combination of: I hang out with [this person] for fun and [this person] is a friend (see the Analysis section for descriptions of all the measures).

To test what proportion of team members collaborate, given that they are also social with these individuals, we identified the team members that the person was social with and then calculated what proportion of those members they were also collaborating with. The results for this measure are given in Table 3 as proportion collaboration given social. Other items in Table 3 were developed in a similar manner.

Table 3 Proportion of overlap between the social, collaboration, and professional support networks.

Although our results indicate no statistical differences between men and women, we found that both men and women have intertwined relationships. If a team member is in one network (e.g., collaboration), it is likely that the person is also in another one of their networks (e.g., social). Furthermore, the overall proportion of men who have intertwined relationships in their collaboration, social, and professional support networks were higher in all proportions except proportion social given professional support (Table 3).

Hypothesis 5: Non-faculty team members will have more social connections on teams with a senior woman relative to those on teams without a senior woman.

Numerous studies have attempted to tease apart gendered approaches to different collaboration styles and whether this has any impact on scientific collaborations (Bozeman et al., 2013; Madlock-Brown and Eichmann, 2016; Misra et al., 2017; Zeng et al., 2016). To build on this body of literature, this hypothesis tests the impact of senior women’s leadership, if any, on the collaborations of senior women and their impact on the network.

Figure 4 illustrates the scaled average indegree on the whole team when there are women in senior positions. A high average indegree for the team indicates that more team members and interacting and socializing on the team. The average scaled indegree on teams with a senior woman was 0.28 and without a senior woman was 0.20 (t-test p = 0.44; Cohen’s D effect size 0.51). The second graph in Fig. 4 illustrates the scaled average indegree on non-faculty when there are women in a senior positions. The average scaled indegree on teams with a senior woman was 0.27 and without a senior woman was 0.16 (t-test p = 0.42; Cohen’s D effect size 0.55). Thus, there was no evidence to conclude that senior women influenced the social interactions on the team.

Fig. 4: The average scaled indegree in the social network was computed for each team and the nonfaculty on a team.
figure 4

These average scaled indegree measures were then separated based on whether there was a senior woman leader on the team, and the average across all teams was marked by a black horizontal bar. Based on these data, there appears to be no systematic difference in the social interactions of teams with a woman in a senior position and teams without a woman in a senior position. Average scaled indegree of non-faculty on teams without a senior woman = 0.16. Average scaled indegree of non-faculty on teams with a senior woman = 0.27. (t-test) p-value = 0.42.

Discussion

This study explored the impact of gender diversity on 12 scientific teams by analyzing team development and process data. It investigated two primary research questions: What is the role of women on scientific teams? and How do women impact team interactions? We initially believed that the primary reason previous research had been unable to adequately explain the role of women on scientific teams and how women impact team interactions were in part due to the lack of qualitative and mixed methods studies. We based our initial hypothesis on the assumption that scientific teams reproduce existing patterns of inequality (Butler, 1988; West and Zimmerman, 1987). However, it was through the development of the five hypotheses for this study and the subsequent analysis of relational data, that we learned that our assumption was in large part not supported.

Numerous studies have found evidence of systematic discrimination and bias in awarding grants (Ginther et al., 2011), acceptance of publications (Lerback et al., 2020; Salerno et al., 2019), language to describe women (Ross et al., 2017), promotion decisions (Régner et al., 2019), rewards (Mitchneck et al., 2016), and access to resources for research (Misra et al., 2017) in addition to other obstacles and forms of marginalization that are invisible and unacknowledged (Rhoten and Pfirman, 2007; Urry, 2015). Why did our data not replicate these findings? We conclude with the following possible explanations.

Preliminary studies in the SciTS literature have found that team science principles may simultaneously support the advancement of women in scientific fields; and complementarily, the inclusion of women on scientific teams may increase the success of these teams (McKean, 2016; Woolley et al., 2010). Further, including women and underrepresented populations on scientific teams has the potential to “serve as a strong entry point into scientific studies for women” (Rhoten and Pfirman, 2007, p. 72). Similarly, in sociology, Soler-Gallart (2017) found positive benefits for the whole team when scientists engaged in dialogic relations and interaction with the intention of overcoming gender barriers and discrimination. Could team science advance women in their scientific careers? If high-functioning scientific teams disrupt rather than reproduce existing hierarchies and gendered patterns of interactions, it increases the possibility that team science is a tool not only for accelerating the creation of knowledge but for the advancement of a more empowered, just, and equitable profession.

Literature has documented how including historically underrepresented identities in the ingroup changes attitudes and behaviors (Soler-Gallart, 2017). Allport et al. (1954) found that when members of an ingroup were in close contact and built connections with members of an outgroup, prejudice decreased. Initially, the theory about ingroups and outgroups was devised to describe race and ethnic relations; however, recent studies have generalized the findings to other topics including gender bias and discrimination (Pettigrew and Tropp, 2006). Today, numerous studies have documented that intergroup contact and connections can improve intergroup attitudes (Allport et al., 1954; Brewer, 2007; Dovidio et al., 2012; Pettigrew and Tropp, 2006). Is it possible that scientific teams create ingroups that include rather than exclude women?

The teams in this study were not created nor did they develop in isolation. These teams had access to team development resources like SciTS literature, team science training, and access to administrative expertise and support. The promotion and tenure package of the selected university for this study allowed faculty to include interdisciplinary and team accomplishments. Structures were in place to fund, train, build, and reward these teams. Many of these resources, interventions and structures were designed and led by a group of nine women and one man. The women, especially, emphasized diversity, equity, and inclusion from team formation to building and rewarding successes. In addition, many of the sessions were customized to meet the needs of individual teams. Did these facilitators create an ingroup? Although we did not test the impact of these interventions and structures, other studies have previously hypothesized that modifying existing and often outmoded structures will positively impact outcomes for women (Gibbons et al., 1994; Hansson, 1999; Rhoten and Pfirman, 2007). Another study found that when team members participate in dialog relations and interactions instead of using prestige to gain power they were more willing to rethink concepts when presented with new information (Soler-Gallart, 2017). Specifically, in terms of women in science, Rolison (2000, 2004) developed a hypothesis recommending explicitly applying Title IX principles to support women in academia. She posited that providing equal funding opportunities and resources for women would result in equal opportunities for success. Another study attributed the key to their team’s success was the inclusion of women, the community, and other diverse perspectives from the community (Soler-Gallart, 2017). Our findings suggest that the handful of women on our teams may have joined the ingroup in academia albeit if only for a short time.

It is important to note that we do not believe our results accurately reflect the university of study as a whole or academia in general. Team observations and resulting field notes documented numerous accounts of gender inequality and inequity where women were disempowered and had limited opportunities to contribute to the team. Moreover, we are confident that women on these teams have had individual experiences that would contradict our findings. A lack of evidence does not indicate that there is equality. Nevertheless, these results do suggest that scientific teams, developed with intention, may provide greater opportunities for women to amplify their contributions to science (McKean, 2016; Rhoten and Pfirman, 2007; Woolley et al., 2010).

Limitations

Previous studies on gender and scientific teams have used bibliometric data to understand patterns of collaboration. Other studies on teams have created teams in the lab using students and other volunteers. Although this study is unique and contributes to the literature, as the data are based on real-world scientific teams, we identified six limitations.

First, several teams had apprehension about participating in SciTS research, and one team left the program after year one resulting in limited data from those teams. Second, teams may have experienced the so-called Hawthorne effect (K. Baxter et al., 2015) and performed differently because they were part of a research study, and a researcher regularly attended team meetings. All participant observations related to the positionality of the researcher were well-documented in field notes (P. Baxter and Jack, 2008; Greenwood, 1993; Marvasti, 2004).

Third, we defined senior women in a manner that would be inclusive to women with and without formal titles. The senior woman designation was given based on both formal titles and field notes. Some of the teams in our study had women who were the PI or in a designated leadership position with formal titles, and other teams had women on the leadership team. It is possible that the women on these teams were seen as leaders because of their position on the team, but that their leadership came without titles, awards, and recognition that might have been associated with those titles.

Fourth, it is possible that study participants had varying definitions of mentor, advice, and leader. We anticipated different interpretations in our study plan and as a result combined data in hypothesis four to detect and account for potential differences in definitions. Nevertheless, we acknowledge that lived experiences, in general, give individuals different perspectives. Literature in political science has found that when people imagine a leader, many of the traits are more masculine (e.g., wearing a suit, being tall and bigger) (Butler, 1988). Fifth, we did not measure the success of the teams in this study; thus, we were unable to translate how different interaction patterns translated into team performance. Ongoing funding was, however, contingent on performance as measured by pre-determined metrics including numbers of grants, publications, invention patents, and other markers of success.

Finally, a limitation of all social network studies is that data are collected at a single point in time. Thus, temporal changes in team interactions cannot be accounted for in our sample. For example, we cannot discern whether social relationships or scientific collaborations came first. We only know that they were both happening at the time the survey was administered. Further, at the time the survey was completed, it is possible that a person had not yet established a relationship, or they had forgotten about a previous relationship.

Conclusion, recommendations, and future research

We offer three key recommendations for future research. First, scientific results that are statistically insignificant are rarely shared in the literature. Therefore, it is critical that all efforts to expand research be published to broaden and accelerate the understanding of the role of women in scientific teams (Bammer et al., 2020; Oliver and Boaz, 2019).

Second, the landscape of science is changing rapidly as a result of private and federal funders requiring the inclusion of the science of team science experts as PIs in grant applications. We recommend that researchers expand their focus and examine how scientific teams change the culture of science. Research questions might include: How do support diverse teams translate to culture changes in science and the academy? Do scientific interdisciplinary teams provide more access for historically marginalized and disenfranchised groups? Finally, to create a comprehensive understanding of elements that contribute to expertise in scientific teams, we recommend that research be conducted with a theoretical focus on team development and processes. This would include studies that explore science facilitation, learning-by-doing, and other tacit forms of expertise that lead to integration and implementation of knowledge (rather than a focus on recruitment and demographics).

Third, existing studies define gender as a binary (man/woman). This short-sighted perspective is no longer relevant in society. Gender is not a biological concept, but a social construct, “It involves a complex of socially guided perceptual, interactional, and micropolitical activities that cast particular pursuits as expressions of masculine and feminine ‘natures.’” (West and Zimmerman, 1987, p. 125). Gender is thus created by the social organization of our everyday lives and the way we interact with one another. People often see this difference as natural, and society is structured as a response to these differences in terms of men and women. Because of this, researchers like us continue to expend time and resources asking research questions rooted in binary gender. Future research should broaden definitions of diversity and gender including non-binary definitions of gender, expand how we measure inclusivity, explore how power imbalances block expertise, and study how a balance of power promotes expertise.

In conclusion, the lack of evidence for gender impacting team roles and behaviors in our study aligns with other SciTS studies that found team composition is not the silver bullet that automatically leads to knowledge creation and innovation (Duhigg, 2016; Oliver and Boaz, 2019). Numerous SciTS studies have documented the importance of processes over team composition and relationships to build successful teams (Boix Mansilla et al., 2016; Gaughan and Bozeman, 2016; Hall et al., 2018; Zhang et al., 2020). Perhaps the reason scientific teams produce more citations and have a greater impact than siloed investigators (Wuchty et al., 2007) is that they are leveraging the available expertise through the authentic integration of all members.

In the future, when scientists ask, “What proportion of women is ideal on a team?” consider responding with “It is not about the number of women, but rather how women on teams are integrated and empowered.”