Stereotypical descriptions of computer science career interests are not representative of many computer scientists

Using responses from a large respondent-initiated online survey, we find that the career interests of many current and aspiring computer scientists in the United States diverge from a popular and official depiction of computer scientists’ interests used for career and workforce development worldwide. Distinct profiles of career interests emerged from the data. These profiles suggest that many women in the field value social and artistic expression in a way not currently recognized by established depictions of computer scientists’ interests. Better capturing the diversity of interests in computer science might help to boost women’s, and men’s, engagement in this STEM field.

proposes that individuals make decisions about their desired participation in roles based upon the perceived gendered nature of those roles.If women perceive that CS professions are stereotypically masculine in nature, they might be dissuaded from participating regardless of their career interests.There is evidence regarding genderbased associations in the field of computer science.For example, stereotypes that women are less interested in computer science than men exist 19,20 .In addition, there are gendered stereotypes regarding persons engaged in computer science (e.g., that they are socially isolated) and stereotypes about the nature of computer science work activities (e.g., that such work is non-collaborative in nature) 21 .Providing evidence for the potentially problematic nature of such stereotypes for women, there is evidence that exposure to stereotypical physical environments and role models (e.g., those indicating a focus on video games or science fiction), can have detrimental influences on women's interest in pursuing computer science occupations 6,22 .There are indications that gender bias is negatively associated with women's retention in the field.For example, it has been observed that women studying CS in college who considered dropping their major tended to experience gender bias in the classroom 23 .
Importantly, depictions of STEM occupational characteristics and environments, including computer science occupations, are often based on O*NET 3,4,24 .Accurate information about the characteristics of jobs and the workers who fill them is critical for career guidance, reemployment counseling, workforce development, and human resource management [25][26][27] .To supply such information, O*NET periodically surveys and updates information on over 900 occupations in the U.S. and publishes it in a publicly available database (www.oneto nline.org) 26 .While the database characterizes occupations in the U.S., these descriptions are used by individuals, private sector organizations, and governments worldwide 4 .For example, the Australian government has used O*NET data as part of their own workforce and career development efforts.Moreover, the database has been translated for use in European and Central American countries 4 .Indeed, O*NET has been recognized as a leading source of information about work internationally 24 .
One important component of the O*NET database is information regarding the inferred interests of job incumbents in engaging in certain types of work activities or operating in certain work contexts 26 .Such information regarding the assumed interests of job incumbents is often used to assist the career development of adolescents and adults in helping them to choose occupations that might be a good fit for their preferences [27][28][29] .In addition, such information is believed to facilitate career exploration and development 30 .Such information may also impact women's retention in the field, as information about the work environment can either strengthen women's persistence in CS or make them realize they do not want to work in the industry 31 .Indeed, O*NET data on the inferred interests of incumbents in over 1,000 occupations is used in a popular online survey (used by millions of persons) that delivers an estimation of the correspondence between survey-takers' interests and the inferred interests of incumbents 32 .
As described by O*NET, the inferred career interest ratings are indirectly based on information regarding that occupation, such as the typical work activities performed on the job 33 .Information on work activities is predominately collected through the stratified sampling of workers in the United States 26 .Based on this information, and other information gathered through alternate means, teams of raters make judgments based on these data regarding the appropriateness of interests as defined by the Holland model 34 .This model is composed of six "RIASEC" interest types: Realistic (interest in practical and hands-on tasks), Investigative (interests in analytic/scientific activities), Artistic (interests in creative or self-expressive work), Social (interests in working or communicating with others), Enterprising (interests in selling and/or leading), and Conventional (interests in working with details and/or routines) 34 .Within this model, interests in things tend to correspond most with Realistic interests while Social interests tend to correspond most with an interest in working with people 35 .

The present study.
In light of the ongoing uncertainty regarding the role of preferences and gender stereotypes in shaping women's representation in CS professions alongside the importance of information on career interests from O*NET in assisting workforce and career development 30 , the present study explores the possibility that, along one or more of the RIASEC dimensions, the inferred interests of those in CS professions might not represent the interests of all persons employed in or aspiring to CS occupations.To accomplish this, we compare overall O*NET estimations of CS interests with responses to a large (over 80,000 responses) and recent (2016) respondent-initiated online survey of career interests.

Results
To characterize heterogeneity in the interests of persons employed in or aspiring to CS professions, we conducted a latent profile analysis (LPA) of responses to a respondent-initiated online career-interest survey.We conducted two LPAs using samples of 500 survey responses randomly selected from the following two groups: respondents self-reporting that they are unemployed but identifying a CS profession as their "dream job" (herein those aspiring to CS) and respondents self-identifying as employed in a computer science profession (herein referred to as those employed in CS).We were able to replicate our findings using the larger sample (see SI Appendix).
We found four distinct profiles for those employed in CS (AIC = 9827.48,BIC = 10,029.78)and three profiles for those aspiring to CS (AIC = 10,045.42,BIC = 10,218.22).Descriptive labels and overall characterizations of the structures of those profiles are provided in Table 1.We estimated the inferred interests of those in CS professions using O*NET interests scores by using an average of CS professions-an average weighted by the representation of each profession in our random sample of 1,000 responses.Furthermore, for overall comparative purposes we include average interest scores for the 974 occupations with available interest information from O*NET (herein the U.S. occupational average).Profile scores, the overall O*NET CS interest estimate, and the U.S. occupational average are displayed in Figs. 1 and 2.
Profiles roughly resembling the O*NET profile in low estimates of Social and Artistic interests were found among both those employed in CS and those aspiring to CS.We refer to these profiles as "Stereotypical."As described in further detail in the Supplementary Information, these profiles exhibited a relatively high correspondence to the O*NET estimates in terms of distance scores and correlational comparisons.We note however that the Stereotypical profiles' low levels of Social and Artistic interests were above the U.S. occupational average.Despite our characterization of these profiles as stereotypical, only 30% (for those employed in CS) and 36% (for those aspiring to CS) of responses fell into these profiles.In contrast, the majority of responses were part of one of three following profiles: Artistic (detected among those employed in and aspiring to CS), Uninterested (detected among those employed in and aspiring to CS), and Multi-Interested (detected only among those employed in CS).The Artistic profiles exhibited high Artistic interests relative to other dimensions.The Multi-Interested profile similarly exhibited high Artistic interests alongside higher Realistic and Investigative interests relative to other profiles.Finally, the Uninterested profile exhibited lower levels of interests than other profiles-except in Conventional interests in comparison to the Artistic profile among those employed in CS.Overall, we note that all seven profiles exhibited higher ratings on the Social interest dimension than the O*NET estimates of CS Social interests.Moreover, Social interest scores for five out of seven profiles were higher than the U.S. occupational average.
Chi-square tests were used to determine whether significant gender differences existed across profiles.Gender was significantly related to profile membership for those employed in CS (χ2(3) = 58.35,p < 0.001) and those aspiring to CS (χ2(2) = 21.65,p < 0.001).We note higher representation of women (72% for those employed in CS/60% for those aspiring to CS) among responses from Artistic profiles and lower representation in Stereotypical profiles (30% for those employed in CS/36% for those aspiring to CS).    and 2).In relation to Social interests, our results correspond to well-established differences in gendered preferences among men and women with women often expressing stronger interests in working with people 15 .However, we note that the Multi-Interested (Fig. 1) profile among those employed in CS also exhibited relatively high levels of Artistic and Social interests (related to the O*NET and U.S. occupationalaverage estimates) and that this profile was mostly characterized by men.This result is in alignment with findings that indicate that non-stereotypical depictions of CS professions might be of relevance to both women and men 6 .Altogether, our results provide nuance to characterizations of CS career interests.Many responses among those in CS professions were characterized by profiles with Social interests that were higher than the U.S. occupational average.This qualifies stereotypical characterizations of CS as occupied by persons preferring social isolation 20 and supports reasoning that CS is an increasingly diverse field that has grown to engage with the growing social and collaborative nature of computer platforms 17 .We note that it is possible that our results are due to the non-representative sample that we relied upon to characterize CS interests.That is, it is possible that responses in our survey came from individuals engaged in, for example, programming within creative industries and/or in organizational contexts where social interaction is rewarded and/or encouraged.Nevertheless, our results do appear to indicate that a non-trivial share of women and men employed in CS professions do not clearly fit O*NET's depictions of interests in CS.
The broad pattern observed among those in CS was largely replicated among those aspiring to CS.This is an important finding given that individuals early in their careers or at times when they are considering new career seek out or are often presented with information on occupational interests like that from O*NET to guide their career exploration.Based upon our findings, it might be that those interested in CS with higher Artistic and/or Social interests might be less likely to pursue CS professions after being exposed to characterizations of the CS field as relatively low on Artistic and/or Social interests.While speculative, this would be in alignment with the tendency of people to pursue careers they perceive to be alignment with their personal preferences [9][10][11][12] and would align with research that has found that exposure to stereotypical physical environments and role models can influence women's interest in computer science occupations 6,21 .Consequently, we call for additional research to explore the potential effects of stereotypical, and non-stereotypical, career interest information on the attitudes of potential CS career prospects.For example, it has been observed that women studying CS in college report fewer positive internship experiences than men, some stating that their internship experiences helped them realize they might want to do something else 23 .Future research could investigate the different types of career-related information provided during CS-related internships and the effects of such information on women's retention in the field.The present study focused on the field of CS and future research should explore whether the findings extend to other STEM disciplines.Other STEM fields that are viewed as unsociable include engineering and physics 36 .The male to female ratio among students pursuing college majors in physics, engineering, and computer science have plateaued at about 4:1, and these majors are well known for being underrepresented by women 36 .Future research is needed to understand whether the interest profiles of engineers, physicists, and other STEM fields also deviate from what is conventionally assumed.Such findings could have implications for broadening women's participation in other STEM fields.
Additional research into the effects of stereotypical and non-stereotypical information might have important practical implications for women's inclusion in CS professions.There have been a variety of efforts to help qualify stereotypical characterizations of CS.For example, initiatives, like Girls Who Code and INTech Camp for Girls, have sought to help women and girls to successfully enter the CS profession 37,38 .However, such programs often rely on interventions among relatively small populations in certain geographic settings.In contrast, online databases of information about careers, like O*NET, are widely used internationally.Future research should test whether our findings replicate across different countries and cultures.The validity of using O*NET profiles outside of U.S. contexts is questionable given potential variability in work activities internationally 39 .Nevertheless, other English-speaking countries might hold similar occupational stereotypes.For example, the United Kingdom's National Career Services database (https:// natio nalca reers.servi ce.gov.uk/) provides occupational descriptions for computing, technology, and digital jobs (e.g., 3D printing technician, software developer, solutions architect).If such job descriptions overemphasize technical aspects of the job (e.g., "write or amend computer code"), this might indicate that social tasks are not as important/relevant to such jobs as they could be in practice.

Conclusion
Women's lack of representation in CS careers not only denies society the advantage of their abilities and perspectives but also limits their participation in many well paid, high-growth professions 17 .A more nuanced and comprehensive picture of computer scientists' interests might help address women's underrepresentation in CS and it is possible that prominent online databases of career information help play a role in this creating this picture.

Methods
Our characterization of CS interests was drawn from responses to an online survey posted to Time Magazine's website (see www. time.com; herein the career interest survey).These data were collected in June 2016 and are available from the Open Science Framework (https:// bit.ly/ 3nCBO UN).The first author obtained institutional review board (IRB) approval for the use of archival data generated from this survey for this study (IRB protocol #14,112).As of 2020, the magazine reported having a readership of over 26 million adults, with about 80% of readers living in the United States.Time Magazine's digital audience is roughly split by gender, with slightly more women (56%) visiting their website than men 40 .The survey was advertised on the magazine's website as a way for readers to learn more about their career interests and the match of these interests with their occupational aspirations or occupational characteristics.As of December 2021, this survey can be found by going to https:// time.com/ 43437 67/ job-perso nality-work/.
Within the survey, respondents were asked to self-declare several demographic characteristics (age, gender, educational attainment, employment status, and personal income).They were then asked to identify their job if they self-identified as being currently employed and identify their aspirational ("dream") job if they self-identified as unemployed.Finally, they were asked to answer questions about their interests in twenty different types of work activities.At the end of the survey, respondents were given a visual comparison of their interest in different tasks to the characteristics typical of their current or dream job.

Matching of respondents from the career interest survey to occupations.
The occupational title tied to respondent's job or dream job were calculated using a dynamic keyword search that matched their entered job title in lay terms (e.g., teacher, farmer) with the overarching occupational titles used by O*NET (e.g., elementary school teachers, farmworkers).

Categorization of CS occupations.
We recruited 442 participants from Amazon Mechanical Turk (Mturk) to categorize 707 occupational titles available from the O*NET occupational classification system into one of the following groups: (1) CS (i.e., those requiring some form of experience in computing), (2) STEM (i.e., those requiring some form of experience in science, technology, engineering, and/or mathematics), (3) Non-STEM (i.e., those requiring experience in neither STEM nor computing), or (4) Unsure (i.e., job titles that were unfamiliar with and unsure how to sort).This research was approved by the Institutional Review Board (IRB protocol # 14,159) from North Carolina State University.Informed consent was obtained from all participants, and this research was performed in accordance with the approved protocol and relevant guidelines.
Participants were asked to categorize 80 job titles randomly selected from the overall set of 707 occupational titles.Each job was categorized by 5 to 26 raters, with most jobs assigned 14 raters.Jobs were ultimately classified as CS if more participants classified a job as "CS" than any of the other three categories.Jobs with the same number of participants classifying them as "CS" and another category (e.g., "STEM") were also classified as CS.The average interrater agreement 41 for raters' individual categorization of job titles was strong (rwg = 0.87) 42 .Of the 707 jobs classified, 47 were identified as CS and the remainder were classified as STEM or neither.Of these 47 job titles, 46 were included in the random sample.The 46 job titles included in our final sample (N = 1,000) are provided in Table 2

Employed in CS.
As described above, we randomly selected 500 responses from those employed in CS occupations.Of these 500 responses, 56% were from men and 44% were from women.Most responses came from those with an undergraduate (49%) or postgraduate (33%) degree.Only 11% of responses were from individuals between the ages of 18-25 years old, 28% between 26-33 years old, 32% between 34-45 years old, 17% between 46-55 years old, 11% between 56-65 years old, and only 2% were 66 years old or older.
Aspiring to CS.We also randomly selected 500 responses from those aspiring to CS occupations.Of these 500 responses, 52% were from men and 48% were from women.Most responses came from those with either an undergraduate (43%) or postgraduate (29%) degree.Different ages were also well represented: 26% of responses were from individuals between 18-25 years old, 17% between 26-33 years old, 19% between 34-45 years old, 12% between 46-55 years old, 13% between 56-65 years old, and 12% were 66 years old or older.

Measures. Interests estimated from the online career-interest survey. A shortened version of the popular
Personal Globe Inventory (PGI) 43 was used to assess respondents' career interests.This mini version of the PGI was developed using item response theory 44 and has been validated for use in the United States by past research [44][45][46] .Respondents were asked how much they enjoyed 20 different work activities on a scale from 1 ("Strongly dislike") to 7 ("Strongly like").Each activity was tied to one of the following six RIASEC career interest types: Realistic, Investigative, Artistic, Social, Enterprising, and Conventional.Like other research using the PGI mini 46 , we utilized item weightings in past research 45 .For each interest type, the corresponding definitions, reliability coefficients, and items are listed below.Reliability was calculated from the overall U.S. sample with Omega total values which provide conservative reliability estimates 45 .Except for the Investigative and Social Scales, Omega total values were at acceptable levels.However, we do not consider these low reliabilities to be problematic.Items composing the scale were selected based upon item response theory analyses of the extent to which a combination of items represented the overall content of each construct.Therefore, in such short scales, low reliabilities can likely be attributed to item's coverage of different dimensions within the same construct.
• Realistic (omega reliability coefficient = 0.78): Involves concrete practical activities and the use of machines, tools, and materials.Realistic interests were measured by asking respondents how much they would enjoy the following tasks: Interests estimated from O*NET.As described in greater detail in two technical reports from O*NET on the estimation of interest scores 29,47 , O*NET used two teams of three raters to estimate interest scores for each occupation included in O*NET's database.These teams reviewed information about each occupations' tasks, requirements, and generalized work activities to provide RIASEC ratings for over 900 jobs.Some of the information provided to raters (e.g., job's tasks, requirements, work activities) was gathered by a stratified randomized sampling and surveying of actual job incumbents and occupational experts across the United States 26 .
Demographic variables.Gender was measured using a dichotomous self-report item, asking whether respondents were "male" (coded as 0 and interpreted as "men") or "female" (coded as 1 and interpreted as "woman").Employment status was measured using a single, dichotomous item asking, "Are you currently employed?".Respondents were also asked to disclose their age and level of educational attainment (less than high school-postgraduate degree).

Analysis.
Profiles were defined solely based on self-reported career interests, which allowed us to determine which profiles naturally emerged for those aspiring to and employed in CS.Model fit indices were used to determine the optimal number of profiles.The number of profiles can differ between samples and analyses, with some models being more parsimonious (having fewer profiles) than others.Similar to previous research [48][49][50][51] , we estimated models ranging from two to ten profiles, preferring solutions that did not have classes with a small number of individuals.We used Akogul and Erisoglu's (2017) analytic hierarchy process (AHP) to select the model with the optimal number of profiles.The AHP takes a holistic approach when considering information criteria, such as Akaike's Information Criterion (AIC) and Bayesian Information Criterion (BIC), to select the best model.The AHP has been tested on common real and synthetic datasets and found to produce more accurate results than relying on one type of information criteria alone 49 .

Table 1 .
Descriptions of latent profiles.Profiles are listed in descending order of representation for women.See the Online Appendix for values of each profile along all six interest score dimensions and for formalized comparisons of profiles to the O*NET estimate of CS interests.

Figure 1 .
Figure 1.Interest Profiles of Those Employed in CS Compared to O*NET Estimations of CS Interests.Interests were assessed in both the online career interest survey and O*NET according to a scale from 1 to 7, with 7 indicating a stronger preference for/a greater relevance of that interest to the occupation.

Figure 2 .
Figure 2. Interest Profiles of Those Aspiring to CS Compared to O*NET Estimations of CS Interests.Interests were assessed in both the online career interest survey and O*NET according to a scale from 1 to 7, with 7 indicating a stronger preference for/a greater relevance of that interest to the occupation.
(1) Install electrical wiring and (2) Oversee building construction.• Investigative (omega reliability coefficient = 0.65): Involves analytical or intellectual activity aimed at trou- bleshooting, creative or knowledge use.Investigative interests were measured by asking respondents how much they would enjoy the following tasks: (1) Categorize different types of wildlife and (2) Write a scientific article.• Artistic (omega reliability coefficient = 0.92): Involves creating work in music, writing, performance, sculp- ture, or unstructured intellectual endeavors.Artistic interests were measured by asking respondents how much they would enjoy the following tasks: (1) Sculpt a statue, (2) Paint a portrait.• Social (omega reliability coefficient = 0.63): Involves working with others in a helpful or facilitative way.Social interests were measured by asking respondents how much they would enjoy the following tasks: (1) Seat patrons at a restaurant, (2) Interview people for a survey, (3) Help children with learning problems, and (4) Teach people to dance.• Enterprising (omega reliability coefficient = 0.70): Involves selling, leading, and manipulating others to attain personal or organizational goals.Enterprising interests were measured by asking respondents how much they would enjoy the following tasks: (1) Oversee a hotel, (2) Manage an office, (3) Interview people for a survey, and (4) Seat patrons at a restaurant.• Conventional (omega reliability coefficient = 0.85): Involves working with things, numbers, or machines to meet predictable organizational demands or standards.Conventional interests were measured by asking respondents how much they would enjoy the following tasks: (1) Prepare financial reports, (2) Oversee a data analyst group, (3) Maintain office financial records, and (4) Manage an electrical power station.
. U.S. who self-identified as being unemployed and aspiring to CS (603 responses).Responses were randomly selected using the "sample()" function in R (https:// www.R-proje ct.org/). the