With the rise in the number of enterprises in Brazil carried out by women, there is also an increase in the number of studies on the subject (Gomes et al., 2014). Scientific studies of this nature are an essential source for formulating public policies and defining lines of financing that can benefit women. In addition, they can also be used to support women’s entrepreneurial possibilities. In Brazil, women have fewer funding possibilities than men; women tend to undertake more out of necessity than opportunity, and most of the businesses conducted by women are related to fashion, food, beauty, and care (Malheiros & Padilha, 2015). In a way, businesses are delimited by gender stereotypes that prevail in our society, which limits, for example, female ventures in different areas, considered “male areas”.

In addition, even though the scientific literature is not read by most micro and small entrepreneurs, it is known that scientific speech is powerful enough to influence society. As Gramsci (1977) wrote, the words are not naïve, and each word carries in itself a world of meanings and senses; that has the power to shape our reality (Nascimento & Sbardelotto, 2008), especially the reality of women and their possibilities. Thus, the proposal of this article is situated from a critical perspective and reflection on the possible implications of what is said about women entrepreneurs.

Gender conceptions—what is considered masculine or feminine—have impacted the sectors of economic activity and professional careers, as well as having consequences for the career paths of men and women (Jaime, 2011). Attention is drawn to the fact that in Latin languages, such as the Portuguese language, there is a predominance of grammatical sexism, using male predicates to indicate male gender as well as human gender. In this sense, there is a nomenclature of words that compose positions and professions (Mäder, 2015) in masculine but considered “neutral”.

Aspects such as the composition of words may seem unimportant when it comes to the professional lives of women. Still, they have a significant consequence on the design of society and the treatment of women in it (Gonçalves, 2018). In the sexist context propagated in Brazil, individuals are already born with task divisions and professions that must be followed, classifying them into “boy activities” and “girl activities”.

The volume of symbolic and cultural meanings disseminated by different cultural environments and vehicles impacts the composition of society. Further influences the conception that men would be superior to women since these “boy activities” are usually generated towards intellectual and strength activities, and “girl activities” would be caring and housework activities. (Araujo, 2007).

The intersection of all the cultural and linguistic aspects already mentioned is essential for the theme of entrepreneurship when one considers that society grows together with its language and reinforces established stereotypes about women, their professions, and careers (Viana, 2016).

In the Brazilian entrepreneurship universe, the percentage of businesses led by women or created by them is around 30%. Also, enterprises created by women, as a rule, are smaller and have less access to financing. It is even more difficult for women to access credit than for men. In this context, a company opened by a woman tends to be smaller (Sebrae Nacional, 2018).

This research aimed to analyse how female entrepreneurship has been described in scientific papers. Therefore, the present work sought to catalogue the abstracts from 88 articles on the topic of female entrepreneurship in Brazil. From the abstracts, a summary has been created in English and Portuguese, based on the period from 1999 to 2019.

As the lexicometric tools of Iramuteq can be performed just on one language for each text-mining analysis, the two corpora have been processed separately. This also allowed to perform a comparative analysis of two distinct discursive construction regarding either scientific productions in a lusophonic context or on an international level, to make emerge the social representations underlying not only common sense but scientific communities, according to Moscovici (2013).

The amount of the two corpora of textual data, N1 (Portuguese abstracts) and N2 (English abstracts), respectively, was defined by the sum of all absolute frequencies (occurrences) that constituted the lexical database. From this pre-processed textual repertoire, all supplementary forms were previously removed. Specifically, the following forms were not considered for text-mining-analysis: demonstrative, indefinite, possessive, and additional adjectives; adverbs; articles; digits; conjunctions; onomatopoeia; pronouns; prepositions.

Abstracts have been used because they are the first part of a paper when a literature review is initiated. Also, usually, it is written in English, even when this is not the language of the document as a whole. They are a part of offering international visibility to the paper, a portrait of what has been accomplished.


A literature review was conducted, covering publications in the last twenty years, following a qualitative approach, analysing the abstract’s content (in English and Portuguese) of 88 articles published on female entrepreneurship in Brazil. ProQuest, Web of Science, and Lilacs have been used for data collecting,

The following terms were chosen as keyword combinations: female entrepreneurship in Brazil, women entrepreneurship in Brazil, Brazilian entrepreneurial women, and their versions in Portuguese. The terms were entered in the above-referenced databases. It has been used all the results from the databases above, that is, all papers that appeared and were peer-reviewed. Moreover, dissertations were not used. The authors were from Brazil and abroad. It was used as an inclusion factor that the manuscript dealt with female entrepreneurship in Brazil.

This study analyses the abstracts of all retrieved publications to make emerge the modalities in which the scientific literature deals with female entrepreneurship in Brazil. No specific exclusion criteria have been defined because of the few papers.

In the first instance, articles produced in English have been separated from those published in Portuguese. Then, the abstract content of each article was collected, and a data corpus was created. As they were two different languages, a corpus was created in Portuguese and another in English.

In that regard, a text-mining analysis has been carried out by adopting the statistical package Iramuteq for performing assisted lexicometrical analysis. Among the different tools offered by the software, the following statistics have been chosen.

The two investigation functions used for processing the textual material were the following:

A hierarchical descending classification (HDC) regarding frequency and proximity of the overall occurrences (lexical forms) (Reinert, 1990).

A word cloud analysis (WCA) according to the absolute frequency of the lexical forms (Atenstaedt, 2017).

The HDC consists of an X-square analysis based on identifying the most frequent occurrences (lexical forms) across the textual corpus of data (the selected abstracts), distributed afterwards within so-called stable classes with regard to lexical correspondences across textual segments (TS). The software identifies these last ones according to Elementary Units of Context (EUCs) according to a specific vocabulary: English and Portuguese, respectively, constituting the abstracts of the retrieved publications for the current literature review.

The WCA organises the lexical forms according to their overall frequency across the corpus of textual data. This simplified factor analysis (without considering the semantic proximity) allows highlighting the relation of the most frequent terms in its centre, along with other peripherical forms, related to semantic means of these former key terms.

The conjunction of lexical forms characterised the corpus of all abstracts (English and Portuguese ones) retrieved for the current review. The number of identified forms V was 412 for the Brazilian corpus of abstracts (Portuguese) and 185 for the International one (English), calculated by adding the number of different forms, according to the following algorithm, originally identified by Reinert (1990):

$$V = \mathop {\sum}\limits_{i = 1}^{F\max } {V_i}$$

The lexical relevance has been calculated by the type-token ratio (TTR) as a percentage ratio between the size of the vocabulary (V) and the size of the corpus (N). These calculations evidence the lexical richness of the processed textual material.

The frequency class consisted of a set of words that showed the same number of occurrences throughout the overall corpus. For calculating the frequency classes, all words have been listened regarding to the number of lexical forms in descending order. The most frequent form is associated with the rank “one.” Furthermore, the keywords refer to uncommon terms, appearing with higher frequency throughout the processed corpus of textual data (for both English and Portuguese abstracts).

The algorithm sustaining such an analysis depends on several bi-partitions due to a correspondence analysis conducted on a binary table (absence/presence) across all Elementary Units of Context. Each bi-partition has been performed throughout two steps:

A correspondence factor analysis (CFA) was conducted on the table. The inter-class inertia was calculated for all partitions according to the first factor of the CFA. An initial cutoff was carried out for such a partition to maximise the inter-class inertia.

All Elementary Units of Context were switched from one class to another, and the inter-class inertia was consequently recalculated. If the calculated values are higher than the previous inter-class inertia, the permutation has been retained. This part of the algorithm looped back until no permutation increased the inter-class inertia.

The principal aim of the algorithm was to close out those lexical forms that maximised inter-class inertia. In this context, the coordinates of lexical units on the first two axes were reported about the percentage of cumulative variance explained by the mean of all factors.

The following analyses were performed twice in the software: 37 abstracts of English texts and the second that gathers 51 abstracts in Portuguese.

The frequencies performed focused on the number of abstracts, total number of words contained in the corpus, number of active and supplementary words, number of words that appear only once, and the arithmetic mean of occurrences per text.

The word cloud makes identifying keywords in the corpus easy, as this analysis organises words graphically according to their frequency. (Kami et al., 2016). Subsequently, a word cloud facilitates the identification of the keywords present in the corpus (Kami et al., 2016).

In addition, the word tree identifies the corpus’s structure since IRAMUTEQ analysis proposes indicators of connectivity between words (Krug, 2017).


The following tree diagram (see Fig. 1) shows the existing links between lexical classes. The distribution has been calculated according to the proximity and weight of each reduced form of the analysed vocabulary. The correlations between the classes have been individualised through the repetition of different X-square tests. The publications analysed within this statistical procedure are produced in English.

Fig. 1
figure 1

Tree diagram of links between lexical classes of words in papers published in English.

Out of this diagram emerges two linguistic positionings. The utterance woman in the lower part of the graph and the occurrence of female in the upper part. In response to the proposed question, that is, to offer a reflection on what has been produced in Brazilian research on female entrepreneurship, it is emphasised that among these two terms, the word woman shows a higher frequency (58 times out of the overall corpus of textual data) compared to the female noun (41 times out of the overall corpus of textual data). Such a statistical spread between two similar synonyms is due to an epistemological distinction used in social sciences between sex and gender:

“Sex, we told students, was what was ascribed by biology: anatomy, hormones, and physiology. Gender, we said, was an achieved status: that which is constructed through psychological, cultural, and social means.”

(West & Zimmerman, 1987, p. 125).

As underlined by former studies (Scott, 1986; Hochdorn et al., 2016), the reifying properties of Western languages, especially Roman idioms, promote a mostly masculine-driven representation of reality. Above all, these latter linguistic families substituted the neutral grammatical gender, once characteristic of traditional Latin, with the male one. Thus, general and universal means are mainly expressed through the male gender. Although such a semantic structure seems to depend on a merely lexical organisation, as language is considered a logical-grammatical system (Wittgenstein, 2013), it enhances, moreover, a wider Weltanschauung, or Social Representation, due to a masculinised vision of society, which has been maintained over the centuries and which survives even today (Bourdieu, 2001). It follows, therefore, that such a lexical matrix does not simply agree with a pragmatic choice. Still, it imposes an intra-, inter-, and extra-subjective perception, which fosters a psychological, cognitive, and cultural structure, according to which men are highly advantaged, whether compared to women. Neo-Latin languages, in that regard, split linguistic processes into two clusters of symbolic means, which ‘transformed an ascribed status into an achieved status, moving masculinity and femininity from natural, essential properties of individuals to interactional, that is to say, social, properties of a system of relationships’ (West & Zimmerman, 2009, p. 114).

Despite the dichotomous structure of Roman languages, especially Italian and Portuguese, the semantic constraints of nearly all linguistic systems worldwide have been implicitly permeated by a patriarchal conception of human civilisation. According to which priority and overriding means are identified with the male as the first and most important of the sexes. Language, in this sense, represents a co-constructor of sexualised meanings; its “grammatical usage involves formal rules that follow the masculine or feminine designation (Scott, 1986, pp. 1053–1054)”.

Indeed, it is interesting to notice that the women’s scenario is linked to gender (42%), future (36%), and innovation (32%)Footnote 1, respectively, underlining the importance of implementing women activities in Brazilian entrepreneurship to promote modernisation of this enormous post-colonial Latin-American national context.

Otherwise, the noun “female” is linked to the noun economic (42%), entrepreneurship (81%), and country (37%), which might seem that the status of women is, in the entrepreneurship scenario, related to economic and entrepreneurial aspects of the country, their productivity and their economic development. These narrative figures, which emerged from the sample of English written papers, are also confirmed even more by those published in Portuguese, where the noun “woman (mulher)” appeared 102 times throughout the overall corpus of textual data. In comparison, the noun “female (femea)” has been cited just 39 times. Indeed, the noun “femea (female)” is less common in the Portuguese language, independently of European or Brazilian variations of the Portuguese language. Such a nearly dichotomous distinction could be explained considering the unfair status of women in the Brazilian business world compared to their male counterparts. As shown by former studies on gender inequalities conducted in South European contexts (Bimbi, 2009), the word “female” used as a noun result as an objectivation and naturalisation of women, more than men, in culture and society. Such an improper superimposition among gender and sex emerges out of “some declination of ‘nature’: fixed about the sexual dimorphism male-female or a genetic causality” (p. 6).

As emerged from recent research (Calile, 2019), such a hyper-professional attitude toward women and LGBTQ + people is due to a heteronormative hegemony (Schilt, 2006), in which individuals in these categories, such as women, in general, often have to produce more to have the similar or fewer results than a white man in the same activity.

The sociocultural female gender has traditionally been linked to weakness, justifying the intra-domestic role of women to carry out exclusively caregiving figures. Such social representation has been maintained until nowadays (Bourdieu, 2001), although women have become integral to public breadwinning activities. Interesting, in that regard, is a study conducted by Schilt (2006) showing that FtM transgender achieve a higher income condition after their sexual transition: “This internalised insider/outsider position allows some transmen to see clearly the advantages associated with being men at work” p. 569

Out of the word cloud (see Fig. 2), due to simplified factor analysis of absolute frequencies across the overall corpus of textual data, the word “woman” occupies either a central position or presents the most significant occurrence. Such morpheme, such as in the former graph (see Fig. 1), splits the cloud into two halves: gender (frequency 28) in the lower part linked to the country (f27), economic (f22) and entrepreneurial f27), underlining the importance of female activities for promoting the technological and industrial growth of Brazil.

Fig. 2
figure 2

Word Cloud of absolute frequency of words in papers published in English.

Otherwise, the upper part of the cloud, similar to the analysis of similitude, female (frequency 41) results linked to activity (f18), business (f36), and personality (f13). The last utterance leads to thinking about people’s personality traits in leading positions, occupations traditionally executed just by men. Such data, indeed, underlines how women in the Brazilian society must claim a personality, conforming to heteronormative, masculine domination (Schilt, 2006).

Some evident differences emerge by comparing findings of the corpus of abstracts in the English language (see Fig. 1) and those produced in Portuguese (see Fig. 3). The central key term in the upper graph is “empreendedor”, a word in the masculine. Such data underlines, differently in the English language corpus, how the Portuguese language, as a neo-Latin idiom, reifies a widespread vision of a society where men claim a central, universal, and predominant reference in Brazilian society. Such data is in line with former studies (Hochdorn et al., 2016), showing that languages like Portuguese and Italian construct two discursive universes, a male and a female one. As Joan Scott (1989) shows, the latter is relegated within a historical category, semantically and lexically circumscribed, and consequently subversive compared to the leading male gender in these romance languages.

Fig. 3
figure 3

Tree diagram of links between lexical classes of words in papers published in Portuguese.

Word cloud analysis further emphasises how this dichotomy compromises the organisation of society (see Fig. 4). The male term “empreendedor” (frequency 108) divides the discursive universe into two nearly opposite dimensions: mulher (woman, f102), pesquisa (research f55), and objetivo (goal f36) on the one hand, and empresa (business f29), feminine (female f39), and estudo (study f67) on the other one. The predominant presence of the term “empreendedor” in the masculine is constantly referred to also purely feminine dimensions, as can be seen from the emergence of the words “mulher” and “feminine” and underlined by the term “como” (as f76) and “objetivo” Such a discursive positioning shed light on how women’s status in the Brazilian business world is still considered a goal. Much empirical research has to be done (Correa, 2001; Bruschini, 2007; Medeiros & Zanello, 2018) in order to comprehend how to promote a paradigmatic change in South America’s most powerful country in economic, political, and military terms.

Fig. 4
figure 4

Word Cloud of absolute frequency of words in papers published in Portuguese.

Ultimately, the descending hierarchical classification is observed (see Fig. 5), resulting in five interrelated categories (called stable classes), as defined: 1(red)—Training of Businesswoman; 2(grey)—Business Sociocultural Context; 4 (blue)—Results of Gender Difference Between Entrepreneurs; 5 (purple)—Fear Related to the Business e; 3 (green) Critical Discussion of the Brazilian Model.

Fig. 5
figure 5

Descending hierarchical classification of words in papers published in English.

Indeed, the most representative class out of the overall number of scientific papers published in English is class 4, which represents 28% of the analysed occurrences (the reduced forms). Indeed, the most frequent words are the noun “result”, which correlates significantly and directly with the substantives “gender”, “difference,” and finally “personality”, which seems to show how the imparity between women and men emerges among “entrepreneurs”. Implicitly, according to a critical perspective of discourse studies (Van Dijk, 2006), the noun “personality” alludes to a condition called blue-colour professions (Schilt, 2006), where it is more suitable to claim as most as possible a masculinised representation of society.

Otherwise, some structural and organisational differences could be observed when considering the distribution and frequency of the most significant occurrences across the overall amount of papers published in Portuguese (see Fig. 6).

Fig. 6
figure 6

Descending hierarchical classification of words in papers published in Portuguese.

First of all, it seems that an explicit discourse on how women claim in the business world is nearly absent. Indeed, none of the six stable classes contains references to gender, sex, men, and women. Interestingly, the second most representative class, namely class 6, has exclusively occurrences, alluding to methodological questions and research procedures. Indeed, these studies pay a lot of concern to the research-assessment rather than epistemological, intersectional, and critical issues.

Furthermore, as the second most representative class contains just words regarding methods and procedures, it shows a lack of coherency among methods, on the one hand, and the study subject itself, on the other. The statistical evidence that reality has been adapted to the methodological procedures is also shown by how class 6 is the only one correlating with all other classes, underlining the priority given to formal and organisational variables.

The most representative class, class 4, contrary to the English sample of publications, contains words alluding mainly to the professional roles (papel professional), which generate conflicts (conflito) within the working context (contexto). Such conflicts are due to diversity (multiciplidade), which, considering the topics the retrieved papers have been focused on, consists of gender differences, though it is not explicitly defined.

Such a conclusion could also be confirmed, considering the link between class 4 and classes 2 and 3, respectively. Mainly class 3 contains as the most representative occurrences the words político (politician), humano (human), contribuicao (contribution), social (social), and finally prático (practical). Considering that most of the papers were focused on gender differences, it is interesting to note how classes 3 and 4 are linked together, highlighting the conflict of interest between the professional role characterised by the political and human aspects. Those seem to emerge a conflict of interest between the political and commercial spheres in Brazil. The political dimension is more interested in human contribution (contribuicao humana), while in business terms, socially more practical aspects result in being much more relevant.


The salient result of the present study is the absence of the word “women” in the last analysis, even though it appears in the word cloud. This result allows us to infer that even when the theme is female entrepreneurship and women are the “object” of analysis, there is no mention of the feminine, feminine at work or feminine when undertaking a professional activity.

As in other realities in the labour market, the discourse is centred in masculine terms, in which men have been the focus of the discourse (Ferreira, 2012). Not even in a research context like this, the woman is the axis of analysis. She does not appear in the discourse, not even an attempt to understand how female entrepreneurship occurs or how women behave when undertaking.

Although it may seem that the defence of the need to build a female vocabulary does not come, strict sense, from the results of this research, Fig. 4 presents the word “entrepreneur” in its singular version in the masculine form, whatsoever in Portuguese, there is the possibility of using the term in the feminine, both in the singular and in the plural form, that is, “entrepreneurial”. Again, we call Gramsci (1977) to defend that words have meaning, significance, and purport. Using masculine or feminine terms makes a difference when there is no gender-neutral. More than that, the patriarchal system materialises in the standardisation of using masculine terms as a neutral term.

Concerning female behaviour in the professional world, masculinising yourself, or representing “savoir-faire”, in whole or part, still seems to be an alternative to entering the job market, remaining, and achieving positions of relevance in entrepreneurship, as well as in other activities. Entrepreneurship also seems to impel women to act and represent predominantly male behaviour, allowing people to suppose that this is another barrier women face in this activity (Alperstedt et al., 2014).

Even “(un)making gender” masculine (Butler, 1999), there is no guarantee that this woman will be inserted into the labour market. Neither is there a guarantee that their competencies will be recognised. Somehow, there seems to be no right response, and women occupy a “non-place” characterised by masculinity (Bourdeiu, 2001).

In a way, the nomenclatures used in the masculine gender in neo-Latin languages indicate “how gender and norms are conveyed by discursive universes, culturally situated” (Hochdorn et al., 2016), that is, the communicative processes through which the margins between the social and individuals are negotiated (Gambirasio & Martins, 2019).

Therefore, in addition to the methodological and conceptual challenges, the area of entrepreneurship in Brazil must create a feminine vocabulary based on women’s speech, respondents of studies. To make such a vocabulary, women need to be heard. It seems obvious, but today we are dealing with a job market, a science, and a society that still uses the masculine to express plurality (Defendi & Gomes, 2019).

To approach the feminine is to open up to the new, too little discussed and studied, to specificity and particularity. The road is long to produce relevant knowledge for women in entrepreneurship. The central theme of this article is to demonstrate the terms used when speaking of female entrepreneurship, specifically in scientific literature. Notably, in the Portuguese language, as well as in Latin languages, it is possible to make use of the feminine language. Even in the face of this possibility, it is opted to use the masculine to describe a reality known to be feminine or when dealing with mainly or solely feminine, as the reality of the participants in the studies in question. In this context, we did not seek to deal with sociocultural and political aspects of the gender issue but to bring light to the use of the chosen terms, both in Portuguese and in English and, from there, generate the discussion that could be relevant.

Entrepreneurship, by necessity, is more real and less glamorous than the media and the academy indicates. There are enormous gaps in the knowledge available today. Based on “where there’s a will, there’s a way.”, knowledge does not seem to support women, especially the less fortunate. Women from the suburbs, with several children, without a partner or family member to help and support them (Braga et al., 2019) have not been listened to. The research field must listen to their voice, especially if it intends to produce knowledge about entrepreneurship in Brazil that promotes impact and social relevance. Those women must be heard to produce for themselves and society.

Finally, in response to the question that originated this article, namely how scientific literature shapes the sociocultural construction of gender inequalities, it is of paramount importance that the scientific community understands its role in the production of knowledge and the political and social impacts that its discourse has so that we can begin to address of the differences as they appear. We no longer make them invisible using a pseudo “neutrality” in the use of language. It is important to emphasise that language is built amid time and history since each world produces a discourse, and each discourse is produced by a world (Lacan, 1953). The approach to other scenarios such as the Latin-American, as well as North American, or even European/African/Asian, we aim to access in the future as a research agenda.