Elitism in Mathematics and Inequality

The Fields Medal, often referred as the Nobel Prize of mathematics, is awarded to no more than four mathematician under the age of 40, every four years. In recent years, its conferral has come under scrutiny of math historians, for rewarding the existing elite rather than its original goal of elevating mathematicians from under-represented communities. Prior studies of elitism focus on citational practices and sub-fields; the structural forces that prevent equitable access remain unclear. Here we show the flow of elite mathematicians between countries and lingo-ethnic identity, using network analysis and natural language processing on 240,000 mathematicians and their advisor-advisee relationships. We found that the Fields Medal helped integrate Japan after WWII, through analysis of the elite circle formed around Fields Medalists. Arabic, African, and East Asian identities remain under-represented at the elite level. Through analysis of inflow and outflow, we rebuts the myth that minority communities create their own barriers to entry. Our results demonstrate concerted efforts by international academic committees, such as prize-giving, are a powerful force to give equal access. We anticipate our methodology of academic genealogical analysis can serve as a useful diagnostic for equality within academic fields.

Although mathematics is o en framed as objective and egalitarian, its access is not equally conferred. Recent a ention has been given to the Fields Medal, one of the most prestigious awards in math, and its elite community. When the award was rst conceived in 1930, it was in part designed to assuage international tensions [1]. e award was intentionally given to individuals that would otherwise not receive any recognition, rather than the best young mathematician.
Using social network analysis (SNA) and neural-based natural language processing (NLP), this paper analyses the ow of elite mathematicians between nations and lingo-ethnic categories. Analysis was performed on the Mathematics Genealogy Project, one of the most complete advisoradvisee databases maintained today with more than 240,000 mathematicians.
Results demonstrates the self-reinforcing behavior among the elite level in mathematics. is contrasts with prior conferral of the Fields Medal, which was a positive force in mending international relations, such as integrating Japan and Germany a er World War II [6]. We propose the Fields Medal can be used today to improve accessibility of mathematics to minority groups. e classi er for lingo-ethnic identity is textual, it would be more accurate to say we classify speci c languages that overlap signi cantly with ethnic or cultural identity. While the use of lingo-ethnic categorization as identity is shallow, our principal aim is to show, even at the most basic de nitions of ethnicity or culture through language, we nd evidence of inequality. is paper also o ers a methodological contribution. We show that combining network analysis, neural-based natural language processing (NLP), and well-maintained academic databases can serve as a powerful diagnosis for access and equity, and improve the practice of science.
Several studies on elitism within the production of mathematical knowledge have been conducted. Methods draw predominantly from the complex network perspective [7], leveraging network repositories such as citation and bibliometric networks. Gargiulo et al. studied the entire, connected giant component of the mathematical genealogy project, enriching the data using data mining techniques [5]. ey work focused on integrating math history with temporal network analysis, noting the elds evolution based on country, discipline, and the structure of scienti c families.
Prior investigated about the relationship between scienti c mentorship and winning the Fields Medal or Wolf Prize, but results were inconclusive. Rossi et al. studied the role of advisor-advisee relationships [4]. ey propose the genealogy index, adapted from the h-index which was initially developed by Hirsh [3]. Malmgren et al. studied the role of mentorship on protégé performance, focused on metrics of academic success like publication record [8]. Beyond scholarship, studies have also considered hiring practices [9] and departmental prestige [10]. e lack of metadata in these genealogies has limited the scope of investigation. is paper places elite community network ow as the focal point, contrasting the historical focus on the nation-state with the modern focus of identity.

HISTORICAL NETWORKS OF ELITE MIGRATION
We begin with a sketch of history. Figure 1a) captures the migration of elite mathematicians between ve key countries. e subgroup of elites was created by aggregating the shortest paths between Fields Medalists. is ensures that the full graph is connected, and conceptually, denotes a minimal graph that connects all the medalists together. Here, migration is determined by comparing where a mathematician earned their Ph.D. and where their students earned their Ph.D. It is reasonable to assume primary advisors have moved to the same country as their advisees.
Prior to WWII, Western European countries were the strong-holds of mathematical thought. Notably, France and Germany contained the highest proportion of elite mathematicians. Many Japanese mathematicians studied in Germany, before returning to Japan, as part of modernization during the Meiji restoration. Examples inclue Rikitaro Fujisawa, who studied at the Unviersity of Strasbourg with Elwin Christofeel, before returning [11]. He was instrumental to reforming mathematics education in Japan. Mobility pa erns of mathematicians among countries (traditionally strong en mass). 1a), the migration of elite mathematicians from 1800 to the present. 1b), the net flow of elite mathematicians between 7 key countries. 1c), the flow analysis of elite mathematicians between countries. Exporting means mathematicians flow out a country, importing means they flow in, and selfish means they are retained. Many countries only import at the elite level (etc in the bo om right). A full list is available in the appendix. e ow chart reveals mass ows of researchers due to historical events. By 1932, the Holocaust led to mass migration from Germany to the United States and other European countries, which accounts for the drop in green volume, including prominent scientist Albert Einstein. Similarly, we observe large amounts of out ow from Russia a er the cold war, greatly diminishing the presence of Russia mathematicians a er the 1990s, and the second Italian mass diaspora a er WWII. Beyond forced immigration, ow analysis also reveals the movement of reintegration. Japanese mathematicians immigrated to the United States following WWII, and continued throughout the 60s to the 90s. Twenty years later, Japanese mathematicians owed back toward Japan.
France is not shown in the Sankey ow chart (1a), but is historically one of the countries that produces the most elite mathematicians. e chord graph in 1b) shows the net ow of mathematicians over all time, with the color of the chord indicating net exports. e USA-GER chord is orange, which indicates a net out ow from USA to Germany. Only France exports more American mathematicians than it imports from the USA. In all other cases, the USA exports more to other countries. Figure 1c) shows the ow dynamics on a country level. In-ow is de ned as the number of incoming edges, out-ow as the number of outgoing edge, and self-ow the number of loops.
ese results are similar to Gargiulo et al. [5] with two striking di erences. First, the United States is a sel sh and importing country at the elite level, whereas in general it is sel sh and exporting. Secondly, there are many more importing countries compared to the general case, where most countries are exporting and sel sh. Notice, many of the countries that are exporting and sel sh are Western or part of the Soviet Union, where there were strong programs in mathematics. Other countries appear to import more at the elite level, because their "exports" are not as competitive as mathematicians exported from other countries. ese two points allow us to infer three things. First, elite mathematicians have more mobility, and in many cases can begin work in foreign countries. Second, the United States imports more compared to the general case, a racting more elite members. ird, countries considered traditional mathematics strong-holds can be observed in the lower le corner.
What this analysis tells us, beyond an exposé of diasporic history, is the elds medal served as a way to mediate tensions. In a similar way that the Olympics was held in Rome, Berlin, and Tokyo, the inclusion of internationally marginalized nations.

THE FLOW OF MARGINALIZED IDENTITIES
Upon analyzing the history of elite communities in mathematics, we turn to the present. As 1a) shows, today, there is signi cant ow between countries. lingo-ethnic categories of identity serve as a useful construct for understanding network ow. Figure 2a) shows the representation of identities, within three subgroups: all mathematicians (blue), mathematicians within the medalist subgroup, (green) and the medalists themselves (red). Fig. 2 compares elite representation of subgroups relative to their actual proportions. For instance, there is a higher proportion of French medalists (14%) compared to the general proportion (8%). In contrast, there is a signi cant number of East Asian mathematicians (14%) but very low representation in both the medalist family and medalists themselves (5% each). On the level of ow, Figure 2b) characterizes identities in terms of in-ow, out-ow, and self-ow. High in-ow means a higher likelihood of being mentored. High out-ow then corresponds to a greater likelihood to mentor others. High self-ow means higher likelihood of mentoring your own identity. e identity with the most self-ow is Japanese. However, when all mathematicians are considered, the Japanese are shown as green, that is to say opposite of sel sh. is indicates reinforcing behavior only occurs at elite levels.
However, once these groups are aggregated into larger groups-Greater European, Asian, African and Arabic-then di erences become evident. European names has high self-reinforcing behavior, whereas Asians names and African and Arabic names are much lower in the number of self-loops. is dispels a common myth that minority groups, due to homophily, tend to group together. is myth insinuate that barriers to entry are self-in icted. However, as we see from 2b), most minority groups are far away from the sel sh pole, with a healthy balance of in-ow and out-ow. Rather, increases in the quantity of self-loops occurs in the greater European subgroup.

OLD STRONGHOLDS, NEW POSSIBILITIES
It is understandable that, when considering all mathematicians, that there is a high levels of self-ow-studying in elite and o en foreign institutions is a privilege. However, the fact that high self-ow in identity at the elite level suggests institutions can do more to open access, given their greater access to resources. is has been the case for Japan.
Japan is unique among Asian countries and identities in that there are many Japanese Fields Medalists (3), with high representation in elite levels. Japan has been known for its rapid westernization during the Meiji restoration relative to other Asian counterparts. As early as 1872, their traditional form of math wasan was replaced by western science. Prussia, rather than the United Kingdom, was the primary source of westernization, and led directly to the establishment of the University of Tokyo [6]. A er WWII, mathematicians sought to re-establish international ties and formed the International Congress of Mathematicians and a new International Mathematics Union (IMU). Marshall Stone, a proponent of this movement, said it clearly: "in considering American adherence to a Union, it must be borne in mind that we want nothing to do with an arrangement which excludes Germans and Japanese as such." Indeed, we nd the ten founding members well-represented in the ternary diagrams, and not long a er founding, the Soviet Union joined. Revisiting Fig. 1a), we discover the density of elite mathematicians in Japan increases a er 1945.
What this says, is the Fields Medal can improve the status of marginalized populations. Mathematics historian Barany captures this aspiration, believing the elds medal should help "sculpt the future, rather than reward the past [1]." What we observe is the opposite, where the elite perpetuate the elite. Fig. 3 demonstrates this clearly, showing French Fields Medalist Laurent Schwartz and his lineage.
Within 5 generations a er Schwartz, 7 Fields Medalists emerge. In particular, Schwartz-Grothendieck-Deligne form a direct chain, as do Lions-Villani-Figalli. Note, Lions' father Jacque -Louis Lions was also a student of Schwartz. In other words, 13.3% of all Fields Medalists descended directly from Schwartz. Broadly, each of these all made contributions to some form of algebraic geometry or functional analysis. Fig. 4 further shows that all medalists can be traced to 9 connected components, with the largest one holding 44 out of 60 listed Fields Medalists. A mathematician that is French and a ends a Top 50 institution means they are 6.4 times more likely to gain membership into the elite circle. Here, the top 50 is de ned as the top institutions a ended by those in the elite group. Note, we de ned our Fields Medalist subgroup minimally, such that any other de nition of subgroup would yield a higher power ratio. On the other hand, being East Asian and a ending a Top 50 institution only a ords you 1.4 times the likelihood of gaining membership into this elite circle.
From this diagram, we infer that institution plays a large role in elite membership. However, notice an East Asian mathematician a top 50 school is 4.5 times less likely to be included than a French mathematician a ending a top 50 school. An Indian mathematician educated outside top 50 schools are 6 times less likely to be included than a French mathematician with the same education. Amongst non-elite institutions, being Japanese gives the best chance of inclusion, an a er-e ect of the e orts by the IMU.

CONCLUSION
In 2014, the late Iranian mathematician Maryam Mirzakhani won the Fields Medal. A talented star herself, her groundbreaking work on dynamics and geometry was encouraged by her Ph.D. advisor Curtis McMullen, also a Fields Medalist, at the elite institution Harvard University. is is by no means downplaying her achievements; rather, it serves to show the power recognition and elite communities have-all of which membership she rightly earned. Although the Fields Medal should serve to recognize under-represented researchers, the proper cultivation of talent through mentorship and institutional support should be the starting point.
In our evaluation of the present, there is a large under-representation of minority groups in not just Field Medalists, but also in the elite circle for mathematics. While institutional prestige a big factor, lingo-ethnic identity is also found to be highly relevant, the widest gap being 4.5 times the power ratio even at elite institutions. Given that elite institutions have more resources, they can take a bigger role in generating higher access for marginalized groups. Flow analysis also dispels the myth that under-representation arises from homophily-driven self-selection.
Although the French stronghold shows the old forces that govern mathematical knowledge remain strong, the presence of Japanese scholars also shows concerted e ort can be used as an integrating force. Concerted e orts by international academic commi ees, such as prize giving, are a powerful force to confer equal rights for knowledge production to traditionally marginalized groups. Beyond analysis, this network analytical methodology is a call for scienti c communities to use advisor-advisee databases to open knowledge production and scienti c access.

Graph Construction
e graph was constructed using the Mathematics Genealogy database. Nodes are mathematicians, and directed edges represent advisor-advisee relationships. e data set contained information (listed in order of completeness) on the academic, advisor-advisee links, school, PhD graduation year, country, and dissertation title and topic. e ID's of medalists were identi ed, then the shortest path was computed in a pairwise fashion. Analysis was conducted primarily using the Networkx package [12]. e subgroup of elites was created by taking the union of shortest paths between Fields Medalists. en, the full graph is connected, and denotes some form of minimal graph that connects all the medalists together. While it is possible to produce a minimal spanning tree, given the forest like structure of the genealogy, the shortest paths has more interpretive value.

Identity Classifier
Since lingo-ethnic identity is not included in the Mathematics Genealogy Project, a separate classi er is required. e identity categories were labeled using the ethnicolr package, which is a long-short term neural network (LSTM) trained on Wikipedia and the census [13]. Speci cally, the LSTM was based o the seminal work of Graves and Schmidhuber [14]. is package has found use in evaluating under-representation in other STEM elds such as biomedicine [15]. It achieves between 78% to 81% accuracy. Potential shortcomings of neural methods for categorization is the accuracy levels. However, for 13 individual categories (which would result in 7.7% accuracy if truly random), 81% is quite high. Additionally, since we are interested in comparison within individual demographics, any bias would be carried forward since the group of all mathematicians supersets the medalist subgroup and medalists themselves. e goal of using this classi er is not to a en de nitions of identity, but to use the best available tools for inference, in absence of concrete data.

Flow Analysis
Meso-graphs were constructed on a ributes of each mathematician. To turn a ributes into nodes, we constructed a mapping from mathematician to the meso-categories (lingo-ethnic and nationality of doctoral degree). Edges between meso-categories were simply the original directededges between mathematicians. Each edge is then weighed by the number of advisor-advisee relations between meso-categories.
Constructing Ternary Diagrams. We constructed the ternary diagrams through analysis of the meso-network. Every meso-network can be represented by a its adjacency matrix, which we denote M. e diagonal then accounts for self-loops, the rows excluding the diagonal elements the out going edges, and the columns excluding the diagonal element the incoming edges. Explicitly, for meso-category indexed by i, we have the following de nitions for in-ow (IF), out-ow (OF), and self-ow (SF).
We then normalize these values to represent each meso-category as a point in three dimensional space.
Note, all points lie on the plane described by x + + z = 1. We then transform this planar section onto the two dimensional plane using a translation and two rotations.
where R 1 rotates the plane up to the XY-plane, and R 2 aligns the simplex to the x-axis.