The disparate impacts of college admissions policies on Asian American applicants

There is debate over whether Asian American students face additional barriers, relative to white students, when applying to selective colleges. Here we present the results from analyzing 685,709 applications submitted over five application cycles to 11 highly selective colleges (the “Ivy-11”). We estimate that Asian American applicants had 28% lower odds of ultimately attending an Ivy-11 school than white applicants with similar academic and extracurricular qualifications. The gap was particularly pronounced for students of South Asian descent (49% lower odds). Given the high yield rates and competitive financial aid policies of the schools we consider, the disparity in attendance rates is likely driven, at least in part, by admissions decisions. In particular, we offer evidence that this pattern stems from two factors. First, many selective colleges give preference to the children of alumni in admissions. We find that white applicants were substantially more likely to have such legacy status than Asian applicants. Second, we identify geographic disparities potentially reflective of admissions policies that disadvantage students from certain regions of the United States. We hope these results inform discussions on equity in higher education.


Introduction
Over the last several decades, questions have been raised over whether selective colleges in the U.S. discriminate against Asian American applicants in admissions decisions [Arcidiacono et al., 2022, Chun and Zalokar, 1992, Espenshade and Radford, 2009, Espenshade et al., 2004, Gelman et al., 2019, Long, 2004, Park, 2019, SFFA v. Harvard, 2019, Takagi, 1992. In the 1980s, Brown and Stanford formed committees to audit their own admissions policies and practices [Chun andZalokar, 1992, Takagi, 1992]. Brown found evidence of discrimination in its admissions process; Stanford did not find clear evidence of bias, but could not fully explain its lower acceptance rates of Asian American applicants relative to white students. A 1990 report by the U.S. Department of Education's Office of Civil Rights (OCR) investigated allegations that Harvard capped the number of Asian American students it admitted [Chun and Zalokar, 1992]. OCR found no evidence of an Asian quota, but concluded that Asian American applicants were less likely to be admitted than white students with similar academic qualifications. OCR further found that this disparity largely disappeared once recruited athletes and the children of alumni ("legacies") were excluded from its analysis, suggesting the gap in acceptance rates was driven by Harvard's stated preference for admitting students from these two groups [Chetty et al., 2023, Hurwitz, 2011, Park, 2019. Most recently, in a 2023 decision, the Supreme Court ruled that Harvard engaged in unconstitutional racial balancing, holding the Asian American share of admitted students to approximately 20%-though Harvard denied doing so. In the more than 30 years since the OCR investigation, there have been limited third-party, applicant-level empirical analyses of potential discrimination in college admissions decisions against Asian American applicants. Over this time span, both the demographics of the United States and the educational landscape have changed substantially. Asian American representation among K-12 public school students has more than doubled, increasing from 3% in 1993 to 7% in 2020 [Nowicki, 2022], and the overall admission rate to Harvard has dropped from 18% in 1990to 5% in 2020[Fu and Kim, 2020, Lee, 1993. These changes suggest a need to reexamine college admissions policies for potential disparate impacts on Asian American applicants.
Here we analyze 685,709 first-year college applications submitted by 292,795 Asian American and white students to a subset of U.S. institutions with relatively low admit rates and relatively high yield rates. All of the applications we consider were submitted via a national postsecondary application platform over five application cycles, from the 2015-2016 cycle to the 2019-2020 cycle. 1 We exclude students who attend a high school outside of the United States or who report primary citizenship outside of the United States. Given the complex patterns of immigration and marked heterogeneity in experiences across subgroups, we disaggregate our analysis by three regions of origin self-reported by the Asian American applicants in our dataset: South Asia, East Asia, and Southeast Asia. 2 To preserve confidentiality, we focus on broader patterns rather than on individual institutions, and we report aggregate results across the combined set of colleges and universities we consider. In particular, our main outcome of interest is whether applicants were admitted to at least one of these institutions. One limitation of our analysis is that we do not directly observe admissions decisions, and so we infer these decisions based on enrollment choices, as described below.
After excluding students who we infer to be recruited athletes, we estimate that South Asian applicants had 49% lower odds of admission to the subset of schools we consider than white applicants with comparable test scores, high school grade-point averages, and extracurricular activities. We estimate that both East Asian and Southeast Asian applicants had 17% lower odds of admission to these schools. After additionally adjusting for whether a student applied early to any considered college or university, the student's high school, and whether the student is a legacy applicant, we estimate that Southeast Asian students were accepted at similar rates to white students, and that East Asian students had 10% lower odds of admission than white students. But, we estimate that South Asian applicants still had 30% lower odds of acceptance to these institutions than white students after adjusting for all available information in our data. We note, however, that we do not have access to all materials submitted by and about applicants, such as essays, letters of recommendation, alumni interviews, and admission officer ratings. Finally, we explore how the relative share of Asian American and white enrollees might change at the colleges and universities we consider under various hypothetical admission policies. Under a policy that admits students solely on the basis of standardized test scores and participation in extracurricular activities-and holding fixed the combined number of enrolled Asian American and white students-we estimate that enrollment of South Asian students and East Asian students would increase substantially, while the number of Southeast Asian students would remain approximately the same.
Concerns about the disparate impacts of college admissions policies on Asian American students are often entangled with discussions about affirmative action [Antonovics and Sander, 2013, Gelman et al., 2019, Gersen, 2017, Hughes et al., 2016, Karabel, 2005, Kim, 2022, Park et al., 2023, Takagi, 1992, West-Faulcon, 2016. At their core, however, these two issues-affirmative action and differences in the admission rates of similarly qualified white and Asian American students-are conceptually distinct. In particular, during the time period we consider, institutions could have admitted Asian American applicants at rates comparable to similarly qualified white students while still giving preference to applicants from groups underrepresented in higher education. 3

Data description
Our analysis is based on applications submitted through a national postsecondary application platform. The data we use contain detailed, anonymized information on each student, including race and gender; standardized test scores (ACT and/or SAT); high school gradepoint average (GPA); Advanced Placement (AP) exam scores; structured descriptions of their extracurricular activities (e.g., the number of hours they spent participating in various clubs or sports); the location and other characteristics of the high school they attended; whether their parents attended college, and, if so, the colleges they attended; whether they received an application fee waiver (a proxy for financial need); the set of colleges to which they applied via the platform; and whether they applied early action or early decision to any of the institutions we consider (Table A4). If a student took the SAT, we convert their SAT score to an equivalent ACT score to facilitate comparisons between applicants and aid interpretation. 4 Although we have quite detailed individual-level data, we do not have access to the full set of application materials, including student essays, letters of recommendation, or intended major. We also do not have access to internal college evaluations, such as interviewer ratings.
We approximate admissions decisions by first inferring enrollment decisions. We infer enrollment by observing the school to which a high school counselor sent a student's official high school transcript, information that is collected by the platform. (NB: official transcripts typically are required by colleges to formalize acceptance decisions.) We then infer that students were admitted to at least one of the schools we consider if, and only if, they sent a transcript to (i.e., ultimately enrolled in) one of those schools. This inference rests on an assumption that students who were admitted to at least one of the schools we consider ultimately attended one of those schools. While imperfect, three points suggest this process yields results that are suitably accurate for our purposes. First, we assessed the quality of our enrollment inference by matching 5,000 randomly selected applicants to the schools we consider to be their true enrollments as reported by the National Student Clearinghouse. We find that the estimated precision of our enrollment inference strategy is 97% with an estimated recall of 91%. We further find that accuracy is comparable across race groups (see the Methods section in the Appendix). Second, the schools we consider have relatively high yield rates, suggesting that admission to these schools is strongly correlated with enrollment. Finally, we find qualitatively similar results with an estimation strategy that holds under the weaker assumption that enrollment is independent of race, conditional on acceptance and other observed student characteristics (see the Estimating Admission Rates section in the Appendix for details).
Our study pool is comprised of 685,709 applications submitted by 292,795 students to the colleges and universities we consider in the 2015-2016 through the 2019-2020 application cycles. We include Asian and white applicants who attended a U.S. high school, excluding students from high schools for which we cannot reliably infer college enrollment (see the Methods section in the Appendix and Table A2). We cannot identify athletic recruits with certainty, but we exclude from our sample students who appear to be athletic recruits based on the timing of their applications and their reported extracurricular activities (see the Methods section in the Appendix). Within our study pool, 36% of applicants selfidentify as Asian, with 51%, 15%, and 34% of these students self-identifying as East Asian, Southeast Asian, and South Asian, respectively. Finally, we supplement our data from the platform with public high school data from the Common Core of Data (CCD), private high school data from the Private School Universe Survey (PSS), and rurality data at the ZIP code level from the Economic Research Service of the U.S. Department of Agriculture.

Results
Among applicants to the colleges and universities we consider, we estimate that 16% of East Asian, 8% of Southeast Asian, and 10% of South Asian students were admitted to at least one of these institutions, compared to 12% of white applicants. While these aggregate admissions rates differ by race and ethnicity, they do not account for differences in qualifications across groups. For example, Asian American applicants had, on average, higher standardized test scores than white applicants (Table A3). As a first step to account for these differences, in Figure 1 we show estimated admissions rates by standardized test score for Asian American applicants and white applicants. We find that Asian American students were admitted at consistently lower rates than white applicants with comparable  Figure 1: Estimated rate of admission to at least one of the selective institutions we consider as a function of standardized test score, for Asian American applicants and white applicants in the study pool. Asian American applicants typically were admitted at lower rates than white applicants with identical test scores, with the largest gap for South Asian students. Among admits in our study pool who report ACT or SAT scores, 93% have ACT (or ACTequivalent) scores at or above 32. Percentiles are derived from all students who took the ACT in 2018 [ACT, Inc., 2018]. Point sizes are proportional to the number of applicants in each group.
test scores, with the largest gap for South Asian applicants. For instance, among applicants with an ACT (or ACT-equivalent) score of 34-placing them in the 99th percentile of test takers-we estimate that 16% of white students were admitted compared to 9% of South Asian students, a relative gap of 43%. Standardized test scores are one factor among many that colleges consider when determining whom to admit. Additional criteria that we are able to observe include high school grade-point average (GPA), participation in extracurricular activities, legacy status, and the state in which each applicant's high school is located. To understand the extent to which these other considerations may explain the observed disparities in admissions rates, we fit a series of nested logistic regression models of the following form: where Y i is a binary variable indicating whether applicant i was admitted to any college or university we consider; 1 S , 1 E , and 1 SE indicate whether the applicant identified as South Asian, East Asian, or Southeast Asian, respectively; and X i is a vector of additional covariates (e.g., test scores and GPA) that we vary across models, with β X the corresponding vector of coefficients. Our key coefficients of interest are β S , β E , and β SE , which yield estimates of the gap in admissions rates between white applicants and Asian American applicants in the three Asian subgroups that we consider. We find similar results if we fit separate models comparing white applicants to applicants in each Asian subgroup individually (Tables A13-A15). Table 1 shows, for nine models that include different subsets of control variables, the fitted coefficients for each of the three Asian subgroups (see also Tables A5-A12). Coefficients are exponentiated for ease of interpretation as odds ratios. The first model includes only fixed effects for the application season and the subset of colleges (or application "basket") to which the student applied-among the full set of colleges we consider-facilitating comparisons among groups of students who applied in the same year and to the same subset of colleges. The corresponding coefficients are thus akin to raw admissions odds ratios across groups, without adjusting for differences in applicant credentials.
The second and third models in Table 1 additionally adjust for measures of academic preparation, including SAT/ACT alone (Model 2) and, additionally, GPA, AP test scores, and SAT II subject test scores (Model 3). These academic-preparation models corroborate the visual pattern in Figure 1: we estimate that Asian American students-especially South Asian students-had substantially lower odds of admission than white students with similar test scores and related academic credentials. These disparities largely persist when we progressively adjust for extracurricular activities (Model 4); gender and family characteristics, like whether the student received an application fee waiver (Model 5); and whether the student applied early (Model 6).
Next, with Model 7, we account for whether a student is the child of an alum. After adjusting for legacy status-in addition to all of the above mentioned factors-we see large reductions in the estimated disparities in acceptance rates for all three Asian subgroups we consider. Figure 2 helps explain this result. The top panel of the figure shows estimated admission rates for Asian American applicants and white applicants conditional on legacy status and test scores. 5,6 For a given test score, we estimate that applicants-both white and Asian American-with legacy status were more than twice as likely to gain admission than applicants without legacy status. In the bottom panel of Figure 2, we present prevalence of legacy status among applicants with an ACT-equivalent test score of 32 or above, mirroring the focus of the upper panel. Here, we observe that white applicants were approximately three times more likely to have legacy status than East Asian and Southeast Asian applicants, and almost six times more likely than South Asian students. Thus, even though estimated acceptance rates conditional on test score and legacy status were similar across race and ethnicity, white students appear to benefit from being substantially more likely to have legacy status.
In theory, the higher estimated admissions rates that we observe for legacy applicants may stem both from admissions practices that favor the children of alumni and from the potentially greater social capital of legacy students. We note, however, that Model 5 adjusts Outcome: Inferred acceptance to at least one college or university we consider Basket+Year SAT/ACT GPA+AP+SAT2 Activities Sex+Family Early App Legacy Location+HS All  Table 1: Estimated conditional odds of admission to at least one college or university we consider for Asian American applicants in the study pool relative to white applicants. Coefficients are estimated via logistic regression, and are exponentiated for ease of interpretation as odds ratios. After adjusting for test scores and extracurricular activities (Model 4), South Asian students had 48% lower estimated odds of admission relative to white students, with East Asian and Southeast Asian applicants exhibiting smaller but statistically significant gaps (17% lower odds of admission). These disparities appear to be explained in part by legacy preferences (Model 7) and geography (Model 8). The "Aggregated Asian" coefficients are computed from separate models that do not separate students into Asian subgroups. High−scoring applicants with legacy (%) Figure 2: Estimated rate of admission to at least one college or university we consider for white applicants and Asian American applicants with high ACT or SAT scores. Across test scores, we estimate that applicants with a parent who attended one of the selective institutions we consider as an undergraduate are more than twice as likely to be admitted than non-legacy applicants with the same test scores. The bottom panel shows the proportion of applicants with high test scores who have legacy status, disaggregated by race. High-scoring white applicants are three to six times more likely to have legacy status than high-scoring Asian American applicants, suggesting white applicants disproportionately benefit from a boost in admission rates afforded to those with legacy status.
for whether an applicant had a parent who attended a top-50 institution (based on 2019 U.S. News rankings) not included in the subset of colleges on which we focus, or attended one of the colleges in our subset to which the student did not apply-proxies for having high social capital distinct from legacy status specifically. The change in disparities that we observe moving from Model 5 to Model 7 thus appears attributable to the specific benefits of having legacy status, rather than the more generalized benefits of high social capital. Finally, we examine the relationship between estimated acceptance rates and geography. For each state, Figure 3 displays the estimated admission rate of high-achieving applicantswith ACT-equivalent scores of 32 or above-to the fraction of applicants from that state who were Asian American. In computing this proportion, we limit to white applicants and Asian American applicants, and point sizes are proportional to the total number of high-scoring white and Asian American applicants in each state. The negatively sloped 10% 15% 20% 25% 30% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Asian−identifying share of high scorers (%) High scorer admit rate (%) Figure 3: For each U.S. state, overall estimated admission rate to at least one institution among the subset of selective schools we consider for white applicants and Asian applicants with an ACT-equivalent score at or above 32, with the proportion of high-scoring white and Asian applicants who identify as Asian on the horizontal axis. Point sizes are proportional to the number of high-scoring white and Asian applicants from the state who applied to one of the institutions we consider. The red least-squares regression line is weighted by the same count of applicants. States with a greater share of Asian American applicants have, on average, lower estimated admission rates for high-scoring applicants.
regression line shows that states with a larger fraction of Asian American applicants tended to have lower estimated admission rates. Further, states with a higher proportion of Asian American applicants tended to have higher average test scores, suggesting the geographic trend is not driven by a gap in academic achievement ( Figure A2). This geographic pattern also persists when we exclude applicants from California, and when we disaggregate the data to the level of high school instead of state ( Figures A1 and A3).
Model 8 in Table 1-which adjusts for location as well as academic and extracurricular performance but not legacy status-shows that these apparent geographic preferences account for much of the admissions gap between white and Asian American applicants. Model 9, the last one we consider, adjusts for all application information available to us, including both legacy status and geography. After adjusting for this rich set of covariates, we see that the estimated admissions gap between Southeast Asian and white applicants largely disappears, though we still find that white students have higher estimated odds of admission than otherwise similar East Asian and South Asian applicants. It is unclear what may account for these remaining disparities, though it bears repeating that admissions officers have access to more complete application materials than do we, including letters of recommendation, essays, and interview assessments.
We conclude our analysis by exploring how the relative share of Asian American students at the institutions we consider might change under various hypothetical admissions policies.
In line with our analysis above, we restrict our attention to white students and Asian American students. Specifically, we hold fixed the combined number of students in these groups (approximately mirroring historical admissions outcomes, as shown in Figure A4), and so any increases in Asian American enrollment necessarily imply decreases in enrollment of white students. Any exercise of this sort is inherently speculative-in part because changes in admissions policies could alter application behavior-but we still believe it is informative to gauge the approximate magnitude of effects.
As a baseline, the top row of Figure 4 shows the estimated share of enrollees in our data from the three Asian subgroups of interest. The rest of the figure shows the estimated share of enrollees from these subgroups under eight hypothetical admissions policies that are divided into four categories. In the first category-which we call "top-k" policieswe imagine admitting students with the highest ACT-equivalent scores, with ties broken randomly. In the second category, "random above threshold," we consider policies that randomly admit students above an ACT-equivalent score t such that admitted students have a mean score equal to that of actual enrollees [Sandel, 2020]. For both of these categories we consider two variants: the "ACT" variant selects from the entire applicant pool of the schools we consider, while the "ACT+ECs" variant selects only from applicants with at least as many hours of reported extracurricular (EC) activities over four years of high school as the median of the hours reported by all enrollees. Under all four policies, we estimate the same or larger shares of Asian American students compared to what we observe in the data. Asian American students report, on average, fewer extracurricular hours than white applicants, so the ACT+ECs policy variant admits fewer Asian American applicants than the ACT variant.
The final two categories we consider investigate outcomes under hypothetical policies that maintain both the current number of enrollees from each state and the total number of enrollees with legacy. Specifically, we first divide our historical data into 102 (2 x 51) cells consisting of legacy and non-legacy applicants from each U.S. state and Washington, D.C.; we then in turn apply each of the four policies described above to each of the 102 cells, ensuring for each cell that the number of students enrolled under the hypothetical policies matches the historical enrollment numbers. With these added legacy and geographic constraints, the share of Asian American enrollees is smaller than under the unconstrained analogs, as expected given our results above. But, even with these constraints, the number of Asian American enrollees across policies is still similar to or larger than the status quo.

Discussion
Based on a large-scale analysis of applications to a subset of selective U.S. colleges and universities, it appears that that Asian American students were less likely to be admitted than white students with comparable academic credentials and extracurricular activities, a disparity that is particularly pronounced for South Asian students. It further appears that much-though not all-of this gap is attributable to admissions practices that favor the children of alumni and apparent geographic preferences. These disparities likely stem from a complex set of objectives that universities work to balance, and are not necessarily driven by explicit or implicit racial preferences. Nonetheless, our results prompt questions about the equitable design of college admissions policies.  In all cases, we consider only the subset of Asian American students and white students, and so increases in Asian American enrollment correspond to decreases in the enrollment of white students. In most instances, the hypothetical policies we consider lead to an increase in enrollment of Asian American students, including those that preserve the number of legacy students and the number of enrollees from each state in the historical data.
In our primary analysis, we excluded applicants who we inferred were recruited athletes, under the assumption that filling sports teams is a hard constraint for many universities, and that doing so involves qualitatively distinct admissions criteria. We note, though, that athletic recruits are disproportionately likely to be white rather than Asian American: in our study pool, white applicants outnumber Asian American applicants by a factor of about two to one, but among inferred recruits, white applicants outnumber Asian American applicants by a factor of four to one. As a result, if we do not proactively exclude recruited athletes from our analysis, we find an even larger gap in the estimated admissions rates between Asian American students and white students with comparable academic credentials (Tables  A13-A15).
Our results are subject to two key limitations. First, we have imperfect information on college admissions decisions. In our analysis, we infer admissions decisions from enrollment choices, where we assume that students who applied to but did not ultimately attend one of the selective schools we consider were not admitted to any of those schools. This assumption only allows us to approximately reconstruct admissions decisions. However, given the relatively high yield rates of the universities we consider, we believe this assumption is suitably accurate for our analysis. Further, we find qualitatively similar results under an al-ternative estimation strategy that rests on the weaker assumption that enrollment decisions are independent of race, conditional on acceptance and other observed student characteristics (see the Estimating Admission Rates section in the Appendix). Finally, our results remain largely the same if we eliminate any one school from our analysis (Tables A13-A15), suggesting the robustness of our results to the exact subset of schools we consider.
Second, we do not have access to each student's complete application materials. Specifically, we do not observe a student's intended major, essays, teacher recommendations, transcripts, interview ratings, and admission officer ratings. It is thus possible that students who we observe to have similar academic and extracurricular credentials are in fact different in important ways that are revealed in these other materials. We note, however, that results made public through litigation suggest that-at least in the case of Harvard's admissions practices-the disparities we identify persist after adjusting for several additional markers of academic and extracurricular excellence, including admission officer ratings of each applicant's academics, extracurriculars, teacher recommendations, and counselor recommendations (cf.  Table B.7.1 and B.7.2 show coefficients). 7 Further, Kim [2022] finds that Asian American and white college applicants with similar academic credentials receive letters of recommendation that are "broadly similar in content and tone." Discussions of college admissions practices impacting Asian Americans often revolve around affirmative action. But, as we noted at the start, these issues are conceptually distinct. In theory, one can both implement affirmative action policies that maintain the share of students on campus from groups that are underrepresented in higher education while simultaneously admitting Asian American students at the same rate as white students with similar academic and extracurricular credentials. In such a case, we would expect the number of enrolled white students to decrease, not the number of racial minorities. During the time period we examined, affirmative action was widely used for shaping the diversity of college campuses, meaning the scenario described above was an option available to college administrators. Thus, at the very least, our results shed light on past admissions choices and their consequences for Asian American college applicants. Now that affirmative action is legally prohibited, institutions will need to reconsider how applicants are evaluated in order to ensure equitable admissions processes and to maintain diverse campuses. For example, existing decision-making processes that afford preference to the children of alumni appear to not only disadvantage Asian Americans but also other racial minorities ( Figure A5). Looking ahead, we hope our findings facilitate ongoing discussions about the design and implementation of equitable admissions policies. 7 Expert testimony provided in the Harvard case indicates that disparities in admission rates at Harvard are reduced after adjusting for admission officers' assessments of an applicant's "personal qualities" and admission officers' "overall rating" of an applicant. There is worry, however, that assessments of "personal qualities" are more subjective than ratings of academic and extracurricular achievements, are less clearly connected to merit, and may be influenced by implicit or explicit racial biases. Further, "overall ratings" are so closely tied to the final admissions decision, that we would expect adjusting for them would mask any disparities [Jung et al., 2018].

Methods
• The first subsection describes the data filters used to construct the study pool.
• The second subsection describes how a record of a sent transcript is deemed as a reliable or unreliable signal of enrollment.
• The third subsection describes our validation of sent transcripts as a signal of enrollment using true records of enrollment from the National Student Clearinghouse.
• The fourth subsection describes how we attempt to identify potential athletic recruits among the applicant pool.
"Estimating Admission Rates" summarizes a complementary analysis contingent on the weaker assumption that enrollment is independent of race, conditional on acceptance and other observed student characteristics. Table A1 shows the proportion of publicly reported applications reported by the institutions in the main analysis that were submitted via the national postsecondary application platform. Table A2 shows summary statistics for applicants from high schools with reliable and unreliable records of sent transcripts. Table A3 shows summary statistics for white, East Asian, South Asian, and Southeast Asian applicants in the study pool.

Data filtering
We begin our core analysis with the 551,292 South Asian, East Asian, Southeast Asian, and white students who submitted at least one application to one of the selective schools we consider via the national postsecondary application platform in the 2015-2016 application cycle through the 2019-2020 cycle. We then filter to the 449,564 applicants who attended a high school in a U.S. state or the District of Columbia, and who did not report citizenship outside of the United States. We next limit to the 297,417 applicants who attended high schools with official transcript sends that, to the best of our knowledge, accurately reflect an intention to enroll. Finally, for our main analysis, we restrict to the 292,795 applicants who, to the best of our knowledge, are not athletic recruits.

Identifying high schools with reliable transcript sending behavior
Typically, when an applicant intends to enroll in a particular college to which they were admitted, their high school must submit an official transcript to the college. Many high schools use the same portal to submit official transcripts, and the platform observes when an applicant's transcript is sent via this portal. While the platform does not observe acceptances or enrollments, high school transcript sends serve as a highly accurate enrollment proxy for the subset of applicants who meet the following conditions: • First, the applicant's high school must use the transcript sending platform. In other words, we only include applicants whose high school sent at least one transcript via the platform in the same year the applicant applied.
• Second, transcript sends must be targeted to specific colleges. If a high school counselor does not track the intended enrollment of a particular applicant, they may indiscriminately sent final transcripts to every college to which the applicant submitted an application.
We define a high school's transcript sending behavior as "reliable" in a given year if the high school submits the same number of transcripts as applications for fewer than 5% of applicants who submit at least two applications. In this definition, we do not consider applicants who submit one application and one transcript, as we cannot reliably guess whether the student intended to enroll or the transcript was sent indiscriminately by their counselor. In Tables A13-A15, we replicate the main results with thresholds other than 5%, finding qualitatively similar results.
Among the applicants from high schools who exhibit reliable transcript sending behavior, we further exclude the applicants who submitted the same number of transcripts as applications and submitted at least two applications, since these students' counselors likely sent the transcripts indiscriminately.

Verifying transcript send enrollment signal with NSC data
Using a stratified random sample of 5,000 enrollments obtained from the National Student Clearinghouse (NSC), we find that our enrollment heuristic has nearly perfect precision (97%) and high, but not perfect, recall (91%). The 5,000 sampled applicants were selected from the study pool, which includes only those applicants whose high school met the threshold for reliable transcript sending behavior in the given application year.
Among the first stratum of 2,500 applicants with a transcript sent to at least one of the selective schools we consider, we were able to match 2,336 (93%) to NSC enrollments. We find that 2,271 actually enrolled in one of the selective schools we consider within a year of admission. Thus, the estimated precision of the enrollment proxy is 97%. Precision is nearly identical across race groups: white students have a precision of 97%, while Asian American students have a precision of 97.1%. Precision is also similar across application years, high school states, and application fee waiver status.
We are ultimately interested in the likelihood of admission to our specific set of selective schools. Thus, the transcript sending signal is arguably superior to knowledge of actual enrollment, as an admitted student with the intention to enroll may later decide not to enroll. If we were to use the student's true enrollment as a signal of admission to one of the selective schools, we would falsely assume that the student was not admitted to one of the selective schools. Thus, the true precision of our enrollment signal may be even higher than 97%.
For the second stratum of 2,500 applicants who applied to one of the selective institutions we consider but did not send a transcript to one of these institutions, we were able to match 2,294 (92%) to NSC enrollments. 43 (2%) of the 2,294 matched applicants ended up enrolling at one of the selective institutions we consider within a year of applying. We attribute this discrepancy to a number of potential factors: students may be admitted off the waitlist at one of the selective schools we consider; individual counselors may not use the transcript sending portal even if the rest of the high school uses the portal; or students may transfer to one of the selective schools we consider after initially being rejected. To determine the source of the discrepancy, we disaggregate by whether the matched applicant sent a transcript to any school on the platform. 1,247 of the 2,294 matched applicants sent a transcript to a school on the platform outside of the specific set of selective schools we consider. Of these 1,247 applicants, 13 actually ended up enrolling at one of these selective schools within a year of applying, but 10 of these 13 enrolled first at the school to which they sent a transcript. We assume that these 10 students transferred to one of the selective schools we consider after initially being rejected, so we exclude them from the error calculation, as our proxy for rejection is assumed to be correct for the application year. The remaining three applicants may have been admitted off the waitlist at one of the schools we consider after initially committing to a different school. In the study pool, 139,888 applicants sent a transcript to a school on the platform outside of the specific set of colleges and universities we consider. Thus, from these applicants, we estimate 139,888*(3/1,247) = 337 unobserved enrollments at the considered schools.
The remaining 1,057 of the 2,294 matched applicants did not send a transcript to any school on the platform. 30 (3%) of the 1,057 applicants ended up at one of the selective schools we consider. We attribute these 30 enrollments to the idiosyncratic counselor behavior described above. In the study pool, 117,138 applicants did not send a transcript to any school on the platform. Thus, we estimate 117,138*(30/1,057) = 3,325 unobserved enrollments from these applicants.
In sum, in the study pool, we observe 35,769 enrollments to the selective schools we consider. We estimate 337 + 3,325 = 3,662 unobserved enrollments. Thus, our estimated recall is 35,769/(35,769+3,662) = 91%. Given that our estimated recall is based on discrepant enrollments of only 33 matched applicants, we cannot meaningfully evaluate recall across groups.

Identifying potential athletic recruits
In order to field competitive athletic teams, universities often recruit students with exceptional athletic ability. University admission offices may have a hard constraint of filling athletic teams with a sufficient number of talented student athletes. Admissions decisions for student athletes are primarily the choice of athletic coaches, who are incentivized to offer admission to recruits with the greatest athletic ability who meet or exceed the minimum academic qualifications for admission. 8 Further, student athletes are typically admitted early. 9 We thus attempt to exclude students who, to our knowledge, may be athletic recruits, as the admission process for student athletes differs considerably from that of typical applicants. As a robustness check, we repeat the main analysis without excluding potential recruits, finding qualitatively similar results (Tables A13-A15).
While we do not observe true athletic recruitment status, we have access to detailed information about each applicant's extracurricular participation. We know not only the extracurricular activities of each applicant, but also the number of years participated in each activity during high school, the order in which those activities are reported in their application, and whether the applicant intends to continue participation in the activity in college. We assume that student athletes will list their athletic participation as the first activity in their application. We further assume that student athletes will have participated in their first-listed sport during all four years of high school, and that they intend to continue participating in their first-listed sport in college. Finally, we assume that student athletes will only apply to one college or university in an early round, and that they will always send a transcript to that one institution.
Among the 40,391 white and Asian American applicants who submitted a transcript to one of the institutions we consider in the main analysis and who also attended a high school in the U.S. with reliable transcript-sending behavior, 4,622 applicants were potential recruits (11%). While we cannot formally verify that these students were actually recruited by the schools we consider, 11% is in line with public estimates of the fraction of selective university enrollees who are student athletes.

Estimating Admission Rates
In our main analysis, we assume that students who were admitted to one of the selective institutions we consider ultimately attended one of those institutions. In this way, we could infer admissions decisions from enrollment choices-which we can in turn accurately impute by looking at the institution to which a student sent their final high school transcript. Here we describe an alternative estimation strategy that holds under the weaker assumption that enrollment choices are independent of race conditional on acceptance and other observable student characteristics.
Denote by A the event that a particular applicant is admitted to one of the schools we consider, where A = 1 if the applicant is admitted and A = 0 if the applicant is not admitted. Denote by E the analogous enrollment event. Finally, denote by R the race of the applicant, and by W a set of non-race covariates. Now, suppose we are interested in comparing the admission probability of an applicant with race R and another applicant of race R ′ with identical non-race covariates W . We can express this comparison as a risk ratio: .
Without observing admission outcomes, the above ratio cannot be estimated directly. But, suppose we assume that E ⊥ ⊥ R | A = 1, W . In other words, conditional on acceptance and all observed non-race covariates, the decision to enroll in one of the considered institutions is independent of race. Then, Applying this result to the acceptance risk ratio: Thus, by assuming that enrollment is independent of race conditional on acceptance and non-race covariates, we can estimate the acceptance ratio using only data on enrollment.
Averaging over W , we have Importantly, the right-hand side of Eq. (1) can be estimated directly from enrollment choices, as done with the main models in our analysis (Table 1). In particular, take R to be white students and R ′ to be, in turn, the three Asian subgroups we consider. Then, after adjusting for test scores, GPA, and extracurricular activities (i.e., by using Model 4 in the main text), we estimate that the average acceptance ratio is 0.58 for South Asian applicants, 0.85 for East Asian applicants, and 0.89 for Southeast Asian applicants. These estimates align with the results we report in Table 1, corroborating our main analysis.   Table A2: Summary statistics for the 'Included' applicants who attend high schools with reliable transcript-sending behavior, the 'Excluded' applicants who do not, and the combined set of 'All' applicants. On average, the 'Included' applicants submit more applications, apply early with a greater likelihood, are more likely to have legacy status, have higher standardized test scores, have more extracurricular hours, are more likely to play sports, are less likely to use application fee waivers, are more likely to attend urban and private high schools, and have smaller graduating class sizes. We re-run the main regression by inversely weighting the probability that a given applicant attends a high school with reliable transcript behavior, finding qualitatively similar results (Tables A13-A15, 'Reweighted' model variant  Table A3: Summary statistics for the race and ethnicity groups included in the analysis. White applicants are more likely to have legacy status than Asian applicants, have a greater number of extracurricular hours, on average, and are more likely to attend smaller and private high schools. East and South Asian applicants have, on average, higher standardized test scores and take more AP tests than white and Southeast Asian applicants.

Platform and our data
Platform, but not our data Neither platform nor our data 1 Unique applicant identifier Full name Athletic recruitment eligibility 2 Gender High school transcript(s) True admission outcome(s) 3 Race, ethnicity, and region(s) of origin Academic honors True enrollment outcome(s) 4 Age Letters of recommendation Ratings of admission officers 5 Citizenship status Essays and written responses Alumni interview ratings 6 High school name and location Intended career Official test scores 7 High school graduation date College-specific fields (e.g., major) Family income and assets 8 Self-reported test scores 9 Self-reported GPA, GPA weighting, and class rank 10 Highest educational attainment of parents 11 Institutions attended and degrees obtained by parents 12 Extracurricular categories, years participated, hours participated per year, leadership positions, and free text description 13 Application submission status at individual colleges 14 Application timing (e.g., restrictive early action) 15 Application fee waiver status at individual colleges 16 Receipt(s) of official transcript submission to individual colleges sent via the platform  If SAT score reported, converted to equivalent ACT score 2 Equivalent ACT Composite Score Squared 3 Missing ACT Score Student did not report an ACT or SAT score  Table A10: Variables included in Model 6, 'Early App'. Variables from Models 1 through 5 are also included.

Additional description 1 Subset Double Legacy Undergrad
Both parents were undergraduates at the same considered school to which the student applied 2 Subset Double Legacy Grad 3 Subset Double Legacy Mixed One parent was an undergraduate and the other parent was a graduate student at the same considered school to which the student applied 4 Subset Single Legacy Undergrad Exactly one parent was an undergraduate at a considered school to which the student applied 5 Subset Single Legacy Grad 6 Subset Two Separate Legacy Undergrad Each parent was an undergraduate at a considered school to which the student applied and both attended a different considered school 7 Subset Two Separate Legacy Grad   Table A13: Robustness checks of the main specification. Each variant of the main specification lists the corresponding value of the exponentiated East Asian coefficient for each of the nine models in the main analysis. Exponeniated coefficients are qualitatively similar across all specifications. Detailed descriptions of each variant are on the next page.

36
Detailed descriptions of each model variant: • The 'E. Asian and white' variant fits the main specification only on East Asian and white applicants in the study pool, mimicking the effect of interacting race with each variable in the main model.
• The 'Include recruits' specification does not remove applicants who we believe may be recruited athletes.
• The '2015 only' model fits the main model on only the 2015-2016 academic year application data, with a similar interpretation for the other variants whose names end in 'only'.
• The 'Real ACT/GPA' model excludes applicants who do not report an ACT/SAT score and/or a high school GPA.
• The 'ACT ≥ 27' model removes applicants with an equivalent ACT below 27, as very few enrollees at the schools we consider have ACT scores below 27.
• The 'Remove legacy' model removes legacy applicants from the study pool, following a similar model choice in the Harvard v. SFFA court case.
• The 'Transcript senders' model includes only those applicants who sent a transcript to a specific college on the platform. These applicants have the strongest enrollment signal, as the precision of our transcript heuristic is 97%.
• The 'Regular decision' model excludes applicants who applied early to only one college, sent a transcript to that college, and did not apply anywhere else. This is a likely signal of enrollment at the school to which the student applied early.
• The 'No transcript thres.' allows all high school-years to be included in the analysis, and only excludes students with at least one application who sent the same number of transcripts as applications. The '20% transcript thres.' model allows only applicants from high school-years for which less than 20% of applicants who submitted more than one application sent the same number of transcripts and applications. This model also removes all students with more than one application who sent the same number of transcripts as applications. The '0% transcript thres.' model does not allow highschool years with any applicants who submitted more than one application and sent the same number of transcripts and applications.
• The 'Reweighted' model reweights the main model by the inverse likelihood that the given applicant attended a high school-year where no more than 5% of applicants with more than one application sent the same number of transcripts as applications. The corresponding propensity model is fit using the same covariates as Model 9, excluding the state-year-basket fixed effects.
• The 'Leave one out' variants assess the sensitivity of the Asian region coefficients to the particular set of schools considered in the analysis. The exponentiated coefficients of the 'Leave one out max' and 'Leave one out min' variants are derived from fitting the 9 models n schools times, where n schools is the number of selective schools considered in the main analysis. To preserve confidentiality, the exact value of n schools is not provided. For each of the 9 model specifications, we report the maximum and minimum observed values of the exponentiated Asian region coefficient across n schools datasets. Each dataset excludes application data from one of the selective schools considered in the main analysis.   Table A15: Robustness checks of the main specification. Each variant of the main specification lists the corresponding value of the exponentiated Southeast Asian coefficient for each of the nine models in the main analysis. Exponeniated coefficients are qualitatively similar across all specifications. Model variants are described in the caption of the corresponding figure for the exponentiated East Asian coefficient. Table A16: Aggregated counts of white applicants and admits across groups defined by geography, equivalent ACT score, and legacy status. Admission is proxied by observing whether a final transcript is sent to one of the schools we consider. "ZIP 1" refers to the first digit of the student's high school ZIP code. To preserve confidentiality, legacy and non-legacy applicant cell counts with fewer than 50 applicants are redacted, along with the corresponding count of admits. Further, legacy and non-legacy admit cell counts of 0 are redacted, along with the corresponding count of applicants. Table A17: Aggregated counts of Asian American applicants and admits across groups defined by geography, equivalent ACT score, and legacy status. "ZIP 1" refers to the first digit of the student's high school ZIP code. To preserve confidentiality, legacy and non-legacy applicant cell counts with fewer than 50 applicants are redacted, along with the corresponding count of admits. Further, legacy and non-legacy admit cell counts of 0 are redacted, along with the corresponding count of applicants. Table A18: Aggregated counts of South Asian applicants and admits across groups defined by geography, equivalent ACT score, and legacy status. Admission is proxied by observing whether a final transcript is sent to one of the schools we consider. "ZIP 1" refers to the first digit of the student's high school ZIP code. To preserve confidentiality, legacy and non-legacy applicant cell counts with fewer than 50 applicants are redacted, along with the corresponding count of admits. Further, legacy and non-legacy admit cell counts of 0 are redacted, along with the corresponding count of applicants. Table A19: Aggregated counts of Southeast Asian applicants and admits across groups defined by geography, equivalent ACT score, and legacy status. Admission is proxied by observing whether a final transcript is sent to one of the schools we consider. "ZIP 1" refers to the first digit of the student's high school ZIP code. To preserve confidentiality, legacy and non-legacy applicant cell counts with fewer than 50 applicants are redacted, along with the corresponding count of admits. Further, legacy and non-legacy admit cell counts of 0 are redacted, along with the corresponding count of applicants. Table A20: Aggregated counts of East Asian applicants and admits across groups defined by geography, equivalent ACT score, and legacy status. Admission is proxied by observing whether a final transcript is sent to one of the schools we consider. "ZIP 1" refers to the first digit of the student's high school ZIP code. To preserve confidentiality, legacy and non-legacy applicant cell counts with fewer than 50 applicants are redacted, along with the corresponding count of admits. Further, legacy and non-legacy admit cell counts of 0 are redacted, along with the corresponding count of applicants.  Asian−identifying share of high scorers (%) High scorer admit rate (%) Figure A1: For each U.S. state, estimated rate of admission to the selective schools we consider for white applicants and Asian applicants with an ACT-equivalent score at or above 32, with the proportion of high-scoring white and Asian applicants who identify as Asian on the horizontal axis. Point sizes are proportional to the number of high-scoring white applicants and Asian applicants to the considered schools whose high school is located in the given state. The red least-squares regression line is weighted by the same count of high-scoring white and Asian American applicants from each state. The blue line excludes applicants from California. States with a greater share of Asian American applicants have, on average, lower admission rates for high-scoring applicants. This pattern holds even if applicants from California are excluded. Asian−identifying share of high scorers (%) Average equivalent ACT score Figure A2: For each U.S. state, mean equivalent ACT score among applicants who reported an ACT score, with the proportion of high-scoring white and Asian applicants who identify as Asian on the horizontal axis. Hawaii is excluded from the plot due to its exceptionally high share of Asian American applicants. Hawaii's mean equivalent ACT score is 31.6. Point sizes are proportional to the number of high-scoring white applicants and Asian applicants to the considered schools whose high school is located in the given state. The red least-squares regression line is weighted by the same count of white and Asian American applicants from each state. 0% 10% 20% 30% 40% 50% 60% 70% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Asian−identifying share of high scorers (%) High scorer admit rate (%) Figure A3: For each high school in the study pool, rate of admission to any of the selective schools we consider for white applicants and Asian American applicants with an ACT-equivalent score at or above 32, with the proportion of high-scoring white and Asian American applicants who identify as Asian American on the horizontal axis. Point sizes are proportional to the number of high-scoring white applicants and Asian applicants to the considered institutions who attend the given high school. The red least-squares regression line is weighted by the same count of high-scoring white and Asian American applicants from the given high school. High schools with a greater share of Asian American applicants have, on average, lower admission rates for high-scoring applicants. High−scoring applicants with legacy (%) Figure A5: The proportion of applicants with one or more parents who attended one of the selective schools we consider as an undergraduate, by race/ethnicity and fee waiver status. The pool of applicants in this figure is the same as the main analysis, but does not apply the filters for race or ethnicity. White applicants are the most likely to have legacy, and Asian American applicants are the least likely.