Introduction

Childhood obesity has reached epidemic proportions in the United States (US). From 1978 to 2016, the prevalence of obesity among youth aged 2–19 years in the US rose from 5% to 18.5% [1], and 1 in 6 children (16.1%) was overweight in 2017–2018 [2]. It is well established that children with obesity have a higher risk of chronic physical and psychosocial health problems [3,4,5,6,7,8,9]. Childhood obesity might be associated with poorer academic performance, which may, in turn, negatively impact children’s long-term professional and economic prospects. Potential mechanisms by which higher body mass index (BMI) may hinder academic performance include social isolation [10], disrupted brain development and cognition [11], school absenteeism related to poorer health [12], low-quality sleep due to breathing issues [13], and weight-based stigmatization by teachers when assigning course grades [14].

Literature on the association between obesity and academic achievement is inconclusive. Two systematic reviews of literature on the obesity-achievement relationship in youth found inconsistent evidence [15, 16]. One review only found a consistent negative relationship for girls’ math achievement [15], while the other concluded that the relationship between obesity and academic performance was uncertain in most studies after controlling for covariates including socioeconomic status (SES) and physical activity (PA) [16]. A 2019 metanalysis of the relationship between BMI and academic achievement found a weak negative correlation using data from 60 studies enrolling 164,049 participants and published from 1999 to 2017, but the included studies did not all control for SES or cardiorespiratory fitness (CRF) [17].

It is difficult to compare study results due to their use of different measures of academic achievement, confounders, and body physique. Many studies use standardized tests to measure academic performance, while others rely on unstandardized outcomes such as teacher- or self-reported grades or grade point average [15, 17]. Although many studies use objectively measured weight and height, some rely on self-reported weight and height [15, 16]. Across studies, SES is identified as an important confounder of the relationship between obesity and academic achievement [15, 17]. In the US, childhood obesity declines as SES rises and higher SES is associated with higher academic achievement [18, 19]. Some common measures of SES include parent level of education, parent income, parent occupation or employment status, and (in the US) free/reduced-price lunch (FRL) eligibility.

In summary, research on the relationship between weight status and academic achievement remains inconclusive. If there is an association, studies suggest it is stronger among girls and older students. Significant research gaps remain. Multiple systematic reviews have called for better incorporation of CRF [15, 16]. Research is also needed that controls for confounders like SES, and longitudinal studies that account for change in obesity status over time [15, 16]. In the US, studies that account for race/ethnicity and SES are particularly important given the country’s experiences with systemic racism and income inequality. The present study addresses existing research gaps and limitations by examining longitudinal data from a large, diverse sample of elementary schoolchildren, and by adequately controlling for confounders. The study seeks to answer the following questions: (1) Is longitudinal overweight or obesity associated with academic performance among children?; and (2) Does the relationship between overweight or obesity and academic performance differ across sexes, race/ethnicity, and level of CRF?

Materials/subjects and methods

Study design

Data used to answer the research questions are from a cluster-randomized controlled trial conducted in 40 elementary schools (20 intervention schools; 20 control schools) in a large suburban school district in Georgia, US. Students were prospectively followed from Grade 4 to Grade 5 including Grade 4 Fall (Fall 2018), Grade 4 Spring (Spring 2019), Grade 5 Fall (Fall 2019), and Grade 5 Spring (Spring 2020), though study activities ended midway through Grade 5 Spring in March 2020 because of the COVID-19 pandemic. School selection and randomization are described in a previous manuscript [20]. The school district administration, district IRB, and Emory University IRB (CR001-IRB00095600) approved this study. This study was registered with the National Institutes of Health ClinicalTrials.gov system, with ID NCT03765047.

The intervention employed components from the evidence-based Health Empowers You! program, which was designed using the Comprehensive School PA Program approach promoted by the Centers for Disease Control and Prevention (CDC) [21]. The multilevel intervention aims to shift school PA practices and culture and help students reach at least 45 min of PA during the school day. Prior evaluations of Health Empowers You! document improvements in average daily steps, moderate-to-vigorous PA (MVPA) levels in physical education (PE) classes, and student fitness and BMI [22, 23]. The intervention was implemented with the goal of sustainably elevating student school-day MVPA, which was measured with ActiGraph wGT3X-BT accelerometers (ActiGraph LLC, Pensacola, FL). Intervention status was ultimately not included in this analysis because differences in MVPA between intervention and control students were small; intervention students had approximately 3 more daily minutes of MVPA in Grade 4 Fall, 4.5 min more in Grade 4 Spring, and 5 min more in Grade 5 Fall. Details about the intervention are provided in a previous manuscript [20].

Before study implementation, consent/assent forms were distributed through district and school protocol with a brief informational video to obtain guardian consent and student assent to measure PA via accelerometry, and authorization for the school district to share de-identified demographic, standardized test score, course grade, FitnessGram, and attendance data with the research team.

Study population

Participating elementary schools included diverse student race/ethnicity and a mix of higher and lower SES. The school selection procedure ensured the schools were representative of the school district [20]. Of 6525 fourth graders in the 40 study schools, 4966 (76%) returned consents. Special education teachers participated in training and received resources for the implementation of the intervention at their discretion in the intervention schools, but students in special education classrooms were not included in data collection because these classes include multiple grade levels, and students in special education classes received teacher-assigned grades based on unique grading criteria. After removing students in special education classrooms from the analytic sample, 4936 students were eligible for analysis.

Data sources

The study used routinely collected school district data to obtain information about demographics, attendance, FitnessGram, course grades, and standardized test scores.

Demographic data included parent/guardian-reported student sex and race/ethnicity and school-reported students with disabilities (SWD), English language learners (ELL), and participation in FRL during the Grade 4 school year.

Attendance data included the number of days students were absent, tardy, and enrolled during the Grade 4 school year.

FitnessGram data documented students’ performance on the FitnessGram, an assessment developed by The Cooper Institute [24]. The district’s PE instructors are routinely trained in FitnessGram data collection, and the intervention’s Physical Activity Specialists (PASs) delivered a refresher training on FitnessGram to PE instructors in both years of the study. Students complete the FitnessGram in September/October and May/June each year. PE instructors measured student height and weight to calculate student BMI. Results from the FitnessGram PACER, a 20-m shuttle run, were used to estimate CRF. Full FitnessGram data were collected in Grade 4 Fall and Spring and Grade 5 Fall. FitnessGram data were not collected in Grade 5 Spring due to COVID-19. The PACER test was also not completed in Grade 3 because it has not been validated among third-grade students, but BMI data were collected in the Grade 3 Fall FitnessGram.

Semesterly course grades data included mathematics, reading, spelling, and writing grades from Grade 3 Fall to Grade 5 Fall.

Georgia Milestones Test data included student scores for Grade 3 Spring and Grade 4 Spring for English language arts (ELA), mathematics, and Lexile reading level [25]. The Milestone test is designed to assess whether students’ knowledge and skills meet state-adopted content standards for each academic subject [26]. Standardized tests were not administered in Grade 5 due to COVID-19.

Study measures

Exposure

The exposure for this analysis is longitudinal weight status based on BMI. CDC age and sex-specific growth charts [27] were used to categorize participants as obese, overweight, healthy weight, and underweight. Children with a BMI at or above the 95th percentile for their age and sex had obesity, those from the 85th–95th percentile had overweight, those from the 5th–85th percentile had a healthy weight, and those below the 5th percentile had underweight [28].

Longitudinal weight status was based on obesity status at two time points and had four categories. Students who were obese at baseline and at follow-up were assigned “persistent obesity,” those who were not obese at baseline but were at follow-up were assigned “developed obesity,” those who were obese at baseline but not at follow-up were assigned “former obesity,” and those who were not obese at both time points were assigned “persistent non-obesity.” For analyses examining Grade 4 standardized test scores as outcomes, baseline BMI was Grade 3 Fall and follow-up was Grade 4 Spring. For analyses examining Grade 5 fall course grades as outcomes, baseline BMI was Grade 3 Fall and follow-up was Grade 5 Fall.

Supplemental analyses also considered the association between longitudinal overweight/obesity status and academic achievement. For these analyses, students with overweight or obesity at baseline and at follow-up were assigned “persistent overweight/obesity,” those who were not overweight or obese at baseline but were at follow-up were assigned “developed overweight/obesity,” those who were overweight/obese at baseline but not at follow-up were assigned “former overweight/obesity,” and those who were not overweight/obese at both time points were assigned “persistent non-overweight/obesity.”

Outcomes

Two different types of academic achievement measures were assessed. The first was Grade 4 Spring ELA, math, and Lexile Georgia Milestones standardized test results. Participant math scale scores ranged from 394 to 715, ELA scale scores ranged from 357 to 775, and Lexile scores ranged from 190 to 1300. Analyses were conducted with Milestones scores as continuous variables.

The second type of academic achievement measure was teacher-assigned course grades for reading, writing, spelling, and math. Course grades for Grade 3 Fall to Grade 5 Fall were collected and ranged from 0 to 100, with 100 indicating the highest achievement.

Covariates

Variables examined as confounders included sex (male or female), race/ethnicity (Asian, Black, Latino, White, or Other), FRL, SWD, ELL, prior achievement, and CRF. FRL status was dichotomized as “receiving” or “not receiving” and was used as a proxy for poverty status since only students whose families earn less than 185% of the federal poverty level are eligible. SWD included those with physical or learning disabilities and was dichotomized as “yes” or “no.” Current ELL was also dichotomized as “yes” or “no”. Student prior achievement was defined as the previous year’s course grade or standardized test score, in accordance with the outcome assessed in analyses. For example, the analysis using Grade 4 Georgia Milestones math standardized test scores controlled for each student’s Grade 3 Georgia Milestones math standardized test score. PACER laps were converted to an estimated CRF using the Cooper Institute’s standard formula [29]. The median CRF across Grade 4 Fall, Grade 4 Spring, and Grade 5 Fall was assigned to each student. The “healthy fitness zone” cutoff for CRF in this age group is 40.2 [30]. A dichotomous CRF variable using this cutoff categorized students’ median CRF as “fit” or “unfit.”

Analysis

Variables were missing data either because students were not enrolled in the participating schools for the entirety of the study or because their observation did not meet inclusion criteria. Multiple imputation addressed missing data. Twenty imputed datasets were created using the multilevel multiple imputation program Blimp [31]. Implausible imputed values were set to variables’ upper or lower bounds, depending on the nature of the recorded implausible value.

Descriptive statistics were computed on the non-imputed data. Variance in academic outcomes was similar across longitudinal overweight/obesity and longitudinal obesity subgroups. Two-level multilevel models were then fit with students nested within schools and synthesizing data across the 20 imputed sets. The teacher level was not included in multilevel analyses since students with departmentalized teachers rotated across teachers for core subjects. All models were first run with longitudinal obesity status as exposure. First, models assessed crude associations between longitudinal weight status and academic outcomes (Model A). Then the same associations were assessed but adjusted for prior achievement, FRL, sex, race/ethnicity, SWD, and ELL (Model B). For analyses with Grade 4 standardized test outcomes, Grade 3 standardized test scores were used for prior achievement. For analyses with Grade 5 Fall course grade outcomes, average Grade 3 course grade was used for prior achievement. Model C was further adjusted for dichotomized CRF. Fixed and random effects were aggregated across imputations using Rubin’s rules [32]. The same analyses were then run using longitudinal overweight/obesity as the exposure. On the basis of Model C, moderation analyses were also conducted for sex, race/ethnicity, and dichotomized CRF.

It was critical to adequately control for SES since it is a key confounder of the association between weight status and academic achievement. FRL participation is an imperfect proxy for SES—there are instances where a student in FRL’s family is not impoverished and instances where a student not in FRL’s family is actually impoverished. We therefore conducted a bias analysis using values of sensitivity and specificity of poverty classification that were derived from the CDC’s National Health and Nutrition Examination Survey conducted from 2011 to 2018. On the survey, parents indicated their race, their household income, and their children’s FRL status, which allowed for the estimation of sensitivity and specificity of poverty classification by FRL. We used these sensitivity and specificity values to calculate positive and negative predictive values of poverty status based on FRL participation. We in turn used those values to run a jackknife-weighted multilevel regression across the 20 imputed sets to see whether uncertainty about FRL as an SES proxy could substantially change findings.

Results

Descriptive statistics

The 4936-student study population was evenly split by sex and racially/ethnically and socioeconomically diverse (12% Asian, 25% Black, 33% Latino; 52% eligible for FRL) (Table 1). Median age in Grade 4 Fall was 9. From Grade 3 Fall to Grade 5 Fall, about 21% of participants had persistent obesity, 5% developed obesity, 3.5% no longer had obesity, and 70% had persistent non-obesity. Median CRF was relatively consistent across measurement periods, ranging from 40.8 to 41.8 VO2 max.

Table 1 Demographic characteristics, physical fitness attributes, and academic outcomes for study participants, Grades 3–5 (n = 4936).

Multilevel models – longitudinal obesity status and standardized test scores

In the unadjusted models (Model A), there were small, negative associations between persistent obesity and all standardized test outcomes (Math −8.1 {−11.7, −4.4}, ELA −6.9 {−10.6, −3.2}, Lexile −23.9 {−38.9, −8.9}) (Table 2). Students who developed obesity had negative associations of a larger magnitude. For example, for Lexile scores, the association was −63.7 (−95.7, −31.7) for those that developed obesity, −23.9 (−38.9, −8.9) for those with persistent obesity, and 14.3 (−14.0, 42.6) for those who no longer had obesity.

Table 2 Associations between longitudinal obesity status from Grades 3–4 and academic performance measured by Grade 4 standardized test scores.

When adjusting for sociodemographic covariates (Model B), all negative associations for standardized test scores migrated nearer to the null. Stronger negative associations remained between students who developed obesity and standardized ELA and lexile. When also controlling for dichotomized CRF (Model C), all associations became near null for students with persistent obesity. Magnitudes of negative associations for students who developed obesity remained further from the null and negative for ELA (−4.9 {−10.1, 0.27}) and lexile (−22.5 {−45.6, 0.59}).

Additional analyses investigated modification by CRF and found that interaction coefficients were consistently negative for Grade 4 standardized test scores, but meaningful modification by CRF was not indicated. Likewise, modification by sex or race/ethnicity was not indicated. Results are available upon request. Bias analyses suggested a slight bias of findings away from the null, without meaningfully changing interpretation.

Multilevel models – longitudinal obesity status and course grades

In the unadjusted models (Model A), students who either had persistent obesity or developed obesity had consistently lower academic performance across all teacher-assigned grade outcomes (Table 3). The magnitude of this relationship was stronger for those with persistent obesity (e.g., math −2.1 {−2.9, −1.4}) compared to those who developed obesity (−1.0 {−2.5, −0.5}, and those who formerly had obesity (0.3 {−1.5, 2.1}.

Table 3 Associations between longitudinal obesity status from Grades 3–5 and academic performance measured by Grade 5 Fall course marks.

When adjusting for sociodemographic covariates (Model B), all negative associations between persistent obesity and academic grades migrated toward the null. Small, negative associations remained between persistent obesity and Grade 5 Fall math and writing grades. When also controlling for dichotomized CRF (Model C), all associations were effectively null across those with persistent obesity, those who developed obesity, and those with former obesity. Additional analyses of modification by CRF found generally positive interaction coefficients, but the meaningful modification was not indicated. Similarly, additional analyses did not indicate modification by sex or race/ethnicity (Results available upon request). Bias analyses suggested a slight bias of findings away from the null, without meaningfully changing interpretation.

Multilevel models – longitudinal overweight/obesity status and standardized test scores/course grades

Analyses examining associations between longitudinal overweight/obesity and standardized test scores and adjusting for sociodemographic characteristics and dichotomized CRF (Model C) found some indication of an association for math test scores among students who developed overweight or obesity (−2.8 {−6.5, 0.91}) (Supplementary Table 1). Model C analyses also found that students who developed overweight or obesity had lower grades across subjects (−1.1 {−2.1, −0.13} in math, −0.72 {−1.5, 0.084} in reading, −0.70 {−1.6, 0.25} in spelling, and −0.95 {−1.7, −0.17} in writing) (Supplementary Table 2). Analyses examining modification by CRF, sex, and race/ethnicity did not indicate modification. Results for modification analyses are available upon request.

Discussion

The present study found only marginal negative associations of persistent overweight/obesity with academic performance. The associations did not differ across sexes or racial/ethnic groups. Analyses did not indicate an association between persistent obesity and academic outcomes over and above sociodemographic factors including SES (proxied by FRL participation), sex, race/ethnicity, disability, and speaking English as a second language. These findings suggest the impact of socioeconomic factors associated with childhood obesity in the US, including lower parent income, lower parent education, and poorer neighborhood conditions, on academic outcomes, as well as the impact of persistent racial inequality in the U.S. on health outcomes and academic achievement. Low SES and racial inequality have clear negative consequences for both health and academic achievement [18, 33,34,35].

Evidence of an association between overweight/obesity and academic achievement was stronger among those students who developed overweight or obesity during the study period. This was particularly true for standardized test scores among students who developed obesity and for teacher-assigned course grades among students who developed overweight/obesity. This suggests interventions on weight status could be especially important for students experiencing a change in weight status.

The limited evidence of a strong, consistent negative association between weight status and academic achievement in this study aligns with existing research. A 2019 meta-analysis of 164,049 participants found only a weak negative correlation between BMI and academic achievement, though this meta-analysis did not account for SES or CRF [17]. A 2017 systematic review of 23 cross-sectional and 11 longitudinal studies found that the association between obesity and academic performance was uncertain after controlling for covariates including SES and PA [16].

The lack of modification by sex is more divergent from previous studies, but may reflect the age of participants in this study’s cohort. A 2017 systematic review found evidence of a consistent negative association among girls, but only among adolescents [15]. More broadly, the limited meaningful negative associations in this study align with prior research suggesting a weaker association between weight status and academic achievement among pre-adolescent children. A 2019 meta-analysis found the smallest pooled effect size for BMI on academic achievement among elementary school students compared to middle school and high school students [17]. This near-null association could be due to cognitive development processes wherein pubertal prefrontal cortex development is particularly important for the development of executive function [11, 15]. The finding of a more consistent negative association for students who developed overweight/obesity aligns with prior studies. One large national U.S. cohort study found that among girls, those that moved from non-overweight to overweight status during kindergarten to third grade experienced a decline in standardized reading and math test scores [36]. Another analysis from the same cohort found lower math scores in girls who developed obesity from kindergarten to fifth grade [37].

There is less research on differences in this association across racial and ethnic groups, but the lack of modification by race/ethnicity aligns with some prior literature. Studies among racial minorities in the U.S. have generally found other factors beyond weight status to be more important for academic achievement. In one study in Massachusetts, CRF was especially important for Black students; Black students who had high CRF achieved the same performance as high-SES, low-CRF Black students [38]. Another study in a predominantly Latino school system found that grit (a construct representing perseverance) was more important than BMI or CRF for predicting Latino students’ academic performance in English [39].

Previous literature suggests a stronger association between CRF and academic achievement than between weight status and achievement. A 2017 systematic review of 45 studies examining the relationship between various physical fitness components and academic achievement found strong evidence for a positive CRF-achievement relationship [40]. This was again found in a 2018 systematic review of 51 studies [41]. A 2020 systematic review and meta-analysis also identified a positive relationship and noted that the relationship was stronger in boys compared to girls. This meta-analysis also noted a stronger positive CRF-achievement relationship among children than adolescents [42], which could explain CRF’s larger role in this pre-adolescent sample.

This study has at least four strengths. First, it has a large sample of nearly 5000 students across 40 elementary schools. Second, the sample is highly diverse, reflecting diversity across the U.S. nationally. Third, the study is longitudinal. Fourth, data collection from a single school district ensured greater consistency in recording data.

Despite these strengths, this study has at least three limitations. First, the 20-m shuttle run is not a perfect measure of CRF since student performance could be influenced by motivation. Nevertheless, it is a standard measure of children’s CRF. Second, analyzing Grade 5 spring standardized test scores and course grades would have given the study a longer follow-up time, but this became impossible due to COVID-related disruptions. Finally, most variables had some missing data, but this was addressed through multiple imputation.

Future research should prioritize longitudinal designs to understand how changes in weight status and CRF interact to affect academic achievement across age groups. Longitudinal studies should also investigate how school-day PA contributes to CRF and academic achievement in the long term. Future studies should also compare findings across different measures of academic performance, including standardized tests and teacher-assigned course grades.

This study has important implications for education policy. It is already established that overall child health is associated with academic performance – healthier children are better prepared to learn [43, 44]. Both CRF and weight status factor into children’s health status, which in turn affects academic achievement. Interventions can and should target both CRF and weight status, especially since they are often correlated. Schools can boost PA by implementing active breaks, recess activities, and other initiatives during the school day that focus on higher-intensity PA and have benefits for both CRF and weight status. Interventions targeting CRF and weight status could be particularly impactful for academic achievement among students who risk shifting from a healthy weight to overweight/obesity.