Large studies reveal how reference bias limits policy applications of self-report measures

There is growing policy interest in identifying contexts that cultivate self-regulation. Doing so often entails comparing groups of individuals (e.g., from different schools). We show that self-report questionnaires—the most prevalent modality for assessing self-regulation—are prone to reference bias, defined as systematic error arising from differences in the implicit standards by which individuals evaluate behavior. In three studies, adolescents (N = 229,685) whose peers performed better academically rated themselves lower in self-regulation and held higher standards for self-regulation. This effect was not observed for task measures of self-regulation and led to paradoxical predictions of college persistence 6 years later. These findings suggest that standards for self-regulation vary by social group, limiting the policy applications of self-report questionnaires.


Study 2
A. Note on items assessing self-regulation standards. We created two standards items for this investigation based on our prior studies of self-regulation in adolescence. Specifically, following the item "I forgot something I needed for class" from the Domain-Specific Impulsivity Scale (4) inspired this question: "If a student in your grade says they 'sometimes' forget something they need for class, how often would you guess they mean?" Seven response options ranged from once a month to three times or more per day. Likewise, hours of homework was a focal measure in "Self-Discipline Outdoes IQ in Predicting Academic Performance of Adolescents" (5) and inspired this question: "If a student in your grade says they did 'a lot of homework' on a weeknight, how long would you guess they mean?" Eight response options ranged from 15 minutes to 3 or more hours. Histograms of the absoolute number of peers within each school that constituted near (blue) and far (red) peer variables. C. Example similarity matrix depicting the number of shared core courses between any two students. Each colored square is a pair of students. The diagonal represents the number of core courses each student is taking. Lighter colors represent more shared core courses.    We do not show partial correlations between peer GPA and other variables because the peer terms are collinear with the school dummy indicators. * p < .05, ** p < .01, *** p < .001.

Table S9. Differences across schools and intraclass correlation coefficients in Study 2
Not controlling for student demographics Controlling for student demographics   Notes. For each variable we report 5 models. (1) School-wide peers and no student characteristics, (2) Near-and far-peers and no student characteristics, (3) School-wide peers, no student characteristics, and school fixed-effects, (4) Near-and far-peers, no student characteristics, and school fixed-effects, (5) Near-and far-peers, student characteristics, and school fixed-effects. Robust standard errors clustered by school.
Notes. * p < .05, ** p < .01, *** p < .001.        another. To get acquainted with the software, students first completed a one-minute practice block of single-digit subtraction problems but without the option to watch videos or play Tetris. Students read: "First we are going to do some math problems for one minute. Try to solve as many problems as quickly and accurately as you can.

Benjamin
To choose an answer, click on the bubble next to the answer you would like to select." After completing the practice block, students read an FAQstyle cover story that described specifically what they would be doing. Students first read that they would be solving math problems similar to the practice block. They were again encouraged to try to solve as many problems as quickly and accurately as possible.
Next, the general utility of completing the subtraction problems was emphasized to students. Students read that practicing basic math skills, like single-digit subtraction, can help make them better at problem solving ("You are doing this activity because it can make you smarter. Research shows that practicing basic math skills, like simple math, makes you a better problem solver . . . The more problems that you do, the better you will be at solving problems in the future."). 1 Thus, if they desired, students could reasonably see completing the math problems as useful to their academic skills.
Students then read that if they felt like it they were free to click on the opposite side of the screen to watch fun videos or play Tetris, but were also reminded that the more problems they completed, the more likely it is that their problem solving ability would improve. Thus, the instructions presented students with a choice: they could either spend their time solving simple, but "good for you" math problems, or alternatively, watching frivolous, but entertaining videos and playing video games. It is important to note that students were free to do whatever they preferred; students were in no way obligated to do the math if they did not want to, and nor would they be punished if they decided to watch the videos or play Tetris (indeed, about 5% of students exclusively watched videos or played Tetris).
After reading the cover story, students answered, "Why are you going to do math problems?," by selecting one of the following re-sponse options: (a) Because I'm in school, (b) Practicing math can make me smarter, and (c) My teacher told me to. We also included this question to rule out the alternative hypothesis that performance on the Academic Diligence Task merely involved compliance with school rules and authority (as indicated by response options (a) and (c)). About 79% (n = 726) of students selected the expected response (Practicing math can make me smarter), suggesting that they both understood the message of the cover story and that it was credible.
Next, the basic cover story was reiterated to students a final time: You are about to begin the activity! Remember, you will be able to watch videos or play Tetris whenever you feel like it. To watch videos or play Tetris, click on the right side of the screen. To switch back to math, click on the left side of the screen. If you have any final questions before you begin please raise your hand now and an experimenter will assist you.
Students then began the 20-minute test phase of the ADT. The test phase consisted of five, four-minute blocks where the student could toggle between completing the single-digit subtraction problems or watching videos or playing Tetris. Math problems were presented one at a time, and students selected the correct answer among four response options. Once a response was selected, another math problem was immediately displayed; students did not receive performance feedback during the task. After each task block students were asked, "How bored were you by the math during the last session?." They responded using a 5-point scale from 1 = not at all bored to 5 = very bored. From start to finish (including the practice block, reading the cover story, answering subjective experience questions, completing the 20-minute test phase, etc.), the entire ADT lasted for about 30 minutes.
Unknown to the students, the software recorded information regarding their interaction with the task, from which we derived two indices of academic diligence: (a) productivity and (b) time on task. Productivity represents the total number of math problems solved correctly, summed across all five task blocks. Time on task represents the total percentage of time students spent solving the math problems, averaged across all five task blocks. Time on task excludes time students spent on the videos/Tetris or time spent idling, which was coded as the amount of time it took for the student to Students are given the choice to "Do math" or "Play game or watch movie." If they click "Do math," they solve single digit subtraction problems. If they click "Play game or watch movie," a pull down menu is displayed that contains various video clips or the option to play the video game Tetris. At any point during the activity students are free to either productively focus on the skill-building task or pass the time by engaging with the distractions, although the program restricts engagement to one activity at a time.  (1), students choose between "Do math" or "Play game or watch movie." If they click "Do math," they solve single-digit subtraction problems. If they instead click "Play game or watch movie," a pull-down menu is displayed that contains various video clips or the option to play the video game Tetris. At any point the students can toggle between math or entertainment, but the program restricts engagement to one activity at a time.    Notes. For each variable we report 2 models. (1) Unadjusted, (2) Controlling for student characteristics. Level 2 is the empirical bayesian estimate of the school mean. Level 1 is the deviation of each individual relative to the estimated school mean. Numeric variables were standardized prior to estimation, so coefficients are betas. Categorical variables are not standardized. † < p .10, * p < .05, ** p < .01, *** p < .001.