Introduction

Reading plays an essential role in students’ academic lives as they need to understand, interpret, synthesize, and evaluate a great deal of information from texts (Grabe, 2009). This is especially important for students at the university level, as reading demands are quite high. To meet this challenge, students need to be fluent readers (NRP, 2000). Thus, reading fluency is of great importance. One crucial area of fluency is reading speed: the rate at which students read (Grabe, 2010).

Discussions of reading rates often begin with Laberge and Samuel’s (1974) automaticity theory (AT), explaining that decoding at the word level for the fluent reader is fast and easy. It appears so automatic that the reader hardly seems aware of the process (Samuels, 2002). However, slow readers often spend too much effort decoding word meanings, which overtaxes their short-term memory, leaving few resources available for overall meaning construction. Following this, readers who have difficulty with automaticity read slowly. Thus, comprehension is affected as readers cannot grasp the text’s overall meaning because they have forgotten what they have read by the time they have finished reading the passage (Anderson, 1999b). Following this, slower readers need to develop their reading rates so that they can allocate their attention to active comprehension processes.

To address this, Samuels (1979) proposed improving L1 learners’ reading rates by operationalizing AT through a repeated reading (RR) technique where learners reread a passage several times in one of two ways: unassisted repeated reading (UARR) or audio-assisted repeated reading (AARR). The first involves rereading the passage silently. The second involves students reading the passage repeatedly while listening to an audio accompaniment. Following Samuels’ (1979) work, an abundant amount of L1 literature has theoretically and empirically explored RR in L1 settings, illustrating that the procedure successfully improves reading rates (Samuels, 2012).

The matter of reading rates is also a major concern for English as a foreign language (EFL) learners, as reading speed is especially challenging for L2 learners because they often read below the recommended reading rates (200 words per minute, wpm), as low as 88 wpm, albeit empirical research in the L2 area has received much less attention (Grabe, 2010; Taguchi et al., 2012).

Accepting that L2 research can benefit from an understanding of L1 studies, discussions in L2 literature similarly begin with Laberge and Samuels’ (1974) AT and Samuels’ (1979) operationalization of the theory as UARR or AARR. Discussions of UARR often begin with and follow Anderson (1983), a major proponent of unassisted RR, and his interpretation of the procedure (Anderson, 1993, 1999a, 1999b, 2003, 2006, 2012). Applications can also be found in the works that followed (Baker, 2015; Chang and Millett, 2013). Explorations of AARR typically begin with and follow Taguchi and his colleague’s (1997, 2002, 2004, 2006, 2008, 2010) interpretation of the procedure and continue with subsequent work (Altun, 2017; Lynn, 2021; Yeganeh, 2013).

These investigations have generally shown that the UARR procedure offers convenience, as preparation and equipment are unnecessary (Baker, 2015; Samuels, 1979), whereas AARR is labor-intensive, requiring the preparation of commercial audiotapes and/or recordings of teachers’ voices or teachers reading out loud, equipment (e.g., audio players), and possibly headphones if students listen individually (Baker, 2015; Samuels, 2002). In addition, AARR is susceptible to technical failure, and the audio component can impede comprehension as it pushes students along when they wish to pause or slow their reading to understand a passage (Taguchi et al., 2012).

These investigations have also generally produced significant yet incongruous results due to heterogeneous methods (sample sizes, matching students between groups, dissimilar pre/post-test and treatment materials, limited overlapping treatment vocabulary, and use of comprehension questions). Moreover, these investigations have focused on each method separately, and thus no single investigation has explored and compared these methods’ effects.

Literature review

Unassisted repeated reading in EFL contexts

Shortly after the publication of Samuels’ (1979) article, Anderson (1983) operationalized the UARR technique with EFL learners as a five-step procedure, describing and elaborating on it in future publications as follows:

  1. 1.

    Give students one minute to read as much material as they can. Time them.

  2. 2.

    After a minute, tell them to stop and write the number 1 where they are in the text.

  3. 3.

    Then, have the students return to the beginning of the passage and read again for another minute.

  4. 4.

    After the second minute, have them write the number 2 where they are in the text. The goal is to read more material in the second minute than in the first.

  5. 5.

    Repeat this procedure a third and fourth time. Each time, have the students record the number (e.g., 1, 2, 3, 4).

Anderson noted that UARR can be applied to various materials, and that he had empirically tested it, showing significant increases for the experimental group (EG) but provided no details. He did, however, note the importance of maintaining an ideal 70% comprehension level, but no comprehension assessments were administered.

Anderson reiterated the benefits and operationalization of UARR in several publications and repeatedly provided practical demonstrations with conference audiences using plenary proceedings as reading material, explaining that the technique could be applied without equipment (e.g., audio assistance) and with any type of materials (Anderson, 1993; 1999a; 1999b; 2002; 2003; 2006; 2012). Yet no further empirical results were found to have been reported.

Following Anderson’s work, other researchers empirically tested the UARR procedure and added variations by experimenting with matching students’ reading levels to texts, the type of material used, the number of sessions, the number of passages to be read at each session, the number of RRs for each passage, and how and whether to address comprehension. And each provided positive but incongruent results. For instance, Chang and Millett (2013) used graded readers to test and assess reading speed and comprehension with EFL learners (Chinese, Japanese) but employed dissimilar materials for treatment (vocabulary and grammar-controlled reading speed textbook passages). In a 13-week design, a vocabulary test was used to match students to texts, two passages per session (26 in total), five RRs per passage, and comprehension questions after each RR. As indicated by parametric statistics (independent samples t test), the EG outperformed the control group (CG) both in reading rate (47 wpm; 13 wpm) and comprehension. However, these findings may have been impacted by two variables. First, the EG was asked to record their rate after each RR, which may have affected their motivation to advance their rates (Samuels, 1979). Furthermore, comprehension questions were included in the pre and post-tests and after the RRs, which may have encouraged participants to read at a slower pace (Carver, 1992; Gorsuch and Taguchi, 2008; Taguchi et al., 2012).

Baker (2015), using a single-group design with Taiwanese EFL learners and referencing the importance of applying Samuel’s easy material requirement through an i minus 1 approach to select texts, employed Betts’s (1946) five-finger technique to match texts (graded readers) to students’ reading levels. Baker added to Anderson’s method by clarifying the importance of beginning the procedure at the beginning of chapters and natural points in passages’ plots and taking a short break between passages. Applying Anderson’s (1983) procedure with a large sample single group design (N = 48) for 4 weeks (one session per week) and three passages per day with 4 RRs per passage for a total of 48 readings, Baker used the first reading of the first and final RR session with the same treatment materials for the pre and post-test. Applying a parametric test (paired-samples t-test), Baker reported a mean significant increase (43.55 wpm) and noted comprehension was intentionally unassessed to avoid interrupting the RR procedure or negatively impacting rate gains, as had been reported in other studies.

Lynn (2021) also furthered UARR literature. Responding to claims that excessive RRs can be counterproductive owing to participant fatigue (Taguchi et al., 2012, 2016) and that excessive sessions can negatively impact gains (Millet, 2008), Lynn experimented with different numbers of RRs (3, 5) per session with a small sample (EG = 16; CG = 15) and an 18-session (one per week) treatment regime using 18 separate graded texts and comprehension questions. Employing the first reading of the first and last sessions with the same materials as pre and post-tests and a parametric independent samples t-test, negligible, insignificant gains (3 RRs, 1.9 wpm; 5 RRs, 5.7 wpm) were reported. These results may be attributed to the inclusion of comprehension questions and limited overlapping vocabulary when using multiple texts, which could reduce repeated exposure and retard automaticity.

Audio-assisted repeated reading with EFL learners

Similar to UARR literature, these investigations employed RR in various ways (e.g., research designs, sample sizes, methods, materials, treatment regime, and comprehension assessment), which likewise resulted in mostly positive, yet incongruous results.

Taguchi (1997) began this trajectory by matching texts to students’ levels and the materials to be used (graded readers) using a complicated combination of TOEFL scores and a lengthy (40-minute) cloze procedure with graded readers. Taguchi additionally described the readability level of the texts. Afterward, Taguchi employed the following procedure, which, like Anderson (1983), would guide later research:

  1. 1.

    Students read the previous passage to remember what they had read in the last session. This step is skipped only when they start a new story.

  2. 2.

    They time their first reading of a passage with a stopwatch.

  3. 3.

    They read the passage three times while listening to the exact taped version with headphones.

  4. 4.

    They read the passage silently three more times and time each reading with a stopwatch (p. 107).

Taguchi et al. employed a 28-treatment session, 14-week regime (assumedly two per week), where students read each passage seven times during each session and were encouraged to read quickly but maintain comprehension, albeit no comprehension assessment was reported. Employing the first readings of the first and last sessions as the pre and post-test and a nonparametric instrument (Mann-Whitney U test) showed a 21-wpm gain, albeit gain transfer to new texts was not significant, most likely due to the small sample size.

Overall, Taguchi (1997) made several essential contributions to AARR literature. Aside from noting the importance of matching students with texts, he also outlined a pre and post-test procedure, the importance of attending to, but not necessarily assessing comprehension, and suggested materials (i.e., graded readers).

In the years that followed, Taguchi experimented with further variations. For instance, in contrast to the previous study’s single-group design, Taguchi and Gorsuch (2002) utilized an experimental one (EG = 9; CG = 9) with Japanese EFL learners. With pre and post-test passages from a reading test bank, comprehension questions at the first, third, and seventh RR, and graded readers (matched to students) for ten treatments (28 sessions), they reported the nonparametric test (Mann–Whitney U-test) showed that the EG insignificantly outperformed the CG (26, 11 wpm). Additionally, both groups showed significant comprehension gains, but intragroup differences were insignificant. However, Gorsuch and Taguchi suggested that including comprehension questions could have adversely affected results, since students might have lowered their reading level below the rauding level (Carver, 1992; Gorsuch and Taguchi, 2008). They also noted the differences in readability between pre and post-tests and the small sample size as problematic.

In 2004, Taguchi et al. conducted a follow-up study and expanded the research area to compare the effects of AARR and extensive reading (ER) on rate and comprehension gains using the previous procedures with similar demographics and sample sizes. With similar pre and post-tests (passages from a reading inventory with comprehension questions) and a 42-week treatment session schedule (5 RRs per session) with graded readers for the AARR group (n = 10) and a 17-week treatment session schedule for the ER group (n = 10), the nonparametric Mann–Whitey U-test showed that both groups showed significant rate gains, but the AARR group significantly outperformed the ER group. Moreover, they reported problems that may have adversely affected the results, such as pre and post-test comprehension questions, text readability levels, and small sample sizes.

In 2006, Taguchi et al. summarized and analyzed the theoretical and administrative challenges that may have affected their previous studies’ findings, i.e., sample sizes, unequally matched participant groups, and comprehension questions. These points again laid the basis for future research.

Gorsuch and Taguchi (2008) addressed challenges raised in previous studies. Besides using a larger sample (EG = 26; CG = 28) and addressing sample size comparability, they also discussed comprehension differently by using short stories for pre- and post-tests and comprehension questions that included short answers and recall procedures. They then used different materials (two graded readers) for an 11-week treatment (1 session per week; 5 RRs per session). According to parametric paired samples t tests, the EG showed a 54-wpm gain during treatment. However, for the pre and post-tests, where comprehension questions were applied, only the EG showed a slight increase (11 wpm), while the CG showed a reduction (3 wpm), insignificant comprehension gains, and inconclusive comprehension recall results. These disparate findings were attributed to participants slowing their reading rate in anticipation of post-reading comprehension questions (Carver, 1992). Gorsuch and Taguchi also noted a power outage that forced instructors to read the texts instead of using an audio tape.

Continuing to experiment with variations, sample size, and matching, Taguchi et al. (2012) conducted a case study (one Japanese EFL learner). In contrast to previous studies’ lengthy and complicated matching procedures, Taguchi et al. used a procedure similar to Betts’s (1946) five-finger method to confirm 98% vocabulary comprehension. In addition, Taguchi et al. tested a 13-week treatment regimen (70 sessions, 5 RRs each). Again, however, they employed a short story and open-ended comprehension questions for the pre and post-test and graded readers for the treatment. They found that both rate (24 wpm) and comprehension improved but that the audio component hampered comprehension since the participant had to skip over areas she needed to explore more deeply to keep up with the audio. Too many RRs (i.e., 5) were also reported as demotivating.

Other studies have furthered Taguchi and his coauthors’ work. For example, Chen and Ying (2009) employed Taguchi’s AARR procedure with Chinese EFL learners, two EGs and one CG (n = 30 each). Each group read passages from a textbook for 25 sessions. The EG used the AARR procedure, whereas the CG read the text once. Chen and Ying also assessed comprehension (pre- and post-test questions) with additional materials. All three groups showed rate gains (49, 57, 21 wpm), but, according to the parametric paired samples t test, only the EG’s gains were significant. As with Taguchi, Chen and Ying noted that the groups were unequally matched. They also did not mention overlapping vocabulary between passages or matching students to texts, problems that may have affected results.

Yeganeh (2013) repeated Taguchi’s AARR procedure with two EGs (n = 20 each), for 18 weeks (one monolingual and one bilingual group) and graded readers. Yeganeh employed a cloze procedure to match the students, as a third short story with open-ended questions was used to gauge rate and comprehension for the pre and post-test. Utilizing parametric measures (paired samples t test), Yeganeh reported that each group made significant gains (49 wpm, 55 wpm) but noted that limited overlapping vocabulary in the texts could have negatively affected the results.

Altun (2017), citing Taguchi’s AARR procedure, used a single group design with 11 undergraduates in Turkey and magazine articles from the British Council’s language learning website for eight sessions, presumably eight weeks. Altun used similar texts for pre and post-tests. Applying parametric analysis (paired samples t-test), Altun reported minor comprehension gains for 8 of the 11 students but no rate gains. Altun explained that the limited number of treatment sessions may have affected the results but made no mention of matching texts and students’ reading levels or overlap of the texts’ vocabulary, both of which could have affected the findings.

Thang and Ngoc (2020) applied Taguchi’s AARR procedure in a single group design with 23 Vietnamese undergraduates for 16 sessions over eight weeks, two per week, using TOEIC reading materials that matched the students’ levels using their TOEIC scores. Using parametric analysis (paired samples t-test), they found that reading rate (2 wpm) and comprehension significantly increased, but gains were negligible. Additionally, these results could have been negatively affected by TOEIC reading tests having little vocabulary overlap.

Research gap

In the decades since Samuels (1979) operationalized LaBerge and Samuels’ AT (1974) RR method and explained that it could be used with or without audio support, Anderson’s interpretation of UARR and Taguchi and his colleague’s interpretation of AARR, along with subsequent research, have generally produced significant yet incongruous results. Moreover, no investigation has explored and compared these two methods’ effects on increasing students’ reading rate gains in the course of one investigation. This study aims to address this crucial gap in the literature. It is hoped that by addressing this lacuna, the study may contribute valuable insights that can inform educators, policymakers, and researchers in the field of facilitating reading rate increases.

Methodology

To investigate and compare the effectiveness of Anderson’s interpretation of UARR and Taguchi and his colleague’s interpretation of AARR, an experimental research design was employed (Fig. 1). This included three hypotheses and relevant sub-hypotheses:

Fig. 1
figure 1

The research design flow of the study.

H1 UARR significantly affects students’ reading rate gains.

H2 AARR significantly affects students’ reading rate gains.

H3 There is a significant difference between UARR and AARR’s effects on students’ reading rate gains.

H3a UARR has a significantly higher effect on students’ reading rate gains than AARR.

H3b AARR has a significantly higher effect on students’ reading rate gains than UARR.

Setting and participants

The study was conducted at International University (an affiliate of Vietnam National University), in Ho Chi Minh City, Vietnam. Following the experimental design, nonprobability sampling was employed, selecting first-year, second-semester undergraduate English majors (pursuing a Bachelor of Arts in English Language and Linguistics), as they had one semester to become familiar with the undergraduate academic program. Additionally, all participants had taken the university entrance examination, and thus their English levels can be expected to be between the elementary, lower intermediate, and intermediate levels. The entire cohort was selected. This population comprises two sections. These groups were, following Dowhower’s (1987) use of the terms with L1 learners, labeled the UARR Group (UARRG) (n = 37) and the AARR Group (AARRG) (n = 40).

Several steps were taken to ensure the protection of the participants. For example, following ethics protocol, site access to the university and classes was obtained, participants received an invitation/informed consent form in their L1 (Vietnamese), no coercion was employed to encourage participation (Creswell and Creswell, 2018), and participants were informed that participation or non-participation would not affect course grades or the participants’ relationship with either the researcher or the course teacher.

Pre-treatment assessment

Prior to the study, the UARRG and the AARRG were given a prereading assessment to identify their reading levels to confirm comparability and select appropriate level reading materials. This was done using Betts’ Five Finger Test, a non-intrusive reading assessment instrument that can be completed quickly and inform reading level decisions (Baker, 2015; Baker et al., 2007; Chall, 1996; Johnson and Blair, 2003).

For this procedure, the students were given six passages (approximately 100 words each) from the Heinemann six-level graded reader series (Starter to Upper Intermediate). Next, the participants read the excerpts and circled the number of unknown words to identify their independent reading levels (Baker et al., 2007). Independent reading levels were assessed at 95% vocabulary comprehension (i.e., five or fewer unknown words per 100 words of text).

Materials

Following the pre-treatment assessment, one graded reader (Macmillan’s L. A. Winner), the type of material commonly used in studies of this type (Baker, 2015; Chang and Millett, 2013; Gorsuch and Taguchi, 2008; Lynn, 2021; Taguchi, 1997, 2004; Taguchi et al., 2012), was selected, as graded readers are designed to be controlled in vocabulary, structure, sentence length, and complexity (Bamford, 1984; Hill, 2008), features that can facilitate automaticity.

To avoid the problem of insufficient vocabulary overlap when using multiple graded readers or dissimilar texts, only one text was employed (Table 1). This text was selected according to Day and Bamford’s (1998) interpretation of Samuels’ (1979) easy material requirement (i minus 1 theory), which refers to material that is below each reader’s i level (current level of linguistic competence). This was done to provide a good fit between the readers and texts (Chall and Dale, 1995). The text was then separated into approximately equal lengths for each of the treatment sessions.

Table 1 Material.

Participants (matching groups)

After the pre-treatment assessment, a pre-test (first reading of the first text sample) was performed, and the treatment began. At this point, using a purposive matched ability nonrandom sampling approach, the two groups were further defined by time reading rates of the first reading of the first text sample (Baker, 2015; Chang and Millett, 2013; Lynn, 2021; Taguchi, 1997; Taguchi and Gorsuch, 2002; Thang and Ngoc, 2020; Yeganeh, 2013).

To ensure comparability, matching pairs from each group were identified when their prereading scores were within two standard deviations (range 0–1.42), resulting in 25 students for each group (Table 2).

Table 2 Matched pairs.

Afterward, the treatment continued. All students in each group participated in the treatment. However, only the treatment samples’ scores were assessed (25, 25) (i.e., data for students who did not have a matching pair were not included in the final analysis). Additionally, participants were not informed whose data was selected for inclusion to avoid the Hawthorne effect. To further determine the pre-treatment similarity of the two groups, a two-tailed independent-samples t-test was applied. The results showed that the groups had similar reading rates: UARRG (m = 252.92, SD = 46.19), AARRG (m = 252.5, SD = 46.48). The Levene test of equality of variance yielded a p value of 0.976, demonstrating equality of variances. The independent-samples t test further showed that the difference between the groups was not significant (p = 0.975); thus, the groups were deemed appropriate to be included in the study. The participants in each group were also shown to have similar demographic profiles: age, gender, and year of study (Tables 3 and 4).

Table 3 Demographics of UARRG.
Table 4 Demographics of AARRG.

Unassisted repeated reading group

The UARRG received UARR treatment. The procedure was conducted for five days, for a total of 60 RRs. The treatment design was adapted from Anderson’s (1993) UARR procedure. To operationalize this, procedures were further adapted from Baker (2015), i.e., the passages in the texts were marked off in 250-word sections, and markers (the numbers 50, 100, 150, etc.) were placed at 50-word increments prior to the treatment to facilitate data collection. Afterward, the following steps were conducted:

  1. 1.

    Repeated readings were begun at the beginning of chapters and at natural points in the texts’ plots.

  2. 2.

    The researcher controlled the time with a stopwatch. The participants read for one minute. At the end of one  minute, the researcher rang a small bell. The students stopped reading and marked the last word they read.

  3. 3.

    The participants repeated Step 2 with the same material (i.e., the first passage) three more times. This produced four readings of the first passage.

  4. 4.

    The participants repeated steps 2 and 3 with the second passage. This produced four readings for the second passage.

  5. 5.

    The participants repeated steps 2 and 3 with the third passage. This produced four readings of the third passage.

  6. 6.

    The researcher collected the reading rate sheets for each of the 12 readings (i.e., four readings per passage) and recorded them on a record sheet.

To avoid mistakes in administering the procedure created by language difficulties, all instructions were given in the students’ L1 (Vietnamese). In addition, to encourage accurate recording and reporting, the students were assured that their performance in the UARR practice would not affect their course grades, was entirely anonymous and voluntary, and that their record sheets would be masked without identifying personal information during data analyses.

Audio-assisted experimental group

The AARRG treatment also employed 60 RRs. The treatment design was adapted from Gorsuch and Taguchi (2010). However, a 12-day treatment regime was employed as Taguchi’s procedure utilizes 5 RRs per session. This was done to provide comparable treatment between the UARRG and AARG (60 RRs for each group).

  1. 1.

    The participants read an approximate 500-word segment. A stopwatch-like timer was made available to them. The participants recorded the time on a time log sheet.

  2. 2.

    The participants read the text a second and a third time while listening to it on an audiotape.

  3. 3.

    The participants finally read the text a fourth and fifth time, timing themselves for each reading and marking each time on their time log sheet (Gorsuch and Taguchi, 2010).

Post-test

Following the standard field protocol, the post-test consisted of reading rates for the first reading of the last treatment (Baker, 2015; Gorsuch and Taguchi, 2008, 2010; Taguchi et al., 2004). As with the treatment phase for both the UARR and the AARR, no attempt to assess comprehension was made during the pre and post-testing phases to determine how increased rates impacted students’ immediate comprehension. This omission, similar to other studies (Taguchi, 1997), was intentional to ensure participants did not artificially slow their RRs in anticipation of comprehension questions, a problem noted in theoretical discussions (Carver, 1992) and empirical studies (Gorsuch and Taguchi, 2008; Taguchi et al., 2012).

Data analysis

To analyze the data necessary to address the hypotheses, nonparametric statistics were employed. To address H1 and H2, the Wilcoxon test was used to compare the pre and post-difference (gains) for each group, and to address H3, H3a, b, the Mann–Whitney U-test was applied to compare the gains between the groups.

Results

To explore and compare the effects of UARR and AARR on undergraduate EFL learners’ reading rate gains, the results of the hypotheses (H1–H3a, b) are addressed separately.

Findings for H1

H1 investigated whether UARR significantly affects students’ reading rate gains. The results showed that the UARR group post-test (m = 300.64; mdn = 302) was higher than its pre-test scores (m = 252.92; mdn = 250), indicating a m = 47.72, mdn = 42 gain (Fig. 2).

Fig. 2: Results for H1.
figure 2

UARR’s effect on students’ reading rates.

The results of the Wilcoxon test further showed that this difference was statistically significant, p < 0.001 (Table 5). Hence, H1 was supported.

Table 5 H1: UARR’s effect on students’ reading rates.

Findings for H2

H2 explored whether AARR significantly affects students’ reading rate gains. The results showed that the AARR group’s post-test (m = 329.77; mdn = 313.77) was higher than its pre-test scores (m = 252.5; mdn = 252), indicating a m = 77.27, mdn = 6.177 gain (Fig. 3).

Fig. 3: Results for H2.
figure 3

AARR’s effect on students’ reading rates.

The results of the Wilcoxon test further showed that this difference was statistically significant, p < 0.001 (Table 6). Hence, H2 was supported.

Table 6 H2: AARR’s effect on students’ reading rates.

Findings for H3

H3 explored whether there is a significant difference between UARR and AARR’s effects on students’ reading rate gains. The results of the descriptive statistics show that the UARR group had lower values for the dependent variable gain (m = 47.72; mdn = 40) than the AARR group (m = 77.27; mdn = 82.8), demonstrating that the AARR group outperformed the UARR group (Fig. 4).

Fig. 4: Results for H3a, b.
figure 4

Comparing UARR and AARR’s effects on students’ reading rate gains.

The results of the Mann–Whitney U-test showed that the difference between UARR and AARR with respect to the dependent variable gain was statistically significant, U = 133, p < 0.001, r = 0.49 (Table 7). Hence, H3 was supported

Table 7 H3a, b: Comparing UARR and AARR’s effects on students’ reading rate gains.

Findings for H3a

H3a investigated whether UARR has a significantly higher effect on students’ reading rate gains than AARR. The results indicated that the UAAR group did not significantly outperform the AARR group. Hence, H3a was not supported.

Findings for H3b

H3b explored whether AARR has a significantly higher effect on students’ reading rate gains than UARR. The results indicated that the AARR group significantly outperformed the UARR group. Hence, H3b was supported.

Discussion and conclusion

The study explored and compared the effects of UARR and AARR on undergraduate EFL learners’ reading rate gains. The results showed that UARR significantly affected students’ reading rate gains. This result is in accordance with literature that has argued for (Anderson, 1993; 1999a; 1999b; 2002; 2003; 2006; 2012) and demonstrated the positive effects of UARR (Anderson, 1983; Baker, 2015; Chang and Millett, 2013) and converse to literature which has shown the contrary (Lynn, 2021). The results also demonstrated that AARR significantly affected students’ reading rate gains. This result is in accordance with those that have reported gains (Altun, 2017; Chen and Ying 2009; Gorsuch and Taguchi, 2008; Taguchi, 1997; Taguchi et al., 2004; Gorsuch and Taguchi, 2006; Taguchi et al., 2012; Taguchi and Gorsuch, 2002; Yeganeh, 2013) but contrary to research that has not shown such gains (Altun, 2017; Thang and Ngoc, 2020). The results further showed that AARR had a significantly higher effect on students’ reading rate gains than UARR.

Overall, the findings unsurprisingly indicate that both approaches demonstrated significant gains in reading rates. Additionally, the results further the literature by showing that the more resource-intensive AARR outperformed the less resource-intensive UARR. Considering this, broader implications for educational practices arise, urging educators and policymakers to consider which approach is more appropriate for their context’s resources and goals or even consider a holistic perspective that recognizes the multifaceted nature of learning and the unique strengths, limitations, and resource requirements of the different approaches. As no previous studies have been found to compare these two techniques, we hope this study provides not only practical insights and informs pedagogic and policy decisions by expanding RR literature regarding the comparative effects of UARR and AARR but also inspires further investigations to refine and take the field of facilitating reading rate increases in new directions.

Regarding new directions, the results have practical pedagogic and policy implications. However, the findings also beg questions that can be addressed in future research. First, efforts were made to control limitations of previous research and heterogeneity that would hinder analyses (sample sizes, matching students between groups, dissimilar pre post-test and treatment materials, lack of overlapping vocabulary in treatments, and comprehension questions). Efforts were also made to control the number of treatments, e.g., 60 RRs for each group. However, as the UARR and AARR have different numbers of RRs, this resulted in a dissimilar number of treatment days. Additionally, the publisher’s audio component of the AARR was spoken at 161 wpm, considerably slower than the students’ reading rates. Fourthly, this study explored gains using one text and not the transfer of gains to additional materials. Further research is needed to determine how consideration of these variables might yield different results.

Lastly, although the study’s results may be generalizable beyond the study’s context, replicability and generalizability are important concerns (Strube, 2000; National Academies of Sciences, 2019). Therefore, as this exploration was regionally (i.e., Vietnam) and contextually (university) specific, additional studies need to be undertaken in other international and educational contexts. Similarly, this exploration has been conducted with one type of material (graded readers). As such, explorations with other texts and other genres would also be prudent.