Introduction

It has long been recognised that early childhood literacy success is critical to a person’s health and wellbeing, enabling a realisation of his or her full potential. Moreover, it is perforce the engagement with and contribution to modern society1. Perhaps surprisingly the evidence-base for improving literacy in general populations has considerable gaps and challenges2, and so-called ‘hard-to-reach’ groups are often excluded3. Support of literacy is essential, and those with reading and learning needs should be identified early for intervention4. As literacy is such a basic foundational skill, delay in redressing these needs can have profound rippling effects over a child’s entire educational enterprise. Indeed, children requiring literacy interventions at primary school are already being ‘left-behind’. Emergent literacy skills (e.g., alphabet knowledge, print concepts, phonological awareness, and early writing) develop through sociocultural experiences from birth and are strong predictors of future reading and writing achievements5,6. Socioeconomic position and family practices also impact upon children’s early language and literacy development; two modes of communication that are deeply intertwined. For instance, in terms of language exposure and rate of expressive language development, socioeconomically advantaged children in a United States of America (USA) were exposed to and speaking over twice as many words as children from less advantaged families by age 3 years7. Thus, it is critical to detect literacy needs early so that timely remediation can be efficaciously applied.

Pacific people in New Zealand constitute a relatively young, fast growing immigrant ethnic minority, comprising 7% of the population with median age 22.1 years at the 2013 Census, compared to the dominant European population which comprised 74% of the population with median age 41.0 years8. Most Pacific people reside in Auckland, and they are generally less advantaged in income, education, housing, and health compared to their European counterparts9,10. As such, Pacific people are frequently overrepresented in many adverse health and social risk factors, leading to disproportionately higher health burden, premature death and consequent lower life expectancy10. Many of the causes are preventable and linked to social deprivation or poverty. Pacific children, many who are bilingual, are often left behind in the New Zealand school system. At primary school, on all available benchmarks, Pacific students achieve, on average, at lower levels in early English literacy and numeracy than all other students11. Here, English literacy has been defined by the dominant New Zealand European population and that this may differ from how literacy is understood in other cultures, including Pacific, which also emphasise other forms to the written symbol such as the spoken word, metaphors, proverbs, visuals arts, and song. Pacific students’ continued educational performance disparity compared to other ethnic groups in New Zealand has been labelled a ‘crisis’12, and is unacceptable to these communities and, indeed, the greater New Zealand society.

Quality early childhood programmes can help narrow the achievement gap between children from low-income families and those from more advantaged families11. Pacific primary school entrants have the lowest prior participation in early childhood education (ECE) services: 86.8% in 2012, compared to 90.9% of Māori entrants and 98% of European entrants13. Māori, New Zealand’s indigenous people, have been affected by colonisation and the ongoing effects of globalisation, and are also overrepresented in many adverse health and social risk factors. Increasing ECE participation for both Pacific and Māori children is a national priority. However, consultation with Pacific peoples suggest that early child centres are not, by themselves, a panacea, and could be improved by becoming more culturally inclusive14. Part of this inclusiveness would include broadening their value base to also include health and wellbeing in defining educational success14.

Despite typically labelled Pacific or Pasifika by educational and health agencies, the New Zealand Pacific population is heterogeneous made up of many ethnicities – manifest in differing cultures, languages, generations of immigrants, and the influences of acculturation15. There are over 20 Pacific nationalities resident in New Zealand; Samoans constituted the largest ethnic group (48.7%), followed by Cook Island Māori (20.9%), and Tongans (20.4%); and 62.3% were born in New Zealand16. Inter-ethnic relationships are common, both between Pacific nationalities and between Pacific and non-Pacific groups. Consistent with the recommended method of reporting in New Zealand, multiple ethnic identifications are permitted17. As such, in the 2013 Census, over 20% of all children aged 5–9 years were identified as belonging to two or more ethnic groups18. Therefore, Pacific communities are diverse and their health, education, and social needs are complex. However, this diversity of ethnic and cultural identities is lost under the homogeneous invoking single Pacific banner, and likely hides pockets of Pacific children who are doing relatively well and others who are doing relatively poorly. This behoves Pacific-specific investigations into factors associated with health and educational success and identifying those children who are at most risk and would benefit from timely and efficacious culturally appropriate interventions. Understanding these factors are critical for the appropriate targeted identification and intervention of individuals but also, more generally, to inform decision and policy-makers of subpopulation patterns that may most benefit from their attention. An opportunity for such an investigation exists using Statistics New Zealand – Tatauranga Aotearoa’s (SNZ’s) Integrated Data Infrastructure (IDI). The IDI captures national data collected from a range of government agencies, SNZ surveys (including the 2013 Census), and non-government organisations19. In particular, it houses data from both the Ministry of Health – Manatū Hauora (MoH) and the Ministry of Education – te Tāhuhu o te Mātauranga (MoE).

Within New Zealand, a nationwide screening programme of children aged 4 years was instigated in September 2008, and focuses on identifying any health, social, behavioural or development issues which could potentially interfere with children’s learning and success at school and to support children’s healthy development20. Known as the B4 School Check (B4SC), it includes measures of hearing and vision, height and weight, and behavioural and developmental questions using the Strengths and Difficulties Questionnaire (SDQ), completed separately by parents and ECE teachers, where applicable, and Parents’ Evaluation of Developmental Status (PEDS) tools21. Uptake is high, with 92% of the nation’s 4-year-old children participating in the 2015/2016 check22. Information collected as part of the B4SC is held by the MoH and available to the IDI. Additionally, the IDI contains important covariates, such as socio-demographic variables, including sex, month and year of birth, and a measure of deprivation.

The MoE has a stated purpose to lift aspirations, and raise educational achievement for every New Zealander23. As part of fulfilling this purpose, it seeks to provide school-based interventions to support children with identified behavioural or learning needs, and records each instance where a literacy intervention is provided to them. Student-level data on intervention type, date and timing, frequency and duration are also made available to the IDI. Linking B4SC data to this literacy intervention information thus enables an assessment of the B4SC’s utility as an early screen for predicting literacy intervention. This can be undertaken for the New Zealand population as a whole, as we have done24, but also for important specific groups, such as Pacific children. Reviews of early childhood development programmes demonstrate that such programmes can have a positive effect on children’s cognitive development and their readiness to succeed in their learning25,26. However, aside from readily identifiable infants and children with frank and apparent neurodevelopmental disorders, the general population screening of young children for future literacy need is recognised as being challenging27,28,29,30; due, in part, to the considerable variability in children’s emergent language and literacy development28. Although, few general population studies have had the suite of health variables available from within the B4SC screen.

When investigating a national sample, all considered socio-demographic and B4SC variables were significantly related to receiving a literacy intervention (p < 0.01)24. Boys were more likely to receive the intervention than girls, as were those living in higher deprivation. Large ethnic variations were observed with Māori children significantly more likely (hazard ratio [HR]: 1.34; 95% confidence interval [CI]: 1.27, 1.40) and Asian children significantly less likely (HR: 0.47; 95% CI: 0.43, 0.51) to receive a literacy intervention than their European counterparts. Children identified as being of Pacific ethnicity (HR: 1.33; 95% CI: 1.26, 1.41) had a similar likelihood of receiving an intervention as Māori children, but those identified with both Pacific and European ethnicities (HR: 1.00; 95% CI: 0.90, 1.10) had a significantly lower likelihood, despite children with any Pacific ethnic identification performing relatively worse on literacy and numeracy benchmarks.

In a large national population study, using comprehensive and psychometrically robust standardized health measures and education indicators, linked on the individual level by the internationally novel IDI, the primary aim is to provide contemporaneous quantitative insights into the relation between Pacific children’s health and development status at age 4 years and their early literacy development. Specifically, the study has two primary aims: (1) to investigate the factors captured within the B4SC health screen of Pacific children aged 4 years and report their association with literacy intervention in early primary-school, and (2) to ascertain whether these factors may be used as an early detection tool of those Pacific children with the greatest literacy need.

Results

Over the study period, 475,965 children with valid date of births between 1 January 2005 and 31 December 2011 were captured within the IDI databases. Of these, 71,199 (36.0%) were identified as being of Pacific ethnicity. Level II ethnic identification information was available for 64,695 (90.9%) of these Pacific children. Restricting this IDI dataset to those enrolled with the MoE yielded 59,760 (83.9%) children of Pacific ethnicity, leaving 11,442 (16.1%) unmatched – through unsuccessful MoE linking, home-schooling, or emigration outside of New Zealand. These 59,760 children form the research database.

Demographics

Table 1 presents the demographic distributions for Pacific children enrolled and contained within the MoE database and for those within the greater IDI database. From Table 1 it can been seen that there existed many different ethnic identification make-ups, with 38,118 (53.5%) having Pacific only ethnic identifications and 29,196 (41.0%) having combinations of Pacific, Māori and European identifications, the vast majority living in urban centres, and nearly 60% lived in the most deprived areas of New Zealand. Among the children grouped under the ‘Other Pacific’ ethnic grouping: 3,909 (37.8%) children had only level I ethnic information available (i.e. categorised broadly as Pacific); 2,499 (24.2%) identified with at least one Pacific ethnicity outside of the Samoan, Tongan and Cook Island Māori groups; 1,002 (9.7%) identified as being Samoan and at least one other Pacific ethnicity apart from Tongan or Cook Island Māori ethnic groups; 1,008 (9.7%) identified with both Samoan and Tongan ethnicities; 825 (8.0%) identified with both Samoan and Cook Island Māori ethnicities; and, 369 (4.6%) identified with both Tongan and Cook Island Māori ethnicities. Similarly, among the children grouped as ‘Pacific and Other’, 1,287 (40.8%) identified with Pacific and Asian ethnicities; 582 (18.5%) identified with Pacific, European and Asian ethnicities; 498 (15.8%) identified with Pacific, Māori, European and Asian ethnicities; and, 282 (8.9%) identified with Pacific, Māori and Asian ethnicities. When comparing the MoE sample with the IDI database, small but significant differences were observed between ethnic identification and deprivation level (both χ2 tests, p < 0.001) but not for sex (Fisher’s exact test, p = 0.40) or domicile area (Fisher’s exact test, p = 0.08). Samoan and other Pacific children and those residing in the most deprived areas were less likely to be in MoE sample than the IDI database, whereas children identified as being of Pacific and European or Pacific, Māori and European ethnicities were more likely to be in the MoE sample.

Table 1 Distribution of selected demographic variables for the full IDI dataset, and those eligible for this study (MoE linked sample).

Multiple Pacific ethnicity identities were common. Within the IDI database, 8,985 (12.6%) children identified with at least two, 927 (1.3%) with at least three, and 24 (0.03%) with at least four Pacific ethnicities. These patterns were similar in the MoE sample.

Literacy intervention

By 31 December 2015, 6,861 (11.5%) children had received at least one literacy intervention. Since birth, the children’s median participation time in the study was 7.1 years (Q1 = 6.0, Q3 = 9.0 years), and the median age to children’s first literacy intervention was 6.3 years (Q1 = 6.1, Q3 = 6.5 years). Figure 1 depicts the Kaplan-Meier curves for the probability of literacy intervention for demographic variables ethnicity, sex and deprivation. This figure illustrates that children’s first literacy intervention typically occurred between 6 and 7 years of age, and substantial differences appeared within ethnic, sex and deprivation groups.

Figure 1
figure 1

Kaplan-Meier curves of probability of having a literacy intervention by ethnicity, sex and deprivation, measured using quintiles of the NZDep2013 score (note: *the vertical axis for ethnicity is different from that used for sex and deprivation).

Table 2 presents the numbers (%) of children who had at least one literacy intervention by these selected demographics, together with unadjusted and adjusted hazard ratios HRs and associated 95% CIs derived from Cox’s proportional hazard model. All of the demographics investigated were significantly related to literacy intervention in both unadjusted and adjusted models (all p < 0.001). In particular, Tongan and Cook Island Māori children, males, those resident in a rural area, and increasing deprivation were all associated with an increased likelihood of receiving an intervention compared to their reference counterparts. Based on this model, a Tongan boy living in a rural area within the highest NZDep2013 quintile would have a literacy intervention HR of 2.86 (95% CI: 2.30, 3.57) compared to an urban living Samoan girl within the least deprived area – assuming their B4SC profiles were identical.

Table 2 Numbers (%) of B4SC cohort who had at least one literacy intervention by selected demographics, together with unadjusted and adjusted hazard ratios (HRs) and associated 95% confidence intervals (CIs).

B4SC measures

Of the 59,760 Pacific children within the MoE research database, 49,512 (82.9%) were indicated as being contacted for a B4SC. Significant ethnic differences were observed in B4SC participant within the sample (χ2 test, p < 0.001) with inclusion rates ranging from 65.8% for Other Pacific and 81.1% for Pacific and Other children through to 89.2% for Pacific and European and 89.6% for Pacific and Māori and European children. B4SC measurement status showed that 30,096 (60.8%) children had their checks completed and their file closed, 10,749 (21.7%) had checks completed but their status remained open, 7,467 (15.1%) were partially completed and were assigned for completion, and 1,203 (2.4%) were returned without completion. The median age B4SC measures were undertaken was 4.3 years (Q1 = 4.1, Q3 = 4.6 years). Also included in Table 2 is the empirical distribution of the sample with literacy intervention by the considered B4SC variables, together with their crude and adjusted HR estimates.

In the unadjusted analyses, all considered B4SC variables were significantly related to the likelihood of receiving a literacy intervention (all p < 0.001). Excluding those children with not applicable SDQ–teacher assessments, the overall agreement between SDQ–parent/caregiver and SDQ–teacher categories was slight, with κ = 0.093 (95% CI: 0.087, 0.0.098) and 12,120 (46.2%) children having concordant outcomes. The primary discordance arose from the high rate of teachers declining to undertake the assessment; see Table 2. When investigating agreement between parents/caregiver and teacher SDQ assessments limited to those with “no action”, “advice given”, “referred”, and “referred – declined” outcomes for both assessments (n = 13,788), agreement was fair κ = 0.303 (95% CI: 0.282, 0.323) and concordance higher (84.7%). Discordance was greatest with parents/caregivers assessed as needing to have advice given whereas teachers’ SDQ assessment required no action.

In the multivariable model, which included all the demographic and B4SC variables investigated here, all demographic and B4SC variables remained significantly associated with this literacy intervention variable (all p < 0.001, apart from: domicile area p = 0.02 and deprivation p = 0.02), except for BMI (p = 0.91) and SDQ-parent/caregiver (p = 0.71); see Table 2. For hearing, the estimated risk for an intervention was significantly higher among those Pacific children who failed their first screen but past after rescreening (HR = 1.41) and for those who declined the test (HR = 1.39) compared to those who passed at their first screen. A similar pattern was observed for the vision screen, where the estimated risk for an intervention among those Pacific children who failed their first screen but past after rescreening (HR = 1.33) and for those who declined the test (HR = 1.38) was significantly higher than those who passed at their first screen. Compared to children with no detected developmental or behaviour problems, as measured by the PEDS, those who were referred (HR = 1.37) or where advice was given (HR = 1.15) had a significantly greater likelihood of receiving a literacy intervention. The estimated effect size was relatively large for those where a suggested referral was declined by the parents (HR = 1.25) after the PEDS assessment, but the relationship was non-significant (p = 0.38) due to the small numbers in this category. With respect to the SDQ in the multivariable model, the only category significantly associated with an increased risk of literacy intervention was where the teacher declined the test (HR = 1.14) compared to children where no action was indicated. This was despite the SDQ for both parents/caregivers and teachers having relative large and significant relationships with literacy intervention in the unadjusted analyses. Taken together, estimates from this model reveal that those children who pass their vision and hearing tests after rescreening will have a literacy intervention HR of 1.87 (95% CI: 1.66, 2.10) compared to those who pass both their initial tests, and if these children also were referred after the PEDS then their HR would further increase to 2.57 (95% CI: 2.13, 3.10) if this referral was accepted or 2.33 (95% CI: 1.40, 3.88) if it was declined.

The predictive power of the multivariable model, which included all demographic and B4SC variables investigated here, fitted on a training dataset and calculated on the remaining partitioned data, yielded a Harrell’s c-statistic of 0.592 (95% CI: 0.583, 0.602). According to the interpretation guidelines of Hosmer and Lemeshow, this represents better than chance prediction but less than the threshold (of 0.7) for a model demonstrating ‘reasonable’ predictive power. When considering the demographic variables alone in a multivariable, the estimated c-statistic was 0.580 (95% CI: 0.571, 0.590), significantly less than the full model (p < 0.001).

Discussion

Large ethnic differences existed in the likelihood of receiving literacy interventions in early primary school years, with Tongan (HR: 1.33; 95% CI: 1.23, 1.45) and Cook Island Māori (HR: 1.33; 95% CI: 1.21, 1.47) children more likely to receive an intervention than Samoan children in the adjusted analyses; whereas those children with both Pacific and non-Pacific ethnic identifications having less likelihood of receiving a literacy intervention. The mechanism for these differences is unknown, although likely reflects variations in immigration histories and cultural practices15. Differential bilingualism profiles between Tongan and Samoan children and their households may provide one partial explanation. According to the New Zealand 2013 Census31, 51.1% and 86.3% of Tongan people speak Tongan and English fluently, respectively, whereas 61.8% and 87.4% of Samoan people speak Samoan and English fluently. This implies that significantly more Samoan people are bilingual than their Tongan counterparts. In addition to children’s emotional and behavioural benefits conferred by being bilingual32, there is evidence that academic developmental trajectories during their early school years is also improved compared to their monolingual peers – although this relationship is complex33. This may also partially explain the decreased likelihood of children with both Pacific and non-Pacific ethnic identifications receiving an intervention, as these children are more likely to be multilingual. In the same Census31, 93.6% of Cook Island Māori people reported speaking fluent English. In addition to English, Cook Islands Māori is an official language of the Cook Islands but has no official status in New Zealand, and data are not systematically collected. Additionally, Tongan and Cook Island Māori mothers in New Zealand are more likely than Samoan mothers to have both relatively low Pacific and European cultural orientations15. Retaining strong Pacific cultural links has been shown to have positive health benefits in New Zealand15, and it could be opined that such strong links would also lead to better educational outcomes for children from outside the prevailing European culture.

In addition to ethnicity, large and significant differences were observed between many of the considered demographic variables. Males, those living in rural areas, and those domicile in areas of increasing deprivation were all more likely to require and have experience with a literary intervention, consistent with reports elsewhere within New Zealand and overseas24,34. While the existence of a persistent sex differential in literacy achievement has been universally reported34,35, including among Pacific children living within New Zealand24,36, its causes, consequences, and potential solutions remain contested in both the public and professional discourse34. However, three key focus areas have been identified in the literature that helps explain gender gaps in literacy attainment: factors in school; in the home and community; and, in peer culture35. The challenge remains on how to address these factors in children’s early years. The findings in relation to deprivation were unsurprising5,11, but the extent and impact of this inequality was revealing. Nearly 60% of Pacific children were found to living in the most deprived areas of New Zealand and approximately 20% lived in the next most deprived quintile. National NZDep2013 figures for children aged 0–5 years at the 2013 Census revealed that 18.2%, 18.5%, 19.0%, 19.7%, and 24.6% lived in Q1 (least deprived) through to Q5 (most deprived) areas, respectively37. Those children growing up in the most deprived areas are more likely to be living in households with low income, material hardship and household crowding, resulting in reduced opportunities for health and educational provision and successes38,39.

In terms of the B4SC measures, hearing, vision, and developmental or behaviour problems measure were predictably and importantly related to literacy intervention likelihood, but BMI and SDQ were not (except when teachers declined the SDQ test). While significant in the unadjusted analysis, the effect associated with BMI was explained by other variables, as has been observed elsewhere40. The relationship between childhood obesity and academic achievement remains contested41. For the SDQ, this multi-informant instrument is one of the most widely used screening measure of emotional and behavioural health problems in children and youth but questions have been raised about its applicability within community samples42. Certainly, its utility in predicting literacy intervention among Pacific children was found wanting here. In terms of the significant variables, Pacific children with hearing and vision rescreens and referrals, and those who were referred from the PEDS were at significantly increased risk of a literacy intervention. Indeed, children who pass their vision and hearing tests after rescreening and were referred after the PEDS then had a HR for a literacy intervention estimated to be 2.57 (95% CI: 2.13, 3.10) if this referral was accepted or 2.33 (95% CI: 1.40, 3.88) if it was declined compared to children passed vision and hearing tests and having no action required after their PEDS assessment.

The increased likelihood of a literacy intervention for children who initially failed a hearing or vision screen but passed a subsequent rescreen is novel and warrants further investigation. This result may have been influenced by a range of factors, including that some children may have been false negatives to these hearing or vision tests. That is a child who passes the rescreen when they do in fact have an impairment. False negative results are of concern, as children who pass the screen are not routinely followed-up and likely to be at greater likelihood of falling behind in literacy and language. Few population studies report false negative rate information, and none were found for school entry hearing screen tests – although in localised hearing studies of children over disparate age ranges, false negative rates varied from 0.2% to 20%43. False negative rates and their impact are difficult to quantify. If a child passed the rescreen and was subsequently found to have an impairment, there can never be any certainty that the impairment was present on the day the child was screened. It could have been acquired or have progressed beyond an identifiable screen threshold level following the original screen. Alternatively, the impairment may be transient which, by its very nature, may not have been present at the time of rescreening. The two-staged approach to the B4SC vision and hearing tests is designed to reduce the likelihood of false negative results, but it cannot eliminate them – especially when undertaken on a national scale.

Another noteworthy feature of these analyses were the large number of declined hearing and vision assessments, together with the assessment not done (missing), and their association with increased literacy intervention likelihood. Access, cultural acceptability of services, racism, language - including barriers for interpreter service utilisation, and the use of alternative health care providers are likely to be just some of the myriad of factors that explain this finding44. The implication is that Pacific people are significantly less likely to ever attend primary health care than the dominant European or indigenous Māori people, and those who do access care generally attend at significantly lower comparative rates44. Moreover, this is likely to contribute to relatively poor follow-up rates. Findings from the Pacific Islands Families Study, a birth cohort of 1,398 Pacific children born in 2000 in South Auckland New Zealand45,46, revealed many Pacific children aged 2 years had middle ear disease. Of the 656 children who had confirmatory testing, 25.4% had otitis media with effusion (OME) in at least one ear and 7.5% had other otological abnormalities47. However, only 53% of children with possible OME or other abnormalities were in a position to attend an otolaryngological follow-up, despite it being offered locally and free of charge.

While PEDS is one commonly employed tool that examines parental concern regarding children’s cognition, communication, motor skills, and readiness for school48, studies that have examined the utility of the PEDS in predicting later literacy or academic success are relatively scarce and have produced mixed findings49,50. It is of note here that it was the Pacific children who were referred which were at significantly higher risk of a literacy intervention. This may result from the children failing to follow-up with the appropriate professional, that the professional failed to (or could only partially) attend to the child’s problems as they pertained to literacy achievement, or the individual or sociocultural circumstances were such that the referred children were more likely to remained disadvantaged. Further research is needed to further elucidate the mechanism(s) of this finding.

In terms of whether the B4SC factors may be used as an early detection tool of those Pacific children with the greatest literacy need, despite the constellation of significant risk factors, the multivariable model had only modest predictive abilities. The estimated c-statistic of 0.592 (95% CI: 0.583, 0.602) fell short of the threshold for a model to be considered as having reasonable predictive power51. Although terms such as risk factors and predictors are often conflated and used interchangeably within the epidemiological literature, they represent different concepts, and even strong associations do not necessarily indicate importantly increased predictive ability52. A more nuanced successful predictive model will likely need to be composed of a much broader suite of individual, family, and community-based variables, including those which elicit questions and capture tasks which are more explicitly focused on language and literacy development and need. For instance, in a study that examined the role that bilingualism played in children’s academic developmental trajectories during their early school years found that school-level factors explained about one third of the reductions in the differences33. It also emphasises the need and importance of longitudinal studies in providing information around the context and development of children that routine collected IDI data, however sophisticated, is unlikely to provide45,46. Lastly, it may also be the case that predicting early literacy need with high sensitivity and specificity from general population screening programmes is simply not possible27,28,29,30 due, in part, to the inherent variability in children’s emergent language and literacy development28.

While the study has salient strengths, it also suffers from some weaknesses. The principal strength is its investigation into a large recent sample of Pacific children - a population rarely studied. It also bring together variables not explored before on a population level, using linked inter-agency national datasets. However, on the flip-side, arguably the greatest potential weakness of this study is the measure of literacy used herein. The literacy intervention variable was constructed by bringing together ‘Reading recovery’ and ‘Specialist teaching’ indications, and thus its psychometric properties are unknown. Reading recovery is given to children with the lowest reading levels at each school. Thus, children who do not meet a particular level or difficulty threshold with literacy to receive the intervention but are identified as struggling in comparison to their peers. This means that the provision of literacy services are not equally distributed across the population, and even when offered, the standard for whether a child receives services is unlikely to be uniform between schools. Given that Pacific children are more likely to attend lower decile schools, what may look like a favourable outcome at age 6 years (i.e., no literacy intervention) may not be that favourable in terms of literacy achievement and, indeed, represent a hidden unmet need. As such, literacy intervention need is likely to be underestimated for Pacific children.

A further important limitation concerns the coverage and uptake rates of the B4SC over the study period. Although introduced in 2008, uptake rates in Auckland, where almost two-thirds (65.9%) of those who identified with at least one Pacific ethnicity lived8, was initially relatively low – increasing from approximately 35% in 2010/2011 to approximately 95% in 2015/201622,24. If Auckland children have or are subjected to a different pattern of behaviour, then this may introduce non-sampling biases into the findings. However, analysis of information on B4SC attendance by area level deprivation between 2010/2011 and 2015/2015 revealed that uptake rates were similar for those living in deprived and non-deprived areas, hence the sample does not appear to be selective by deprivation22. In the national sample, 84.5% of children had their checks completed and their file closed24, compared to the 60.8% of Pacific children observed here. While this rate has increased from the reported 38.8% of Pacific children with checks completed and file closed born between July 2005 and January 200753, many Pacific children are paradoxically missing or having incomplete assessments from this national programme designed to reduce (rather than exacerbate) inequalities. Moreover, as observed here, there were significant differences in B4SC uptake between Pacific ethnicities. An additional important consideration was the 16.1% of Pacific children within the IDI dataset that did not appear enrolled within the MoE database due to delayed school commencement, home-schooling, emigration outside of New Zealand, or unsuccessful linking. Relatively small but significant differences were observed between those within and outside the MoE databases which may affect the external validity of the study findings. Further, it must be noted that the analyses are hinged on routinely collected data that are inputted from multiple individuals in various locations over time. As such, operationalised variations in exposure and classification definitions potentially exist or evolve that may differentially comprise the internal validity of any findings.

The study raises a number of possible policy and practice outcomes. Generally it is cited that a key policy lever is lowering cost barriers to make primary health care more accessible for all people54. But to increase Pacific engagement this, by itself, is not sufficient. To address the issues associated with the cultural acceptability of services, institutional racism, language barriers, and socioeconomic deprivation, there is a need for partnerships in the planning and delivery of services; the shared value of services delivered in churches and in locations where Pacific feel comfortable, at times which allow flexibility; and the benefits of access to Pacific health and education professionals and community workers55. There is also the need for structural change in the provision of health and educational services if they are to be more responsive to the needs of Pacific people. Developing a culturally responsive service may involve working in a different way that takes much greater account of the values, beliefs and practices of Pacific people.

In conclusion, Pacific children in New Zealand are ethnically and culturally diverse. Large ethnic differences existed in the likelihood of receiving literacy interventions between these Pacific children. Equity demands that we address the relatively poor literacy achievement and needs of Pacific children. However, it is important that researchers and policy-makers consider the question of “which children,” as the literacy achievement gap is far from uniform across the sociodemographic and health indicator profile of Pacific children considered here. Once identified, then efficacious culturally-nuanced strategies might be implemented to help ameliorate this disparity. Predictively, the utilised demographic and B4SC failed to produce a model of reasonable power, and this begs further continued research. But unless we are content to continuing leaving Pacific children behind in New Zealand, do we have enough evidence to begin answering the “which children” question?

Methods

Study design

Time-to-event analysis of a continuously recruited national cohort.

Participants

Children identified as being of at least one Pacific ethnicity, born between 1 January 2005 and 31 December 2011, inclusive, who were in the MoH (B4SC) and/or MoE databases and captured within the IDI. Ethnicity was based on parental/caregiver report and allowed multiple identities, as is the prevailing method of reporting in New Zealand17.

Outcome measure: literacy intervention

New Zealand children typically start formal schooling around their 5th birthday; although it is only at their 6th birthday that they are legally obligated to commence. After approximately one school year of classroom instruction, children are assessed on a range of literacy and language measures. Those falling behind their peers in early reading development are identified for reading intervention. Data were gleaned from the MoE reports contained within the IDI, which captures each instance where a child has received a school-based intervention. Literacy intervention was assessed here via: ‘Reading recovery’, which is an intensive reading program targeting children aged 6 years, typically having completed their first year of primary schooling; and ‘Specialist teaching’, whereby specialist resource teachers’ work with students who are struggling to meet national literacy standards for their age. Although nationally administered, around 72% of all children aged 6 years have potential access to the ‘Reading recovery’ intervention as some schools choose not to implement it36. A binary ‘literacy intervention’ variable was created, and indicated if either ‘Reading recovery’ or ‘Specialist teaching’ interventions was initiated, together with the child’s age that this first occurred. Children without either intervention by the study end-date were treated as censored. A study end-date of 31 December 2015 was applied to ensure record completeness within the IDI.

Ethnicity, demographic and B4SC explanatory variables

Within New Zealand, ethnicity is classified in level hierarchies, with level I information identifications grouped as: European; Māori; Pacific; Asian; Middle Eastern/Latin American/African; and, Other Ethnicity. For parents/caregivers with Pacific children, they can further identify their child (level II information) as belonging to one or more of the following Pacific ethnic groups: Samoan; Cook Islands Māori; Tongan; Niuean; Tokelauan; Fijian; other Pacific peoples; and, Pacific peoples not defined. Children’s ethnicity was ascribed from SNZ Census, Department of Internal Affairs - Te Tari Taiwhenua (DIA; Births, Deaths and Marriages Register), and MoH sources using a priority approach. If ethnic identifications were level I or missing from the SNZ Census data file, then the DIA values were used if they contained level II information; and if ethnic identifications were level I or missing from SNZ Census and DIA files, then the MoH information was utilised if they contained level II information. Due to the known demographic profile16, an a priori decision was made to combine the level II ethnic groups of Niuean; Tokelauan; Fijian; other Pacific peoples; and, Pacific peoples not defined into an ‘Other Pacific’ group.

As multiple ethnicities can be indicated, this implies that New Zealand’s ethnic categories are not mutually exclusive, making direct comparison of individual ethnic groups difficult. In an effort to improve interpretability, mutually exclusive major ethnic group combinations were developed, based on the population priorities within New Zealand (Māori being the indigenous people of New Zealand; and Europeans representing the dominant population group) and on their empirical distributions. As such, eight mutually exclusive categories were defined: Samoan; Cook Islands Māori; Tongan; Other Pacific (which also included those of two or more Pacific ethnicities); Pacific and Māori; Pacific and European; Pacific and Māori and European; and, Pacific and Other. The last groups includes those children identified as being of Asian, Middle Eastern, Latin American, African, or other ethnicity.

Sex was categorised as female (girls) and male (boys). Age (in months) was calculated from month/year of birth. Area level deprivation was measured using the New Zealand Index of Deprivation 2013 (NZDep2013)37 for the recorded residential address of the child at their B4SC. It is based on the deprivation characteristics of “meshblocks” (small areas with a typical population of 60–110 people), and combines 2013 Census data relating to income, home ownership, employment, qualifications, family structure, housing, access to transport and communications into a single measure. Each meshblock is assigned a score from 1 (least deprived) to 10 (most deprived), with 10% of all meshblocks being in each category. While deprivation is categorised in deciles, quintiles were employed here. Urban/rural residency was derived from the New Zealand standard classification which uses a 5-point scale, namely: (i) main urban (centred on a city or major urban area with population of 30,000+ people); (ii) secondary urban (centred on larger regional centres with population of 10,000–29,999 people); (iii) minor urban (centred around smaller towns with population of 1,000–9,999 people); (iv) rural centre (with population of 300–999 people); and (v) other rural (inlets, islands, inland waters, and oceanic waters)56. These were collapsed into two groups: urban, combining (i)–(iii); and, rural, combining (iv)–(v).

The B4SC includes measures of hearing and vision, height and weight, and behavioural and developmental questions using the SDQ and PEDS tools. Detailed information on the measurement and reporting of these variables can be found elsewhere21,24, and are summarised in Table 3.

Table 3 B4SC variables utilised and their definition.

Procedure

Presentation of methods and reporting of findings were informed by the STROBE guidelines (www.strobe-statement.org). A thorough description of the procedures for the implementation of both the B4SC and literacy intervention have also been provided elsewhere21,24. In brief, B4SC are conducted by registered nurses and hearing and vision technicians. If concerns are identified, the child and their parents or caregivers are offered information and support which include clinical pathways and referral processes to health, education or social support services. B4SCs are undertaken in various locations, depending on the needs of the community. After receiving informed parental/caregiver informed consent, they usually takes 45–60 minutes to complete21. Held by the MoH, the B4SC National Information System stores data relating to the child, permission, assessments and checks, and any issues identified and referrals made. This system provides non-identifiable information for monitoring the performance of the B4SC programme, for tracking the population health status of 4-year-olds, and for approved research studies21. The MoE routinely collects intervention data on all children enrolled within its service such as intervention type, date and timing, frequency and duration. As part of the IDI scope and mandate, these data are digested into the infrastructure. These apposite data from the MoH and MoE databases were linked through the IDI platform, together with augmented data from SNZ (Census 2013) and DIA. Conducted by SNZ, the IDI uses both deterministic and probabilistic techniques to link individuals across datafiles using a unique identifier that is derived from an individual’s name, date of birth, sex, and home address22. Missing information for a variable in one database may be replenished by non-missing data for a duplicate variable from another source. If conflicting information existed in duplicate fields (such as ethnic identification), then Census data were preferred, followed by DIA and then MoH data. All analyses are performed in a secured environment, according to strict protocols, and findings released after SNZ IDI approval19.

Statistical analysis

Analyses and graphs were performed using Stata MP version 14.0 (StataCorp, College Station, TX, USA), and α = 0.05 defined statistical significance. Initially, demographic distributions for selected variables were derived and compared between the full IDI database and those within the MoE sample. The empirical overall distribution of children’s first literacy intervention, their median age and associated quartiles, together with the distributions by selected demographics and the Kaplan-Meier curves by ethnicity, sex and deprivation were given. Unadjusted and adjusted Cox proportional hazard models using a robust estimator of variance predicting the likelihood of children’s first literacy intervention were then implemented on these selected demographic variables and the considered B4SC variables, with HRs and associated 95% CIs reported (employing the ‘stcox’ command in Stata). Note, because of children’s different follow-up times and the right censoring of many participants, considerable care needs to be exercised in interpreting and comparing the percentages (%) of children who were observed to have a literacy intervention and the associated estimated risks (derived from the Kaplan-Meier curves and Cox proportional hazard models). Due to the almost complete concordance of ‘declined test’ and ‘not done (missing)’ values between hearing and vision tests, and for ‘not done (missing)’ values of the PEDS, SDQ-parent or caregiver, and SDQ-teacher variables, only one apposite indicator variable was used in each variable grouping for the adjusted analyses. For instance, in reporting the adjusted HR and 95% CI for hearing, the ‘declined test’ and ‘not done (missing)’ indicator variables were included for hearing but omitted for vision, whereas in reporting the adjusted HR and 95% CI for vision, the ‘declined test’ and ‘not done (missing)’ indicator variables for vision were employed but those associated with hearing were omitted. The ‘lincom’ command in Stata was used to derive the point estimate, standard error, significance, and 95% CI for linear combinations of risk factors. To assess the contribution of the B4SC variables in increasing the model’s predictive power over and above the adjusted model containing only the selected demographic variables, Harrell’s c-statistic was employed57. The c-statistic gives the probability a randomly selected participant who experienced an event (e.g. the literacy intervention) had a higher risk score than a participant who had not experienced the event. A value of 0.5 indicates that the model is no better than chance at making a prediction and a value of 1.0 indicates that the model perfectly identifies those within a group and those not. Models are typically considered reasonable when the c-statistic is higher than 0.7 and strong when it exceeds 0.851. Following the method advocated by Newson, the dataset was randomly partitioned into two; the first used as a training dataset to fit the model, and the second used as a test dataset to make prediction assessments57. Agreement between the SDQ-parent/caregiver and SDQ-teacher scores were assessed using the κ statistic. Using Landis and Koch’s characterization, κ > 0.75 was taken to represent strong agreement, 0.40 ≤ κ ≤ 0.75 was taken to represent moderate agreement, and κ < 0.40 was taken to represent poor agreement58.

Approvals and ethics

The study proposal and protocols were approved by SNZ (MAA2017-15) and by the University of Otago Human Ethics Committee (D17/024). Based on New Zealand’s Health and Disability Ethics Committees’ checklist, the study did not meet the threshold required for formal ethics review. All methods and reported results were carried out in accordance with relevant guidelines and regulations, and only includes aggregated randomly rounded to base 3 de-identified data.

Data availability

The datasets used for statistical analysis are held within the IDI. Application to use these data must be made through SNZ.

Disclaimer

The results in this paper are not official statistics. They have been created for research purposes from the IDI, managed by SNZ. The opinions, findings, recommendations, and conclusions expressed in this paper are those of the authors, not SNZ. Access to the anonymised data used in this study was provided by SNZ under the security and confidentiality provisions of the Statistics Act 1975. Only people authorised by the Statistics Act 1975 are allowed to see data about a particular person, household, business, or organisation, and the results in this paper have been confidentialised to protect these groups from identification and to keep their data safe. Careful consideration has been given to the privacy, security, and confidentiality issues associated with using administrative and survey data in the IDI. Further detail can be found in the privacy impact assessment for the IDI available from www.stats.govt.nz.