Mapping home internet activity during COVID-19 lockdown to identify occupation related inequalities

During the COVID-19 pandemic, evidence has accumulated that movement restrictions enacted to combat virus spread produce disparate consequences along socioeconomic lines. We investigate the hypothesis that people engaged in financially secure employment are better able to adhere to mobility restrictions, due to occupational factors that link the capacity for flexible work arrangements to income security. We use high-resolution spatial data on household internet traffic as a surrogate for adaptation to home-based work, together with the geographical clustering of occupation types, to investigate the relationship between occupational factors and increased internet traffic during work hours under lockdown in two Australian cities. By testing our hypothesis based on the observed trends, and exploring demographic factors associated with divergences from our hypothesis, we are left with a picture of unequal impact dominated by two major influences: the types of occupations in which people are engaged, and the composition of households and families. During lockdown, increased internet traffic was correlated with income security and, when school activity was conducted remotely, to the proportion of families with children. Our findings suggest that response planning and provision of social and economic support for residents within lockdown areas should explicitly account for income security and household structure. Overall, the results we present contribute to the emerging picture of the impacts of COVID-19 on human behaviour, and will help policy makers to understand the balance between public health and social impact in making decisions about mitigation policies.

www.nature.com/scientificreports/ factory-based occupations, hospitality, and essential services are unable to work from home [4][5][6][7] . The confluence of occupational and financial constraints place many such workers at a greater risk of exposure to infectious disease, either through the occupational hazard of close social interactions, or because without adequate leave or income entitlements, they have a limited ability to remain at home when unwell 6,8,9 . This conflict between household economic needs and public health orders to stay at home is problematic for the success of such mitigation strategies. COVID-19 has driven a general economic contraction brought about through concerted reductions in consumer, recreational, and occupational activity [10][11][12] . Impact disparities can be explained (at least partially) by examining the distribution of employment types within and between subpopulations. Those who can perform their work requirements at home from a computer with internet access have experienced less-severe economic impact [13][14][15][16][17][18] . By engaging in written communication, and replacing face-to-face interactions with online video conferencing, work-related tasks are likely to result in measurable increases to home internet traffic. Populationlevel data on home internet usage may therefore provide a useful complement to the widely available mobility data typically used to monitor and model the real-time effects of COVID-19 and the restrictions associated with mitigation measures 16,17,[19][20][21][22][23][24][25][26][27] . While mobility data can tell us who is staying home and where people are going when they leave the home, internet volume data provides a unique perspective on what is happening within households, particularly in relation to adapting work arrangements to COVID-19 lockdown requirements.
Here, we demonstrate how relationships between occupational factors and home internet traffic can provide insight into social and economic disparities that are amplified by the pandemic and associated mitigation strategies. Our results support the hypothesis that occupational factors link the ability to work from home with income security, and clearly show how this link produces strong positive correlations between income security and increased home internet activity during COVID-19 restrictions. Our results in the Australian context may help explain observations from other recent studies describing the connection between income, internet, and the ability to self-isolate during COVID-19 17,21 . Overall, the results we present contribute to the emerging picture of the impacts of COVID-19 on human behaviour, and will help policy makers to understand the balance between public health and social impact in making future decisions. Furthermore, results such as those presented in this work will contribute to the ability to produce precise, integrated models of epidemic dynamics connected to social and economic phenomena.

Methods
Data sources. Our study examines the Greater Metropolitan Areas of Melbourne and Sydney, Australia.
To subdivide these regions, the geographic analysis unit adopted was the ASGS Statistical Area Level 2 (SA2). SA2 regions are defined by the Australian Bureau of Statistics (ABS) and typically contain between 1000 and 10,000 residents, representing neighbourhoods that socially and economically interact 28 . Aggregating to this geographic unit, we used the following data sets to create our measures of occupational factors, income security and internet usage: • Detailed population surveys quantifying the distributions of occupation types within local regions, and the characteristics of different occupation classifications [29][30][31][32] . • A population-scale data set describing home internet use patterns aggregated on the scale of SA2. This data was obtained from nbn co ltd. (nbn), a Government Business Enterprise providing national wholesale broadband access in Australia. • Recent survey data collected in September, 2020 by the COVID-19 Attitudes Resilience and Epidemiology (CARE) study to substantiate our individual-level interpretation of observed population-scale trends 33 .
All survey data used in this study was collected with the informed consent of all participants. The CARE study was by approved by the University of Melbourne Human Research Ethics Committee (2056694). Ethics approval applied to all study sites. All methods were carried out in accordance with relevant guidelines and regulations.

Computation of income security by occupation.
To quantify the salient features of the complex distributions of employment characteristics in Sydney and Melbourne, we constructed an income security index using data on employment security and income characteristics linked to the Australian and New Zealand Standard Classification of Occupations (ANZSCO). For income, we used average weekly earnings by occupation from the ABS Census of 2016 34 . To calculate the employment security associated with an occupation, we used the most recent iteration (2018) of the nationally-representative Household, Income and Labour Dynamics in Australia (HILDA) survey. For each occupation, we computed income security as the product of the proportion of securely employed HILDA respondents, and the average weekly wage reported by the ABS (with income rescaled to the sample maximum). This gives a value between 0 and 1, with zero corresponding to occupations with no securely employed individuals and values of 1 corresponding to occupations with maximal remuneration as well as 100% securely employed respondents. The distributions of these values and the component measures of income and proportion securely employed are shown in the Supplementary Fig. S3.
Occupation security status was developed as an individual categorical variable with two levels: [0 (secure employment)] or [1 (insecure employment)]. An individual was classified as 'secure' if they had a fixed-term or permanent job. We computed the employment security score associated with each occupation as the proportion of respondents in each occupation who were securely employed. We then computed the index of income security by occupation as the product of the employment security score and average weekly earnings (rescaled to the sample maximum). Distributions of the resulting income security values and the component measures of income and proportion securely employed are shown in Supplementary Fig. S3. Note that those on a fixed-term contract were classified as securely employed because the work conditions associated with fixed-term employment are www.nature.com/scientificreports/ more similar to the conditions of permanent employment than they are to casual work conditions. Those on fixed-term contracts have also been shown to be more sociodemographically similar to those employed permanently than to those employed on a casual basis 35,36 . Computation of ability to work from home by occupation. To compute the working from home indicator for each occupation classification, we adapted an analysis done by Dingel and Neiman 37 (available on GitHub, https:// github. com/ jding el/ Dinge lNeim an-worka thome) to establish which occupations could potentially be performed from home. Dingel and Neiman used the 'Work Context' and 'Generalized Work Activities' occupational surveys from the O*NET®Database. Drawing on a series of questions, they classified occupations according to whether they were compatible with working from home. For example, occupations were considered to be unsuitable for working from home if they involved activities that required a workplace such as operation of machinery or handling of specialised items (for more information in methodology, please refer to Dingel 37 ).
To produce international estimates we linked the binary work-from-home classifications (N = 969) produced by Dingel  Analysis of internet usage data from nbn. nbn provided access to aggregated Australian internet usage volume data from household customers. The data provided by nbn consist of upload and download volume (in bytes) by individual households over 30 min intervals, spatially aggregated into SA2 regions. Different types of internet usage (such as streaming movies, videoconferencing and online gaming) are associated with different patterns of upload and download volume. The high time resolution and structure of the data allowed us to approximately differentiate between background (latent) internet activity, and active internet use. The data were restricted to the total download and upload volume for 30 min intervals, per SA2 region, and the corresponding number of active internet connections per time slot that generated these data. Only data generated by at least 50 domestic connections were provided, to avoid privacy concerns and to reduce the impact of aberrant individual household behaviours in regions with insufficient service coverage. Outlier data points beyond three standard deviations from the corresponding time period mean were removed, to limit the impact of outlier usage data points, usually caused by network management or internal infrastructure configuration changes. We set the data collection interval of October 10th, 2019 until November 29th, 2019 as the baseline period, when life in Australia was not impacted by school holidays or major, longer public holidays, and preceding the major disturbance produced by the bushfire season of summer 2019-2020. The period representing behaviour during the 1st wave of COVID-19 restrictions was set to the interval of April 18th-April 24th, 2020, while the period representing the second wave of restrictions was set to the interval of August 8th-August 14th, 2020.
The following upload and download volume characteristics per SA2 region were computed for these three periods (baseline and the two COVID-19 intervals): (1) an overall daytime average volume; (2) the daily average minimum volume relating to the minimum internet usage, between 4:30 a.m. and 5:30 a.m.; (3) the average volume generated during the daytime period (from school start until noon, 9:00 a.m. Censoring of outlier data for correlation analysis. All correlation coefficients were computed after censoring data points for which either variable was greater than three standard deviations from the sample mean. The full data set made available with this article contains all values including outlier data. Outlier data is not included in the scatter plots shown in Figs. 5 and 6. Analysis of data from the CARE survey. Our analysis used data from the CARE study's Victoria-wide survey which aimed to address the overall question: How were Victorians thinking, feeling and behaving in response to the 'second wave' of the COVID-19 epidemic and the associated public health measures. The survey was self-administered online in English to 1006 Victorian residents aged 18 years and over. The survey was based on research developed and conducted by Imperial College in the UK in mid-March 2020 33,38 . Some questions in the Australian survey were modified slightly to reflect local response measures and terminology. Additional questions were added to the Australian survey to measure social and emotional impacts. Data collection in both the UK and Australia was conducted by the online market research agency YouGov. www.nature.com/scientificreports/ The CARE study used a structured questionnaire addressing the following three domains: perceptions of risk and consequences of COVID-19 infection; measures taken by individuals to protect themselves and others from COVID-19 infection; and social and emotional impact. The questionnaire was administered online to members of the YouGov Australia panel of individuals who have agreed to take part in surveys of public opinion (over 120,000 Australian adults). Panellists, selected at random from the base sample, received an email inviting them to take part in a survey, which included a survey link. Once a panel member clicked on the link and logged in, they were directed to the survey most relevant to them available on the platform at the time, according to the sample definition and quotas based on census data. A plain language statement appeared on screen and respondents were required to electronically consent prior to the survey questions appearing. Proportional quota sampling was used to ensure that respondents were demographically representative of the Victorian adult population, with quotas based on age, gender, household income, location (state and metropolitan or regional) and whether a language other than English is spoken at home.
The study was by approved by the University of Melbourne Human Research Ethics Committee (2056694). Ethics approval applied to all study sites. www.nature.com/scientificreports/

Results
Our analysis focused on the urban areas of Sydney and Melbourne during a pre-COVID period (which we use as a baseline), during the first pandemic wave in March and April 2020 (with Australia-wide transmission and mobility restrictions), and during the second wave from July 2020 (with substantial transmission and mobility restrictions in Melbourne but not in Sydney). We identify positive correlations between income security and changes to internet activity during COVID-19. These correlations are consistent with the hypothesis that higher income security is associated with more people working from home during lockdown. This hypothesis is further supported by individual-level data from the CARE survey. We observe that in Sydney this trend persists after the release of lockdown restrictions, indicating the possibility of a 'new normal' of remote working conditions, particularly for occupations associated with higher income security. In Melbourne, we find that the role of children conducting their studies online disrupts these correlations due to an inverse relationship between income security and the proportion of families with children.
Employment, income security, and the ability to work from home. Income security is distributed spatially according to distinct patterns, with high values in the central and northeast suburbs of both Sydney and Melbourne (Fig. 2a,b). The upper 50% income security quantile (Fig. 2c) favours managerial and officebased occupations, while the lower 50% quantile (Fig. 2d) contains more service staff and other socially-oriented occupations. The frequency distributions of average income security among SA2s in Sydney and Melbourne (respectively) are provided in the Supplementary Fig. S4, which demonstrates that the distributions in the two regions are not significantly different (two sample t-test, p = 0.552 ). High resolution choropleth maps of income security by region can be found in the Supplementary Fig. S6a. www.nature.com/scientificreports/ To examine the qualitative association between income security and the ability to work from home indicated by the distributions in Fig. 2c,d, we apply the occupation classification method developed by Dingel and Neiman 37 . This results in a binary (0 or 1) value indicating whether or not a particular occupation type can be performed from home. We found a strong association between income security and the ability to work from home (Fig. 3). This association was observed both by occupation (Fig. 3a) and by geographic region (Fig. 3b). See the Supplementary Fig. S5 for histograms of the distributions shown in Fig. 3a as well as the distributions of the x-and y-axis variables used in Fig. 3b.

Changes to internet traffic during COVID-19.
To quantify changes to home internet use during COVID-19 restrictions, we aggregated internet activity data from all SA2 regions within Greater Sydney and Greater Melbourne (respectively). Over the pre-COVID baseline, we averaged the per-user upload and download rates from the hours of 9 a.m. to 12 p.m. in order to capture a baseline measurement of putative remote work-related internet activity (see Methods). During the first and second waves of COVID-19 in Australia, peaks in case incidence coincided with the implementation of the most restrictive policies, and were followed by increases in total internet use, which peaked approximately 1 to 3 weeks after implementation of the tightest level of restrictions (Fig. 4).
After identifying time intervals representative of the changes induced by the first-and second-waves of restrictions, we examined spatial variation among individual SA2 regions during those periods. The grey bands in Fig. 4 show the periods over which nbn data was averaged for each individual SA2 in order to examine the spatial distribution of changes to internet activity during first and second waves of COVID-19. For visualisation of spatial trends, high-resolution choropleth maps of internet activity changes relative to baseline can be found in the Supplementary Fig. S6b,c.
We found that during the first period of restrictions, areas with higher income security tended to exhibit larger increases in internet volume per household (Fig. 5a-c). However, these trends were produced by qualitatively different changes for downloads and uploads, respectively: • During the pre-COVID baseline period, absolute download volume tended to decrease with income security, becoming uncorrelated during the first wave of restrictions (Fig. 5a). This transition produces larger increases in download volume in areas with higher average income security (Fig. 5b,c). • On the other hand, absolute upload volume shows baseline rates that are initially uncorrelated with income security and transition to an increasing trend during the first wave of restrictions (Fig. 5d). This produces changes in upload volume that have similar correlations with income security to those observed for downloads (Fig. 5e,f), but that occur due to the emergence of a positive correlation rather than the removal of a negative correlation with the onset of restrictions.  ). This suggests that children engaged in online activity may establish the negative baseline correlation between download rates and income security (Fig. 5a). Children's activities also appear to influence the changes observed during lockdown. During the time interval selected to represent the first wave of restrictions (April 18th to April 24th), school holidays were still in effect in Greater Sydney while in Melbourne, children had returned to their studies remotely. Because regions with higher income security tend to have a lower proportion of families with children, remote learning activity weakens the positive association between upload activity and income security produced by adults working from home. Correlations between income security, the proportion of families with children, and internet activity in Sydney and Melbourne (respectively) during the first wave of COVID-19 restrictions are shown in the Supplementary  Tables S2 and S3.
For the second wave of COVID-19 (and associated restrictions), we selected the appropriate time period using internet data from Victoria, where the second epidemic wave was concentrated. In Victoria during the second wave, internet activity peaked during the week of August 8th to August 14th. As for the first wave, this home internet activity peak immediately followed the implementation of the highest level of restrictions (Fig. 4b).
Because of the substantially different epidemiological and policy situations in Sydney (New South Wales) and Melbourne (Victoria) during the second-wave period (Fig. 4), we examined the relationship between internet traffic, lockdown policy, and income security for each city separately. Comparing the two cities provides insight regarding changes in behaviour related to the contrasting scenarios. During the second wave, Greater Sydney experienced a series of localised outbreaks with minimal social restrictions, while in Melbourne there was a large-scale epidemic with mandatory movement restrictions.
While household internet traffic declines in Sydney during the second wave relative to the first wave, the positive correlation between income security and internet activity relative to baseline remains prominent for both downloads (Fig. 6a,c), and uploads (Fig. 6d,f). This is despite the absence of formal stay-at-home orders www.nature.com/scientificreports/ in the Greater Sydney region at that time (though some restrictions on social gatherings remained in place). The time interval between the first and second waves was long enough to support the assertion that behavioural changes made in response to COVID-19 lockdown policies remain observable after those policies have been formally relaxed. Greater Melbourne behaves similarly in both waves with respect to changes in download traffic as a function of income security (compare Figs. 5a,c, 6a,b). However, changes to upload volumes do not mirror the correlations observed during the first wave (compare Figs. 5d,f, 6d,e). In fact, there are many areas of Melbourne with high income security that show substantial reductions in upload traffic during the second wave, relative to the first. While our data gives no immediate explanation for this counter-intuitive trend, we speculate that it may be due to alterations in work habits that occurred as the lockdown became protracted. Decreases in upload traffic without corresponding decreases in download traffic could result from individuals continuing to perform work activities from home, but participating in less "face-to-face" online interaction. Conversely, widespread adoption of remote schooling practices could help explain the increase in upload rates for regions in the mid-range of the income security spectrum. Such an effect is consistent with the weak but positive correlations between the proportion of families with children and changes to upload volumes ( ρ = 0.15 95% CI [0.033, 0.25] , see Supplementary Table S5). This suggestion is also consistent with the observation that daytime internet activity increases during school holidays, when children are more likely to be in the home (Fig. 1b).
We hypothesise that schooling in the home had a greater impact on internet volume in general, and upload rates in particular, than working remotely from home during the second wave of COVID-19 restrictions in Melbourne. This hypothesis is supported by a preliminary principal component analysis, summarised in the Supplementary Fig. S2. This 3-component PCA shows an increased role of children in determining upload rates in Greater Melbourne during the second-wave period. Specifically, Supplementary Fig. S2 demonstrates a qualitative change in the relationship between the proportion of families with children, income security, and second-wave changes to upload activity. Upload rate is positively associated with the proportion of families with children in   Fig. 3b).

Change of work environment for above-and below-median income earners.
To confirm that the household-level trends inferred from SA2-level aggregate variables corresponded to observations made on the individual level, we analysed representative data from Victoria collected by the CARE study. While the CARE study did not collect data on income security per-se, it did record the annual income bracket reported by each respondent.
One of the survey questions was posed as follows: 'Have you personally experienced a change in work environment (working from home) because of COVID-19 and the measures to prevent its spread? (yes or no)' . We computed the proportion of respondents who reported income above and below (or within) the median income bracket for the sample (sample median annual income was $AUD 60,000 to 69,999) who answered 'yes' to this question. We then performed a two-tailed Fisher's exact test to determine the resulting odds ratio between the two groups, and its statistical significance given the response numbers (see Table 1). The results demonstrate a strong positive relationship between income and switching to work from home, with an odds ratio of 2.15 (95% CI [1.59, 2.92], p = 6.8 × 10 −7 ) computed for the above-median income group, relative to the median-andbelow income group. While the income data tabulated by the CARE survey is not an exact representation of the income security score used in our analysis of internet trends (which incorporated contract classification), this result supports the same conclusion: those with higher financial security have more capacity to change their work environments in response to COVID-19 restrictions.
In summary, our results support the hypothesis that occupational factors link the ability to work from home with income security, and clearly show how this link produces strong positive correlations between income security and increases to home internet activity during COVID-19 restrictions. These correlations are consistent with the assertion that higher income security is associated with more people working from home during lockdown. This assertion is further supported by individual-level data from the CARE survey. We observe that in Sydney this trend persists after the release of lockdown restrictions, indicating the possibility of a 'new normal' of remote working conditions, particularly for occupations associated with higher income security. In Melbourne, we find that the role of children conducting their studies online disrupts these correlations due to an inverse relationship between income security and the proportion of families with children.

Discussion
By combining an analysis of occupational factors and distributions with large-scale, high-resolution, real-time data on internet activity, we have broadly characterised the impact of COVID-19 restrictions on two major urban centres in Australia, demonstrating three main findings. First, that occupations associated with greater income security are also associated with the ability to work from home. Second, that Internet usage increased during periods in which COVID-19 restrictions were in place. Increases were greatest in regions with high income security, suggesting that they may be caused by people who were able to adapt to working from home. Finally, that during the second wave in Melbourne, lower-income regions also displayed increased internet usage, likely driven by increased levels of remote schooling.
These findings confirm and elaborate on the general observation that COVID-19 and the associated restrictions on human activity distort normal life activities, with the relative impact largely determined by occupational and demographic factors [39][40][41][42] . This unequal impact is dominated by two major influences: the types of occupations in which people are engaged, and the compositions of households and families. Our analysis helps to illustrate how life changed during lockdown. For households with no children, and members engaged in work that could be conducted from home, the internet provided a means of continuing livelihood during the prolonged periods of mobility restrictions implemented to combat virus transmission. Furthermore, the members of such housholds were likely to have been previously employed in high-income occupations with relatively strong employment guarantees, adding a measure of confidence in the ability to financially out-last the economic downturn. On the other hand, households with lower income security were also less likely to have been able to work from home, and were more likely to have had children who required care during school closures. For such households, life www.nature.com/scientificreports/ during lockdown was financially insecure, potentially stressful on the family, and came with a heightened risk of exposure to the pandemic virus due to the work requirements of those occupations which remained active.

Study limitations.
Due to the nature of the data we analysed, our study has several limitations. With the exception of the CARE survey results, all of the data analysed in this work is aggregated to sub-populations. Therefore, a direct behavioural interpretation of the correlations we report is contingent on the assumption that the variables we investigate are independently distributed within these sub-populations. While there are likely to be exceptions, the spatial aggregation of areas by income security (Fig. 2a,b), suggests that the spatial resolution of SA2 regions is sufficient to sample within the boundaries that define salient heterogeneity of the population for the purposes of our study. To confirm this quantitatively, the spatial autocorrelation (Moran's I) between neighbouring SA2 regions is shown in the Supplementary Sect. S6. This analysis shows highly significant spatial autocorrelation of all relevant variables (income security, and changes to internet usage volumes). From this we conclude that the SA2 scale is an appropriate resolution for the spatially varying quantities studied in this work. Another inherent limitation is introduced though the use of household internet data in quantifying behaviour across the income spectrum: home internet connections have financial requirements including usage fees and installation costs that may be prohibitive for those at the low end of the income spectrum. For example, a study in the United States recently determined that household internet speeds typically increase with income, and the combination of both high income and high-speed internet is associated with an enhanced ability to self-isolate during the pandemic 21 . Our analysis of relative changes in data volume per active connection addresses potential inequity in the spatial distribution of broadband infrastructure as well as potential correlation of available bandwidth and income security. However, it does not address the possibility that lower-income households use access plans with smaller usage caps. Such a trend could complicate the interpretation of changes to internet volume with respect to working from home during lockdown. The analysis presented here assumes that larger increases to internet usage are indicative of a higher proportion of resident individuals switching to home-based work. An alternative interpretation of the observed trends could be that the proportion of individuals working from home does not depend on income security, but that households with higher income security have access plans with higher usage caps. This alternative interpretation requires data caps to regularly limit internet usage. While such a scenario is possible, we believe it to be less plausible than the one we chose to assume. In addition, the individual-level CARE survey analysis supports our chosen interpretation.
A related limitation is that households with no internet connection were implicitly excluded from our analysis. Using Australian data from the 2016 ABS Census, we computed a correlation of ρ = 0.50, 95% CI [0.44, 0.56] , associating the fraction of households with an internet connection (as of 2016) with the income security measure computed here (see Supplementary Table S1). Therefore, our use of home internet traffic data to estimate the ability of workers with differing income security to adapt to COVID-19 restrictions may omit the behaviour of many low-income households, producing an underestimate of the effects of occupational factors on behaviour during the crisis.