Exploring the use of mobile phone data for national migration statistics

Statistics on internal migration are important for keeping estimates of subnational population numbers up-to-date, as well as urban planning, infrastructure development, and impact assessment, among other applications. However, migration flow statistics typically remain constrained by the logistics of infrequent censuses or surveys. The penetration rate of mobile phones is now high across the globe with rapid recent increases in ownership in low-income countries. Analyzing the changing spatiotemporal distribution of mobile phone users through anonymized call detail records (CDRs) offers the possibility to measure migration at multiple temporal and spatial scales. Based on a dataset of 72 billion anonymized CDRs in Namibia from October 2010 to April 2014, we explore how internal migration estimates can be derived and modeled from CDRs at subnational and annual scales, and how precision and accuracy of these estimates compare to census-derived migration statistics. We also demonstrate the use of CDRs to assess how migration patterns change over time, with a finer temporal resolution compared with censuses. Moreover, we show how gravity-type spatial interaction models built using CDRs can accurately capture migration flows. The results highlight that estimates of migration flows made using mobile phone data is a promising avenue for complementing more traditional national migration statistics and obtaining more timely and local data.


Introduction
Human populations are highly mobile in the modern world, and migration is one of the main factors that determines changes in population size, distribution and structure (Abel and Sander 2014;Agliari et al., 2018). As migration impacts the demographic and socio-economic aspects of a country, it has become one of the most challenging issues confronting policymakers for nations around the world (International Organization for Migration 2017a, c). Understanding internal migration, which is normally substantially larger than international migration rates, and their changes over time is critical for keeping subnational population numbers up-to-date (Frayne 2005;Pendleton et al., 2014;Wardrop et al., 2018). Contemporary data on internal migration flows are valuable for urban planning, resource allocation, infrastructure development, public service provision and impact assessments. For instance, identifying where people migrate internally is often vital in development work, as migrants might be marginalized and at higher risk due to a lack of resources to meet demands (Lu et al., 2012;Lu et al., 2016;Ruktanonchai et al., 2016a). However, our knowledge of contemporary internal migration patterns remains poor for many countries (Garcia et al., 2015;Sorichetta et al., 2016;International Organization for Migration 2017b), and is difficult to update between data collections for the majority of countries around the world.
Data collected from traditional sources, such as national population and housing censuses and household surveys, are the primary source for migration statistics (International Organization for Migration 2018). Within population and housing censuses, migration is typically measured through a change in residence over a one-or five-year period prior to the census. The increasing use of global positioning systems (GPS) has supported the collection of more spatially precise data, but each census only provides a single snapshot of migration flows, commonly once every decade, and migration patterns typically change over time between censuses or surveys (Namibia Statistics Agency 2013; Wesolowski et al., 2013). Moreover, surveys only sample a small proportion of population, and the logistical challenge of censuses makes them an infrequent and expensive source of demographic data (Wardrop et al., 2018).
Moreover, as migration is anticipated to continue to rise, both in terms of volume and reach, the need for timely updates to demographic statistics and inform migration policy development increases -a need that traditional sources are typically not wellequipped to meet (International Organization for Migration 2018). To predict contemporary migration for many countries, a growing interest in the modelling of migration flows emerged, leading to the advanced development of modelling methodologies to estimate migration rates (Courgeau 1995;Henry et al., 2003;Cohen et al., 2008;Abel 2013;Abel and Sander 2014;Garcia et al., 2015;Sorichetta et al., 2016;Vobruba et al., 2016). However, regardless of how sophisticated these methods are, these estimates remain largely constrained by the lack of contemporary input data and often their coarse spatiotemporal resolution (Garcia et al., 2015;Sorichetta et al., 2016).
Call detail records (CDRs) routinely collected by mobile phone operators for billing purposes are particularly promising for analysing migration-related phenomena and a potential solution to existing data gaps (International Organization for Migration 2018). CDRs contain an entry for each call or text (or other billable event) made or received by any anonymous user, together with the date and time of each communication and an identifier for the tower that the communication was routed through within the operator's network (Ruktanonchai et al., 2016b;Zu Erbach-Schoenberg et al., 2016). Then the tower-level location of each communication can be identified, and from this, spatially and temporarily explicit estimates of human mobility can be derived from anonymised CDRs from the movement of individual mobile user between different communications. These data have been increasingly used for quantifying short-term human mobility, mapping dynamically changing population densities, estimating infectious disease spread risk, and measuring population displacements due to disasters and conflicts (Lu et al., 2012;Wesolowski et al., 2012;Deville et al., 2014;Tatem et al., 2014;Wesolowski et al., 2014a;Wesolowski et al., 2015a;Wesolowski et al., 2015c;Lu et al., 2016;Ruktanonchai et al., 2016b;Zu Erbach-Schoenberg et al., 2016;Wesolowski et al., 2017). Moreover, previous work on defining overall and seasonal patterns of population movement using CDRs suggested they could also be used to model internal migration (Blumenstock 2012; Wesolowski et al., 2013;Ruktanonchai et al., 2016a;Wesolowski et al., 2017).
In previous studies, however, CDRs frequently spanned much shorter periods than one year, or multi-year mobility analysis using CDRs have been presented, but no studies have compared individual places of usual residence across different years to estimate migration flows by matching the definition of migration used in censuses (Blumenstock 2012;Zu Erbach-Schoenberg et al., 2016;Wesolowski et al., 2017).
Based on a multiannual CDR dataset in Namibia, for the first time, we assess how CDRs as a novel data source might be used efficiently and accurately to replicate the internal migration statistics produced in a census, and examine how CDRs could improve the estimates made using classical gravity models. This study also reveals otherwise unmeasurable year-by-year migration patterns to assess the potential of CDRs for updating internal migration statistics.

Datasets
Census Migration Statistics. The most recent census in Namibia was conducted in 2011, and we obtained the internal migration statistics between regions from a census- . MTC is the leading network operator in Namibia with a 76% market share and providing network spatial coverage 95% population (Mobile Telecommunications 2018). The CDR dataset obtained from MTC included the time and routing tower for each call and text and a random uniquely hashed number for each user. The approximate location of a user was defined by the location of the routing mobile phone tower for each communication. The data were spatially aggregated to regional level to match the census migration data and to further reduce sensitivities of using individual level data. We estimated a user's place of residence for a given period as the region where the user was observed most frequently during the period of interest. As the data on very infrequent mobile phone users or seasonal movement (e.g. short-term travels in holidays), might introduce noise in defining residential places, we only included any user who was active for more than 30 days each year (12 months) defined as below.
To match as closely as possible the time frame used in census and to be comparing residences between Years 1 and 2, and between Years 2 and 3, respectively. If mobile users changed residence between the two years, they were identified as migrants, otherwise as non-migrants. Additionally, we also assessed the potential impact of data filtering and different time lengths on defining residences

(Supplementary information [SI] text).
Model Covariates. For estimating migration by models for the 2011 period, we also collated potential migration-related demographic, socioeconomic, geographic and environmental variables, as described in previous studies (Garcia et al., 2015;Sorichetta et al., 2016), including population by region in 2010 and 2011 (Namibia Statistics Agency 2013); the proportions of population living in urban areas, male population, population aged 15-59, educated population, labour force participation, and marital status in population at aged 15 years and above; administrative unit boundaries to define the distance and contiguity between regions and their area (Zhao et al., 2012); and the average annual precipitation by region. The collation of covariates is detailed in the SI Text.

Models and analysis
We fit three types of models to census data to explore whether CDR-derived migration data can accurately replicate traditional census-derived migration statistics. Three types of models were included (Table S1): 1) CDR-based linear models (CDRLMs), simply using CDR-derived migrating user data alone or combined with covariates used in gravity models; 2) gravity-type spatial interaction models (GTSIMs), which have been applied extensively to estimate migration flows based on a range of migration-related push-pull factors including populations and distance between origin and destination (Zipf 1946;Hua and Porell 1979;Garcia et al., 2015;Wesolowski et al., 2015b;Ruktanonchai et al., 2016a;Sorichetta et al., 2016;Vobruba et al., 2016); and 3) GTSIMs extended using CDR data (thereafter called CGTSIMs).
CDR-based Linear Models. Initially, we used Pearson correlation coefficients to assess the relationship between CDR and census data. To investigate how well the CDRs can replicate the census migration numbers, we built four sub-models of CDRLMs using independent variables of CDR-derived migrating user numbers or integrating with other covariates: where the dependent variable , is comprised of the observed migration flows between regions in Namibia from the census.
, is the number of CDR-derived migrations from origin to destination , with the coefficient 1 and the constant 0 .
The suite of models was built by successively adding same covariates that were used in GTSIMs and represented by the matrix X and its vector of coefficients ⃗ .

Gravity-Type Spatial Interaction Models.
In the simplest form of gravity models (Zipf 1946), the flow of migration between regions is proportional to their total populations and inversely proportional to the distance between them: where and refer to populations at an origin and a destination in 2010, respectively; , represents the distance between and ; The exponents, 1 , 2 , and 3 , are used to indicate the magnitude of the effect for each variable.
As a range of potential push-pull factors, e.g. urbanization and natural disaster, could affect human migration, the models can be further extended to reach more accurate estimates as described in previous studies (Garcia et al., 2015;. However, given that the number of regions in Namibia is small (13 regions) and to prevent overfitting, we only tested models by replacing the total population variables with the percentage of population living in urban areas ( and ) and the precipitation ( and ) in origin and destination, respectively (SI text). Although both logistic and Poisson regressions have been widely used in gravity models to predict migration flows, the outputs from logistic regression should be identical to estimates of Poisson regression by adding an offset variable of nonmigrating populations (Garcia et al., 2015;Ruktanonchai et al., 2016a;. Therefore, we only fit GTSIMs using the logistic regression function here: where represents the total population residing in an origin in 2010, and where and refer to the push factor at origin and pull factor at destination, respectively (Table S1). Moreover, the CGTSIMs with additional CDRs variables were tested to assess how well the CDR-derived migration data could improve the performance of gravity models.
Model Comparisons. By fitting to census statistics for each model, we used a leaveone-out-cross-validation approach (Hastie et al., 2009) to split the dataset to calculate the goodness-of-fit indicators, including root-mean-square error (RMSE), R-squared (R 2 ) and Akaike Information Criterion (AIC). The model with the lowest RMSE was determined as the best model of each model family. The estimates of migration between regions were then calculated using the optimal model, and the inflow, outflow and netflow for each region in Namibia were also aggregated.
As our models used non-spatial regression approaches, and spatial autocorrelation may exist in migration data (Tobler 1970;Getis 2008;Sorichetta et al., 2016), a shuffle test was used to assess whether any spatial dependencies significantly affected the performance of our models. First, we randomly permuted the census-derived migration data across all regions. Then each model was fitted to calculate RMSE by using each shuffled dependent variable, and the distribution of RMSE could be produced through 1000 iterations. If the "real" RMSE of each model that was fitted with the "ground truth" migration data was less than all 1000 simulated values of RMSE using the shuffled data, we assumed that the spatial dependencies were not significant in our models. All analyses were done within the R statistical environment (version 3.5.2), and fitting procedures of models were conducted using To account for potential mobile phone ownership biases across regions, the models mentioned above were also tested by using CDR data adjusted by two approaches respectively: 1) using the proportion of mobile phone ownership to inversely weight CDR-derived migration data by region; and 2) adding the proportion of ownership as an additional variable into models. Therefore, we present the following results without the Zambezi region, and relevant comparable analyses for all regions are provided in the SI.

Comparing Migration Prediction Models.
In general, the goodness-of-fit indicators, including RMSE, R 2 and AIC, show that CDRLMs using only CDR data could precisely and accurately replicate census-derived statistics, with a better predictability than GTSIMs (Figs. S8-S10). Moreover, the performance of GTSIMs could be substantially improved by using CDRs. Comparing the "real" RMSE with the distributions of RMSEs generated by the shuffled census data, it was evident that spatial autocorrelation was not significant in our models ( heads were less likely to be able to afford a cell phone, and there was a significant ownership differential between regions in Namibia (SI text; Tables S2 and S3). To account for the potential mobile ownership bias between regions, two approaches were used respectively to adjust CDRs. The performance of both CDRLMs and CGTSIMs were not significantly improved by these adjustments however (Figs. S8-S10).
Predicting inflows and net migration aggregated by region (Figs. S14-S16). However, the relative differences across periods show greater variations in outflow than in inflow between regions, with more people moving out from the West-South regions and into the northern regions in Namibia (see Fig. 5).

Discussion
Migration is difficult to measure frequently, particularly at local scales, and data from censuses are typically collected just once every decade, pushing a need for innovation in the production of migration statistics (International Organization for Migration 2018).
The penetration rate of mobile phones is now high across the globe, and analysing the changing spatiotemporal distribution of mobile phone users through anonymized CDRs offers the possibility to measure migration at multiple temporal and spatial scales. Global mobile phone network subscriber numbers passed the five billion mark in 2017 with a global penetration rate of 66%, and the number is forecasted to continue to grow, moving up to 71% by 2025, with rapid recent increases in ownership in lowincome countries (The GSM Association 2018). The data collected every second by mobile network operators have the potential to contribute to the "big data revolution" in complementing more traditional statistics through updating internal migration statistics in a timely, accurate and low-cost way.
This study demonstrates how the analysis of CDRs can replicate national internal migration statistics to complement outputs from censuses. The multiannual time series of CDRs with high spatiotemporal resolution facilitates the derivation of residence measures, matching closely the definitions used in censuses. We found that not only can the estimates of migration produced through CDRs be as accurate as census data-derived measures, but these data offer additional benefits in terms of updating intercensal migration numbers and understanding changing patterns of annual internal migration. Additionally, the methodologies presented are designed to be easy to implement while considering the impact of heterogeneous phone ownership across regions and years, and the simple linear model built using CDRs results in estimates with high precision and accuracy.
Results here suggest that CDRs can also improve the performance of gravity models. The GTSIMs explicitly state the spatial interaction relationship between migration and the push-pull factors that represent the benefits and costs of migration (Zipf 1946;Hua and Porell 1979). The estimates made using gravity models contribute to a better understanding of migration patterns, with known boundaries to their accuracy in the absence of censuses or surveys. However, due to the lack of high spatiotemporal resolution input data on contemporary population movements, such models used in previous studies resulted in high uncertainties in estimates (Garcia et al., 2015;Sorichetta et al., 2016;Vobruba et al., 2016). Though biases exist, as CDRderived migration data directly relate to populations who moved across the country over years, a combination of CDRs and other migration-related covariates could facilitate a significant improvement in the precision and accuracy of outputs from gravity models.
Internal migration is common in Namibia, and we estimated a larger number of migrating mobile phone users compared to those migrating within the census data.
One reason is that CDRs do not suffer from recall bias (Wesolowski et al., 2014b) and capture missing data from people who moved, but did not register their previous residence in the census. Moreover, different time windows for data capture may also have contributed, with the CDR-based home definition window used here being wider than the census collection date. As elsewhere, the largest proportion of migration in Namibia is rural-to-urban migration, a phenomenon that relates partly to rapid urbanization (Garcia et al., 2015 Some limitations must be acknowledged. First, to prevent overfitting and multicollinearity, our models did not test a large number of demographic, socioeconomic, geographic and environmental factors and their combinations that might potentially affect migration as described before (Henry et al., 2003;Henry et al., 2004;Garcia et al., 2015;Wesolowski et al., 2015b;Ruktanonchai et al., 2016a;Sorichetta et al., 2016;Vobruba et al., 2016). Another methodological shortcoming is the lack of correction for spatial autocorrelation in the modelling by using a spatial regression model. However, a shuffle approach showed that any spatial dependencies likely did not significantly affect the performance of our models.
Mobile users only cover a proportion of the population, therefore, CDRs may provide an incomplete picture, not accounting for those who do not own and use a phone, mobile phone sharing, network coverage, or alternative networks. The which will also decrease the influence of the problem of phone sharing, which is common in areas with low cell phone penetration.
Additionally, to account for the impact of increasing user numbers across years on migration estimates, we adjusted the CDR-derived data for comparing interannual migration patterns, but these only represent an initial step for adjusting for mobile phone usage changes. Future studies on estimating migration could use other appropriate data, such as travel history and mobile phone use surveys to infer possible correlation in mobile use and migration in demographic-specific subgroups.
Additionally, due to the availability of data, we only investigated here internal migration over the course of a year. Long-term internal migration (>5 years) could be estimated by analysing CDRs over a longer period and these could be integrated with additional data sources, such as Google Location History data (Ruktanonchai et al., 2018), to address relevant underlying research questions and technical issues in the future.
The results here show that estimates of migration flows made using CDRs is a promising avenue for complementing more traditional national statistics and obtaining more timely and local data. The metrics and approaches can inform distinctly different policy-relevant needs that require migration statistics and the implementation of policies geared towards providing relevant public services. Partnerships between governments and phone companies supported by appropriate incentives could enable accurate and rapid production of national migration statistics to complement census and survey-based data collection.    Table S1 and Figs. S8-S10. *The model #1 of CDRLMs. **The model #3 of CDRLMs.

Data availability
The internal migration statistics between regions in Namibia in 2011 are available in the migration report published by the Namibia Statistics Agency in 2015    Telecommunications 2012Telecommunications , 2014. We then calculated the most frequently observed region for each user and day as that user's daily location. Using these daily locations,

Author contributions
we could define the residence as the most frequent daily location during a given yearlong period (12 months) for each user. Finally, we derived migration flows of mobile users by comparing the places of residence at regional level across years for each user.

Comparing Different Time Lengths to Define Residence
As sufficient CDR data points are needed to accurately estimate the place of residence, different time windows of data were compared to define the residences of mobile phone users. We first investigated how many months of data were available for each mobile phone user in an example dataset from January to December 2012 that was used for migration estimate in Periods 2011 and 2012. A monthly location for each individual was calculated as the most frequent daily location on regional levels over each month if this user made at least one call or text in that month. We found that the most frequent case was that users have data available for all months of the year ( Figure S2A), but it still showed a high proportion (15%) of infrequent users with only 1-month of data, which could introduce a strong bias for deriving migration flows compared to using yearly locations as residence. Therefore, to exclude very infrequent users, we used 31 days of defined daily locations as a cutoff for an individual to be included in the migration estimation. Moreover, most individuals with more than 1month of data had monthly locations that were identical to yearly locations ( Figure   S2B), and the spatial differences in the mean percentage of monthly locations matching yearly locations were likely homogenous across the country, from the lowest percentage of 91% to the highest of 97% ( Figure S2C).
Seasonal movements might lead to an individual temporarily residing at different locations, which would influence estimates of the place of residence in settings where only shorter periods (e.g. a month or two per year) of CDRs are available (Wesolowski et al., 2017). Figure S3A shows the percentage of locations using the random N months (ranged 1-11 months) of data that match the locations using a full year of data. As expected, the accuracy of estimating the usual residence increases with increasing length of period covered by the data. However, we observed the largest increase going from one month to two months of data used, likely due to the strong seasonal effect of individuals' temporary locations deviating from their usual places of residence during holiday periods and other times when short-term mobility is prevalent. Furthermore, Figure S3B shows the Z-score of the percentage of users whose monthly locations matched the yearly location. The negative deviation in December and January means fewer users being found at their usual residences in these holiday months, with a similar, but smaller effect being seen in May. This is part of the reason that censuses are generally timed to occur during a period with low seasonal population movement, e.g. August and September, and again it highlights the need to exclude infrequent mobile phone users to prevent the inclusion of shortterm travel in longer-term migration estimation.

Phone Ownership
As mobile phone ownership is not homogenous across the population, we utilized data of assets. The IWI score is an asset-based wealth index used for measuring the economic situation of households in developing countries (Smits and Steendijk 2015).
Households were then divided by five quintiles, based on the IWI ranking. Households falling within the upper two quintiles ('richer' and 'richest') were then classified as being wealthy, while households in the bottom three quintiles were defined as not being wealthy.
We first performed an exploratory bivariate analysis to compare the characteristics between households owning at least one mobile phone and households without mobile phones. Data were weighted using sampling weights and adjusted for the survey sampling design, and analyses were done using STATA 14 software. Table S2 shows that households owning at least one mobile phone are significantly different from households without mobile phones for almost all background characteristics considered in this analysis. In comparisons to households without mobile phones, those with mobiles are significantly more likely to be wealthy, to reside in urban areas and live in Khomas (22%). Households without mobile phones are more likely to be located in Kavango, Zambezi and Ohangwen regions. Moreover, households with one or more mobile phones are significantly more likely to have younger heads and a higher number of household members, whereas households without mobile phones tend to have more uneducated household heads and uneducated female and male residents (Table S2).
Furthermore, we performed a binary logistic regression model for the probability of households without mobile phones, by including all variables that resulted as being statistically significant in the bivariate analysis. In particular, we aimed to identify groups of households that had a high probability of not owning a mobile phone and the characteristics they share within a multiple regression analysis framework, after adjusting for the effects of other variables (Callegaro and Poggio 2004). Table S3 shows that there is a significant differential in the ownership of mobiles between households regarding wealth, age of the household head, household size, and education. Our findings also show that there is a significant ownership differential between regions in Namibia, confirming the results from the bivariate analysis. The odds of ownership of a mobile phone for households residing in most regions range between about 2 and 5 times greater than that in Kavango, meaning it may be necessary to take the regional mobile ownership bias into account in estimates of migration by CDRs for each region.

Model Covariates
We also collated potential migration-related demographic, socioeconomic, geographic and environmental variables for migration modelling as described in previous studies (Henry et al., 2003;Henry et al., 2004;Garcia et al., 2015;Wesolowski et al., 2015;Ruktanonchai et al., 2016;Sorichetta et al., 2016;Vobruba et al., 2016). An administrative unit boundary file at regional level matching the year of the census was obtained from the Global Administrative Areas Database (GADM 2018). Following previous studies (Garcia et al., 2015;Sorichetta et al., 2016), the shapefile was used to calculate variables that measure distance and contiguity between administrative units, respectively. Euclidean distance between geometric centroids is commonly used as a parameter in gravity models, where it represents the barriers to, as well as potential costs of, migration (Garcia et al., 2015;Sorichetta et al., 2016). To calculate possible environmental drivers of migration in models, moreover, high resolution monthly precipitation grids (30 seconds, ~1 km 2 ) were obtained from WorldClim version 2 (worldclim.org/version2). We then aggregated the precipitation data to obtain average annual precipitation (mm) by region as a proxy of push-pull factors such as agricultural productivity and the potential of floods and droughts.
A variety of demographic and socioeconomic variables known to be associated with migration flows were also collated from 2011 census data for each region in Namibia (Table S1). First, we included the populations in origin and destination ( and ) in 2010 and 2011. Given that the urbanization can be a significant pull factor for migrants (Lall et al., 2006;Garcia et al., 2015), then we included the percentage of population living in urban areas in origin and destination, denoted as and , respectively.
Previous studies on migration also suggested that human migration is, at least in part, driven by economic opportunities and that different demographic characteristics, such as age, sex, educational attainment and marital status influence migration rates (Henry et al., 2004;Garcia et al., 2015;Sorichetta et al., 2016).
Therefore, we also collated the following covariates from 2011 Namibia census data: might offer more opportunities for improving these socioeconomic status (Lall et al., 2006). As the urbanization is related to socioeconomic development, we removed the demographic and socioeconomic variables that were highly correlated with the urbanization variables to avoid multicollinearity and overfitting in models.
SI Tables   Table S1. The summary of models. : Proportion of the population owning mobile phones in origin region .    Fig. S1. Urban area and road networks (A), population density (B), and annual precipitation (C) in Namibia. The data on urban areas were obtained from the Natural Earth (www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-urban-area/) (Schneider et al., 2003), and the road networks were obtained from DIVA-GIS (www.diva-gis.org), the population density data were downloaded from WorldPop (www.worldpop.org), and the average annual precipitation for 1970-2000 were obtained from WorldCllim (worldclim.org/version2). Percentage of monthly locations matching yearly locations by region. The monthly/yearly location was defined as the most frequent location at regional level across the whole month/year.       Table S1.  Table S1.  Table S1.  Table S1). The RMSE of each model fitted by real, unshuffled census data is given in the title of each graph. The Zambezi region as an outliner is excluded in the dataset.

SI Figures
* The model #1 of CDRLM. ** The model #3 of CDRLM. an outliner is excluded, and the estimates of optimal models with the lowest RMSE using unadjusted CDR data ( Fig S8 and Table S1) are presented here.
Fig. S13. Comparing regional outflow, inflow and net migration between 2011 census data and estimates made by models for all regions in Namibia. The estimates of optimal models with the lowest RMSE using unadjusted CDR data (Fig S8 and Table   S1)     show the results of CDRLM using CDR data adjusted to offset the effect of increasing number of users across years. The statistics of migration were obtained from the 2011 Namibia Population and Housing Census data. The Zambezi region as an outliner is excluded in these graphs, and the fitted CDRLMs using only CDRs for 2011 were used to predict the migration in 2012 with corresponding CDR data.