Unraveling the association between socioeconomic diversity and consumer price index in a tourism country

Diversity has tremendous value in modern society. Economic theories suggest that cultural and ethnic diversity may contribute to economic development and prosperity. To date, however, the correspondence between diversity measures and the economic indicators, such as the Consumer Price Index, has not been quantified. This is primarily due to the difficulty in obtaining data on the micro behaviors and macroeconomic indicators. In this paper, we explore the relationship between diversity measures extracted from large-scale and high-resolution mobile phone data, and the CPIs in different sectors in a tourism country. Interestingly, we show that diversity measures associate strongly with the general and sectoral CPIs, using phone records in Andorra. Based on these strong predictive relationships, we construct daily, and spatial maps to monitor CPI measures at a high resolution to complement existing CPI measures from the statistical office. The case study on Andorra used in this study contributes to two growing literature: linking diversity with economic outcomes, and macro-economic monitoring with large-scale data. Future study is required to examine the relationship between the two measures in other countries.


Introduction
D iversity is exceedingly valuable in modern society (Puritty et al., 2017). Empirical evidence relates diversity to tangible benefits, such as productivity (sax, 2014;AlShebli et al., 2018) and innovation (Ottaviano and Peri, 2006) for organizations and nations (Galinsky et al., 2015). Diversity has attracted the interests from diverse fields. A commonly accepted view in cognitive science is that cognitive diversity enables the exchange of valuable information, thereby increasing creativity. Similarly, sociologists find that diverse ties provide greater access to social and economic opportunities (Eagle et al., 2010). There exist different opinions in economics; some claim that diversity measures predict economic growth (Montalvo and Reynal-Querol, 2005); some argue for a negative impact due to resource allocation between groups (Alesina and Ferrara, 2005); some argue that ethnic diversity deflates price bubbles (i.e., financial failures) (Levine et al., 2014); some argue that a diverse work force (e.g., gender and racial diversity) is generally beneficial to corporate profits and earnings (Wright et al., 1995;Herring, 2009). Neighborhood ethnicity diversity has also been shown to have different effects on housing price (Macpherson and Sirmans, 2001). The enthusiasm from various disciplines highlights the importance of understanding diversity. To date, however, the correspondence between micro-level diversity and macro-level economic indicators (e.g., Consumer Price Index, measuring changes in the cost of purchasing a fixed basket of goods (Stigler, 1961)) has not been quantified. This gap is primarily due to the lack of data on both micro-behavior and economic indicators.
Macroeconomic indicators are essential for economists and policymakers to discern expansions and contractions in the near future (Bok et al., 2018). A couple of studies focus on predicting inflation and CPI using the data collected from the financial markets (Monteforte and Moretti, 2013;Modugno, 2013;Bańbura and Modugno, 2014). Establishing the relationship between micro-behaviors and macro-economic indicators is invaluable in monitoring the economic and social systems. Large-scale behavioral data can illuminate social phenomena and economic processes (Bok et al., 2018;Lazer and Radford, 2017). Traditional data collection methods are time-consuming, expensive to obtain, and vulnerable to sampling error (Bok et al., 2018). Hence, policymaking can be improved with high spatial-temporal resolution indicators comparing with statistics with substantial publication lags and limited contemporaneous information. Comparatively, large-scale behavioral data has high geographic and temporal granularity and is cost-effective (Lazer and Radford, 2017). Using massive data to approximate macro-indicators is especially promising in developing countries, where reliable data on economic livelihoods remains scarce (Blumenstock, 2016). There have been several successful attempts in the literature. To name a few, satellite and survey data have been combined to predict local economic outcomes in five African countries by applying a deep learning framework (Jean et al., 2016). Mobile phones have been combined with the environmental data to predict the Global Multidimensional Poverty Index (MPI) based on Gaussian Process regression (Pokhriyal and Jacques, 2017). The MIT Billion Prices Project used online prices to construct daily CPI in multiple countries Cavallo and Rigobon (2016).
Mobile phone data is especially promising in studying social and economic issues due to its high penetration rate and wideavailability (Leng et al., 2021(Leng et al., , 2017. By January 2019, 5.1 billion users globally had mobile phones, with a penetration rate of 67%. Among them, 4.39 billion people had access to the internet (Kemp, 2019). More than 20 mobile phone companies have donated their proprietary information to developing big data solutions for social good (Kemp, 2019;Bakker et al., 2019). Epidemiologists have explored call records to combat diseases (Maxmen, 2019), such as malaria in Kenya (Wesolowski et al., 2012), dengue in Pakistan (Wesolowski et al., 2015), and cholera in Haiti (Rinaldo et al., 2012). Researchers have also used mobile phone data to extract macro-level indicators for touristic events performances (Leng et al., 2016a). During the COVID-19 pandemic, we have seen an increasing number of applications using mobile phones to do contact tracing (Oliver et al., 2020). Therefore, mobile phone data has a nature appeal to unveil the relationships between micro-level behaviors (e.g., diversity) and economic statistics (Leng et al., 2016a). In this paper, we present a case study to explore the relationship between the travel demand based diversity measures extracted from mobile phone data and one of the most important macroeconomic indicators in this tourism country, namely the Consumer Price Index.
To the best of our knowledge, our study constitutes the first attempt to study the association between tourists-based diversity and the CPI in a tourism country. We couple the most complete national communication data with the CPI statistics collected by the Andorran government. Although the nature of the data limits the ability to establish a causal relationship, we can explore the association between the micro-level diversity and the region economic indicator, both tourism and non-tourism related. Specifically, we investigate the relationship between sociodemographic diversity extracted from mobile phone data surrounding different types of POIs 1 and the CPI of different industries. Our results demonstrate strong predictive relationships between diversity measures of nationalities and income and the Consumer Price Index of different sectors. We build statistical models to predict the CPIs of different categories in a European tourism country, Andorra. Finally, we estimate the CPI on a daily basis and at the cell tower levels, improving the temporal availability and spatial precision of CPI estimations. This associate shows that diversity measures on tourists can be used to predict the macroeconomics in a tourism country. This paper proceeds as follows. Section "Methods" covers the setting and data of this study. Section "Results" presents the results on the relationships between CPI and the diversity measures from mobile phone data. Section "Nowcasting and mapping" CPI presents daily nowcasting and spatial maps of the CPI measures. Section "Discussion" summarizes and concludes the paper.

Methods
Call detail records. We use passively collected behavioral data, call detail records (CDRs), in a European country, Andorra. The economy of Andorra heavily relies on tourism. The population of Andorra is only 85,000, while it attracts 10.2 million international visitors annually (Cia.gov., 2012). The high volumes of tourists makes Andorra an especially interesting country to study sociodemographic diversity.
Mobile carriers initially collect CDRs for billing purposes; hence, this data widely exists in almost every country in the world. The spatial-temporal resolution of this data is high compared to traditional surveys and has the most substantial penetration rate among all passively collected data. The data is recorded when users make phone calls, send short message services (SMS), and use internet data services. It contains information on the longitude and latitude of the cell tower, the timestamp of the transaction, the registry country of the SIM card, and other characteristics about the phone (e.g., the brand, the vendor, the model, and the system of the phone). The CDRs were collected from July 2014 to August 2016 the only mobile carrier in Andorra. Hence, the coverage of mobile phone data is 100% in our study, meaning that we have all the mobile phone data of individuals who visited Andorra.
Categories of cell towers. There are one hundred cell towers in Andorra, each covering an area of 250 m to 2 km in radius. We use the Voronoi tesselation to approximate the mobile tower coverage (Fortune, 1987). We manually label the cell towers to eight categories of Points of Interests (POIs), including wellness, leisure, shop, gastronomic, nature, event, culture, and others. Each tower may be associated with more than one POI category.
Diversity measures. The diversity measure is defined using two types of information, namely, nationality and approximation of disposable income. We compute the diversity measures at each cell tower and aggregate according to the eight kinds of POIs. We quantify diversity of cell tower as a function of Shannon entropy. According to Stirling (2007), there are three types of diversity measures, namely balance, variety, and disparity. The Shannon entropy we employ in the paper captures balance. Balance measures the pattern of apportionment of tourists from different origin countries. Specifically, Shannon entropy in our case captures the evenness or concentration of tourists origins. It measures how likely a tourist from a certain origin interacts with tourists from another country. We believe the types of interactions, how likely a foreigner interacts with tourists from another country and how likely a local Andorran interacts with tourists from different countries, contribute to higher CPI. Variety and disparity are not appropriate in our study for the following reasons. Variety is the number of categories into which system elements are apportioned. However, a simple enumeration of the number of countries tourists travel from cannot capture whether there are just one tourist from one country. Assume that there are one tourist from each country around the world in Andorra, varieties are large, however, cannot capture diversity. The other type is disparity, which refers to how each tourist origin can be distinguished from another. It measures how different are the foreign tourists from another country in Andorra. This is not appropriate in our setting in capturing diversity, as our goal is not to differentiate tourists from one country of origin from another.
Next, we describe how we compute the diversity measure in this study, where p j is the proportion of individuals who belong to category j, and J is the total number of categories. The diversity is computed at a daily level τ, where τ ∈ T and T is the set of days over the total observational period. The diversity of nationalities at cell tower i on day τ, D nat,i,τ , is formulated as, where T i,τ is the total number of individuals connected to cell i during τ, t i,kτ is the number of individuals belonging to nation k who connected to cell tower i during τ. K þ τ is the number of distinct nationalities that appear at the cell tower on day τ. K is the total number of nations and K = 10 in our study. The nations consist of Andorra and other countries with frequent tourists to Andorra, including Spain, France, Netherlands, Belgium, Russia, the UK, Germany, Portugal, and others.
We use phone prices to approximate the disposable income of mobile phone users. We discretize phone prices into 14 categories, including [0,20] S is the number of phone price categories, and S = 14 in our case. The diversity of income at cell tower i on day τ, D inc,i,part , is constructed similarly using Shannon entropy and is normalized by the number of categories. Formally, it is calculated as: where S þ τ is the number of phone price categories with at least one users on day τ. pi,s,τ is the number of individuals connected to cell tower i who belong to phone price category s on day τ.
A higher diversity measure implies that the cell tower attracts a more diversified (i.e., less uniform) population; see Fig. 1 for an illustration. The left figure is dominated by Andorrans (with a lower diversity measure), while the right plot is more diversified (with a higher diversity measure). We perform z-normalization on all of the predictors used in this study to allow for an easier comparison of variable importance.
Let CðbÞ be the set of cell towers with POI type b. jCðbÞj is the number of cell towers with POI b. The average diversity of nationality at POI category b on day τ is: Similarly, the average diversity of income at POI category b is: Note that since one cell tower may be assigned to more than one POI category, they may contribute to more than one POI categories. Consumer price index. We collect the monthly CPI from the Andorra Government Statistics website 2 . The CPI measures we collected are a relative value compared with year 2001 3 . Other than general CPI, we also collect CPI measures in different industries. We segment them into tourism-related and residentsrelated CPIs: • Tourism-related CPI: (1) hotels, cafes, and restaurants; (2) food, drink, and tobacco.
We focus on the change in CPI relative to the past month. The relative change in CPI in month t + 1 relative to month t is defined as, In Fig. 2, we present the correlations between each pair of industries. Many of the resident-related CPIs (i.e., clothes, resident services, home products, and transport) are highly positively correlated with the general CPI. Among these industrial CPIs, the correlation between the CPI of transport and resident services is the highest among all pairwise correlations (correlation = 0.93, p-val < 0.001). Two of the tourism-related CPIs (i.e., food, drink and tobacco, and leisure and culture) correlate with general CPI to a lesser extent. Interestingly, the direction of the correlations is the opposite between the two. CPI of the hotel and restaurant industries negatively correlates with the general CPI, while the correlation between CPI of food, drink and tobacco, and general CPI is positive.

Results
Correlation between diversity and CPI. We first investigate the association between diversity and consumer price index (CPI), as illustrated in Figs. 3 and 4. We present the Pearson correlation in Fig. 3. The association between the diversity of income at leisure and nature places is exceptionally high, with correlations of 0.805 (p-val < 0.001) and 0.775 (p-val < 0.001). We further illustrate the relationships via a scatter plot in the leftmost plot of the first row in Fig. 4. This pattern shows that we can easily predict general CPI using a single diversity measure mentioned above.
Besides, many of the diversity measures are highly predictive of CPIs in hotels and restaurants. For example, the diversity of nationality of the country and at shops associate with this diversity measure with strong correlation (r = 0.75, p-val < 0.001). We present the scatter plot in the two rightmost figures of the first row in Fig. 4. Additionally, four of the diversity of nationality measures (i.e., diversity of nationality at the country level (r = − 0.765, p-val < 0.001) and shopping (r = −0.750, p-val < 0.001), food (r = −0.785, p-val < 0.001), and cultural POIs (r = −0.756, pval < 0.001)) negatively associate with the CPI of clothes and shoes.
Diversity of income and nationality present different roles in predicting inflation and deflation, as can be seen from the contrary patterns. Diversity of income at different POIs positively correlates with general CPI. In contrast, the diversity of nationality presents a negative correlation. A similar contrary pattern shows up in most other CPIs: diversity of nationality negatively correlates with CPI of residence related services (r = − 0.401, p-val < 0.05), and furniture, home products, and home services (r = −0.392, p-val < 0.05). In contrast, the diversity of income positively correlates with these CPI measures: diversity of nationality negatively correlates with CPI of transport (r = 0.682, p-val < 0.001), residence related services (r = 0.650, p-val < 0.001), and furniture, home products, and home services (r = 0.395, pval < 0.05). A selective set of diversity measures. Next, we use the elastic net regression method to select the most important covariates for predicting each CPI measure, as shown in Figs. 5 and 6. This analysis helps us understand whether we can use a small number of diversity measures to reach reasonable predictive performances on CPIs. We see that diversity of income in the country, and at leisure and nature POIs and the diversity of nationality at wellness POIs are more predictive of general CPI than other diversity measures. The three income-related diversity measures exhibit a positive relationship with general CPI, while the nationality  related diversity measures exhibit a negative correlation. In terms of the CPI of hotels and restaurants, four covariates related to the diversity of nationality, including diversity of nationality in the country and at leisure, nature, and other POIs, are sufficient to achieve reasonable outcomes. All of the diversity measures contribute positively, similar to the correlation pattern shown in Fig.  3. For general CPI, eight diversity measures are selected to achieve R 2 = 0.74, among which both diversity measures in the whole country and that at leisure and nature places are selected. Also, four diversity measures of nationality contribute negatively, and four diversity measures of income contribute positively to general CPI. This pattern-income diversity contributes positively, and nationality diversity contributes negatively-also shows up in models for transport (with R 2 = 0.78).  Four covariates are sufficient to predict the CPI of (1) transport; and (2) hotels and restaurants. Comparatively, all covariates provide additional predictive powers to nowcast the CPI of (1) clothes and shoes, and (2) food, drinks, and tobacco.
We summarize the importance of covariates in Fig. 5. We present the top ten important covariates by the number of models that contribute to and the sum of coefficients in magnitude. The diversity of nationality at leisure POIs and in the whole country are predictive for most CPIs. Diversity of nationality at the event, wellness, and leisure POIs are the most significant by the magnitude of coefficients in all POIs. Diversity measures at the shop, food, and other POIs are not as predictive to CPI measures comparing to the top ten diversity measures. This result indicates that diversities at leisure, nature, wellness, event, and culture POIs are more predictive for the CPIs in this tourism country. As data on more countries become available, future research can explore whether this pattern exists in other tourism countries.
Nowcasting and mapping CPI Since the population's diversity can be computed at a much higher spatial and temporal resolution, we can nowcast 6 the CPI with much finer granularity. Using the covariates selected via elastic net regression 7 , we nowcast ΔCPI at a daily level and at the cell tower level. This analysis further highlights the benefit of using mobile phone data: it allows the government to adjust economic policies timely, leading to a data-driven smarter city.
Temporal nowcasting on a daily basis. Regular updates on macroeconomic indicators are necessary to enable the federal government to (1) adjust historical data, to (2) escalate federal payments and tax brackets, and to (3) adjust rents and wages (Bok et al., 2018). To this end, our result helps to provide frequent data for policymakers to complement the infrequent macroeconomic statistics. This analysis demonstrates the potential of enhancing the timely proxy for CPI measures at the country and sectoral levels with mobile phone data. The nowcasting for daily CPI measures is shown in Fig. 7. We see that our model can capture the variations in the CPI measures. Most CPI indicators demonstrate a periodic pattern except for home product services, which is relatively flat, and health services showed an increasing trend in 2015. Our method captures the periodic pattern well, especially for clothes and shoes.
Mapping regional CPI. CPI is usually reported on a national scale, while our method can estimate such measures at a regional level. This regional nowcasting provides more insights for policymakers to design regional policies. In Fig. 8, we show the spatial distributions of the predicted general CPIs in June 2015 and January 2016. The regional variation shown in both plots demonstrates the regional differences in CPIs, which implies that the country-level CPI is not sufficient to capture the regional variations. We perform community detection on the cell towers according to the predicted CPI. We observe that even if spatially proximate cell towers belong to similar groups, we still see some farther-away cell towers being grouped. The regional nowcasting might be helpful for policymakers to design corresponding interventions to deal with the regional variations in CPI.

Discussion
Our paper reveals strong associations between diversity measures and CPI measures in an European country of Andorra, contributing to the growing literature of diversity in various disciplines. We find that the diversity of income at leisure and nature POIs alone is highly predictive of general CPI. Moreover, diversities of nationality at the country level and shops are highly correlated with the CPI of hotels and restaurants. Interestingly, these two diversity measures negatively correlate with the CPI of clothes and shoes. Also, we use a statistical model to select a smaller number of covariates for predictions. Our result shows that socio-demographic diversities of tourists in Andorra and at some tourism-related (e.g., leisure, nature, and wellness) POIs are highly predictive of multiple sectoral CPIs. This result is useful when we cannot compute the diversity measures for many POIs when data is limited or POIs are sparse. Although the data cannot be used to show causality, the association suggests that diversity can be a strong predictor for CPI. Our finding provides empirical evidence to support the relationships between social structures (i.e., the diversity of individuals in a small region) and the CPI of different industries (both tourism and non-tourism related). Diversity of tourists may indicate that the service and tourism industry is attractive to tourists in different countries and with different income levels. This suggests the wellness of the economy. In addition, more international tourists may inflate the prices. All of these leads to changes in CPI.
Undoubtedly, new data and information technology can improve timely statistics in economics and monitoring society (Lazer and Radford, 2017). Our work represents an attempt to build predictive maps and daily predictions of CPI using the mobile phone data in a tourism-based European country, contributing to the burgeoning literature using big data to produce timely macroeconomic indicators. The strong association may be due to the strong relationships between behavior of tourists and the CPIs of this tourism country. The universal coverage of cell towers and the wide availability of CDRs makes it possible to predict CPI at high spatial and temporal resolutions in other countries. As behavioral data becomes more available, the high-resolution predicted economic indicators from this data could complement the static and lagged government statistics to help policymakers and economists make more informed decisions. Using traveldemand-based CDRs to provide CPI estimates of different industries to deliver accurate, high-resolution CPI maps offer a way to complement traditional statistical methods and provide regular updates in high spatial resolution in this tourism country. This study offers a framework to utilize human behavioral data by aggregating information at cell tower levels without revealing sensitive user information.
Our study is not without limitations and hence points out several future directions. We provide empirical evidence demonstrating the relationship between diversity and inflation (deflation) in a tourism country of Andorra. Andorra is an interesting case study as a tourism country, as it is highly internationally and attracts tourists from all around the world. This makes Andorra an especially interesting case study for diversity. We expect that such study can possible extend to other tourism countries or cities that attract international travelers. We use Andorra as a case study and this opens up opportunities for such analysis in other countries using mobile phone data. We leave the analysis to other contexts (non-tourism countries) for future work. As more data becomes available, we expect the same framework to be applied in different countries to examine the external validity of this study to more countries (tourism and non-tourism countries). The economic literature has laid out several micro-foundations to explain the forces underlying ethnic diversity and economic development Alesina and Ferrara (2005), related to individual preferences Alesina et al. (2000), individual strategies Alesina and Ferrara (2005), and production function (e.g., heterogeneity vs. innovation and productivity) Ottaviano and Peri (2006). More theoretical groundings may grow out of this work to explain the relationships between diversity and CPI, especially in a tourism country.
Uncovering the underlying mechanism connecting diversity and CPIs is not within the scope of this study. However, we provide some potential explanations. Future studies can unveil the causal mechanism. First, the diversity of tourists may indicate that the tourism industry of Andorra is attractive to diverse tourists (in different countries and with different income levels). The attractiveness to diverse cultures and income may predict a higher CPI. This also suggests the economic wellness of the country, which may, in turn, attract more tourists. Second, with many tourists from different nations, the Andorra tourism department and attraction organizers need to provide more services to accommodate diverse needs. Tourists from different nations conceivably have different language needs and expectations; tourists from different income levels are more likely to be interested in a broader range of activities. Third, tourists from different nations mean that more marketing investments have been made in foreign countries. This may also boost the CPI, especially for the tourism0related industry. Fourth, tourists interacting with people from other nations and others with different income levels can learn from the activities others perform. This social learning process diversifies their travel experience and provides a new source of information. They may therefore make more purchases, leading to a higher CPI.
We expect future research to provide micro-foundations to explain the forces underlying these socio-demographic diversity measures on tourists and CPI measures in a tourism country. Second, the data used in this paper does not allow for the establishment of causality. Further research can explore the causal relationships between social structure and inflation (deflation) using observational causal analysis, which is useful in policy designs for economic development. Lastly, our results present interesting patterns in a tourism country. As more data become available, it would be interesting to see whether similar patterns persist comparing with other countries with a different economic structure. Fig. 7 Daily CPI nowcasting. The x-axis corresponds to the days. The y-axis is the predicted CPI (in blue) and the actual CPI (in orange), relative to the start of the period (June 1, 2014). ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-021-00822-w Appendix: data availability The data was obtained through a collaboration with the Andorra government and the Andorra mobile carriers. The brand of the phone and the country that the SIM card was registered was collected by the mobile carrier for billing purposes. The data is fully anonymized (no user identification is collected) and the data is stored on Andorran servers. The authors run algorithms on the Andorran server and obtain the aggregate statistics at cell tower levels (e.g., diversity measures used in this study).
Next, we will discuss the availability of such data to replicate the analysis of this research. We agree that the availability of mobile phone data and the practical applications of the method to assess the impacts of our paper. First, the application of mobile phone data into smart cities has been an active research field. In these applications, policy-making departments (e,g, transportation, tourism environment) can easily collaborate with the mobile carriers to utilize these data for social issues. The tourism and transportation department is a governmental organization and can typically collaborate with mobile carriers to obtain relevant data. These three organizations can collaborate collectively to build smart cities to improve and economy, similar to the type of collaboration built in this study. Second, in the case where the collaboration has not been established, mobile phone data has been made available by different stakeholders, specifically, open data by some mobile carriers and services (data and analytics) from some companies. Let us name a few of them. Owing to the wide-availability and the opportunistic nature (i.e., data initially collected for billing purposes) of mobile phone data, many mobile carriers share their data with the public. For example, in Europe, the CDR data has been made available in Milan and the Province of Trentino in Italy. In Africa, CDRs have been opened to the public in Ivory Coast and Senegal through two Orange D4D challenges. In Asian, China Unicom has shared the mobile phone data in 2018 CCF big data and computer intelligence competition through Data Foundation 8 . Except for these open data, some other organizations and companies provide services and analytics on mobile phone data. For example, AirSage provides the service in the US for collecting this type of CDR data that commercial companies can be used for identifying consumer patterns 9 .
Talkingdata is a China-based company that provides data and services on mobile phone data 10 . Predicio is based in Europe that provides mobile phone data and helps businesses with actionable consumer behavior insights 11 . Flowminder is a Sweden-based company that provides anonymous phone records for policymaking and social good 12 . Moreover, other firms provide a similar type of mobile phone records with higher spatial resolution through GPS, such as SafeGraph 13 and Cubiq 14 . To allow the collective efforts to fight against COVID-19, SafeGraph has created a data consortium and shared mobility data in the US.

Data availability
Due to the nature of this research, data stakeholders did not agree for their data to be shared publicly, so supporting data is not available.
Received: 9 July 2020; Accepted: 1 June 2021; Notes 1 A point of interest is a specific location that someone may find useful or attractive, such as a skiing resort or a museum. This term widely appears in geographic information systems. The category of POI potentially indicates the trip purposes and activities in transport studies (Leng et al., 2021). 2 Govern D'Andorra. 3 Note that CPI is a measure of the average change overtime in the prices paid by urban consumers for a market basket of consumer goods and services. Hence, it is usually reported relative to the past. 4 residence related services include (1) rental of housing, (2) services and products for the conservation of the home, (3) water distribution sewers and purification, (4) electric energy, (5) gas, flammable liquids. 5 This category includes: (1) furniture, furniture accessories, carpets, (2) textile articles for home and articles of furniture, (3) equipment and accessories for the home, (4) crystal, crockery and other home products, (5) small tools and disposable items for construction. 6 Nowcasting is the prediction of the present, the very near future and the very recent past in economics (Giannone et al., 2008). 7 When tuning the parameters for elastic net regression, the search space used for λ, the penalty term, was 10 −3 to 10 3 . The possible range for values of α, the mixing The size of the nodes corresponds to the predicted CPI of the corresponding month. The color of the nodes corresponds to the segmentation based on the predicted CPIs across our observational period. We compute the pairwise correlations of cell towers and then perform spectral clustering on the correlation matrix to obtain the clusters.