Table 2 Example data sources reported in the BEE-COAST framework

From: Can big data solve a big problem? Reporting the obesity data landscape in line with the Foresight obesity system map

Ordnance survey (OS) Points of interest (POI) data
Background Key features POI is a dataset detailing over 4 million geographic features (both natural and built) across Great Britain
  History The dataset is created and maintained by PointX Ltd on behalf of OS, the national mapping agency of Great Britain. PointX is an independent company jointly owned by OS and Landmark Information Group. POI data has been available since 2000, and is updated quarterly (see below)
  Purpose POI was developed for the purpose of mapping features of public interest in Great Britain. It is has various uses including both administrative (e.g., service provision and emergency planning) and commercial (e.g., driver routing and location based services)
Elements Content POI is a dataset detailing over 4 million geographic features (both natural and built) across Great Britain. The scope of features covered is broad, including commercial services, education and healthcare establishments, transportation infrastructure, attractions, and public infrastructure. Of particular relevance to the obesity system, the dataset contains information on food outlets (various classifications), public transportation nodes (e.g., bus stops), formal green spaces (e.g., commons and parks), and sport and recreational facilities.
For each feature, the following data are available:
- Unique Reference Number
- Feature Name
- Feature Classification (600 classifications available)
- Feature Address
- Feature Location (British National Grid coordinates)
- Positional Accuracy of Feature Location
- Unique Property Reference Number (allows linkage to OS Address Base suite of products)
- Topographic ID and version Identifier (allows linkage to OS MasterMap Topography Layer product).
- ITN easting, northing, TOID and version identifier (allows linkage to OS MasterMap ITN layer)
- Telephone number and/or web address
  Ownership Ordnance survey
  Aggregation Data are available at the level of individual features
  Sharing POI data can be accessed for free online via the EDINA Digimap website using an educational institution login. However, use of the data via this means is restricted to ‘Educational Use’ and/or limited ‘Administrative Use’, as defined by Ordnance Survey’s end user agreement. Data can be shared with others who have entered into the end user agreement/a data handlers’ agreement with Ordnance Survey. Less restrictive access to the data can be purchased at a cost
  Temporality A new version of POI is released every quarter. EDINA Digimap hold previous versions of POI back to March 2015. With each new release, OS publish details on the changes that have been made as compared to the previous release.
Note, feature classification codes have also changed over time (last update at time of writing: January 2013)
Exemplars Indicative use cases POI can be used to characterise access to local amenities relating to diet and physical activity such as food outlets [23], and sport and recreational facilities [24]
  Foresight nodes 4.2 Opportunity for team based activity 4.3 Access to opportunities for physical exercise 4.6 Reliance on labour saving devices and services 4.9 Opportunity for un-motorised transport 4.11 Dominance of motorised transport 4.13 Walkability of living environment 7.4 Food exposure, 7.5 Food abundance, 7.7 Convenience of food offerings, 7.8 Food variety
Food standards agency (FSA) food hygiene data
Background Key features FSA data contains locational, functional (i.e., business type) and hygiene rankings information on food businesses in the UK
  History Under UK law any business intending to conduct ‘food operations’ (including selling, cooking food, storing, handling, preparing or distributing food) must register their business with the environmental health department of their Local Authority (LA). This is then used by the environmental health team to conduct food hygiene inspections and enforce food law.
The register is updated by a LA when a business registers its intention to conduct food operations, and businesses are removed when registered businesses inform a LA of their intention to terminate food operations. Data are also updated when environmental health officers conduct food hygiene inspections. The frequency of such inspection will depend on the initial food hygiene rating assigned to the business
  Purpose As above in history
Elements Content Data are available for all LA that are participating in the Food Hygiene Rating Scheme (FHRS) in England, Northern Ireland and Wales, or the Food Hygiene Information Scheme (FHIS) in Scotland. Participating LAs are listed on the Food Standards Agency website. Presently, all LA in the UK participate in the scheme.
Datasets are downloadable separately for each LA. Each dataset contains information on:
- Business name
- Business type (13 classifications, including ‘Pub/Bar/Nightclub’, ‘Restaurant/Café/Canteen’, ‘Retailers–Supermarkets/Hypermarkets’, ‘Retailers–other’ and ‘Takeaway/Sandwich Shop’)
- Business address
- Food hygiene ratings and last inspection date
- Longitude and latitude
  Ownership Local authorities
  Aggregation Data are available at the level of individual businesses
  Sharing Data are freely available online via the Food Standards Agency website as part of the UK Government’s open data initiative. There are no restrictions as to the use of the data
  Temporality The FSA website pulls data on a daily basis from LA food hygiene ratings databases. There is no information on how regularly the LA themselves update their databases, and this is likely to vary between LA. Correspondence with an environmental health officer from one LA, for example, indicated that their data were updated fortnightly.
Data on the FSA website are overwritten with each daily update, and thus no historical data are available
Exemplars Indicative use cases Data can be used to characterise access to food outlets [25, 26] and to assess the quality/acceptability of food offerings within an area (via hygiene ratings)
  Foresight nodes 7.4 Food exposure, 7.5 Food abundance, 7.7 Convenience of food offerings, 7.8 Food variety
Supermarket loyalty card data
Background Key features Transactional records for food and drink purchases (and everything else you can buy in a supermarket)
  History Traditionally these data are collected for the card holder to gain points on their purchases within a given store. Retailers use the data to target promotions and marketing
  Purpose As above in history
Elements Content Example data fields:
- Customer ID (or pseudoID)
- Customer home address aggregated to an area level
- ID for supermarket address where purchase made
- Food type purchased: e.g., avocado
- Food group purchased: e.g., produce
- Number of items purchased in supermarket
- Cost of items purchased in supermarket
- Number of items purchased online
- Cost of items purchased online
- Number of items purchased in convenience store
- Cost of items purchased in convenience store
  Ownership Supermarket or the loyalty card provider if different
  Aggregation Individual data
Geographic identifier–Output area
  Sharing Currently on a project by project basis.
Some data available via the Consumer Data Research Centre (CDRC)
  Temporality Date and time of purchase available
Exemplars Indicative use cases Many examples to date relate to store location planning by major supermarkets, for example demand for grocery retailers in tourist areas, determined by store loyalty card transactions [27]
  Foresight nodes 1.8 Media consumption 1.11 Exposure to food advertising 1.16 Smoking cessation 2.10 Use of medicines 5.7 Level of available energy 5.12 Reliance of pharma remedies 5.20 Quality and quantity of breastfeeding and weaning 6.1 Purchasing power 6.4 Demand for health 6.8 Desire to maximise volume 6.9 Desire to differentiate food offerings 6.11 Desire to minimise costs 6.12 Standardisation of food offerings 6.13 Market price of food offerings 6.17 Societal pressure to consume 7.4 Food exposure, 7.5 Food abundance 7.6 De-skilling 7.7 Convenience of food offerings 7.8 Food variety 7.9 Alcohol consumption 7.11 Energy density of food offerings 7.12 Fibre content of food and drink 7.13 Portion size 7.14 Demand for convenience 7.16 Nutritional quality of food and drink 7.1 Force of dietary habits
Physical activity applications/wearables
Background Key features Real-time or near to real time recording physical activity. Often Global Positioning System (GPS) point data from the phone or app in addition to detailed information from the device. This will likely include information on the duration, intensity, time and place of the activity. Some of the more basic step counters may only include indication of total steps
  History These devices have become increasingly popular in recent years for personal monitoring of physical activity. Opportunity to earn rewards e.g., Bounts or Pru vitality can be motivating. Opportunity for gamification, or for joining up with friends to challenge each other provide further motivation
  Purpose To monitor personal physical activity levels
Elements Content Example: Bounts
- Serial number to identify records in the report
- UserID (or pseudo id)
- Date and time
- App source
- Distance travelled (m)
- Activity type
- Activity duration (s)
- Number of steps
- MYZONE Effort Points–calculated using the MYZONE system which converts heart rate, calories and time exercising into points
- Average speed km/h
- First four digits of post code
- Gender
- Year of birth
- GPS point data – latitude, longitude, altitude, accuracy, location type, course, speed
  Ownership The individual.
Access at scale is often via the technology company owner
  Aggregation Data are at the level of an individual. However identifiers are at an aggregated area level.
Fine grain GPS estimates
  Sharing Bounts data available via the CDRC. This includes data from other fitness devices streamed via the Bounts App.
Data from other sources available at a monetary cost e.g., Strava
  Temporality Bounts data has GPS point data for every 20 minutes throughout the day for data collected by the app installed on a phone. These data are downloaded daily to the Consumer Data Research Centre (CDRC)
Exemplars Indicative use cases Prior to the use of new types of activity trackers, assessing the reliability of the data generated by these devices is essential. Evaluation of the popular Fitbit tracker for use in health care monitoring is one example of this [28]
  Foresight nodes 3.1 Physical activity 3.2 Functional fitness 3.3 NEAT non-volitional activity 3.4 Level of recreational activity 3.5 Level of domestic activity 3.6 Level of occupational activity 3.7 Level of transport activity 4.2 Opportunities for team based activity 4.3 Access to opportunities for Physical exercise 4.4 Cost of physical exercise 4.10 Ambient temperature 4.12 Dominance of sedentary employment 4.13 Walkability of living environment 7.4 Food exposure
Web-based or smartphone apps to record diet
Background Key features Using new technologies to record diet offer two new key features: opportunity to select from a wide range of food and beverage products and a timely in depth nutrient breakdown of foods recorded as consumed
  History Traditionally recording of diet has been done through paper based questionnaires and diaries which are burdensome for participants to complete and for researched to code in nutrient composition software. Nutrient composition software typically only include nutrient breakdown for ~3200 foods, whereas tools like myfood24 offer nutrient composition of ~45000 food at the push of a button
  Purpose New technologies enable timely recording of diet for personal use and for research purposes
Elements Content Self-reported dietary consumption including elements such as: meal slot, time of day, branded and/or generic items, scanned unique product codes (UPC; ‘bar codes’), portion size, own recipes, photos of meals, nutrient composition of foods
  Ownership The individual and the technology company
  Aggregation Individual level – nutrient summary information or a full breakdown (120 nutrients).
myfood24 will provide the individual’s region of residence.
Certain phone apps will likely include some GPS point data
  Sharing Depends on the technology
  Temporality Multiple entries are likely depending of the type of use by the individual
Exemplars Indicative use cases The MyMealMate app has been evaluated for use in weight loss. And is available for download for Android and IOS.
Development, usability and relative validity of myfood24 has been well documented [29, 30]. The tool is available for research purposes currently. The public can access the tool via: www.myfood24.org
  Foresight nodes 4.3 Access to opportunities for physical exercise 5.20 Quality and quantity of breastfeeding and weaning 6.1 Purchasing power 6.4 Demand for health 6.8 Desire to maximise volume 6.9 Desire to differentiate food offerings 6.11 Desire to minimise costs 6.12 Standardisation of food offerings 6.13 Market price of food offerings 6.17 Societal pressure to consume 7.1 Force of dietary habits 7.4 Food exposure 7.5 Food abundance 7.6 De-skilling 7.7 Convenience of food offerings 7.8 Food variety 7.9 Alcohol consumption 7.11 Energy density of food offerings 7.12 Fibre content of food and drink 7.13 Portion size 7.14 Demand for convenience 7.16 Nutritional quality of food and drink
Cameo data from Callcredit
Background Key features Geodemographic classification data
  History Cameo is a suite of products which have been developed by a commercial organisation. A geodemographic classification was first developed from the 1991 Census (originally ‘Neighbours and Prospects’). The suite has been developed to include a range of classifications (e.g., Cameo Income, Green and Ethical). International classifications have been produced in a number of countries
  Purpose Cameo has been developed as a commercial product for targeted marketing and credit scoring. Government and public service organisations are also regular users of this and similar competing technologies
Elements Content Data are synthesised from a variety of sources, including census data, shareholder registers, house prices, expenditure surveys and corporate data. The product suite covers (many) major domains ranging from holiday preferences and shopping habits to leisure activities and technology awareness. Indicators with direct relevance to obesity include health club membership, participation in active sports and physical exercise, attitudes to health (‘slimmers’, ‘health conscious’), and propensity to visit pubs and restaurants
  Ownership Cameo data are the property of Callcredit, a commercial organisation based in Leeds, UK with a US parent
  Aggregation Profiles are commonly available for Lower Super Output Areas (LSOAs) as well as higher geographies such as postal sectors, local authorities and regions. Data may be provided for individual postcodes or even household profiles subject to confidentiality, anonymization and relevant ethical and legal considerations
  Sharing CDRC has a licence to access core products from the Cameo suite. Applications for use from individual researchers and groups is subject to a Research Approvals Process (data.cdrc.ac.uk). Specific data fields are potentially available subject to the presentation of an appropriate ‘business case’
  Temporality Some Cameo profiles are anchored in 2011 Census data but are continually updated using longitudinal data about customers, shareholders, voters and so on. Most datasets are updated annually
Exemplars Indicative use cases Cameo has been used in characterisation of obesity for neighbourhoods in the UK, US and Australia [31]. Neighbourhood classification has been used as a device for health care resource allocation for many years [32, 33], and in a variety of other applications
  Foresight nodes 1.1 Education 1.2 Acculturation 1.3 Media availability 1.4 Availability of passive entertainment options 1.8 Media consumption 1.11 Exposure to food advertising 1.12 TV watching 6.1 Purchasing power 6.10 Female employment 6.15 Level of employment 6.16 Pressure for growth and profitability 6.17 Societal pressure to consume
YouGov
Background Key features Self-reported data from opinion polls
  History YouGov provides self-reported data from opinion polls which are collected four times each year from a large panel of 250,000 adults. The questions in the survey are a combination of fixed topics and commissioned content. The themes are extremely wide ranging. A complete catalogue of available data resources may be obtained on request from the data owner
  Purpose Data were originally collected as a basis for political polls (under the organisation’s original name of Gallup). Commercial and social questions have been developed more recently
Elements Content Data spans many thematic areas including consumers, digital, politics, public services, brand profiles, financial services and sports
  Ownership Data are generated and maintained by YouGov on a commercial basis
  Aggregation Data are available as cross-classified individual responses which are coded down to a geography of 400 + local authority areas. Demographics are coded by broad categories e.g., gender, age (five groups), social class (six groups)
  Sharing The CDRC has a licence for data in three key areas of mobility, retail and sustainability. The variables relevant to health include product consumption (e.g., meat, vegetables, alcohol, carbonated drinks, confectionery and snacks); eating habits (self-classified) and concerns about food (e.g., salt, sugar, fats, gluten). Commissioned tables can potentially be generated at a modest but commercial rate
  Temporality Data are updated quarterly.
Exemplars Indicative use cases YouGov data have been regularly used [34]. Current work is considering the relationship between supermarket accessibility and electronic delivery of groceries, in which individual level choices are a useful feature.
  Foresight nodes 1.1 Education 1.5 Sociocultural valuation of food 2.9 Demand for indulgence/compensation 3.7 Level of transport activity 4.6 Reliance on labour saving devices and services 6.2 Pressure to improve access to food offerings 6.3 Pressure to cater for acquired tastes 6.4 Demand for health 7.1 Force of dietary habits 7.3 Tendency to graze 7.4 Food exposure 7.5 Food abundance 7.6 De-skilling 7.7 Convenience of food offerings 7.8 Food variety 7.9 Alcohol consumption 7.11 Energy density of food offerings 7.12 Fibre content of food and drink 7.13 Portion size 7.14 Demand for convenience 7.16 Nutritional quality of food and drink
Acxiom
Background Key features Self-reported data from voluntary consumption surveys
  History Acxiom is a very large poll collected in the order of one million returns every year. The data are primarily sourced from product guarantees and media (e.g., newspaper) inserts
  Purpose Data are from market research and widely used in marketing, advertising and also within local government
Elements Content Data includes basic demographics (age, gender, household composition) but also income and expenditure attributes. Relevant to obesity, it includes consumption profiles and lifestyle attitudes including sports and leisure pursuits. The content of irregular commissioned tables ranges from interest in holidays in Yorkshire to purchase of pet foods
  Ownership Acxiom is a private company which is now part of the VNU multi-media transnational corporation. The majority of the data owned by Acxiom are only accessible through commercial licence
  Aggregation Data are at individual level, coded to unit postcodes and classified by demographics and other self-reported categories for activity, behaviour and consumption variables
  Sharing Income and household composition profiles for unit postcodes (1.2 million streets) are licensed for the use of CDRC and its partners. Data relate to calendar year 2014
  Temporality Data have been collected since at least 2005, with many variables captured on a recurrent basis. Composition of the sample varies from year to year according to responsiveness of consumers and their exposure to the questionnaires
Exemplars Indicative use cases Exploration of the Acxiom data in the context of household migration has been undertaken by Thomas (2014) [35]. Use of the data in the context of retail consumption in times of austerity and the “credit crunch” have been considered by Thompson (2013) [36] and Clarke (2015) [37]. These academic studies have explored and reweighted for skews and variable quality of the individual returns
  Foresight Nodes 1.1 Education, 1.3 Media availability, 1.4 Availability of passive entertainment options, 1.8 Media consumption, 1.11 Exposure to food advertising, 1.12 TV watching, 1.16 Smoking cessation, 2.2 Face to face social interaction, 3.1 Physical activity, 3.4 Level of recreational activity, 3.5 Level of domestic activity, 3.6 Level of occupational activity, 3.7 Level of transport activity, 4.6 Reliance on labour saving devices and services, 4.11 Dominance of motorised transport, 6.1 Purchasing power, 6.10 Female employment, 6.15 Level of employment