Epidemiology and Population Health

Evidence from big data in obesity research: international case studies



Obesity is thought to be the product of over 100 different factors, interacting as a complex system over multiple levels. Understanding the drivers of obesity requires considerable data, which are challenging, costly and time-consuming to collect through traditional means. Use of ‘big data’ presents a potential solution to this challenge. Big data is defined by Delphi consensus as: always digital, has a large sample size, and a large volume or variety or velocity of variables that require additional computing power (Vogel et al. Int J Obes. 2019). ‘Additional computing power’ introduces the concept of big data analytics. The aim of this paper is to showcase international research case studies presented during a seminar series held by the Economic and Social Research Council (ESRC) Strategic Network for Obesity in the UK. These are intended to provide an in-depth view of how big data can be used in obesity research, and the specific benefits, limitations and challenges encountered.

Methods and results

Three case studies are presented. The first investigated the influence of the built environment on physical activity. It used spatial data on green spaces and exercise facilities alongside individual-level data on physical activity and swipe card entry to leisure centres, collected as part of a local authority exercise class initiative. The second used a variety of linked electronic health datasets to investigate associations between obesity surgery and the risk of developing cancer. The third used data on tax parcel values alongside data from the Seattle Obesity Study to investigate sociodemographic determinants of obesity in Seattle.


The case studies demonstrated how big data could be used to augment traditional data to capture a broader range of variables in the obesity system. They also showed that big data can present improvements over traditional data in relation to size, coverage, temporality, and objectivity of measures. However, the case studies also encountered challenges or limitations; particularly in relation to hidden/unforeseen biases and lack of contextual information. Overall, despite challenges, big data presents a relatively untapped resource that shows promise in helping to understand drivers of obesity.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


  1. 1.

    Davison KK, Birch LL. Childhood overweight: a contextual model and recommendations for future research. Obes Rev. 2001;2:159–71.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  2. 2.

    Egger G, Swinburn B. An “ecological” approach to the obesity pandemic. BMJ. 1997;315:477–80.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  3. 3.

    Harrison K, Bost KK, McBride BA, Donovan SM, Grigsby-Toussaint DS, Kim J, et al. Toward a developmental conceptualization of contributors to overweight and obesity in childhood: the six-Cs model. Child Dev Perspect. 2011;5:50–8.

    Article  Google Scholar 

  4. 4.

    Butland B, Jebb S, Kopelman P, McPherson K, Thomas S, Mardell J et al. Foresight. Tackling obesities: future choices—project report. Government Office for Science; 2007.

  5. 5.

    Rutter HR, Bes-Rastrollo M, de Henauw S, Lahti-Koski M, Lehtinen-Jacks S, Mullerova D, et al. Balancing upstream and downstream measures to tackle the obesity epidemic: a position statement from the European association for the study of obesity. Obes Facts. 2017;10:61–3.

    PubMed  PubMed Central  Article  Google Scholar 

  6. 6.

    Mittelstadt BD, Floridi L. The ethics of big data: current and foreseeable issues in biomedical contexts. Sci Eng Ethics. 2016;22:303–41.

    PubMed  Article  Google Scholar 

  7. 7.

    Kaisler S, Armour F, Espinosa JA, Money W. Big data: issues and challenges moving forward. In: Proceedings of the 46th Hawaii International Conference on System Sciences. Association for Computing Machinery Digital Library; 2013. p. 995–1004.

  8. 8.

    Herland M, Khoshgoftaar TM, Wald R. A review of data mining using big data in health informatics. J Big Data. 2014;1: https://doi.org/10.1186/2196-1115-1-2.

  9. 9.

    Vogel C, Zwolinsky S, Griffiths C, Hobbs M, Henderson E, Wilkins E. A Delphi study to build consensus on the definition and use of big data in obesity research. Int J Obes. 2019. https://doi.org/10.1038/s41366-018-0313-9.

  10. 10.

    Morris M, Birkin M. The ESRC strategic network for obesity: tackling obesity with big data. Int J Obes. 2018;42:1948–50.

    Article  Google Scholar 

  11. 11.

    Timmins K, Green M, Radley D, Morris M, Pearce J. How has big data contributed to obesity research? A review of the literature. Int J Obes. 2018;42:1951–62.

    Article  Google Scholar 

  12. 12.

    Monsivais P, Francis O, Lovelace R, Chang M, Strachan E, Burgoine T. Data visualisation to support obesity policy: case studies of data tools for planning and transport policy in the UK. Int J Obes. 2018;42:1977–86.

    Article  Google Scholar 

  13. 13.

    Morris M, Wilkins E, Timmins K, Bryant M, Birkin M, Griffiths C. Can big data solve a big problem? Reporting the obesity data landscape in line with the Foresight obesity system map. Int J Obes. 2018;42:1963–76.

    Article  Google Scholar 

  14. 14.

    Vayena E, Salathé M, Madoff LC, Brownstein JS. Ethical challenges of big data in public health. PLOS Comput Biol. 2015;11:e1003904.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  15. 15.

    Silver LD, Ng SW, Ryan-Ibarra S, Taillie LS, Induni M, Miles DR, et al. Changes in prices, sales, consumer spending, and beverage consumption one year after a tax on sugar-sweetened beverages in Berkeley, California, US: a before-and-after study. PLoS Med. 2017;14:e1002283.

    PubMed  PubMed Central  Article  Google Scholar 

  16. 16.

    Gore RJ, Diallo S, Padilla J. You are what you tweet: connecting the geographic variation in america’s obesity rate to Twitter content. PLoS ONE. 2015;10:e0133505.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  17. 17.

    Nguyen QC, Li D, Meng H-W, Kath S, Nsoesie E, Li F, et al. Building a national neighborhood dataset from geotagged Twitter data for indicators of happiness, diet, and physical activity. JMIR Public Health Surveill. 2016;2:e158.

    PubMed  PubMed Central  Article  Google Scholar 

  18. 18.

    Hirsch JA, James P, Robinson JR, Eastman KM, Conley KD, Evenson KR, et al. Using MapMyFitness to place physical activity into neighborhood context. Front Public Health. 2014;2:1–9.

    Article  Google Scholar 

  19. 19.

    Althoff T, Hicks JL, King AC, Delp SL, Leskovec J. Large-scale physical activity data reveal worldwide activity inequality. Nature. 2017;547:336–9.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  20. 20.

    Kerr NL. HARKing: hypothesizing after the results are known. Pers Soc Psychol Rev. 1998;2:196–217.

    CAS  PubMed  Article  Google Scholar 

  21. 21.

    Lee IM, Shiroma EJ, Lobelo F, Puska P, Blair SN, Katzmarzyk PT, et al. Effect of physical inactivity on major non-communicable diseases worldwide: an analysis of burden of disease and life expectancy. Lancet. 2012;380:219–29.

    PubMed  PubMed Central  Article  Google Scholar 

  22. 22.

    Bennett JE, Li G, Foreman K, Best N, Kontis V, Pearson C, et al. The future of life expectancy and life expectancy inequalities in England and Wales: Bayesian spatiotemporal forecasting. Lancet. 2015;386:163–70.

    PubMed  PubMed Central  Article  Google Scholar 

  23. 23.

    World Health Organisation. Report of the Commission on ending childhood obesity. Geneva, Switzerland: World Health Organisation; 2016.

  24. 24.

    Centers for Disease Control and Prevention. Recommended community strategies and measurements to prevent obesity in the United States. Atlanta, GA, U.S.: Centers for Disease Control and Prevention; 2009.

  25. 25.

    Local Government Association. Building the foundations: tackling obesity through planning and development. London, UK: Local Government Association; 2016.

  26. 26.

    Burgoine T, Alvanides S, Lake AA. Creating ‘obesogenic realities’; Do our methodological choices make a difference when measuring the food environment? Int J Health Geogr. 2013;12. https://doi.org/10.1186/1476-072X-12-33.

  27. 27.

    Wilkins E, Morris M, Radley D, Griffiths C. Methods of measuring associations between the Retail Food Environment and weight status: Importance of classifications and metrics. SSM Popul Health. 2019. https://doi.org/10.1016/j.ssmph.2019.100404.

  28. 28.

    Bardou M, Barkun AN, Martel M. Obesity and colorectal cancer. Gut. 2013;62:933–47.

    CAS  PubMed  Article  Google Scholar 

  29. 29.

    Siegel R, Desantis C, Jemal A. Colorectal cancer statistics, 2014. CA Cancer J Clin. 2014;64:104–17.

    PubMed  Article  Google Scholar 

  30. 30.

    Derogar M, Hull MA, Kant P, Östlund M, Lu Y, Lagergren J. Increased risk of colorectal cancer after obesity surgery. Ann Surg. 2013;258:983–8.

    PubMed  Article  Google Scholar 

  31. 31.

    Kant P, Hull MA. Excess body weight and obesity—the link with gastrointestinal and hepatobiliary cancer. Nat Rev Gastroenterol Hepatol. 2011;8:224–38.

    CAS  PubMed  Article  Google Scholar 

  32. 32.

    Östlund MP, Lu Y, Lagergren J. Risk of obesity-related cancer after obesity surgery in a population-based cohort study. Ann Surg. 2010;252:972–6.

    PubMed  Article  Google Scholar 

  33. 33.

    Sainsbury A, Goodlad RA, Perry SL, Pollard SG, Robins GG, Hull MA. Increased colorectal epithelial cell proliferation and crypt fission associated with obesity and roux-en-Y gastric bypass. Cancer Epidemiol Biomark Prev. 2008;17:1401–10.

    CAS  Article  Google Scholar 

  34. 34.

    Aravani A, Downing A, Thomas JD, Lagergren J, Morris EJA, Hull MA. Obesity surgery and risk of colorectal and other obesity-related cancers: an English population-based cohort study. Cancer Epidemiol. 2018;53:99–104.

    PubMed  PubMed Central  Article  Google Scholar 

  35. 35.

    Openshaw S. The modifiable areal unit problem. In: Concepts and techniques in modern geography. Norwich: Geo Books; 1984. p. 1–41.

  36. 36.

    Kwan M-P. The uncertain geographic context problem. Ann Assoc Am Geogr. 2012;102:958–68.

    Article  Google Scholar 

  37. 37.

    Di Zhu X, Yang Y, Liu X. The importance of housing to the accumulation of household net wealth. Harvard, USA: Joint Center for Housing Studies, Harvard University; 2003.

  38. 38.

    Rehm CD, Moudon AV, Hurvitz PM, Drewnowski A. Residential property values are associated with obesity among women in King County, WA, USA. Soc Sci Med. 2012;75:491–5.

    PubMed  PubMed Central  Article  Google Scholar 

  39. 39.

    Drewnowski A, Buszkiewicz J, Aggarwal A. Soda, salad, and socioeconomic status: findings from the Seattle Obesity Study (SOS). SSM Popul Health. 2019;7:e100339.

    Article  Google Scholar 

  40. 40.

    Birkin M, Morris MA, Birkin TM, Lovelace R. Using census data in microsimulation modelling. In: Stillwell J, Duke-Williams O, editors. The Routledge handbook of census resources, methods and applications. 1st ed. Routledge: IJO publication; 2018.

  41. 41.

    Jiao J, Drewnowski A, Moudon AV, Aggarwal A, Oppert J-M, Charreire H, et al. The impact of area residential property values on self-rated health: a cross-sectional comparative study of Seattle and Paris. Prev Med Rep. 2016;4:68–74.

    PubMed  PubMed Central  Article  Google Scholar 

  42. 42.

    Nguyen DM, El-Serag HB. The epidemiology of obesity. Gastroenterol Clinics. 2010;39:1–7.

    CAS  Article  Google Scholar 

  43. 43.

    Pickett KE, Pearl M. Multilevel analyses of neighbourhood socioeconomic context and health outcomes: a critical review. J Epidemiol Commun Health. 2001;55:111–22.

    CAS  Article  Google Scholar 

  44. 44.

    Timperio A, Salmon J, Telford A, Crawford D. Perceptions of local neighbourhood environments and their relationship to childhood overweight and obesity. Int J Obes. 2005;29:170–5.

    CAS  Article  Google Scholar 

  45. 45.

    Roda C, Charreire H, Feuillet T, Mackenbach JD, Compernolle S, Glonti K, et al. Mismatch between perceived and objectively measured environmental obesogenic features in European neighbourhoods. Obes Rev. 2016;17 S1:31–41.

    PubMed  Article  Google Scholar 

  46. 46.

    Drewnowski A, Arterburn D, Zane J, Aggarwal A, Gupta S, Hurvitz PM, et al. The Moving to Health (M2H) approach to natural experiment research: a paradigm shift for studies on built environment and health. SSM Popul Health. 2019;7:100345.

    CAS  PubMed  Article  Google Scholar 

  47. 47.

    Bourassa SC, Cantoni E, Hoesli M. Predicting house prices with spatial dependence a comparison of alternative methods. J Real Estate Res. 2010;32:139–60.

    Google Scholar 

  48. 48.

    Wilkins EL, Radley D, Morris MA, Griffiths C. Examining the validity and utility of two secondary sources of food environment data against street audits in England. Nutr J. 2017;16:1–13.

    Article  Google Scholar 

  49. 49.

    Nevalainen J, Erkkola M, Saarijarvi H, Nappila T, Fogelholm M. Large-scale loyalty card data in health research. Digit Health. 2018;4:2055207618816898.

    PubMed  PubMed Central  Google Scholar 

  50. 50.

    Aiello L, Schifanello R, Quercia D, Del Prete L. Large-scale and high-resolution analysis of food purchases and health outcomes. EPJ Data Sci. 2019;8:14.

  51. 51.

    Craig CL, Marshall AL, Sjostrom M, Bauman AE, Booth ML, Ainsworth BE, et al. International physical activity questionnaire: 12-country reliability and validity. Med Sci Sports Exerc. 2003;35:1381–95.

    PubMed  Article  Google Scholar 

  52. 52.

    Zwolinsky S, McKenna J, Pringle A, Widdop P, Griffiths C, Mellis M, et al. Physical activity and sedentary behavior clustering: segmentation to optimize active lifestyles. J Phys Act Health. 2016;13:921–8.

    PubMed  Article  Google Scholar 

  53. 53.

    Bauman A, Ainsworth BE, Sallis JF, Hagströmer M, Craig CL, Bull FC, et al. The descriptive epidemiology of sitting: a 20-country comparison using the International Physical Activity Questionnaire (IPAQ). Am J Prev Med. 2011;41:228–35.

    Article  Google Scholar 

  54. 54.

    Guerin PB, Diiriye RO, Corrigan C, Guerin B. Physical activity programs for refugee somali women: working out in a new country. Women & Health. 2003;38:83–99.

    Article  Google Scholar 

  55. 55.

    Pope L, Harvey J. The efficacy of incentives to motivate continued fitness-center attendance in college first-year students: a randomized controlled trial. J Am Coll Health. 2014;62:81–90.

    PubMed  Article  Google Scholar 

  56. 56.

    Cetateanu A, Jones A. Understanding the relationship between food environments, deprivation and childhood overweight and obesity: evidence from a cross sectional England-wide study. Health Place. 2014;27:68–76.

    PubMed  PubMed Central  Article  Google Scholar 

  57. 57.

    Harrison F, Burgoine T, Corder K, van Sluijs EM, Jones A. How well do modelled routes to school record the environments children are exposed to? A cross-sectional comparison of GIS-modelled and GPS-measured routes to school. Int J Health Geogr. 2014;13:5.

    PubMed  PubMed Central  Article  Google Scholar 

  58. 58.

    Ells LJ, Macknight N, Wilkinson JR. Obesity surgery in England: an examination of the health episode statistics 1996–2005. Obes Surg. 2007;17:400–5.

    PubMed  Article  Google Scholar 

  59. 59.

    Nielsen JDJ, Laverty AA, Millett C, Mainous AG, Majeed A, Saxena S. Rising obesity-related hospital admissions among children and young people in England: National time trends study. PLoS ONE. 2013;8:e65764.

    CAS  Article  Google Scholar 

  60. 60.

    Smittenaar C, Petersen K, Stewart K, Moitt N. Cancer incidence and mortality projections in the UK until 2035. Br J Cancer. 2016;115:1147–55.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  61. 61.

    Wallington M, Saxon EB, Bomb M, Smittenaar R, Wickenden M, McPhail S, et al. 30-day mortality after systemic anticancer treatment for breast and lung cancer in England: a population-based, observational study. The Lancet Oncol. 2016;17:1203–16.

    PubMed  Article  Google Scholar 

  62. 62.

    Smolina K, Wright FL, Rayner M, Goldacre MJ. Determinants of the decline in mortality from acute myocardial infarction in England between 2002 and 2010: Linked national database study. BMJ. 2012;344:d8059.

    PubMed  PubMed Central  Article  Google Scholar 

  63. 63.

    Hanratty B, Lowson E, Grande G, Payne S, Addington-Hall J, Valtorta N, et al. Transitions at the end of life for older adults–patient, carer and professional perspectives: A mixed-methods study. Health Serv Deliv Res. 2014. https://doi.org/10.3310/hsdr02170.

  64. 64.

    Aggarwal A, Monsivais P, Cook AJ, Drewnowski A. Does diet cost mediate the relation between socioeconomic position and diet quality? Eur J Clin Nutr. 2011;65:1059–66.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  65. 65.

    Drewnowski A, Aggarwal A, Tang W, Moudon AV. Residential property values predict prevalent obesity but do not predict 1-year weight change. Obesity. 2015;23:671–6.

    PubMed  Article  Google Scholar 

Download references


The ESRC Strategic Network for Obesity was funded via ESRC grant number ES/N00941X/1. The authors would like to thank all of the network investigators (https://www.cdrc.ac.uk/research/obesity/investigators/) and members (https://www.cdrc.ac.uk/research/obesity/network-members/) for their participation in network meetings and discussion which contributed to the development of this paper.

Author information



Corresponding author

Correspondence to Michelle A. Morris.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wilkins, E., Aravani, A., Downing, A. et al. Evidence from big data in obesity research: international case studies. Int J Obes 44, 1028–1040 (2020). https://doi.org/10.1038/s41366-020-0532-8

Download citation