Predicting nationwide obesity from food sales using machine learning
Author
dc.contributor.author
Dunstan, Jocelyn
Author
dc.contributor.author
Aguirre Jerez, Marcela
Author
dc.contributor.author
Bastías García, Magdalena
Author
dc.contributor.author
Nau, Claudia
Author
dc.contributor.author
Glass, Thomas A.
Author
dc.contributor.author
Tobar, Felipe
Admission date
dc.date.accessioned
2020-06-17T22:55:47Z
Available date
dc.date.available
2020-06-17T22:55:47Z
Publication date
dc.date.issued
2020
Cita de ítem
dc.identifier.citation
Health Informatics Journal 2020, Vol. 26(1) 652– 663
es_ES
Identifier
dc.identifier.other
10.1177/1460458219845959
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/175549
Abstract
dc.description.abstract
The obesity epidemic progresses everywhere across the globe, and implementing frequent nationwide surveys to measure the percentage of obese population is costly. Conversely, country-level food sales information can be accessed inexpensively through different suppliers on a regular basis. This study applies a methodology to predict obesity prevalence at the country-level based on national sales of a small subset of food and beverage categories. Three machine learning algorithms for nonlinear regression were implemented using purchase and obesity prevalence data from 79 countries: support vector machines, random forests and extreme gradient boosting. The proposed method was validated in terms of both the absolute prediction error and the proportion of countries for which the obesity prevalence was predicted satisfactorily. We found that the most-relevant food category to predict obesity is baked goods and flours, followed by cheese and carbonated drinks.