FoodRepo: An Open Food Repository of Barcoded Food Products

Abstract
Metabolic disorders, such as diabetes or obesity, have become a major public health concern, with increasingly large parts of the global population affected (1, 2). Nutritional epidemiologists hope to better understand the underlying causes, the potential treatments and prevention strategies by analyzing population and individual patterns through studies that generally rely on surveying dietary habits. Traditional food-intake survey methods are based on questionnaires filled by participants at a given frequency. The frequency of diet records is an important factor contributing to the accuracy of the study (3). Multiple-day diet records might provide good accuracy when not based on memory, but require strong motivation and time commitment by the participants. Approaches like multiple/single 24-h recalls—involving a specialized interviewer performing surveys in person or on the phone with the participants—require less engagement, but pose issues with missing data as they rely on short-term memory. Finally, so-called Food Frequency Questionnaires, where participants are asked to indicate the frequency of intake of certain foods over long periods of time (typically 1 year), demand minimal participants' commitment, therefore allowing for large cohort studies on long-term dietary habits. However, the likelihood of missing or incorrect data increases as they count on participants' long-term memory. Overall, self-reported dietary data present biases which limit their applications, especially when they heavily rely on participants' memory (4). Such limitations, which should be properly addressed in further epidemiological studies, may be overcome with more advanced recording methodologies such as dietary biomarkers and digital technologies (5). Recent technological advances, and in particular the emergence and almost complete market penetration of smartphones, have offered interesting surveying alternatives. In particular, mobile phones have been successfully deployed in several food-related studies (6), for example using food photography (7–12). Other research has also explored the possibility of recording dietary habits by asking participants to scan the barcodes of their consumed food (13, 14). Although further investigations are required to assess self-reporting biases, these advances in nutritional research have triggered the release of mobile apps oriented mainly toward diabetes and weight-loss self-management (15–19), showing the willingness and interest of users to monitor their food intake if it provides potential health benefits. The further expansion of self-monitoring for research and medical purposes relies on comprehensive and continuously updated food databases. A few databases of barcoded products already exist, for example Open Food Facts (20) or the USDA Food Composition Databases (21). While they each have their strength, not all of them are openly accessible or, and they often have a limited product coverage, and are often not regularly updated. For Switzerland, we did not find any database whose product coverage was sufficiently high, where the data was completely open, and easily accessible through an Application Programming Interface (API). The last point was particularly important to us, as APIs are necessary for third parties to dynamically use the data in their products and services. Our approach was therefore to build an openly accessible database of barcoded food products with sufficiently high coverage, accessible through a stable API. Rather than focusing on a wide geographic range, we focused on a small country (Switzerland) in order to obtain the necessary coverage. The focus on the Swiss market further benefits from the need to support multiple languages from the beginning, thus making the system readily expandable to other countries, which we are now planning to do. Here, we present this system, which we call FoodRepo (https://www.foodrepo.org), an openly accessible database of barcoded food products, and we describe the data-acquisition framework, its quality control and maintenance. Here, the word repository is meant to be understood as a data repository, where the community can deposit an increasing number of datapoints on food products. The growing community around FoodRepo and the validation of new products make our database robust, scalable and self-sustainable in the long run. Currently, the FoodRepo database mostly holds products sold in Switzerland, from the main grocery stores in the country. Its international expansion is under development. Any item in the database is accessible through the FoodRepo website (for an example of products contained in the FoodRepo database, please see Figure 1A) or via our API, described in section Usage Notes. The CC-BY-4 license under which our database is released will allow its exploitation by different type of users, from academic researchers to commercial partners. For instance, a Swiss consumers association is using FoodRepo data in their NutriScan mobile app (22) to make the food package information more accessible, and to provide their users with an overall nutritional score. Figure 1. (A) Screenshot from the webpage of a product on the FoodRepo website. (B) Schematic representation of the pipeline behind our API. When a user or an application (left column) sends a call to the API, the request is handled by the server that hosts the API (middle column). This sends then a query to the server which hosts the FoodRepo database (right column), where the query is handled by the Elastic Search engine. The data is returned to the API server which performs final formatting before giving it back to the user or the application. (C) Distribution of API response times, color-coded according to different sections of the back-end pipeline, as shown in (B). In green (main plot and inset) the response-times of the Elastic Search...
Funding Information
  • Stiftelsen Kristian Gerhard Jebsen