In a paper published in the journal Scientific Reports, researchers detailed an innovative method using image-based systems to gauge food weight accurately. Their approach involved training advanced boosting regression algorithms on 23,052 annotated Mediterranean cuisine images dataset, encompassing 226 dishes with accompanying reference objects.
Extracted image features enabled the creation of a comprehensive dataset, and the model achieved impressive accuracy after rigorous training and validation. With a mean weight absolute error of 3.93 g, a mean fundamental percentage error of 3.73%, and a root mean square error of 6.05 g for 226 food items in the MedGRFood database, this novel methodology demonstrates substantial progress in estimating food weight and nutritional information through images.
Background
The contemporary challenge of managing daily dietary intake impacts global health across various income levels. Both inadequate and excessive nutritional intake contribute to malnutrition, a critical factor in chronic diseases and childhood mortality in low to middle-income countries.
Integrating Artificial Intelligence (AI), the Internet of Things (IoT), and computer vision, food applications have gained traction for dietary monitoring. These apps utilize image capture, enabling real-time health data tracking and ease of use. Traditional methods pale in comparison, boosting the popularity of such apps. Critical to these systems is the dataset of food images and the subsystem estimating food volume or weight. However, determining food quantity from images remains demanding, often requiring specific methodologies, controlled environments, or specialized equipment.
Optimizing Food Weight Estimation Methods
The study leveraged the MedGRFood image database, housing 51,840 images across 160 classes. A subset of 23,052 images, categorized into 226 food groups, formed the core dataset. Researchers took these images in controlled settings, ensuring each had a reference object alongside the dish. Annotations detailed food specifics such as categorization, specific names, cuisine, reference object presence, and food weight in grams. These annotations formed the foundation for a structured dataset, aiding machine learning regression models.
The dataset underwent meticulous manipulation for usability. Standardizing the reference object area to a consistent metric enabled uniformity by using a 2-euro coin as a reference. Additionally, a new feature—the ratio of reference area to food area—provided unique insights based on image distance and interest area perimeters.
Selecting pertinent attributes and excluding specific fields resulted in a structured data frame of 24,996 records and six columns, fostering focused analysis. Boosting regression algorithms—Extreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost), and Light Gradient Boosting Machine (LightGBM)—were utilized to estimate food weight. Renowned for regression capabilities, these algorithms iteratively refined predictions, emphasizing difficult instances for accuracy.
Hyperparameter tuning occurred via the Optuna framework, employing Bayesian optimization techniques. This step identified optimal parameter combinations, minimizing loss functions and enhancing model efficacy. Researchers trained and evaluated the models using a dataset subset to guarantee reliability across multiple runs.
Implementation occurred within a high-performance computing infrastructure optimized for data-intensive tasks. The dietary assessment system was developed in Python in the Anaconda environment, utilizing appropriate libraries to implement the food weight estimation system smoothly. This systematic approach ensured efficient model functioning and system performance.
Advancing Food Weight Estimation Innovatively
The boosting regression models underwent validation using a distinct subset to ensure consistency across multiple runs. Conducted 10 times for each model, the random selection of training and validation sets provided unbiased evaluation. The XGBoost algorithm, in particular, showcased notable success, achieving an overall of 3.93 g, a general of 3.73%, and an overall of 6.05 g per food item within the MedGRFood database. However, models employing CatBoost and LightGBM algorithms exhibited slightly higher errors, indicating comparatively lower performance.
Throughout various metrics, the XGBoost algorithm consistently outperformed other employed algorithms, revealing its dominance in predicting food weights accurately. Nevertheless, visual representations underscored nuances in the model's performance across different weight ranges and food categories. Foods with distinct shapes showed improved accuracy in predicted weight, while categories like vegetables exhibited higher deviations due to items with minimal weight potentially overlapping during photography.
This study's contribution lies in innovatively estimating food weight from annotated images. The proposed dataset, formulated by identifying critical features from annotated images, sets a new standard in addressing image-based food weight estimation challenges. It addresses limitations present in existing food image databases by including weight information for numerous food items, transcending limitations in capturing solid or liquid food images without constraints on plate shape or reference object type.
The methodology presented in this study requires only a single image without additional devices or specific acquisition methods, offering broader applicability to various food types and shapes. It stands out for its ability to calculate weight across diverse foods and provides a promising solution to the challenge of accurately estimating food weights through images. Boosting regression algorithms were chosen for their efficiency in regression problems, exhibiting robustness in handling categorical features and outliers while minimizing bias and overfitting risks.
However, the study acknowledges certain limitations, notably the need to evaluate the proposed system on an external food dataset, considering the tendency of boosting algorithms to underperform on unfamiliar data. The following steps involve constructing more complex models for food weight estimation and exploring Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) models to enhance accuracy and applicability across various datasets.
Conclusion
To sum up, this study introduces a model architecture focused on estimating food weight through feature extraction from annotated images. The developed framework incorporates an augmented regression algorithm, creating a novel solution for calculating food weight from images.
Integration with nutrient databases and dietary assessment systems aims to aid health professionals in identifying dietary risks and assisting consumers in maintaining a healthy, balanced diet. This approach holds promise in preventing malnutrition and various nutrition-related diseases and conditions, benefiting health practitioners and individuals concerned about their dietary habits.
Journal reference:
Konstantakopoulos, F. S., et al. (2023). A novel approach to estimate the weight of food items based on features extracted from an image using boosting algorithms. Scientific Reports, 13:1, 21040. https://doi.org/10.1038/s41598-023-47885-0, https://www.nature.com/articles/s41598-023-47885-0