In a paper published in the Journal of Food Engineering, researchers proposed a framework for automated inspection systems in the food industry, focusing on the geometric properties of agricultural products using computer vision.
Background
Carrots are popular root vegetables known for their numerous health benefits. Their global production has been steadily increasing over the years. However, traditional methods of grading and sorting carrots based on size can be imprecise and challenging, especially when dealing with different carrot varieties and shapes.
While the manufacturing industry has successfully implemented automated optical inspection (AOI) using artificial intelligence (AI) for various products, optical inspection in the agricultural industry still requires improvement due to the variations in raw products and difficulties in accurately quantifying each item.
The current work aims to address these challenges by introducing image-processing techniques for grading and sorting carrots. By incorporating both RGB and depth information from a depth sensor, the proposed framework enables precise identification of the geometric properties of carrots. Essential techniques such as object detection, semantic segmentation, and object tracking are employed to overcome the obstacles in real time.
This research’s contributions include creating a new dataset, proposing an artificial intelligence-automated optical inspection (AI-AOI) system to approximate the geometric properties of a carrot, incorporating a depth camera to obtain depth information on the target object, and conducting comprehensive experimentation to evaluate the feasibility and practicality of the framework.
Related work
Several recent studies have explored similar topics. One study focused on volume estimation of carrots using artificial neural networks (ANNs) and manual measurements, while another aimed to classify different types of carrots using convolutional neural networks (CNNs). Another work explored tasks such as classification, grading, and shape approximation of carrots using the ShuffleNet architecture.
Other studies discussed the volume estimation of cherry tomatoes using machine learning and the volume estimation of mango fruits using 3D shape analysis. Estimation of sweet onion volume using an RGB-D sensor and volume estimation of chicken eggs using depth images were also explored. These studies demonstrated promising results but had limitations, such as the need for human intervention, challenges with lighting conditions, and limitations in capturing irregular shapes.
Proposed method
T the current study's objective is to determine each carrot's geometric properties, including length, width, area, and volume. The proposed framework includes several components, such as hardware equipment setup, carrot image acquisition, RGB image processing, depth image processing, and training of neural networks (ANN and CNN) for estimating geometric properties.
Hardware equipment setup: This involves the use of a conveyor system, a depth sensor, and other peripherals to acquire images and data. Both RGB and depth information for each carrot were captured on the conveyor system during the process of carrot image acquisition.
RGB image processing: Various techniques, such as semantic segmentation, object detection, and tracking were applied during the experiment. Semantic segmentation was achieved using the DeepLabv3+ algorithm, while object detection was performed using the YOLOv4-tiny structure. Object tracking utilized the Kalman filter for precise spatial position prediction.
Depth image processing involves null pixel fill-in, carrot separation, and orientation adjustment. While null pixel fill-in replaces missing depth values using neighboring pixels, carrot separation distinguishes individual carrots using semantically segmented masks and bounding boxes. Orientation adjustment aligns carrots vertically using erosion morphological operations and linear curve fitting.
Training neural networks: Geometric property approximation is achieved through neural network-based models. An ANN estimates the width and length, while a CNN estimates the volume. The networks are trained on datasets of samples with ground-truth measurements.
The proposed framework combines hardware setup, image processing techniques, and neural networks to accurately estimate the geometric properties of carrots.
Experiments and results
The experiment utilized a database consisting of 3120 images from 10 videos, with a resolution of 640 x 480. The geometric properties of the carrots were measured before video acquisition. Model training was performed using MATLAB 2022a with specific settings. Pre-trained models like ResNet-18 and CSPDarknet53 were fine-tuned, and a 10-fold cross-validation scheme was employed. Performance metrics included accuracy, Intersection over Union, Dice coefficient, precision, recall, F1-score, RMSE, MAPE, and mean absolute error.
The average error rates in the estimation of geometric properties such as length, width, and volume were 1.85%, 2.51%, and 5.35%, respectively. The model performed well for length and width predictions, with some outliers. The volume prediction had a slightly higher error rate due to data bias. Video analysis showed variations in performance across videos. The correlation between actual and predicted values had a high R-square. The proposed framework demonstrated accurate property prediction in real-world conditions.
Conclusion
The present study introduces a novel technique for estimating the geometric characteristics of agricultural products using a depth camera and deep learning algorithms. The proposed pipeline achieved an average error rate of under 5% when tested on 20 carrots, employing image processing techniques such as DeepLabv3+ and YOLOv4-tiny.
The present study's findings showcase the potential for live surveillance of conveyor systems in the farming industry, thereby making a substantial contribution to the advancement of Industry 4.0 and the ongoing digital transformation process. Future research can explore applying the framework to other products, improving prediction results and computation time with advanced image processing algorithms, and enhancing real-time processing through IoT integration.