In a paper published in the journal Scientific Reports, researchers explored the growing realm of agricultural automation and intelligence driven by technological progress. They emphasized the pivotal role of precision classification models in optimizing farming practices by accurately identifying, classifying, and processing agricultural products, ultimately improving production efficiency and economic value.
Addressing a notable challenge with the then-current mobile network version 2 (MobileNetV2) network model's recognition biases, the researchers proposed an enhanced version featuring a novel Res-Inception module inspired by Googlenet's Inception module and an efficient multi-scale cross-space learning module (EMA). Their experiments on the fruit-360 dataset demonstrated notable progress, with the improved MobileNetV2 showcasing a substantial increase in classification accuracy and promising advancements in agricultural product classification tasks.
Related Work
Past work in agricultural product classification has increasingly relied on advancements in computer image processing and deep learning, aiming to automate sorting processes and enhance accuracy. Various algorithms, such as lightweight and efficient deep network (LedNet), self-organizing map (SOM) network-based methods, and fine-tuned convolutional neural network (CNN) models, have been proposed, showing significant progress in fruit recognition tasks.
In particular, the Inception architecture has been instrumental in feature extraction, improving classification accuracy. Recent attention mechanisms, like the EMA module, have further boosted performance. However, most studies focus on specific fruits rather than comprehensive classification across diverse categories.
Methodology Overview
The methodology initially partitioned the dataset into training, validation, and test sets for model development and evaluation. During training, researchers trained the improved-MobileNetv2 model using both the training and validation sets, resulting in a well-trained model for subsequent testing on the dedicated test set.
Researchers devised the Improved-MobileNetv2 model to address limitations in accurately recognizing agricultural product subcategories, especially with increased category diversity. By enhancing the backbone architecture, the aim was to improve feature extraction capabilities for better subcategory recognition. Key enhancements included integrating the residual-inception (res-inception) module and introducing an EMA between consecutive modules. These improvements aimed to boost the model's accuracy in recognizing agricultural product subcategories, particularly in complex scenarios.
Inspired by the Inception architecture, the res-inception module introduced larger convolutional kernels into the depthwise separable convolution model to enhance feature extraction while controlling convolutional depth. This design allowed for more effective feature extraction without compromising model performance. Meanwhile, the EMA module was embedded into the backbone network to enhance feature extraction across spatial dimensions. By partitioning channels into sub-groups and employing parallel processing paths, EMA effectively captured pixel-level pairwise relationships and global context, improving recognition accuracy.
Overall, the proposed methodologies addressed the challenges of accurately recognizing agricultural product subcategories by enhancing feature extraction capabilities. Integrating the res-inception and EMA modules within the improved-MobileNetv2 model demonstrated promising results in improving model performance and detection accuracy, particularly for complex agricultural product recognition tasks.
Experimental Overview and Findings
The experimental section describes the fruit-360 dataset, comprising 90,483 images of 131 different fruit types. These images underwent preprocessing, including background removal, to facilitate model training and ensure data accessibility and reproducibility. Evaluation metrics, such as precision, recall, accuracy, and F1 score, were employed to assess the model's performance, with results presented as a confusion matrix.
The experimental setup utilized TensorFlow 2.10 and Python 3.9 on a hardware configuration comprising an Intel Core i7-12650H processor, 48.0 GB of Random Access Memory (RAM), and an NVIDIA graphics processing unit (GPU) based on the real-time ray tracing (RTX) technology (NVIDIA GeForce RTX) 4060 GPU. Model training was conducted over 100 epochs, employing the Adam optimizer and cross-entropy loss function.
The experimental design encompassed tests on the fruit-360 dataset, ablation experiments to analyze the res-inception and EMA modules' impact, experiments on different datasets to showcase model versatility, and comparisons with state-of-the-art classification frameworks. These experiments aimed to provide comprehensive insights into the model's performance and efficacy.
Results obtained from training the improved-MobileNetv2 model on the Fruit-360 dataset demonstrated rapid convergence and stability in loss and accuracy metrics. Ablation experiments highlighted the significant performance improvement achieved by enabling both the res-inception and EMA modules simultaneously. Comparative analyses emphasized the superiority of the Improved-MobileNetv2 model over competing frameworks, particularly in terms of accuracy and parameter efficiency.
Further comparison with other models, including AlexNet, visual Geometry Group 16-layer (VGG16), ResNet34, and ResNet50, underscored the Improved-MobileNetv2's superior accuracy and model complexity performance. These findings reaffirmed the efficacy of the proposed enhancements in enhancing agricultural product classification accuracy and efficiency, positioning the improved-MobileNetv2 as a leading solution in the field.
Conclusion
In conclusion, this paper introduced an enhanced MobileNetv2 model for agricultural product recognition, addressing challenges posed by diverse subcategories. By integrating res-inception and EMA modules, the model achieved remarkable accuracy improvements. Its applicability in agricultural classification tasks was evident, with potential for broader usage. Future work would have explored image augmentation techniques to optimize parameter efficiency without compromising accuracy, enhancing the model's efficiency for varied agricultural classification tasks.