In an article recently published in the journal Scientific Reports, researchers investigated the feasibility of using pseudo-labeling to improve the deep neural network (DNN) predictive performance for animal identification.
Background
Computer vision (CV) systems can be utilized for high-throughput precise phenotyping in multiple domains, such as farm management, animal breeding, and precision medicine. State-of-the-art DNN algorithms used in supervised learning tasks, such as CV tasks, typically require large amounts of annotated training data/labeled images to realize satisfactory performance.
However, most CV system-generated image data in agriculture are both resource- and labor-intensive to analyze, annotate, and organize. Moreover, manual annotation of agricultural objects, such as individual animals in a herd, is extremely challenging, error-prone, and time-consuming.
Several techniques, such as semi-supervised learning (SSL) and few-shot learning, have been proposed to enable DNNs to learn from smaller labeled datasets to decrease the data annotation, collection, and preprocessing costs while ensuring good predictive performance.
Currently, the CV methods used for animal identification, which is the initial step for individual animal phenotyping, are primarily based on fully supervised learning approaches that need a large number of annotated images for animal identification in both open-set scenario/animal identification in dynamic environments in large commercial farms and closed-set scenario/animal identification in a specific group of animals.
Thus, SSL can be utilized as an effective tool for animal identification by leveraging the unlabeled data obtained from the camera systems deployed at farms. The machine learning (ML) algorithm in SSL can learn the structured information from the labeled part of the dataset and utilize the patterns identified in the unlabeled data to increase its generalization power and improve predictive performance.
SSL is most suitable for scenarios where labeling all data is unfeasible or expensive, and only a part of the dataset can be annotated. Several studies have investigated the feasibility of using SSL in livestock systems for animal identification. For instance, an SSL technique leveraging unlabeled data significantly improved the dairy cow teat-end condition classification performance. However, the SSL method has not been utilized for the identification of individual animals using CV systems until now.
Pseudo-labeling, a simple and versatile technique within SSL, can effectively improve trained ML model predictive performance when annotating data is expensive and a substantial amount of unlabeled data is available. Thus, researchers in this study selected the pseudo-labeling SSL technique to train DNNs for animal identification.
Pseudo-labeling for animal identification
In this paper, researchers investigated the feasibility of using the pseudo-labeling technique to better the predictive performance of DNNs trained for Holstein cow identification using a large unlabeled dataset and labeled training datasets of different sizes.
Four Intel RealSense D435 depth cameras were used to collect images from 59 lactating cows. Overall, 23,709 snapshots were utilized in this study, with 4695 annotated with the corresponding cow identification code and 20,194 snapshots remaining unlabeled.
The annotated snapshots were then split into test, validation, and training sets based on the capture data, with the validation set being utilized to define the best threshold values for each pseudo-labeling algorithm round, while the test set was used as an independent final performance assessment.
The Keras library in Python with TensorFlow as the backend was used to train all neural network architectures, including Xception, NASNet Large, and MobileNetV2. The training process for all neural networks was conducted in two stages: fine-tuning and feature extraction.
The pseudo-labeling method investigated in this study involved training a convolutional neural network (CNN) in several rounds, with every round consisting of multiple steps, including the use of a manually labeled initial training set for training a neural network, predictions made by this trained neural network on a larger unlabeled dataset, and addition of the confidently predicted unlabeled images to the training set that contained previously labeled images to train a new neural network.
Significance of the study
Applying the pseudo-sampling technique to label the unlabeled images automatically improved the predictive performance of DNN models compared to models that used only manually labeled images for training. The baseline Xception model trained using the initial manually labeled training set achieved 83.45% accuracy and 77.54% accuracy on the validation and test sets, respectively.
Researchers used the Xception architecture to evaluate the impact of performing several pseudo-labeling rounds on accuracy, as this architecture provided the best trade-off between computational and predictive performance among all three DNN models, including NASNet Large and MobileNetV2. Thus, the predictive performance of the final Xception model on the test set after four pseudo-labeling rounds was evaluated and compared with the performance of the baseline Xception model.
The final model attained 95.25% accuracy and 92.71% accuracy on the validation set and test set, respectively, demonstrating a 15.17% absolute increase in testing accuracy compared to the baseline model after four pseudo-sampling rounds while identifying individuals in a herd of 59 cows.