In an article recently submitted to the arxiv* server, researchers introduced a semi-supervised concept bottleneck model (SSCBM) to improve CBMs. SSCBM addressed the high cost and effort of obtaining annotated concepts and alignment issues in CBMs. The framework generated pseudo labels and alignment loss by leveraging joint training on labeled and unlabeled data. Experiments demonstrated its effectiveness, achieving high concept and prediction accuracy with only 20% labeled data compared to fully supervised settings.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
Background
Past work on CBMs has aimed to enhance image classification and visual reasoning by introducing a concept bottleneck layer for better generalization and interpretability. However, CBMs often struggle due to incomplete information extraction and the need for extensive dataset annotation.
Researchers have addressed these issues with solutions like interactive prediction settings and label-free CBMs that maintain high accuracy without labeled concept data. SSL improves classifiers in scenarios with limited labeled data by utilizing pseudo-labeling and consistency regularization methods.
SSCBM Framework Overview
The SSCBM framework extends the concept embedding model (CEM) by employing distinct methods for processing labeled and unlabeled data. Labeled data is transformed through a feature extractor into latent representations, producing concept embeddings and predicted concept vectors. These vectors are compared with ground truth concepts to compute the concept loss, while the label predictor calculates the task loss based on the expected concept vectors.
For unlabeled data, image features are extracted using an image encoder, and pseudo-labels are assigned via the k-nearest neighbors (KNN) algorithm. The team generated a heatmap to predict alignment labels and alignment loss was computed between pseudo-labels and alignment labels, updating model parameters during each training epoch.
High-dimensional concept representations for labeled data are obtained using a backbone network like a residual network with 50 layers (ResNet50), followed by an embedding generator to produce concept embeddings. Compared to ground truth labels, the predicted binary concept vectors are optimized through binary cross-entropy loss. The task loss for classification is defined using categorical cross-entropy loss and applied to unlabeled data to maximize data utilization.
Pseudo-labeling for unlabeled data involves measuring image similarity through cosine distance and assigning pseudo-labels based on this similarity. Although effective, this method can cause misalignment in concept embeddings. Concept heatmaps are generated to refine the similarity scores between images and concepts, converting them into hard concept labels using a specified threshold. Alignment loss is computed to ensure the concept encoder aligns with pseudo and similarity-based labels. The overall loss function integrates concept, task, and alignment losses to balance interpretability and accuracy.
Experimental Framework Evaluation
This section details the experimental studies on the performance of SSCBM across three real-world image datasets (Caltech-University of California, San Diego birds-200-2011 (CUB), celebrities attributes (CelebA), and animals with attributes 2 (AwA2)). The evaluation compares SSCBM with two baseline models, CBM and CEM, in a supervised setting where ground truth labels are absent.
Key metrics include concept and task accuracy, which are essential for assessing the model's predictive capabilities under varying ratios of labeled to unlabeled data samples.
SSCBM demonstrates significant performance improvements across datasets. For instance, concept accuracy peaks at 95.04% on the CUB dataset with an 80% labeled ratio, indicating its robust performance with increased supervised data. Similarly, class accuracy consistently enhances across different datasets, underscoring SSCBM's effective utilization of limited supervised data.
To enhance interpretability, SSCBM employs concept heatmaps to refine similarity scores between images and concepts, which is crucial for transforming scores into precise concept labels based on predefined thresholds. This methodology effectively aligns concept embeddings with input saliency maps, improving interpretability and prediction accuracy.
SSCBM's adaptability is also demonstrated through test-time interventions, allowing dynamic adjustments to concept labels during inference based on real-time interactions. This capability enhances model robustness and ensures accurate predictions across diverse image analysis tasks.
In summary, SSCBM emerges as a versatile framework capable of effectively leveraging labeled and unlabeled data. By enhancing concept prediction accuracy and interpretability through innovative methodologies like concept heatmaps and test-time interventions, SSCBM proves suitable for various real-world applications in image analysis.
Conclusion
To sum up, the training of current CBMs heavily relied on the accuracy and richness of annotated concepts in the dataset. These concept labels were typically provided by experts, which could be costly and require significant resources and effort. Additionally, concept saliency maps are frequently misaligned with input saliency maps, causing concept predictions to correspond to irrelevant input features—an issue related to annotation alignment.
In response to these challenges, SSCBM was proposed as a strategy to generate pseudo labels and an alignment loss to address these issues. Results demonstrated the effectiveness of SSCBM in mitigating these challenges, showing promising outcomes in enhancing concept prediction accuracy and alignment with input features.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
Journal reference:
- Preliminary scientific report.
Hu, L., et al. (2024). Semi-supervised Concept Bottleneck Models. ArXiv. DOI:10.48550/arXiv.2406.18992, https://arxiv.org/abs/2406.18992