Bridging the Gap: Visualizing and Enhancing Medical Text AI's Decision-Making

In a paper published in the journal Bioengineering, researchers addressed the challenge of applying deep learning in medicine by introducing explainable artificial intelligence (XAI). A high-accuracy computer vision model was used for medical text tasks and employed gradient-weighted class activation mapping (Grad-CAM) for intuitive visualization. Their system comprised four modules: pre-processing, word embedding, classification, and visualization.

Study: Bridging the Gap: Visualizing and Enhancing Medical Text AI
Study: Bridging the Gap: Visualizing and Enhancing Medical Text AI's Decision-Making. Image credit: mrmohock/Shutterstock

After comparing various word embeddings and classifiers, a ResNet-based model on formalized medical text achieved the best performance. This new approach combines ResNet and Grad-CAM and provided both high-accuracy classification and intuitive visualization of focus words during predictions.

Background

In recent years, artificial intelligence (AI) technology has advanced significantly by offering promising applications across various domains. In healthcare, AI has the potential to improve diagnostics and patient care. However, the opacity of AI algorithms poses ethical and practical challenges that lead to skepticism about their real-world performance.

To tackle this challenge, explainable AI (XAI) has been used, underscoring the importance of AI models offering comprehensible explanations for their decisions. This need for transparency is particularly critical in medical applications, where gaining trust and fostering acceptance are paramount goals.

Related Work

Prior research has underscored the difficulties associated with the lack of transparency in AI models, especially in healthcare, where comprehending decision-making processes is essential. One notable XAI technique is Grad-CAM, which generates intuitive heat maps to illustrate AI model focus areas during predictions. Beyond diagnostics, AI has the potential to enhance healthcare by supporting clinical decisions, reducing errors, and providing real-time health risk assessments. Rule-based AI systems have been employed in healthcare but face scalability and manual adjustment challenges. In medical text processing, AI focuses on tasks like entity recognition and relationship extraction.

Proposed Method

This study applied computer vision models to transfer learning for text-processing tasks and utilized the Grad-CAM method for model explainability. Word2Vec served as the word-embedding tool, while ResNet was the primary classifier. The dataset included clinical text data categorized into five classes. The adaptation of Grad-CAM, originally developed for computer vision, allowed for the interpretation of text models by generating thermal phase maps. This adaptation provided valuable insights into the decision-making process of the models. The study's experiments utilized Python, PyTorch for ResNet and Grad-CAM, and a Word2Vec model from Gensim.

The dataset contained 14,438 clinical texts across five categories. Each record was transformed into a 25x25 format to facilitate image-processing algorithms. Word2Vec generated word vectors with a window size of 100 and dimension 100. The Bidirectional Encoder Representations from Transformers (BERT) model was also employed for word embedding. ResNet18 and 1D convolutional neural network (CNN) models were employed for classification, with the text treated as multi-channel images. Additionally, traditional methods like Naïve Bayes were employed for comparison, using tf-idf features. The Grad-CAM module was integrated into the CNN classifier for attention visualization.

Experimental Analysis

The 2D ResNet18 is fine-tuned with pre-trained parameters on Kaggle's medical text dataset to achieve the highest performance among the models. In contrast, the traditional text classifier Naïve Bayes yielded considerably lower weighted F1 scores of 42.2% and 47.8%, respectively. Models based on AlexNet and VGG11 exhibited inferior performance, with accuracy rates falling below that of Naïve Bayes.

The models were considered for comparison to explore variations in the performance of ResNet18 under different conditions, including (1) fine-tuning both input and output layers of ResNet using pre-trained ImageNet parameters and further training on the medical text dataset, (2) fine-tuning only the input and output layers while training on the medical text dataset; (3) fine-tuning input and output layers with parameters pre-trained on ImageNet. Among these models, the one exhibiting the best performance at Epoch 25 was selected.

The model displayed convergence and achieved an accuracy exceeding 90% in both training and validation when visualizing the training and validation accuracies of the ResNet18 model over 25 epochs. This study successfully applied ResNet to medical text-processing tasks and leveraged Grad-CAM for model interpretability. In medical texts, Grad-CAM's usage for generating heatmaps to achieve interpretability is a novel contribution.

The interpretability of Grad-CAM remains qualitative rather than quantitative despite providing valuable model explainability. Therefore, the suitability of the provided explanations still depends on clinical assessments, which poses a subjectivity challenge. Future research should explore standardized criteria for assessing model interpretability, possibly adopting metrics like SHapley additive explanation (SHAP) and locally interpretable model agnostic explanations (LIME) to gauge explainability. Additionally, the findings hint at the need for XAI to elucidate the factors most influential in model outcomes and to find suitable linguistic expressions for interpretation, especially for complex "black-box" models like ANNs. Future endeavors should aim to pinpoint these influential factors through ablation experiments and introduce explicit linguistic expressions to enhance XAI's capabilities.

Conclusion

To sum up, ResNet was employed for medical text processing to enhance classification accuracy. Additionally, Grad-CAM visualization effectively highlighted the model's attention during predictions and also contributed to model interpretability. Looking ahead, future endeavors will focus on establishing quantitative criteria to assess Grad-CAM's explainability for performance comparisons with state-of-the-art models.

Journal reference:
Silpaja Chandrasekar

Written by

Silpaja Chandrasekar

Dr. Silpaja Chandrasekar has a Ph.D. in Computer Science from Anna University, Chennai. Her research expertise lies in analyzing traffic parameters under challenging environmental conditions. Additionally, she has gained valuable exposure to diverse research areas, such as detection, tracking, classification, medical image analysis, cancer cell detection, chemistry, and Hamiltonian walks.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Chandrasekar, Silpaja. (2023, September 13). Bridging the Gap: Visualizing and Enhancing Medical Text AI's Decision-Making. AZoAi. Retrieved on December 22, 2024 from https://www.azoai.com/news/20230913/Bridging-the-Gap-Visualizing-and-Enhancing-Medical-Text-AIs-Decision-Making.aspx.

  • MLA

    Chandrasekar, Silpaja. "Bridging the Gap: Visualizing and Enhancing Medical Text AI's Decision-Making". AZoAi. 22 December 2024. <https://www.azoai.com/news/20230913/Bridging-the-Gap-Visualizing-and-Enhancing-Medical-Text-AIs-Decision-Making.aspx>.

  • Chicago

    Chandrasekar, Silpaja. "Bridging the Gap: Visualizing and Enhancing Medical Text AI's Decision-Making". AZoAi. https://www.azoai.com/news/20230913/Bridging-the-Gap-Visualizing-and-Enhancing-Medical-Text-AIs-Decision-Making.aspx. (accessed December 22, 2024).

  • Harvard

    Chandrasekar, Silpaja. 2023. Bridging the Gap: Visualizing and Enhancing Medical Text AI's Decision-Making. AZoAi, viewed 22 December 2024, https://www.azoai.com/news/20230913/Bridging-the-Gap-Visualizing-and-Enhancing-Medical-Text-AIs-Decision-Making.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Hybrid Autoregressive Transformer Revolutionizes Visual Generation, Outperforming Diffusion Models