New research shows that while AI can offer accurate advice, incorrect explanations still mislead users, distorting decision-making and eroding trust in AI systems.
Research: Don't be Fooled: The Misinformation Effect of Explanations in Human-AI Collaboration. Image Credit: Monster Ztudio / Shutterstock
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
In an article recently posted on the arXiv preprint* server, researchers comprehensively examined the impact of incorrect explanations on human-artificial intelligence (AI) collaboration. They aimed to understand how incorrect explanations, even when paired with accurate AI advice, affect human procedural knowledge, reasoning, decision-making, and overall collaboration between humans and AI.
Background
Explainable AI (XAI) has become crucial for improving transparency and understanding, ensuring that AI systems are straightforward to humans. However, their effectiveness depends on the accuracy of their explanations. Incorrect explanations not only mislead users during decision-making but can also impair their ability to perform tasks autonomously, as they rely on flawed reasoning strategies. This is particularly risky in high-stakes decision-making contexts such as healthcare, finance, and law. The study also highlights the role of cognitive load and the persistence of the "misinformation effect," where incorrect explanations distort knowledge retention and future decisions.
In human-AI collaboration, explanations play an important role in building trust and supporting effective decisions. However, the effect of incorrect explanations on human procedural knowledge and reasoning has not been deeply explored.
About the Research
In this paper, the authors used a mixed-methods approach, combining both quantitative and qualitative analysis, to explore the impact of incorrect explanations on human-AI collaboration. The study focused on how incorrect explanations affect long-term knowledge retention and cognitive load. To ensure the generalizability of the results, they conducted an online study with 160 participants, with equal representation from various demographics, ages, and educational backgrounds.
The participants were randomly assigned to one of four groups: a control group with no AI support, a group with AI advice only, a group with AI advice and correct explanations, and a group with AI advice and incorrect explanations.
The study tracked eye movements, decision-making time, and confidence levels to gain deeper insights into participants' decision-making processes. The researchers also took detailed notes on participants' self-reported experiences, including any confusion or uncertainty they faced.
The study had five phases: introduction, pre-test, main task, post-test, and questionnaire. In the main task, participants were shown images of buildings from various architectural styles and eras and asked to classify them into one of three architectural styles: Art Deco, Art Nouveau, or Georgian Architecture. The architectural styles were chosen because of their similar visual characteristics, making them difficult to distinguish, which heightened the need for accurate AI explanations. The pre-test assessed their initial ability by classifying six images.
In the main task, they classified twelve images with AI support, and in the post-test, they classified six more images without any support. Finally, they filled out a questionnaire to gauge their understanding and reasoning. The AI system provided explanations for its decisions, which were either correct or incorrect, depending on the assigned group.
The authors aimed to explore how incorrect explanations affected human procedural knowledge, reasoning, and overall human-AI team performance. They used generative pre-trained transformer version 4 (GPT-4), a large language model (LLM) developed by OpenAI, which gave participants a classification and an explanation based on their assigned group. In the correct explanation group, the LLM's explanation matched its prediction, while in the incorrect explanation group, the explanation did not.
Key Findings
The outcomes showed that incorrect explanations from the AI system significantly impaired human procedural knowledge and reasoning. Participants who received incorrect explanations performed worse on the post-test than those who got correct or no explanations.
Incorrect explanations also reduced human-AI team performance, as participants were less likely to trust the AI's recommendations. The results also showed that the effects of incorrect explanations persisted even when the AI provided accurate advice. This decline in performance was associated with increased cognitive load and frustration, leading participants to misapply flawed reasoning strategies.
The quantitative analysis revealed a 20% improvement in post-test classification accuracy for those given correct explanations compared to the pre-test. In contrast, participants with incorrect explanations experienced a 10% decline in accuracy over the same period. Additionally, incorrect explanations led to higher "detrimental overrides," where participants wrongly relied on AI advice, even when it conflicted with their own reasoning.
The study also found that those provided with incorrect information felt frustrated and uncertain about accurately classifying architectural styles. At the same time, those given correct explanations reported increased confidence in their decisions and a greater willingness to trust the AI system.
Applications
This research has significant implications for designing AI-based decision support systems in fields like healthcare, finance, and education. Understanding the risks of incorrect explanations can help organizations and policymakers develop AI systems that provide accurate explanations, enhancing collaboration and trust. The study emphasizes that incorrect explanations may provide short-term gains in task performance but ultimately undermine long-term knowledge retention and decision-making abilities.
The study also highlights the need for AI developers to focus on building reliable explanation-generation algorithms that minimize errors. It offers practical recommendations for future AI development, suggesting that systems should incorporate multiple validation layers to ensure explanations match their recommendations. The inclusion of uncertainty scores or feedback mechanisms to flag potentially incorrect explanations could further improve reliability and trustworthiness.
Conclusion
In summary, the study highlighted the importance of accurate explanations in human-AI collaboration. It demonstrated the risks associated with incorrect explanations and emphasized the need for AI systems to provide correct and reliable explanations to support effective decision-making. Incorrect explanations were found to impair not only task performance during AI collaboration but also users' procedural knowledge and reasoning after AI support is removed, marking the "misinformation effect" as a critical issue. The findings have broad implications across various fields and suggest the need for further research into how explanations affect collaboration.
Future work should examine how explanations influence human-AI collaboration in different settings, such as virtual teams or time-sensitive decision-making. Additionally, it would be important to investigate the impact of AI-generated explanations on teamwork, communication, and power dynamics within human and AI teams.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
Journal reference:
- Preliminary scientific report.
Spitzer, P., & et, al. Don't be Fooled: The Misinformation Effect of Explanations in Human-AI Collaboration. arXiv, 2024, 2409, 12809. DOI: 10.48550/arXiv.2409.12809, https://arxiv.org/abs/2409.12809