Generative Chatbots Amplify False Memories in Witness Interviews, Posing New Ethical Risks

Download PDF Copy

By Dr Silpaja Chandrasekar, PhDReviewed by Joel ScanlonSep 8 2024

Generative chatbots not only create and reinforce false memories but also increase users' confidence in these distorted recollections, raising critical concerns about their potential misuse in law enforcement and the need for immediate ethical guidelines in AI applications.

Study: Conversational AI Powered by Large Language Models Amplifies False Memories in Witness Interviews

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.

In an article recently posted to the arXiv preprint* server, researchers at the Massachusetts Institute of Technology and the University of California, Irvine, have found that participants interacting with a generative chatbot were significantly more prone to false memories during simulated crime witness interviews, inducing approximately three times as many false memories as the control group. Even after one week, the number of false memories remained constant, but participants' confidence stayed significantly higher.

The study highlighted the ethical risks of using advanced artificial intelligence (AI) in sensitive contexts like police interviews.

Background

Past work on false memories has demonstrated that human recollections are reconstructive and susceptible to external influences, such as suggestive questioning, with significant implications for legal settings.

Studies by Loftus and others have shown how question-wording and misinformation can distort memory, while neuroimaging has revealed that true and false memories activate similar brain regions.

More recently, research has raised concerns about AI's potential to induce false memories, especially as AI systems, including large language models, are increasingly integrated into daily life and human interactions. However, the specific impact of AI-driven dialogue systems on memory formation remains an emerging area of study, warranting further exploration.

Two-phase Memory Experiment

The study followed a two-phase experimental procedure. Participants watched a two-and-a-half-minute silent video of a robbery in the first phase. They then rated their emotional state using a self-assessment manikin (SAM) scale and completed a filler activity (a brief Pac-Man game).

Manipulation of Eyewitness Memory by AI: This figure illustrates the process of AI-induced false memories in three stages. It begins with a person witnessing a crime scene involving a knife, then shows an AI system introducing misinformation by asking about a non-existent gun, and concludes with the witness developing a false memory of a gun at the scene. This sequence demonstrates how AI-guided questioning can distort human recall, potentially compromising the reliability of eyewitness testimony and highlighting the ethical concerns surrounding AI’s influence on human memory and perception.

Participants were randomly assigned to one of four conditions: control, survey-based, pre-scripted chatbot, or generative chatbot. The experimental conditions involved misleading questions aimed at inducing false memories.

After interacting with the experimental condition, participants assessed their cognitive load using the NASA task load index (NASA TLX) and then took a memory test with questions about the video. The session lasted between 30-45 minutes.

The second phase occurred one week later, with participants completing an online survey, recalling the video, and answering the same follow-up questions. This phase allowed researchers to assess not only the persistence of false memories but also any changes in the participants' confidence levels. This session lasted 10-20 minutes, and responses were compared to those from the first phase to assess memory retention and confidence changes.

Technically, the pre-scripted and generative chatbots were implemented using a web interface powered by generative pre-trained transformer 4 (GPT-4). The generative chatbot provided feedback based on participants' answers, reinforcing or correcting responses using specific prompts.

These prompts aimed to simulate a more authentic and dynamic interaction, occasionally introducing additional details to influence the user's recall of the video.

Participants were recruited through Prolific, pre-screened for fluency in English, and balanced by gender. Of the 200 participants recruited, six did not complete the second phase, and 39 others were excluded due to failed attention checks. Statistical analyses, including Kruskal-Wallis and Wilcoxon tests, were used to evaluate false memories and confidence scores.

Generative Chatbot Misinformation

The study's results indicated that short-term interactions with generative chatbots significantly increased the occurrence of false memories and elevated users' confidence in these false memories compared to other methods. The generative chatbot had the most substantial misinformation effect, misleading 36.4% of participants.

The survey-based intervention also caused false memories in 21.6% of participants. Users who were less familiar with chatbots but more knowledgeable about AI and interested in crime investigations were particularly prone to false memories.

A one-way Kruskal–Wallis's test revealed that the generative chatbot produced significantly more immediate false memories than the control, survey-based, and pre-scripted chatbot interventions.

While all interventions led to more false memories than the control, the generative chatbot induced almost three times as many false memories as the control group. However, no significant differences were found between the survey-based and pre-scripted chatbot conditions.

Regarding confidence in false memories, the intervention conditions, including the generative chatbot, significantly increased participants' confidence compared to the control.

Experimental Design for Studying AI-Induced False Memories: This figure outlines a two-phase study on AI-induced false memories. In Phase 1, participants watch a CCTV crime video, complete emotional assessments and filler tasks, and are randomly assigned to one of four conditions: control, survey-based, pre-scripted chatbot, or generative chatbot. They then undergo cognitive load assessment and answer questions about the video. Phase 2, conducted one week later, involves participants recalling the video and answering the same questions, allowing researchers to measure the persistence of potential false memories induced by different AI interactions.

However, confidence in true memories did not differ significantly between conditions. Notably, false memories induced by the generative chatbot remained stable after one week, unlike the control and survey-based conditions, which showed an increase in false memories over time.

Further analysis revealed that familiarity with AI technology and interest in crime investigations were key moderating factors in false memory formation. Participants unfamiliar with chatbots but experienced with AI were likelier to develop false memories. In contrast, variables like age, gender, and cognitive workload did not show a significant impact, as indicated by the study’s mixed-effects regression model.

Conclusion

The study provided empirical evidence of AI's influence, especially generative chatbots, on false memory formation. As AI systems become more sophisticated and widely used, it is crucial to consider their impact on cognitive processes.

The findings underscored the need for caution and the development of ethical guidelines for AI applications in sensitive contexts. This research highlights the importance of balancing AI technology's benefits with preserving human memory and decision-making integrity. Further research is necessary to address these concerns.

Journal reference:

Preliminary scientific report. Chan, S., et al. (2024). Conversational AI Powered by Large Language Models Amplifies False Memories in Witness Interviews. ArXiv.org, DOI: 10.48550/arXiv.2408.04681, https://arxiv.org/abs/2408.04681

Posted in: AI Research News

Comments (0)

Written by

Silpaja Chandrasekar

Dr. Silpaja Chandrasekar has a Ph.D. in Computer Science from Anna University, Chennai. Her research expertise lies in analyzing traffic parameters under challenging environmental conditions. Additionally, she has gained valuable exposure to diverse research areas, such as detection, tracking, classification, medical image analysis, cancer cell detection, chemistry, and Hamiltonian walks.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Chandrasekar, Silpaja. (2024, September 08). Generative Chatbots Amplify False Memories in Witness Interviews, Posing New Ethical Risks. AZoAi. Retrieved on October 17, 2025 from https://www.azoai.com/news/20240908/Generative-Chatbots-Amplify-False-Memories-in-Witness-Interviews-Posing-New-Ethical-Risks.aspx.
MLA
Chandrasekar, Silpaja. "Generative Chatbots Amplify False Memories in Witness Interviews, Posing New Ethical Risks". AZoAi. 17 October 2025. <https://www.azoai.com/news/20240908/Generative-Chatbots-Amplify-False-Memories-in-Witness-Interviews-Posing-New-Ethical-Risks.aspx>.
Chicago
Chandrasekar, Silpaja. "Generative Chatbots Amplify False Memories in Witness Interviews, Posing New Ethical Risks". AZoAi. https://www.azoai.com/news/20240908/Generative-Chatbots-Amplify-False-Memories-in-Witness-Interviews-Posing-New-Ethical-Risks.aspx. (accessed October 17, 2025).
Harvard
Chandrasekar, Silpaja. 2024. Generative Chatbots Amplify False Memories in Witness Interviews, Posing New Ethical Risks. AZoAi, viewed 17 October 2025, https://www.azoai.com/news/20240908/Generative-Chatbots-Amplify-False-Memories-in-Witness-Interviews-Posing-New-Ethical-Risks.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.

Post a new comment

(Logout)

Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.