Disinformation Detection in the Age of LLMs: Challenges and Solutions

In a recent submission to the arXiv* server, researchers comprehensively examined the detection of large language models (LLMs)-generated misinformation.

Study: Disinformation Detection in the Age of LLMs: Challenges and Solutions. Image credit: Bits And Splits/Shutterstock
Study: Disinformation Detection in the Age of LLMs: Challenges and Solutions. Image credit: Bits And Splits/Shutterstock

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.

Background

The emergence of LLMs, including models such as Chat Generative Pre-Trained Transformer (ChatGPT) and Meta’s language model (Llama), has marked a significant milestone in computational social science (CSS). While LLMs have opened doors to extensive studies of human language and behavior, concerns about their potential misuse for disinformation have arisen. As these models advance in generating highly convincing human-like content, the risk of their exploitation for the creation of misleading information on a large scale becomes evident. Recent research has highlighted this concern, acknowledging the cost-effectiveness and effectiveness of AI-generated disinformation.

Existing models for disinformation detection

In text generation, there has been a transition from small language models (SLMs) to LLMs with billions of parameters, resulting in significant advancements. Models such as the Language Model for Dialogue Application (LaMDA), Bloom, Pathways Language Model (PaLM), and the generative pre-trained transformer (GPT) family have demonstrated the ability to produce human-level responses. However, the format of input prompts can influence performance, and advanced prompt engineering techniques are crucial to guide LLMs toward more accurate and higher-quality responses.

Before the rise of LLMs, disinformation detection was primarily centered around SLMs such as bidirectional encoder representations from transformers (BERT), GPT-2, and text-to-text transfer transformers (T5). Deep learning has played a pivotal role in detecting disinformation, with models such as the hybrid deep model for fake news (CSI) and FakeBERT employing neural networks to identify textual features indicative of disinformation. The introduction of LLMs, with their vast parameters, has significantly complicated disinformation detection, given their ability to produce natural, human-like text. This shift raises critical questions about the effectiveness of existing disinformation detection methods designed around SLMs.

The current study aims to detect LLM-generated misinformation by addressing three research questions (RQs):

  • RQ1: Are current disinformation detection techniques suitable for LLM-generated disinformation?
  • RQ2: If not, can LLMs themselves be adapted for detection?
  • RQ3: If both approaches fall short, what alternative solutions can be explored?

Dataset for disinformation detection

Researchers created a human-written fake news dataset known as Dhuman. It is a benchmark dataset for disinformation detection, comprising 21,417 real news articles from Reuters and 23,525 fake news articles from unreliable sources flagged by fact-checking websites.

From Dhuman, three LLM-generated fake news datasets are constructed using different zero-shot prompt techniques: Dgpt std, Dgpt mix, and Dgpt cot. The dataset Dgpt std involves minimal modifications to human-written disinformation, maintaining its original content while enhancing its tone and vocabulary. Dgpt mix combines true and fake news to create more complex disinformation, and Dgpt cot employs chain-of-thought (CoT) prompts to guide ChatGPT in generating disinformation that mimics human cognitive processes. These new datasets are introduced as valuable resources for future research in LLM-generated disinformation detection.

To validate the generated disinformation, a comparative analysis between samples from Dhuman and Dgpt std is conducted. Linguistic and semantic similarities are examined using Linguistic Inquiry and Word Count (LIWC) and t-SNE. The linguistic analysis shows that LLM-generated disinformation exhibits increased prosocial language, political and ethical themes, and logical coherence while reducing emotional language, profanities, and colloquialisms. Semantically, human-written and ChatGPT-generated disinformation have significant overlap, indicating similar semantic meanings.

Experiments and results

Evaluated the disinformation detection on the collected datasets, focusing on assessing the performance of detection models in distinguishing human-written and LLM-generated disinformation.

Existing Technique (RQ1): The study reveals that current state-of-the-art disinformation detection models struggle when faced with advanced disinformation. A variant of the BERT model (RoBERTa) is employed for detection, initially tested on human-written disinformation from Dhuman and later challenged with LLM-generated disinformation. The performance of Dhuman exhibits excellence, with a minimal misclassification rate. However, when tested on LLM-generated datasets (Dgpt std, Dgpt mix, and Dgpt cot), the model faces challenges. Dgpt mix and Dgpt cot, which involve more complex disinformation generation, exhibit high misclassification rates. Additionally, the model showcases political bias in its classification, particularly misclassifying center-leaning disinformation as true news.

LLMs (RQ2): Researchers demonstrated that LLMs struggle to identify self-generated disinformation effectively. They conducted experiments using ChatGPT to assess the proficiency of LLMs in identifying disinformation generated by LLMs. ChatGPT's responses vary in length and complexity when prompted to detect misleading information. The study finds that GPT-4 performs slightly better than GPT-3.5 in identifying LLM-generated disinformation, and prompting ChatGPT to provide detailed explanations improves its performance. However, ChatGPT's performance is generally inferior to the fine-tuned RoBERTa model in detecting disinformation.

Proposed Solution (RQ3): Researchers introduced a novel approach to detect LLM-generated disinformation, focusing on complex disinformation blending genuine and misleading content. A structured CoT prompt is designed to guide ChatGPT step by step in analyzing and fact-checking key content elements. An ablation study assesses the impact of contextual elements on detection performance.

Results indicate that GPT-4 consistently performs better than GPT-3.5 across various configurations of the CoT prompts. The study highlights the significance of contextual elements such as events and time for disinformation detection. Overall, advanced prompts paired with LLMs show promise for effectively countering LLM-generated disinformation.

Conclusion

In summary, researchers examined the detection of LLM-generated disinformation using ChatGPT to create three distinct datasets. The research shows that existing techniques, including LLMs, struggle to consistently identify this disinformation. To address this challenge, advanced prompts are introduced, significantly improving detection.

Future research may explore other LLM-generated disinformation types, such as false connections and manipulated content, while advanced prompting methods, such as CoT-self-consistency, offer promising avenues for further improvement.

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.

Journal reference:
Dr. Sampath Lonka

Written by

Dr. Sampath Lonka

Dr. Sampath Lonka is a scientific writer based in Bangalore, India, with a strong academic background in Mathematics and extensive experience in content writing. He has a Ph.D. in Mathematics from the University of Hyderabad and is deeply passionate about teaching, writing, and research. Sampath enjoys teaching Mathematics, Statistics, and AI to both undergraduate and postgraduate students. What sets him apart is his unique approach to teaching Mathematics through programming, making the subject more engaging and practical for students.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Lonka, Sampath. (2023, October 02). Disinformation Detection in the Age of LLMs: Challenges and Solutions. AZoAi. Retrieved on December 27, 2024 from https://www.azoai.com/news/20231002/Disinformation-Detection-in-the-Age-of-LLMs-Challenges-and-Solutions.aspx.

  • MLA

    Lonka, Sampath. "Disinformation Detection in the Age of LLMs: Challenges and Solutions". AZoAi. 27 December 2024. <https://www.azoai.com/news/20231002/Disinformation-Detection-in-the-Age-of-LLMs-Challenges-and-Solutions.aspx>.

  • Chicago

    Lonka, Sampath. "Disinformation Detection in the Age of LLMs: Challenges and Solutions". AZoAi. https://www.azoai.com/news/20231002/Disinformation-Detection-in-the-Age-of-LLMs-Challenges-and-Solutions.aspx. (accessed December 27, 2024).

  • Harvard

    Lonka, Sampath. 2023. Disinformation Detection in the Age of LLMs: Challenges and Solutions. AZoAi, viewed 27 December 2024, https://www.azoai.com/news/20231002/Disinformation-Detection-in-the-Age-of-LLMs-Challenges-and-Solutions.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
AI Imagines 3D Worlds to Revolutionize Decision-Making