Decoding Authorship: Unveiling AI-Generated Text Sources

Large language models (LLMs) such as Generative Pre-trained Transformers (GPT)-4, Pathways Language Model (PaLM), and Large Language Model Meta AI (Llama) have greatly advanced artificial intelligence (AI)-generated text. The concerns surrounding the potential misapplication emphasize the need for AI-generated text. Neural authorship attribution (AA) facilitates tracing AI-generated text back to its source LLM, categorizing it into proprietary and open-source classifications.

Study: Decoding Authorship: Unveiling AI-Generated Text Sources. Image credit: Ole.CNX/Shutterstock
Study: Decoding Authorship: Unveiling AI-Generated Text Sources. Image credit: Ole.CNX/Shutterstock

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

In a recent paper submitted to the arXiv* server, researchers analyzed LLM writing styles empirically, comparing these categories and investigating their use in AA, contributing to countering AI-generated misinformation threats.

Background

Recent strides in generative LLMs such as GPT-4, OpenAI's proprietary model, PaLM from Google, and open-source models, including Llama 1 and 2, have revolutionized AI-generated content, mainly textual. While AI-generated text is a boon for human productivity, its potential misuse for influence operations and misinformation dissemination poses severe cybersecurity and information integrity risks. The surge in seemingly genuine yet deceitful AI-generated news articles raises concerns. Therefore, immediate computational methodologies are crucial for conducting forensic evaluations of AI-generated text, thereby mitigating the dissemination of misinformation driven by LLMs.

A pivotal facet of AI-generated text forensics is neural authorship attribution, identifying the LLM behind a specific text. This aids in uncovering malicious actors and their strategies, informs countermeasures, and refines LLM usage ethics. Typically, neural authorship attribution involves training a classifier on pre-trained language model (PLM) embeddings using texts generated by known LLMs, such as the Robustly Optimized BERT pre-training approach (RoBERTa). However, the LLM landscape's evolution introduces novel dimensions. Categorizing source LLMs as proprietary or open-source holds merit, potentially revealing campaign nuances such as actor resources and expertise. An open-source LLM choice might indicate specialized skills and computational infrastructure.

Unveiling LLM writing signatures for enhanced neural AA

In previous research, AA focused on identifying writers through distinctive signatures. Early approaches used classical machine learning classifiers such as Naïve Bayes and Support Vector Machines (SVMs) and features such as n-grams, part-of-speech (POS) tags, and topic modeling. Neural networks, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), gained prominence due to their accurate representation. Transformer-based models introduced neural authors, identifying the generating language model. Initial studies applied traditional AA features to neural authors. Recent work employed PLM classifiers such as Grover, GPT-2, and GPT-3 for neural author attribution, including fine-tuned instances to base LLM attribution.

The researchers aimed to measure the writing signatures of both open-source and proprietary LLMs for more transparent neural AA. This is achieved through a three-step methodology. In dataset generation, various LLMs such as GPT-3.5, GPT-4, GPT-NeoX, Llama 1, and Llama 2 were used. Models GPT-3.5 and GPT-4 are proprietary; the rest are open-source. Focusing on news articles controls domain-specific variations. Researchers adopted prompting techniques using human-authored article headlines, yielding six thousand of AI-generated news articles from the chosen LLMs.

The writing signatures of each LLM model were assessed through stylometric features encompassing lexical, syntactic, and structural attributes. Syntactic features measure sentence length, POS frequency, active and passive voice, and tense. Structural features quantify paragraph length, punctuation usage, and capital letters. Features were normalized before forming the final writing signature vector.

Neural AA is treated as a classification problem. A classifier comprehends the boundary within writing signatures, attributing the text to recognizable LLM sources. Initially, binary classification distinguished proprietary and open-source LLMs. Multiple classifications further analyzed neural authorship within each category. Data samples were balanced for each classification. Classification models included XGBoost with stylometry (XGBstylo) and bag-of-words (XGBbow) features. RoBERTa model variations and a fusion of stylometry and RoBERTa embeddings were utilized, enhancing neural AA. Finally, these embeddings are incorporated into the proposed interpretable neural AA model.

Analyzing Proprietary and Open-Source LLMs for Neural AA

Evaluation of proprietary and open-source LLMs was performed separately. For classification, researchers balanced six thousand data samples, utilizing XGBoost and RoBERTa models, both fine-tuned and not, along with a fusion of RoBERTa embeddings and stylometry features for enhanced neural AA.

Initial Analysis: To get a deeper insight into the uniqueness and similarities among the examined LLMs, researchers conducted t-SNE analyses on RoBERTa embeddings and stylometry features. Both spaces show clear proprietary-open-source distinctions, with stylometry exhibiting more pronounced separation. Overlaps occur within LLM categories; GPT-3.5 and GPT-4 overlap among proprietary LLMs, while Llama 1 and GPT-NeoX overlap among open-source ones. Intriguingly, Llama 2 aligns with proprietary LLMs in stylometry, possibly due to its engaging text-generation reputation, narrowing the gap.

Proprietary versus Open-Source Attribution: Experiments on AA reveal that initial attribution, notably with fusion, performs well. Solely Llama 2 inclusion in open-source lowers performance by 7.4%, indicating growing complexity. Shapley additive explanations (SHAP) analysis on XGBoost highlights lexical diversity, specific syntactic features such as prepositions, adjectives, and nouns, and structural attributes for category distinction.

Intra-Category Attribution: Proprietary models, enhanced with stylometry, demonstrate strong performance scores. Conversely, open-source LLMs such as GPT-NeoX and Llama 1 exhibit reduced performance, while Llama 2 shines, showcasing a unique style distinct from its predecessors.

Conclusion

In summary, the current study delved into neural AA, differentiating proprietary and open-source LLMs. Detailed stylometric analysis unveiled distinct writing style indicators: lexical diversity, POS, and structural features. These insights improve attribution techniques and illuminate LLM evolution. Similarities in open-source models could stem from shared pre-training or architecture. However, Llama 2's uniqueness hints at open-source potential. Understanding LLM nuances becomes crucial to counter misinformation threats in the AI-generated content era.

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

Journal reference:
Dr. Sampath Lonka

Written by

Dr. Sampath Lonka

Dr. Sampath Lonka is a scientific writer based in Bangalore, India, with a strong academic background in Mathematics and extensive experience in content writing. He has a Ph.D. in Mathematics from the University of Hyderabad and is deeply passionate about teaching, writing, and research. Sampath enjoys teaching Mathematics, Statistics, and AI to both undergraduate and postgraduate students. What sets him apart is his unique approach to teaching Mathematics through programming, making the subject more engaging and practical for students.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Lonka, Sampath. (2023, August 17). Decoding Authorship: Unveiling AI-Generated Text Sources. AZoAi. Retrieved on July 06, 2024 from https://www.azoai.com/news/20230817/Decoding-Authorship-Unveiling-AI-Generated-Text-Sources.aspx.

  • MLA

    Lonka, Sampath. "Decoding Authorship: Unveiling AI-Generated Text Sources". AZoAi. 06 July 2024. <https://www.azoai.com/news/20230817/Decoding-Authorship-Unveiling-AI-Generated-Text-Sources.aspx>.

  • Chicago

    Lonka, Sampath. "Decoding Authorship: Unveiling AI-Generated Text Sources". AZoAi. https://www.azoai.com/news/20230817/Decoding-Authorship-Unveiling-AI-Generated-Text-Sources.aspx. (accessed July 06, 2024).

  • Harvard

    Lonka, Sampath. 2023. Decoding Authorship: Unveiling AI-Generated Text Sources. AZoAi, viewed 06 July 2024, https://www.azoai.com/news/20230817/Decoding-Authorship-Unveiling-AI-Generated-Text-Sources.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Accent Classification with Deep Learning Models