In an article recently published in the journal Scientific Reports, researchers examined the response patterns of artificial intelligence (AI) chatbots/large language models (LLMs) to different emotional primes to investigate whether these chatbots elicit human-like response patterns to risk and prosocial behavioral cues.
AI and emotional intelligence
The investigation of AI’s capabilities has remained one of the top priorities of the scientific community, specifically with the advent of advanced LLMs. Conventionally, studies have primarily focused on investigating cognitive dimensions. A recent study investigated the coding skills, multimodal functionalities, and mathematical prowess, and evaluated the human interaction competencies of generative pre-trained transformers (GPT) -4.
However, sufficient studies have not been performed to investigate AI's emotional intelligence. Emotions, which are distinctly human characteristics, can result in a range of behaviors, such as promoting generosity under positive emotional states or risk aversion under negative ones. However, determining whether AI can possess emotions is difficult due to the lack of a clear definition of 'emotion' in the context of AI. Conventionally, human emotions serve dual roles, including interpersonal and intrapersonal roles.
Although AI has achieved significant progress in the interpersonal role as the technology can interpret emotional states from different inputs and respond suitably and simulate emotional expressions convincingly across several domains, the ability of AI to match the intrapersonal functions of human emotions has remained underexplored. However, a complete grasp of AI behavior can only be achieved by understanding the extent to which the technology can adjust its responses to emotions/emotional stimuli.
The study
In this work, researchers performed two studies, designated as Study 1 and Study 2, to investigate the response patterns of AI chatbots/LLMs, including OpenAI's ChatGPT-3.5 and ChatGPT-4, to different emotional primes. Six different OpenAI ChatGPT Plus accounts with access to both the previous ChatGPT-3.5 model and the latest ChatGPT-4 model were tasked with responding to inquiries concerning prosocial behaviors and investment decisions.
Specifically, the influence of emotional priming on financial decision-making and prosociality in LLM-generated texts was analyzed in Study 1 and Study 2, respectively. Like humans, AI models can also be guided by prompts to perform specific tasks.
The latest advances in LLMs have enabled them to comprehend natural language and respond to complex questions and instructions involved in psychological experiments. Additionally, the parameter temperature is set by OpenAI to regulate how freely the answers can be generated by the bots.
The temperatures of ChatGPT-3.5 and ChatGPT-4 are currently set to a value between 0/very rigid and 1/very creative, which indicates that answers from various chat sessions differ when the same question is asked to the bots. Thus, these chat sessions can be treated as human participants who answer questions differently and independently.
In two studies, researchers investigated the response of AI/OpenAI's LLMs to emotionally charged scenarios/scenarios designed to elicit negative, neutral, or positive emotional states following methods similar to human psychological evaluations. The adjustable temperature parameter and the ChatGPT sessions' independence allowed them to simulate different human-like responses.
In the research design, emotionally evocative prompts were incorporated into dozens of chat sessions. Experimental integrity was preserved by following strict guidelines, including clarity in response interpretation, control for biases, and classical psychological experiment frameworks.
Researchers prevented training data contamination by adapting conventional psychological settings. They also demanded quantifiable, clear responses from the AI, which were either specific choices or numbers from set options, to prevent interpretative ambiguity.
Overt emotional words were not used before the final responses of the chatbot to control bias. Human-to-human studies were performed in parallel to AI experiments under the same conditions to directly compare human and AI emotional responses.
Research findings
This work was the first to investigate the responses of AI to emotional cues at the intrapersonal emotion level. The findings demonstrated that AI chatbots could replicate the way human emotions coordinate responses by adjusting their prosocial and financial actions accordingly.
The ChatGPT-4 bots demonstrated distinct response patterns in both prosocial and risk-taking decisions when primed with negative, neutral, or positive emotions. However, such a phenomenon was less evident in the ChatGPT-3.5 iterations. Specifically, the sensitivity displayed by ChatGPT-4 to emotional priming was consistent with established human behavior, with more pronounced responses being obtained using negative primes.
For instance, in both Study 1 and Study 2, the ChatGPT-4 chatbots generated significantly different answers compared to the control group when primed with negative emotions, while ChatGPT-4 bots primed with positive emotions displayed only marginal or no significant differences when compared with the control group. However, ChatGPT-3.5 elicited less differentiated responses compared to ChatGPT-4.
This observation also indicated that more advanced and bigger LLMs like ChatGPT-4 possess an enhanced capacity to modulate responses based on emotional cues compared to previous LLM versions like ChatGPT-3.5. However, the outcomes of the human-to-human studies aligned more closely with the responses of ChatGPT-3.5 instead of ChatGPT-4. Thus, more research is required to address this discrepancy.
To summarize, the findings of this study demonstrated the feasibility of swaying AI responses by leveraging emotional indicators.
Journal reference:
- Zhao, Y., Huang, Z., Seligman, M., Peng, K. (2024). Risk and prosocial behavioural cues elicit human-like response patterns from AI chatbots. Scientific Reports, 14(1), 1-7. https://doi.org/10.1038/s41598-024-55949-y, https://www.nature.com/articles/s41598-024-55949-y