In a recent submission to the arXiv* server, the authors introduced the Explore-Generate-Simulate (EGS) framework. Large language models (LLMs) leverage this framework to enhance communication across various situations.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
Background
Communication is a fundamental aspect of human interaction, from establishing friendships to accomplishing tasks and conveying intentions. Individuals often rely on heuristics, drawing upon past experiences or seeking advice from others to inform their communication strategies. In more contemplative moments, they may simulate potential dialogues in their minds, envisioning the reactions of hypothetical listeners to guide their decision-making.
Inspired by the newly discovered capabilities of LLMs in simulating agents, the authors propose the EGS framework. EGS aids individuals in exploring communication strategies and crafting message candidates, thus alleviating the cognitive burden of simulating audience responses.
The study draws inspiration from social psychology literature on interpersonal relationships and employs a framework that leverages LLM agent simulations as a substitute for human mental simulations. Human preferences play a central role in the evaluation of this framework.
Cognitive psychology research suggests that long-term memory aims to store past information for future planning. LLMs, repositories of collective human experiences, may simulate episodic future thinking by inferring the properties of an average agent with specific attributes, enabling realistic LLM-simulated audiences.
EGS framework
The EGS framework is a versatile tool designed to assist individuals in achieving their communication goals by providing candidate messages and advice. EGS operates in three steps:
Explore: In this stage, EGS prompts an LLM to generate a diverse set of advice relevant to the scenario. This advice may include affirmations or compliments. Additionally, it encourages the generation of “unorthodox but potentially helpful” advice, promoting creativity and diversity in candidate messages.
Generate: For each set of advice, EGS instructs the LLM to generate candidate messages. The Generate step combines advice in various ways, ensuring that each candidate aligns with the scenario, speaker, and goal. Multiple candidates are generated for each advice set to account for randomness.
Simulate: The Simulate step begins by having the LLM create profiles for key audiences and simulate their reactions to each candidate’s message. These audience profiles include descriptions and questions related to their reception of the message. The reactions are evaluated based on the likelihood and magnitude of achieving the communicator's goal. EGS aggregates the results to determine the best candidate.
The Explore step expands the candidate generation space by providing distinct pieces of advice based on existing social psychology literature, recalling useful advice, and prior experiences. It also encourages the generation of unconventional advice to increase candidate diversity.
In the Generate step, EGS forms combinations of advice generated in the Explore step to condition the generation of candidate messages. Each candidate is guided by specific advice sets, allowing for flexibility in incorporating advice while maintaining relevance to the scenario, speaker, and goal.
The Simulate step involves generating audience profiles, simulating their reactions, and aggregating the results to determine the best candidate. This step allows for nuanced evaluations by simulating pairwise comparisons and selecting the candidate that performs the best against others.
EGS showcases the potential of LLMs in addressing everyday communication challenges by providing reasonable and creative suggestions. The framework's modularity enables flexible implementation in specific communication scenarios.
Human evaluations and comparative analysis
The study collected human evaluations for communication candidates generated by the EGS framework, the Generative Pre-Trained Transformers (GPT)-4 zero-shot model, and the GPT-4 Chain-of-Thought (CoT) in various scenarios. This evaluation aimed to assess the framework's chosen advice and candidates' performance relative to the baselines, as well as the agreement between human raters and GPT-4 in pairwise comparisons.
Data collection encompassed eight diverse scenarios, and human ratings were collected for candidates generated by EGS, GPT-4 zero-shot, GPT-4 CoT, and two ablated versions of EGS. In EGS, three pieces of normal advice and one unorthodox advice were combined to create ten distinct advice sets, with three candidates generated for each set. EGS outperformed GPT-4 zero-shot and GPT-4 CoT in most scenarios, with average improvements of 17.1 percent and 9.3 percent, respectively. All the EGS candidates received mean scores above five, indicating a positive impact on the communicator's goal.
Enhancing communication and control with the EGS
The study also highlighted significant agreement between human ratings and GPT-4 evaluations, with consistency in pairwise comparisons.
Interpretable Explanations and Accessible Alternatives: EGS excels at providing clear explanations and detailed reasoning for candidate comparisons. Users can understand the logic behind preferred candidates and explore alternatives if they disagree with EGS suggestions by favoring better-performing candidates or adjusting stakeholder weights and aggregation methods to match their preferences.
Extending User Controls: While the framework primarily involves LLM-generating potential audiences, users may desire more control over audience members and their weights. This feature allows users to introduce custom audience members or modify existing profiles. Human-in-the-loop interaction is considered, enhancing quality and alignment at the cost of increased human effort.
Further, EGS offers scalable alternatives for delayed communication scenarios, such as emails, with limitations in real-time applications. It supports counterfactual reasoning, designing study protocols, data collection, and optimization for human subject studies and RLHF comparisons. EGS promotes communication across boundaries, synthesizing diverse perspectives into responses.
Conclusion
In summary, researchers delved into using LLMs to simulate audiences, introducing the EGS framework: Explore, Generate, and Simulate. EGS generated high-quality advice and candidates, surpassing GPT-4 with CoT. The findings suggest the potential for leveraging cognitive science principles to enhance LLM capabilities, providing novel support for human communication and reasoning.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
Journal reference:
- Preliminary scientific report.
Liu, R., Yen, H., Marjieh, R., Griffiths, T. L., and Krishna, R. (2023). Improving Interpersonal Communication by Simulating Audiences with Language Models. arXiv. DOI: https://doi.org/10.48550/arXiv.2311.00687, https://arxiv.org/abs/2311.00687