AgentOhana: Unifying Multi-Turn LLM Agent Trajectories

In an article recently submitted to the ArXiv* server, researchers proposed a pioneering platform, AgentOhana, to address the complex challenges related to the consolidation of heterogeneous data sources concerning multi-turn large language model (LLM) agent trajectories.

Study: AgentOhana: Unifying Multi-Turn LLM Agent Trajectories. Image credit: a-image/Shutterstock
Study: AgentOhana: Unifying Multi-Turn LLM Agent Trajectories. Image credit: a-image/Shutterstock

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.

Challenges in LLM-based autonomous agents

LLMs have demonstrated robust abilities in conversational AI, AI agents, mathematical reasoning, and code generation. Among them, autonomous agents powered by LLMs are increasingly gaining greater research attention. For instance, LangChain, XAgent, BOLAA, OpenAgent, and Auto generative pre-trained transformer (GPT) are recent frameworks designed for LLM agents to support agent tasks and have attracted substantial interest in the open-source community.

However, several existing agents are primarily powered by closed-source LLM application programming interfaces (APIs) like Gemini and GPT-4, as most open-source models cannot efficiently handle complex agent tasks and perform long-horizon reasoning. Recently, several efforts have been made to train open-source models instead of solely depending on commercialized APIs.

However, fully harnessing the LLMs' potential for agent-based tasks is an inherent challenge owing to the heterogeneous nature of various data sources/non-standardized data formats sourced from different dataset collections featuring multi-turn trajectories, which are common in agent-relevant data. The heterogeneity in processing methods, labeling conventions, syntaxes, and data structures across datasets complicates the fine-tuning and training processes of LLMs.

Specifically, the absence of standardized formats leads to complexities while harmonizing diverse data sources, resulting in potential inconsistencies and biases. Effective preprocessing pipelines must be developed to ensure compatibility and unification across diverse data formats, and strategies must be implemented to mitigate biases caused by the non-standardized representations to address these challenges.

The proposed approach

In this study, researchers proposed a comprehensive solution /agent data training and collection pipeline designated as AgentOhana to address these challenges effectively. The study's objective was to establish an effective method for managing non-standardized data formats to realize the robust performance of LLM agents in various applications, considering the rising demand for diverse and comprehensive datasets.

AgentOhana can aggregate agent trajectories from different environments, spanning various scenarios. The platform can meticulously standardize and unify these trajectories into a consistent format to streamline the development of a generic data loader optimized for agent training.

Specialized processes were employed by AgentOhana to transform various data into a uniform format for seamless integration across several s ources. Moreover, the data collection was subjected to a meticulous filtering process to ensure high-quality trajectories, which introduced an additional quality control layer.

Thus, the proposed training pipeline ensures equilibrium across various data sources and preserves independent randomness in all devices during model training and dataset partitioning by leveraging data unification and standardization to prevent the inadvertent introduction of biases during the training process.

In the AgentOhana workflow, a homogeneous multi-turn data format that was designed to consolidate trajectories from heterogeneous data sources was initially adopted. Then, a method designated as AgentRater was introduced to filter and assess agent trajectories based on robust close-world models such as ChatGPT or public models such as Mistral.

Eventually, a generic data loader was adopted as a central component to enable seamless integration of different datasets into a distributed training process. Additionally, researchers also presented a large action model, designated as xLAM-v0.1, which was tailored for AI agents.

A supervised fine-tuning approach was adopted to improve the performance of the xLAM-v0.1 agent model, which was pre-trained initially on the Mixtral-8x7B-Instruct-v0.1 model. This fine-tuning process was executed by leveraging AgentOhana's capabilities. Four benchmarks, including MINT-Bench, ToolEval, HotpotQA, and Webshop, were used for the experimental evaluations of the model.

Study findings

xLAM-v0.1 displayed exceptional performance across various benchmarks. The model consistently outperformed both GPT-3.5-Turbo-Instruct and GPT-3.5-Turbo across every agent configuration and also outperformed GPT-4-0613 in five out of six settings within the Webshop environment. Similarly, the xLAM-v0.1 model demonstrated superior performance compared to Mixtral-8x7B-Instruct-v0.1 and GPT-3.5-Turbo in all settings in the HotpotQA environment. However, GPT-4-0613 showed a slight performance edge over the proposed model.

On ToolEval, xLAM-v0.1 outperformed both GPT-3.5-Turbo-0125 and TooLlama V2 across all scenarios and also outperformed GPT-4-0125-preview in two out of the three settings. In the comprehensive and challenging MINT-Bench environment, the xLAM-v0.1 model secured third rank, outperforming AgentLM-70b and Lemur-70b-Chatv1 agent-based models, and GPT-3.5-Turbo-0613 and Claude-2 general LLMs, which indicated the created model's exceptional capability to navigate complexities of task resolution and multi-turn interactions.

To summarize, the findings of this study demonstrated that AgentOhana can effectively address the inherent challenges in consolidating diverse data of the multi-turn LLM agent trajectories.

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.

Journal reference:
Samudrapom Dam

Written by

Samudrapom Dam

Samudrapom Dam is a freelance scientific and business writer based in Kolkata, India. He has been writing articles related to business and scientific topics for more than one and a half years. He has extensive experience in writing about advanced technologies, information technology, machinery, metals and metal products, clean technologies, finance and banking, automotive, household products, and the aerospace industry. He is passionate about the latest developments in advanced technologies, the ways these developments can be implemented in a real-world situation, and how these developments can positively impact common people.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Dam, Samudrapom. (2024, February 29). AgentOhana: Unifying Multi-Turn LLM Agent Trajectories. AZoAi. Retrieved on November 24, 2024 from https://www.azoai.com/news/20240229/AgentOhana-Unifying-Multi-Turn-LLM-Agent-Trajectories.aspx.

  • MLA

    Dam, Samudrapom. "AgentOhana: Unifying Multi-Turn LLM Agent Trajectories". AZoAi. 24 November 2024. <https://www.azoai.com/news/20240229/AgentOhana-Unifying-Multi-Turn-LLM-Agent-Trajectories.aspx>.

  • Chicago

    Dam, Samudrapom. "AgentOhana: Unifying Multi-Turn LLM Agent Trajectories". AZoAi. https://www.azoai.com/news/20240229/AgentOhana-Unifying-Multi-Turn-LLM-Agent-Trajectories.aspx. (accessed November 24, 2024).

  • Harvard

    Dam, Samudrapom. 2024. AgentOhana: Unifying Multi-Turn LLM Agent Trajectories. AZoAi, viewed 24 November 2024, https://www.azoai.com/news/20240229/AgentOhana-Unifying-Multi-Turn-LLM-Agent-Trajectories.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Researchers Boost Large Language Model Factual Accuracy With Novel Integrative Decoding Approach