LG's EXAONE 3.5 Sets New Standards in Generative AI

Discover how EXAONE 3.5 is setting new benchmarks in AI performance, redefining multilingual capabilities, and advancing ethical technology for real-world applications.

Research: EXAONE 3.5: Series of Large Language Models for Real-world Use Cases. Image Credit: Krot_Studio / ShutterstockResearch: EXAONE 3.5: Series of Large Language Models for Real-world Use Cases. Image Credit: Krot_Studio / Shutterstock

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.

In an article recently submitted to the arXiv preprint* server, researchers presented EXAONE 3.5, LG artificial intelligence (AI) Research's instruction-tuned large language models (LLM) (32 billion (B), 7.8B, 2.4B). These models showcase exceptional instruction-following, robust long-context comprehension, and competitive performance across benchmarks. These open-access models were made available for research, with commercial licensing through LG AI Research.

Background

LLMs have revolutionized natural language processing (NLP), with EXAONE 3.0 showcasing strong bilingual capabilities in Korean and English, excelling in instruction-following and real-world applications. Despite its success, challenges remain in catering to diverse user needs.

Researchers require smaller, efficient models for deployment on low-specification graphic processing units (GPU), while industries demand larger, cost-effective, high-performance models. Additionally, the rise of retrieval-augmented generation (RAG) has emphasized the need for models capable of processing longer contexts effectively.

LG AI Research introduced EXAONE 3.5, a collection of instruction-tuned models ranging from 2.4B to 32B parameters to address these gaps. These models were designed to balance performance and scalability, supporting up to 32 thousand (K) tokens of long-context comprehension.

With improvements across real-world and general benchmarks, EXAONE 3.5 set new standards to empower researchers and developers to advance generative AI and create impactful applications aligned with LG AI Research’s mission of enhancing human life. In particular, the smaller 2.4B model delivered performance comparable to larger models in many use cases, addressing growing demand for smaller yet powerful LLMs.

Model Training

The EXAONE 3.5 LLMs are a collection of instruction-tuned models developed using the latest decoder-only transformer architecture designed to balance performance and resource efficiency. Key advancements include extending the maximum context length from 4,096 to 32,768 tokens using long-context fine-tuning and replay-based methods to mitigate catastrophic forgetting.

This approach allowed the models to process and retrieve information from extended text lengths, as demonstrated in Needle-in-a-Haystack benchmarks. The models were pre-trained in two stages: general-domain training and targeted enhancement for long-context understanding. A rigorous decontamination process ensured unbiased evaluations by removing contaminated data from the training set.

EXAONE 3.5 models delivered competitive performance while requiring lower computational costs compared to similar-sized models. Post-training processes, including supervised fine-tuning and preference optimization, further enhanced instruction-following and alignment with human preferences. The models were trained using a diverse and evolving instruction-response dataset derived from core knowledge spanning multiple complexities and domains.

Additionally, LG AI Research adhered to strict AI compliance standards to address legal risks associated with data usage. These processes included de-identification methods and compliance checks aligned with LG’s AI Ethics Principles to minimize potential misuse and ensure privacy protection. These advancements position EXAONE 3.5 as a scalable, efficient solution for varied user needs, supporting applications ranging from resource-constrained deployments to high-performance tasks.

Evaluation

The evaluation of EXAONE 3.5 LLMs highlighted their performance across various benchmarks, categorized into real-world use cases, long-context understanding, and general domains. The models were assessed against recently released open language models, demonstrating competitive or superior performance across multiple categories.

In real-world use cases, EXAONE 3.5 models excelled in understanding diverse user instructions, outperforming similar-sized baselines in most benchmarks. Their bilingual capabilities in English and Korean were also noteworthy. For long-context tasks, these models showed robust abilities in retrieving and processing information from inputs up to 32K tokens, achieving near-perfect accuracy in tests like Needle-in-a-Haystack and outperforming baselines in long-context understanding benchmarks. In one evaluation, EXAONE 3.5 achieved a 68% retrieval success rate across extensive contexts, surpassing most competitors.

In general, domains, which included mathematical problem-solving, coding, and knowledge assessments, EXAONE 3.5 models delivered competitive performance. Notably, the 2.4B model outperformed larger baselines, reflecting its efficiency and utility for scenarios where smaller models were preferred. Detailed results, including specific performance on tasks such as GSM8K and MATH, showcased the models’ versatility.

Ensuring Ethical and Responsible AI

The EXAONE 3.5 language models were developed with a Responsible AI framework emphasizing ethics, safety, transparency, and accountability. The models adhered to LG AI Ethics Principles, ensuring data quality, fairness, and regulation compliance to maximize societal benefits while mitigating risks. Available in diverse sizes (2.4B, 7.8B, and 32B), they cater to various research and application needs, advancing generative AI through reliable and flexible tools.

To address risks, extensive assessments identified challenges like potential bias, harmful outputs, and misuse. Mitigation strategies included rigorous dataset reviews, de-identification processes, standardized pre-processing protocols, and monitoring global AI regulations.

The models’ performance on the Korean Large Language Model Trustworthiness dataset demonstrated effective mitigation strategies, with the 32B model achieving over 87% accuracy in ethical benchmarks. Performance evaluations with benchmarks like the Korean LLM Trustworthiness dataset further ensured ethical robustness, though explainability and transparency remain ongoing research priorities.

Limitations included the occasional generation of biased, inappropriate, or outdated responses due to the nature of training data. LG AI Research is committed to reducing risks and prohibits malicious use of the models, striving for safe and responsible AI deployment.

Conclusion

In conclusion, EXAONE 3.5 significantly advanced generative AI, balancing performance, scalability, and ethical considerations. With diverse model sizes (2.4B, 7.8B, and 32B), it addressed varied user needs, excelling in instruction-following, long-context understanding, and general domain applications.

Rigorous evaluations and responsible AI principles ensured reliability, fairness, and transparency, while ongoing research addressed limitations like bias and explainability. Open for research and available for commercial licensing, EXAONE 3.5 empowers researchers and industries to innovate and create impactful applications.

The comprehensive ethical compliance and advanced evaluation benchmarks underscore the models’ reliability and their potential to set new standards in AI development. LG AI Research invites feedback and collaboration to enhance these models and advance the field of AI responsibly.

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.

Source:
Journal reference:
  • Preliminary scientific report. Research, L. A., An, S., Bae, K., Choi, E., Choi, K., Choi, S. J., Hong, S., Hwang, J., Jeon, H., Jo, G. J., Jo, H., Jung, J., Jung, Y., Kim, H., Kim, J., Kim, S., Kim, S., Kim, S., Kim, Y., . . . Yun, H. (2024). EXAONE 3.5: Series of Large Language Models for Real-world Use Cases. ArXiv. https://arxiv.org/abs/2412.04862
Soham Nandi

Written by

Soham Nandi

Soham Nandi is a technical writer based in Memari, India. His academic background is in Computer Science Engineering, specializing in Artificial Intelligence and Machine learning. He has extensive experience in Data Analytics, Machine Learning, and Python. He has worked on group projects that required the implementation of Computer Vision, Image Classification, and App Development.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Nandi, Soham. (2024, December 18). LG's EXAONE 3.5 Sets New Standards in Generative AI. AZoAi. Retrieved on December 20, 2024 from https://www.azoai.com/news/20241218/LGs-EXAONE-35-Sets-New-Standards-in-Generative-AI.aspx.

  • MLA

    Nandi, Soham. "LG's EXAONE 3.5 Sets New Standards in Generative AI". AZoAi. 20 December 2024. <https://www.azoai.com/news/20241218/LGs-EXAONE-35-Sets-New-Standards-in-Generative-AI.aspx>.

  • Chicago

    Nandi, Soham. "LG's EXAONE 3.5 Sets New Standards in Generative AI". AZoAi. https://www.azoai.com/news/20241218/LGs-EXAONE-35-Sets-New-Standards-in-Generative-AI.aspx. (accessed December 20, 2024).

  • Harvard

    Nandi, Soham. 2024. LG's EXAONE 3.5 Sets New Standards in Generative AI. AZoAi, viewed 20 December 2024, https://www.azoai.com/news/20241218/LGs-EXAONE-35-Sets-New-Standards-in-Generative-AI.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.