Multilingual AI Gets Smarter: Introducing Xmodel-1.5

Built on advanced architecture and fine-tuned for e-commerce, Xmodel-1.5 showcases breakthrough performance in low-resource languages, setting a new benchmark in global AI innovation.

Research: Xmodel-1.5: An 1B-scale Multilingual LLM. Image Credit: Krot_Studio / ShutterstockResearch: Xmodel-1.5: An 1B-scale Multilingual LLM. Image Credit: Krot_Studio / Shutterstock

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.

In an article submitted to the arXiv preprint* server, researchers at Xiaoduo AI introduced Xmodel-1.5, a 1-billion-parameter multilingual large model pre-trained on approximately 2 trillion tokens.

The model competed strongly in Thai, Arabic, French, Chinese, and English. They also released a Thai evaluation dataset annotated by students from Chulalongkorn University. This dataset includes 359 samples that are culturally and linguistically accurate, focusing on polite and contextually appropriate responses.

While promising, the results highlighted areas for improvement, aiming to advance multilingual artificial intelligence (AI) research. For instance, the model faced challenges with Thai slang, gender differentiation, and tone distinctions, occasionally producing unnatural responses.

Related Work

Past work on multilingual large language models (LLM) has focused on addressing natural language processing (NLP) challenges across diverse languages, emphasizing high-resource and low-resource contexts.

Notable models include cross-lingual language model - RoBERTa (XLM-R), multilingual text-to-text transfer transformer (mT5), and polynomial language model (PolyLM), which set benchmarks for multilingual AI.

XLM-R demonstrated robust low-resource generalization, while mT5 excelled in cross-lingual tasks focusing on understanding and generation.

PolyLM utilized bilingual data and curriculum learning, achieving strong performance, particularly in lower-resource languages like Thai and Indonesian.

Multilingual Data Integration

The pretraining of Xmodel-1.5 involved a diverse, multilingual corpus, emphasizing low-resource languages from MultiWiki and CulturaX. Data ratios evolved during training, increasing the multilingual content from 5% to 10% over 600,000 iterations, to enhance low-resource language coverage.

A unigram tokenizer with a 65,280-token vocabulary was developed to balance efficiency and linguistic coverage, outperforming other tokenizers in compression. It utilized byte fallback and character coverage settings to handle rare tokens, ensuring adaptability for low-resource languages and code generation.

The architecture integrated rotary positional embeddings, root mean square normalization (RMSNorm), switched gated linear unit (SwiGLU), and grouped-query attention for improved context understanding and training efficiency.

The training utilized 7 H800 GPUs, AdamW optimization, and a cosine learning rate schedule across 600,000 iterations, processing over 2 trillion tokens.

E-commerce RAG Fine-tuning

Instruction fine-tuning enhanced the model's performance on e-commerce retrieval-augmented generation (RAG) tasks using a comprehensive instruction dataset for training. The fine-tuning setup included a learning rate 6e-5, a weight decay of 0.1, a warmup ratio of 0.03, and a batch size of 120.

Progressive dataset construction integrated RAG and recurrent all-pairs field transforms (RAFT) datasets, with significant contributions from Belle (56.04%) and other sources. This approach achieved a 92.47% satisfaction rate on e-commerce evaluations, assessed by GPT-4o mini.

Xmodel Evaluation Insights

To ensure a fair comparison, the evaluation of Xmodel-1.5 was conducted against several prominent decoder-only architecture models, each with around 1 billion parameters. These included open pretrained transformers (OPT), Pythia, tiny LLM meta-AI (TinyLLaMA), MobileLLaMA, H2O-Danube, InternLM2, and Qwen2.5.

The evaluation focused on commonsense reasoning tasks using the LM Evaluation Harness, covering datasets such as AI2 reasoning challenge (ARC), ARC-easy, Boolq, HellaSwag, OpenBookQA, physical interaction question answering (PiQA), scientific question answering (SciQ), and Winogrande.

Results highlighted Xmodel-1.5's competitive performance, surpassing models like TinyLLaMA in multiple metrics. However, specific models like Qwen2.5 outperformed Xmodel-1.5 in overall accuracy.

To assess multilingual capabilities, Xmodel-1.5 was tested on translated datasets, including ARC (Chinese), XCOPA (11 languages), PIQA_AR (Arabic), Belebele_tha_thai (Thai), multilingual massive multitask language understanding (mMMLU), and mHellaSwag.

These datasets evaluated the model's reasoning, comprehension, and knowledge across various languages and domains. Comparative results showed the model's strengths and limitations in multilingual tasks, with performance insights detailed in case studies. For example, Xmodel-1.5 achieved a Belebele_tha_thai score of 0.2756, outperforming PolyLM-1.7B.

Instruction-following performance was evaluated using benchmarks such as IFEval and MT-Bench, which measure language understanding and multi-turn dialogue capabilities. Xmodel-1.5-Instruct demonstrated moderate proficiency in these areas.

Further evaluation included diverse tasks encompassing open-domain question answering, machine translation, e-commerce, and cultural nuances. This evaluation revealed the model's adaptability across domains and highlighted areas for improvement, such as slang understanding and cultural context.

Overall, the results emphasize Xmodel-1.5's capabilities and areas needing refinement for enhanced instruction and multilingual performance.

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.

Source:
Journal reference:
  • Preliminary scientific report. Qun, W., Yang, L., Qingquan, L., & Ling, J. (2024). Xmodel-1.5: An 1B-scale Multilingual LLM. ArXiv. https://arxiv.org/abs/2411.10083
Silpaja Chandrasekar

Written by

Silpaja Chandrasekar

Dr. Silpaja Chandrasekar has a Ph.D. in Computer Science from Anna University, Chennai. Her research expertise lies in analyzing traffic parameters under challenging environmental conditions. Additionally, she has gained valuable exposure to diverse research areas, such as detection, tracking, classification, medical image analysis, cancer cell detection, chemistry, and Hamiltonian walks.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Chandrasekar, Silpaja. (2025, January 08). Multilingual AI Gets Smarter: Introducing Xmodel-1.5. AZoAi. Retrieved on January 09, 2025 from https://www.azoai.com/news/20250108/Multilingual-AI-Gets-Smarter-Introducing-Xmodel-15.aspx.

  • MLA

    Chandrasekar, Silpaja. "Multilingual AI Gets Smarter: Introducing Xmodel-1.5". AZoAi. 09 January 2025. <https://www.azoai.com/news/20250108/Multilingual-AI-Gets-Smarter-Introducing-Xmodel-15.aspx>.

  • Chicago

    Chandrasekar, Silpaja. "Multilingual AI Gets Smarter: Introducing Xmodel-1.5". AZoAi. https://www.azoai.com/news/20250108/Multilingual-AI-Gets-Smarter-Introducing-Xmodel-15.aspx. (accessed January 09, 2025).

  • Harvard

    Chandrasekar, Silpaja. 2025. Multilingual AI Gets Smarter: Introducing Xmodel-1.5. AZoAi, viewed 09 January 2025, https://www.azoai.com/news/20250108/Multilingual-AI-Gets-Smarter-Introducing-Xmodel-15.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.