Abstract
We introduce Xmodel-1.5, a novel 1-billion-parameter multilingual large model pretrained on approximately 2 trillion tokens. The model demonstrates strong performance across several languages, with particularly notable results in Thai, Arabic, and French, alongside its effectiveness in Chinese and English. In addition, we contribute to the research community by releasing a Thai evaluation dataset, which includes hundreds of questions annotated by students from Chulalongkorn University's School of Integrated Innovation. While the results are promising, we acknowledge that there is still room for improvement. We hope this work advances ongoing efforts in multilingual AI research and promotes better cross-linguistic understanding in various natural language processing tasks. Our models and code are publicly available on GitHub at https://github.com/XiaoduoAILab/XmodelLM.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models (2024)
- Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus (2024)
- Multilingual Pretraining Using a Large Corpus Machine-Translated from a Single Source Language (2024)
- Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs (2024)
- EuroLLM: Multilingual Language Models for Europe (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper