OpenLLM-Ro

community

AI & ML interests

None defined yet.

The goal of the OpenLLM-Ro is to bring together the Romanian community that builds open Romanian models and to collect these models in a single place.

We value:

  • using public and open corpora
  • open-source training and evaluation code.

In this organization, you can find RoLLM models, based on different underlying models and in different flavours (i.e., foundational, instruct, or chat variants). There are currently four model collections:

  • RoLlama2: Romanian models based on Llama2
  • RoMistral: Romanian models based on Mistral
  • RoGemma: Romanian models based on Gemma
  • RoLlama3: Romanian models based on Llama3

Furthermore, here you can find data that was used for training and evaluation LLMs in Romanian. Currently, there are two data collections:

  • SFT datasets: data used for supervised (instruction) finetuning
  • Evaluation datasets: data used for evaluating LLM in Romanian

See details in https://arxiv.org/abs/2406.18266 and https://arxiv.org/abs/2405.07703.

We encourage the community to engage in discussions (to provide feedback, ask questions, or make improvement suggestions) in Hugging Face or GitHub.

We will also organize physical meetings (announced in advance) to brainstorm ideas, roadmap, and other technical aspects.