Aya Expanse 8B French
Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. It focuses on pairing a highly performant pre-trained Command family of models with the result of a year’s dedicated research from Cohere For AI, including data arbitrage, multilingual preference training, safety tuning, and model merging. The result is a powerful multilingual large language model serving 23 languages.
We cover 23 languages: Arabic, Chinese (simplified & traditional), Czech, Dutch, English, French, German, Greek, Hebrew, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Turkish, Ukrainian, and Vietnamese
This model card corresponds to the 8-billion version of the Aya Expanse model. We also released an 32-billion version which you can find here.
- Developed by: Cohere For AI
- Point of Contact: Cohere For AI: cohere.for.ai
- License: CC-BY-NC, requires also adhering to C4AI's Acceptable Use Policy
- Model: Aya Expanse 8B
- Model Size: 8 billion parameters
Model description
This version have been fine-tuned using SFT
with huggingface
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Use paged_adamw_32bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 3
Training results
Framework versions
- PEFT 0.13.2
- Transformers 4.46.0
- Pytorch 2.1.1+cu121
- Datasets 3.0.2
- Tokenizers 0.20.1
- Downloads last month
- 62
Model tree for Svngoku/Aya-Expanse-8B-French
Base model
CohereForAI/aya-expanse-8b