File size: 1,048 Bytes
24882be |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
---
license: apache-2.0
language:
- ru
- en
base_model:
- Qwen/Qwen2.5-3B-Instruct
pipeline_tag: text-generation
library_name: transformers
---
---
## FractalGPT/RuQwen2.5-3b-instruct
---
### Model Overview
- **RuQwen2.5-3b-instruct** by FractalGPT is a language model tailored to deliver high-quality Russian language output. Building upon the Qwen2.5 series, it is optimized for Russian-language tasks while retaining broad multilingual support.
- **Improved Russian Language Quality**: Adaptations have significantly enhanced the fluency, accuracy, and coherence of Russian text generation, making it an excellent choice for Russian-language applications.
### Model Specifications
- **Type**: Instruction-tuned Causal Language Model
- **Training Stages**: Pretraining & Instruction Tuning
- **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
- **Layers**: 36
- **Attention Heads (GQA)**: 24 for Q, 4 for KV
- **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens |