File size: 1,048 Bytes

24882be

---
license: apache-2.0
language:
- ru
- en
base_model:
- Qwen/Qwen2.5-3B-Instruct
pipeline_tag: text-generation
library_name: transformers
---

---

## FractalGPT/RuQwen2.5-3b-instruct

---

### Model Overview

- **RuQwen2.5-3b-instruct** by FractalGPT is a language model tailored to deliver high-quality Russian language output. Building upon the Qwen2.5 series, it is optimized for Russian-language tasks while retaining broad multilingual support.

- **Improved Russian Language Quality**: Adaptations have significantly enhanced the fluency, accuracy, and coherence of Russian text generation, making it an excellent choice for Russian-language applications.

### Model Specifications

- **Type**: Instruction-tuned Causal Language Model
- **Training Stages**: Pretraining & Instruction Tuning
- **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
- **Layers**: 36
- **Attention Heads (GQA)**: 24 for Q, 4 for KV
- **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens