File size: 1,048 Bytes
24882be
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
---
license: apache-2.0
language:
- ru
- en
base_model:
- Qwen/Qwen2.5-3B-Instruct
pipeline_tag: text-generation
library_name: transformers
---

---

## FractalGPT/RuQwen2.5-3b-instruct

---

### Model Overview

- **RuQwen2.5-3b-instruct** by FractalGPT is a language model tailored to deliver high-quality Russian language output. Building upon the Qwen2.5 series, it is optimized for Russian-language tasks while retaining broad multilingual support.

- **Improved Russian Language Quality**: Adaptations have significantly enhanced the fluency, accuracy, and coherence of Russian text generation, making it an excellent choice for Russian-language applications.

### Model Specifications

- **Type**: Instruction-tuned Causal Language Model
- **Training Stages**: Pretraining & Instruction Tuning
- **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
- **Layers**: 36
- **Attention Heads (GQA)**: 24 for Q, 4 for KV
- **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens