--- license: apache-2.0 language: - ru - en base_model: - Qwen/Qwen2.5-3B-Instruct pipeline_tag: text-generation library_name: transformers --- --- ## FractalGPT/RuQwen2.5-3b-instruct --- ### Model Overview - **RuQwen2.5-3b-instruct** by FractalGPT is a language model tailored to deliver high-quality Russian language output. Building upon the Qwen2.5 series, it is optimized for Russian-language tasks while retaining broad multilingual support. - **Improved Russian Language Quality**: Adaptations have significantly enhanced the fluency, accuracy, and coherence of Russian text generation, making it an excellent choice for Russian-language applications. ### Model Specifications - **Type**: Instruction-tuned Causal Language Model - **Training Stages**: Pretraining & Instruction Tuning - **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias - **Layers**: 36 - **Attention Heads (GQA)**: 24 for Q, 4 for KV - **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens