Maestro-R1-Llama-8B

Model banner
Created by suayptalha

Model Information

Maestro-R1-Llama-8B

Maestro-R1-Llama-8B deepseek-ai/DeepSeek-R1-Distill-Llama-8B 8B Parameters
Maestro-R1-Llama-8B is a powerful language model fine-tuned from DeepSeek-R1-Distill-Llama-8B, a distilled model based on the Llama-3 architecture. DeepSeek-R1-Distill-Llama-8B itself is derived from the Llama-3 architecture, with a distillation process from DeepSeek-R1, utilizing a large corpus of diverse data. This distillation enables the model to retain strong reasoning capabilities while maintaining a smaller parameter count.
Maestro-R1-Llama-8B builds on this foundation, further enhancing its performance through fine-tuning on the ServiceNow-AI/R1-Distill-SFT dataset. This fine-tuning step sharpens the model's ability to handle specialized tasks and improves its reasoning, problem-solving, and code generation capabilities. The combination of the distilled base model and domain-specific fine-tuning makes Maestro-R1-Llama-8B an efficient and robust model, excelling across a wide range of language tasks.
DeepSeek-R1 Paper Link: https://arxiv.org/abs/2501.12948

Loss Graph

Model banner
Downloads last month
15
Safetensors
Model size
8.03B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for suayptalha/Maestro-R1-Llama-8B

Finetuned
(24)
this model
Quantizations
2 models

Dataset used to train suayptalha/Maestro-R1-Llama-8B

Collection including suayptalha/Maestro-R1-Llama-8B