Zenyx-DeepSeek-220M: Macro-Reasoning in Nano-Scale

Zenyx Logo Framework Hardware

Zenyx-DeepSeek-220M is a nano-scale language model (220M parameters) engineered to replicate "System 2" reasoning capabilities usually found in models 100x its size. Built from scratch using JAX/Flax on TPU v5e, it has been fine-tuned on a massive distillation of DeepSeek-R1 logic traces.

This model demonstrates that even small parameter counts can learn complex Chain of Thought (CoT) structures when trained on high-density reasoning data.

🧠 Model Description

  • Model Type: Causal Language Model (Nano-Reasoner)
  • Architecture: Custom Llama-style (RoPE, SwiGLU, RMSNorm, Grouped Query Attention)
  • Parameters: 220 Million
  • Context Window: 2048 Tokens
  • Tokenizer: Qwen 2.5 (151,650 Vocab) + Special Reasoning Tokens
  • Training Data: ~3.15 Million Samples (80GB)
  • Final Loss: 1.805 (Exceptional convergence for this scale)

πŸ’‘ Intended Use & "Thinking" Mode

Zenyx is designed to think before it speaks. Unlike standard chat models that predict the immediate next word, Zenyx is trained to enter a reasoning state using special control tokens.

How to Prompt

To get the best results, you should use the ChatML format and encourage the model to generate the <think> block.

Format:

<|im_start|>user
{Question}<|im_end|>
<|im_start|>assistant
<think>

Output Behavior:

  • The model will open a block.
  • It will decompose the problem, analyze variables, and attempt to derive a solution step-by-step.
  • It will close with and provide the final .

import jax
import jax.numpy as jnp
from transformers import AutoTokenizer, FlaxAutoModelForCausalLM

# 1. Load Model & Tokenizer
model_id = "Arko007/zenyx-deepseek-220m"
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct", trust_remote_code=True)
model = FlaxAutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)

# 2. Add Special Reasoning Tokens
new_tokens = ["<think>", "</think>", "<answer>", "</answer>"]
tokenizer.add_special_tokens({"additional_special_tokens": new_tokens})

# 3. Format Input
prompt = "If I have 3 apples and eat one, how many oranges do I have?"
formatted_prompt = f"<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n<think>\n"

# 4. Generate
inputs = tokenizer(formatted_prompt, return_tensors="np")
output_ids = model.generate(
    **inputs, 
    max_new_tokens=2048, 
    do_sample=True, 
    temperature=0.7, 
    repetition_penalty=1.2
)

print(tokenizer.decode(output_ids.sequences[0], skip_special_tokens=False))

πŸ“‰ Training Details

Phase 1: The Foundation (Pre-training)

  • The base model was trained on ~153 Billion Tokens using a sophisticated curriculum to maximize information density:

  • FineWeb-Edu: English language fundamentals.

  • StarCoder: Logic, pythonic structure, and indentation.

  • FineWeb-2: Multilingual capabilities.

  • Omni-Mix: A weighted convergence of all datasets.

Phase 2: Project Zenyx (Instruction Tuning)

  • We performed a full epoch of Supervised Fine-Tuning (SFT) on Kaggle TPU v5e-8.

Strategy: "The DeepSeek Injection"

Data Mix:

  • 80% a-m-team/AM-DeepSeek-R1-0528-Distilled: High-fidelity reasoning traces.
  • 20% teknium/OpenHermes-2.5: General conversational ability.

Objective:

  • To force the model to adopt the "internal monologue" style of reasoning.

⚠️ Limitations & Bias

  • "The Overthinking Trap" Due to its small size (220M) and the intense nature of the DeepSeek dataset, Zenyx mimics the structure of genius-level reasoning but often lacks the capacity for factual arithmetic.
  • It may over-analyze simple questions (e.g., treating a simple riddle as a complex algebra problem).
  • It may hallucinate steps in a proof to maintain the "style" of reasoning.

Use Case:

  • This model is intended for research purposes to study the emergence of reasoning patterns in small language models. It is not recommended for production math or coding tasks where 100% accuracy is required.

πŸ“œ Citations

@misc{Zenyx220M,
  title = {Zenyx-DeepSeek-220M: Nano-Scale Reasoning Model},
  author = {Arko007},
  year = {2025},
  publisher = {HuggingFace},
  url = {[https://huggingface.co/Arko007/zenyx-deepseek-220m](https://huggingface.co/Arko007/zenyx-deepseek-220m)}
}
  • DeepSeek R1 Data: a-m-team/AM-DeepSeek-R1-0528-Distilled
  • OpenHermes: teknium/OpenHermes-2.5
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
0.2B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Arko007/zenyx-deepseek-220m

Finetuned
(1)
this model

Datasets used to train Arko007/zenyx-deepseek-220m