pharaouk's picture
Update README.md
9888b5b verified
metadata
license: llama3.1
language:
  - en
pipeline_tag: text-generation

Deepthought-8B

Deepthought-8B is a small and capable reasoning model built on LLaMA-3.1 8B, designed to make AI reasoning more transparent and controllable. Despite its relatively small size, it achieves sophisticated reasoning capabilities that rival much larger models.

Model Description

Deepthought-8B is designed with a unique approach to problem-solving, breaking down its thinking into clear, distinct, documented steps. The model outputs its reasoning process in a structured JSON format, making it easier to understand and validate its decision-making process.

Key Features

  • Transparent Reasoning: Step-by-step documentation of the thought process
  • Programmable Approach: Customizable reasoning patterns without model retraining
  • Test-time Compute Scaling: Flexible reasoning depth based on task complexity
  • Efficient Scale: Runs on 16GB+ VRAM
  • Structured Output: JSON-formatted reasoning chains for easy integration

Try out Deepthought-8B on our Ruliad interface: https://chat.ruliad.co

Technical Requirements

  • Python 3.6+
  • PyTorch
  • Transformers library
  • 16GB+ VRAM
  • Optional: Flash Attention 2 for improved performance

Installation

pip install torch transformers
# Optional: Install Flash Attention 2 for better performance
pip install flash-attn

Usage

  1. First, set your HuggingFace token as an environment variable:
export HF_TOKEN=your_token_here
export HF_HUB_ENABLE_HF_TRANSFER=1
  1. Use the model in your Python code:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Initialize the model
model_name = "ruliad/deepthought-8b-llama-v0.01-alpha"
tokenizer = AutoTokenizer.from_pretrained(
    model_name,
    add_bos_token=False,
    trust_remote_code=True,
    padding="left",
    torch_dtype=torch.bfloat16,
)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    attn_implementation="flash_attention_2",  # Use "eager" (or omit) if flash_attn is not installed
    use_cache=True,
    trust_remote_code=True,
)
  1. Run the provided example script:
python deepthought_inference.py

Example Output

The model provides structured reasoning in JSON format:

{
  "step": 1,
  "type": "problem_understanding",
  "thought": "Understanding the user's objective for the task."
}

Each reasoning chain includes multiple steps:

  1. Problem understanding
  2. Data gathering
  3. Analysis
  4. Calculation (when applicable)
  5. Verification
  6. Conclusion drawing
  7. Implementation

Performance

Deepthought-8B demonstrates strong performance across various benchmarks:

  • Step-by-step problem-solving
  • Coding and mathematical tasks
  • Instruction following with transparent reasoning
  • Scalable performance with test-time compute

Limitations

Current known limitations include:

  • Complex mathematical reasoning
  • Long-context processing
  • Edge case handling

License

The model is available under a commercial license for enterprise use.

Citation

If you use this model in your research, please cite:

@misc{Deepthought2024,
  author = {Ruliad},
  title = {Deepthought-8B: A Small and Capable Reasoning Model},
  year = {2024},
  publisher = {Ruliad}
}

Support

For questions and feedback: