Virtuoso-Lite Logo

Virtuoso-Lite (10B) is our next-generation, 10-billion-parameter language model based on the Llama-3 architecture. It is distilled from Deepseek-v3 using ~1.1B tokens/logits, allowing it to achieve robust performance at a significantly reduced parameter count compared to larger models. Despite its compact size, Virtuoso-Lite excels in a variety of tasks, demonstrating advanced reasoning, code generation, and mathematical problem-solving capabilities.

GGUF

Quantizations available here

Model Details

  • Architecture Base: Falcon-10B (based on Llama-3)
  • Parameter Count: 10B
  • Tokenizer:
    • Initially integrated with Deepseek-v3 tokenizer for logit extraction.
    • Final alignment uses the Llama-3 tokenizer, with specialized “tokenizer surgery” for cross-architecture compatibility.
  • Distillation Data:
    • ~1.1B tokens/logits from Deepseek-v3’s training data.
    • Logit-level distillation using a proprietary “fusion merging” approach for maximum fidelity.
  • License: falcon-llm-license

Background on Deepseek Distillation

Deepseek-v3 serves as the teacher model, from which we capture logits across billions of tokens. Rather than standard supervised fine-tuning, Virtuoso-Lite applies a full logit-level replication to preserve the most crucial insights from the teacher. This approach enables:

  • Strong performance on technical/scientific queries
  • Enhanced code generation and debugging
  • Improved consistency in math-intensive tasks

Intended Use Cases

  • Chatbots & Virtual Assistants
  • Lightweight Enterprise Data Analysis
  • Research Prototypes & Proofs of Concept
  • STEM Educational Tools (where smaller footprint is advantageous)

Evaluations

Virtuoso-Lite Logo

How to Use

Below is a sample code snippet using transformers:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "arcee-ai/virtuoso-lite"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = "Provide a concise summary of quantum entanglement."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=150)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training & Fine-Tuning

  • Initial Training: Began with Falcon-10B, optimized for large-scale text ingestion.
  • Distillation & Merging:
    • Trained on ~1.1B tokens/logits from Deepseek-v3.
    • Employed “fusion merging” to capture detailed teacher insights.
    • Final step included DPO to enhance alignment and mitigate hallucinations.
  • Future Developments: We plan to incorporate additional R1 distillations to further improve specialized performance and reduce model footprint.

Performance

Virtuoso-Lite demonstrates strong results across multiple benchmarks (e.g., BBH, MMLU-PRO, MATH), often standing its ground against models with higher parameter counts. This efficiency is largely credited to logit-level distillation, which compresses the teacher model’s capabilities into a more parameter-friendly package.

Limitations

  • Context Length: 128k Tokens (may vary depending on the final tokenizer settings and system resources).
  • Knowledge Cut-off: Training data may not reflect the latest events or developments beyond June 2024.

Ethical Considerations

  • Content Generation Risks: Like any language model, Virtuoso-Lite can generate potentially harmful or biased content if prompted in certain ways.

License

Virtuoso-Lite (10B) is released under the falcon-llm-license License. You are free to use, modify, and distribute this model in both commercial and non-commercial applications, subject to the terms and conditions of the license.

If you have questions or would like to share your experiences using Virtuoso-Lite (10B), please connect with us on social media. We’re excited to see what you build—and how this model helps you innovate!

Downloads last month
117
Safetensors
Model size
10.3B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for arcee-ai/Virtuoso-Lite

Finetuned
(12)
this model
Quantizations
12 models