Instructions to use dnotitia/DNA3.0-2B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use dnotitia/DNA3.0-2B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="dnotitia/DNA3.0-2B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("dnotitia/DNA3.0-2B")
model = AutoModelForImageTextToText.from_pretrained("dnotitia/DNA3.0-2B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use dnotitia/DNA3.0-2B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "dnotitia/DNA3.0-2B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dnotitia/DNA3.0-2B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/dnotitia/DNA3.0-2B

SGLang

How to use dnotitia/DNA3.0-2B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "dnotitia/DNA3.0-2B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dnotitia/DNA3.0-2B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "dnotitia/DNA3.0-2B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dnotitia/DNA3.0-2B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use dnotitia/DNA3.0-2B with Docker Model Runner:
```
docker model run hf.co/dnotitia/DNA3.0-2B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

DNA 3.0

We introduce DNA 3.0, a family of large language models built upon the Qwen3.5/3.6 base model with enhanced capabilities for Korean and enterprise scenarios. By applying an Uncensored Training methodology together with Persona Training (deep grounding in Dnotitia's corporate knowledge and product context), we've created a model that excels in analytical reasoning, agentic coding, and multimodal understanding while maintaining genuinely open, enterprise-aware conversational capabilities.

Highlights

Dnotitia Post-training

Uncensored Training: The model is post-trained with an uncensored methodology so that it can respond to a wider range of prompts without unnecessary refusals, while preserving the instruction-following and reasoning quality of the base model.
Persona Training: Additional supervised training on Dnotitia's corporate knowledge — company history, products, services, and internal terminology — so the model can act as an authentic first-party assistant for Dnotitia-facing use cases.
Long-form Reasoning Preservation: Chain-of-thought traces from prior turns can be retained across multi-turn sessions, enabling smoother iterative development and debugging workflows.

Inherited from Qwen3.5/3.6

Unified Vision-Language Foundation: Early-fusion training over multimodal tokens delivers strong cross-modal reasoning across text, image, and video — outperforming the prior Qwen3-VL line on coding, agents, and visual understanding benchmarks.
Efficient Hybrid MoE Architecture: Gated DeltaNet (linear attention) layers combined with sparse MoE layers deliver high-throughput inference while keeping activated parameters low.
Scalable RL Generalization: Reinforcement learning is scaled across million-agent environments with progressively complex task distributions, improving real-world adaptability for tool use and agentic workflows.
Global Linguistic Coverage: Native support for 201 languages and dialects, enabling inclusive worldwide deployment with nuanced cultural and regional understanding.
Long Context: Native 262,144-token context length, extensible up to roughly 1,010,000 tokens via YaRN scaling.
Thinking Mode by Default: Generates <think>...</think> reasoning blocks before final answers; can be disabled with "enable_thinking": false.

Comparison with Qwen3.5-2B

The chart above compares DNA3.0-2B against its Qwen3.5-2B base across four metrics, reported on a 0–1 scale (higher is better):

Persona Identification — Measures how reliably the model identifies itself as a Dnotitia assistant and answers correctly about Dnotitia's company, products, and identity.
Uncensorship — Measures how willingly the model engages with topics that the Chinese-origin base model is trained to refuse — i.e., subjects suppressed by the censorship policies baked into the original Qwen training.
Language Confusion Reduction — Measures how well the model avoids unintended language mixing, particularly Chinese-character intrusions in Korean responses — a well-known failure mode of Qwen-family models.
Repetition Reduction — Measures how well the model avoids getting stuck in infinite-loop repetition during long-form generation, another common failure mode of the base model.

Model Overview

Field	Value
Base Model	`Qwen/Qwen3.5-2B`
Model Type	Causal Language Model with Vision Encoder (Dense)
Parameters	2B
Hidden Dimension	2048
Number of Layers	24
Hidden Layout	6 × (3 × (Gated DeltaNet → FFN) → 1 × (Gated Attention → FFN))
Gated Attention Heads	8 (Q) / 2 (KV), head dim 256
Gated DeltaNet Heads	16 (V) / 16 (QK), head dim 128
FFN Intermediate Dim	6,144
Token Embedding	248,320 (padded, tied to LM head)
Context Length	262,144 native, up to ~1,010,000 extended (YaRN)
License	Apache-2.0

Quickstart

DNA3.0 models support both non-thinking and thinking mode. DNA3.0-2B operates in non-thinking mode by default.

DNA 3.0 is compatible with the Hugging Face Transformers ecosystem as well as popular inference engines such as vLLM, SGLang, and KTransformers. Given the model's scale, a dedicated serving engine on multi-GPU hardware is strongly recommended for production workloads.

The model has a default context length of 262,144 tokens. If you encounter out-of-memory (OOM) errors, reduce the context window — but keep at least 128K tokens to preserve long-form reasoning behavior.

vLLM

# Standard (multimodal) serving
vllm serve dnotitia/DNA3.0-2B \
  --reasoning-parser qwen3

# Tool-calling enabled
vllm serve dnotitia/DNA3.0-2B \
  --reasoning-parser qwen3 \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_coder

# Text-only mode (skip vision encoder to free KV-cache memory) serving
vllm serve dnotitia/DNA3.0-2B \
  --reasoning-parser qwen3 \
  --language-model-only

Disabling Thinking Mode

For latency-sensitive or non-reasoning workloads, disable thinking mode via the chat-template kwarg:

$ curl https://demo-api.dnotitia.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dna-router_xxxx" \
  -d '{
    "model": "DNA3.0-2B",
    "messages": [
      {
        "role": "user",
        "content": "코스피가 8000을 넘으려면 너 생각에 몇 년이나 더 걸릴 거 같아?"
      }
    ],
    "chat_template_kwargs": {
      "enable_thinking": false
    }
  }' | jq

Unlike Qwen3, the DNA 3.0 generation does not support the soft-switch commands /think and /nothink. Use chat_template_kwargs.enable_thinking instead.

Image Input

DNA 3.0 accepts image and video inputs in OpenAI-compatible content array format:

$ curl https://demo-api.dnotitia.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dna-router_xxxx" \
  -d '{
    "model": "DNA3.0-2B",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "image_url",
            "image_url": {
              "url": "https://upload.wikimedia.org/wikipedia/commons/6/6e/Golde33443.jpg"
            }
          },
          {
            "type": "text",
            "text": "이 이미지에 무엇이 있나요? 한국어로 설명해 주세요."
          }
        ]
      }
    ]
  }' | jq

Limitations, Bias, and Responsible Use

DNA 3.0 has been post-trained with an uncensored methodology, which means it will engage with a broader range of prompts than typical safety-tuned models. Users and downstream developers should be aware of the following:

Reduced refusal behavior: The model may respond to prompts that other models decline. This does not constitute endorsement of the content. Downstream applications should implement appropriate content moderation, output filtering, and policy layers suited to their deployment context.
Persona bias: Because the model has been trained on Dnotitia-specific corporate knowledge, it may exhibit a first-party perspective when discussing Dnotitia, its products, or related entities. For neutral comparative analysis, prompt accordingly.
Inherited biases: As a derivative of Qwen3.5/3.6, DNA 3.0 inherits the biases, gaps, and limitations of its base model and training data, including potential cultural, linguistic, and factual blind spots.
Hallucination: Like all LLMs, DNA 3.0 can produce confident but incorrect output, particularly for niche facts, recent events, or high-precision numerical reasoning.
Not for high-stakes autonomous use: The model should not be deployed in safety-critical, legal, medical, or financial decision-making pipelines without human oversight and domain-specific validation.

Users are responsible for ensuring their use of the model complies with applicable laws and regulations in their jurisdiction.

License

This model is released under the Apache-2.0 license, inherited from the Qwen3.5/3.6 base model.

Acknowledgments

We thank the Qwen team for releasing the Qwen3.5/3.6 base model under an open license, which made this work possible. We are also grateful to the broader open-source community behind the serving and training ecosystem — HuggingFace and vLLM — which our pipeline relies on throughout.