Next LLM
Collection
Our Next LLM models will be here.
•
3 items
•
Updated
•
1
Next 4B is a 4-billion parameter multimodal Vision-Language Model (VLM) based on Gemma 3, fine-tuned to handle both text and images efficiently. It is Türkiye’s first open-source vision-language model, designed for:
This model is ideal for researchers, developers, and organizations who need a high-performance multimodal AI capable of visual understanding, reasoning, and creative generation.
Model | MMLU (5-shot) % | MMLU-Pro % | GSM8K % | MATH % |
---|---|---|---|---|
Next 4B preview Version s325 | 84.6 | 66.9 | 82.7 | 70.5 |
Next 1B Version t327 | 87.3 | 69.2 | 90.5 | 70.1 |
Qwen 3 0.6B | 52.81 | 37.6 | 60.7 | 20.5 |
Llama 3.2 1B | 49.3 | 44.4 | 11.9 | 30.6 |
Kumru 7B not verified | 30.7 | 28.6 | 15.38 | 6.4 |
Model | MMLU (5-shot) % | MMLU-Pro % | GSM8K % | MATH % |
---|---|---|---|---|
Next Z1 Version l294 | 97.3 | 94.2 | 97.7 | 93.2 |
Next Z1 Version l294 (no tool) | 94.7 | 90.1 | 94.5 | 88.7 |
GPT 5 | 92.5 | 87.0 | 98.4 | 96.0 |
Claude Opus 4.1 (Thinking) | ~92.0 | 87.8 | 84.7 | 95.4 |
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoProcessor
from PIL import Image
import torch
model_id = "Lamapi/next-4b"
model = AutoModelForCausalLM.from_pretrained(model_id)
processor = AutoProcessor.from_pretrained(model_id) # For vision.
tokenizer = AutoTokenizer.from_pretrained(model_id)
# Read image
image = Image.open("image.jpg")
# Create a message in chat format
messages = [
{"role": "system","content": [{"type": "text", "text": "You are Next-X1, a smart and concise AI assistant trained by Lamapi. Always respond in the user's language. Proudly made in Turkey."}]},
{
"role": "user","content": [{"type": "image", "image": image},
{"type": "text", "text": "Who is in this image?"}
]
}
]
# Prepare input with Tokenizer
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=prompt, images=[image], return_tensors="pt")
# Output from the model
output = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "Lamapi/next-4b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# Chat message
messages = [
{"role": "system", "content": "You are Next-X1, a smart and concise AI assistant trained by Lamapi. Always respond in the user's language. Proudly made in Turkey."},
{"role": "user", "content": "Hello, how are you?"}
]
# Prepare input with Tokenizer
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt")
# Output from the model
output = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Feature | Description |
---|---|
🔋 Efficient Architecture | Optimized for low VRAM; supports 8-bit quantization for consumer GPUs. |
🖼️ Vision-Language Capable | Understands images, captions them, and performs visual reasoning tasks. |
🇹🇷 Multilingual & Turkish-Ready | Handles complex Turkish text with high accuracy. |
🧠 Advanced Reasoning | Supports logical and analytical reasoning for both text and images. |
📊 Consistent & Reliable Outputs | Reproducible responses across multiple runs. |
🌍 Open Source | Transparent, community-driven, and research-friendly. |
Specification | Details |
---|---|
Base Model | Gemma 3 |
Parameter Count | 4 Billion |
Architecture | Transformer, causal LLM + Vision Encoder |
Fine-Tuning Method | Instruction & multimodal fine-tuning (SFT) on Turkish and multilingual datasets |
Optimizations | Q8_0, F16, F32 quantizations for low VRAM and high VRAM usage |
Modalities | Text & Image |
Use Cases | Image captioning, multimodal QA, text generation, reasoning, creative storytelling |
This project is licensed under the MIT License — free to use, modify, and distribute. Attribution is appreciated.
Next 4B — Türkiye’s first vision-language AI, combining multimodal understanding, reasoning, and efficiency.