unsloth/Llama-3.2-11B-Vision-Instruct (Fine-Tuned)

Model Overview

This model, fine-tuned from the unsloth/Llama-3.2-11B-Vision-Instruct base, is optimized for vision-language tasks with enhanced instruction-following capabilities. Fine-tuning was completed 2x faster using the Unsloth framework combined with Hugging Face's TRL library, ensuring efficient training while maintaining high performance.

Key Information

  • Developed by: Daemontatox
  • Base Model: unsloth/Llama-3.2-11B-Vision-Instruct
  • License: Apache-2.0
  • Language: English (en)
  • Frameworks Used: Hugging Face Transformers, Unsloth, and TRL

Performance and Use Cases

This model is ideal for applications involving:

  • Vision-based text generation and description tasks
  • Instruction-following in multimodal contexts
  • General-purpose text generation with enhanced reasoning

Features

  • 2x Faster Training: Leveraging the Unsloth framework for accelerated fine-tuning.
  • Multimodal Capabilities: Enhanced to handle vision-language interactions.
  • Instruction Optimization: Tailored for improved comprehension and execution of instructions.

How to Use

Inference Example (Hugging Face Transformers)

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Daemontatox/finetuned-llama-3.2-vision-instruct")
model = AutoModelForCausalLM.from_pretrained("Daemontatox/finetuned-llama-3.2-vision-instruct")

input_text = "Describe the image showing a sunset over mountains."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Daemontatox__DocumentCogito-details)!
Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=Daemontatox%2FDocumentCogito&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!

|      Metric       |Value (%)|
|-------------------|--------:|
|**Average**        |    24.21|
|IFEval (0-Shot)    |    50.64|
|BBH (3-Shot)       |    29.79|
|MATH Lvl 5 (4-Shot)|    16.24|
|GPQA (0-shot)      |     8.84|
|MuSR (0-shot)      |     8.60|
|MMLU-PRO (5-shot)  |    31.14|
Downloads last month
54
Safetensors
Model size
10.7B params
Tensor type
BF16
Β·
Inference Examples
Inference API (serverless) does not yet support transformers models for this pipeline type.

Model tree for Daemontatox/DocumentCogito

Spaces using Daemontatox/DocumentCogito 2

Evaluation results