Model Card for Model ID
AI大喜利,简介 https://www.gcores.com/articles/188405
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import Qwen2VLForConditionalGeneration, AutoTokenizer, AutoProcessor
from qwen_vl_utils import process_vision_info
model_id = "Joctor/qwen2-vl-7b-instruct-ogiri"
# default: Load the model on the available device(s)
model = Qwen2VLForConditionalGeneration.from_pretrained(
model_id, torch_dtype="auto", device_map="auto"
)
# default processer
processor = AutoProcessor.from_pretrained(model_id)
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"image": "path/to/image",
},
{"type": "text", "text": "根据图片给出有趣巧妙的回答"},
],
}
]
# Preparation for inference
text = processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
text=[text],
images=image_inputs,
videos=video_inputs,
padding=True,
return_tensors="pt",
)
inputs = inputs.to("cuda")
# Inference: Generation of the output
generated_ids = model.generate(**inputs, max_new_tokens=128)
generated_ids_trimmed = [
out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text)
Training Details
Training Data
https://huggingface.co/datasets/Joctor/cn_bokete_oogiri_caption
Training Procedure
基础模型:qwen2vl
微调方式:数据量充足,采用SFT微调
微调参数:max_length=1024(短就是好!), num_train_epochs=1, per_device_train_batch_size=1, gradient_accumulation_steps=1
训练设备:10 * 4090D
训练时长:22小时
- Downloads last month
- 52
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for Joctor/qwen2-vl-7b-instruct-ogiri
Base model
Qwen/Qwen2-VL-7B-Instruct