zhangsongbo365 commited on
Commit
a093431
1 Parent(s): c39c842

Update the model card

Browse files
Files changed (1) hide show
  1. README.md +43 -0
README.md CHANGED
@@ -5,4 +5,47 @@ language:
5
  base_model:
6
  - meta-llama/Llama-3.2-11B-Vision-Instruct
7
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
 
5
  base_model:
6
  - meta-llama/Llama-3.2-11B-Vision-Instruct
7
  ---
8
+ ## Introduction
9
+ This model originates from [Xkev/Llama-3.2V-11B-cot](https://huggingface.co/Xkev/Llama-3.2V-11B-cot). This repository simply quantizes the model into the NF4 format using the bitsandbytes library.
10
+ All credit goes to the original repository.
11
+
12
+ ## Usage
13
+ ```
14
+ from transformers import MllamaForConditionalGeneration, AutoProcessor, BitsAndBytesConfig
15
+ from PIL import Image
16
+ import time
17
+
18
+ # Load model
19
+ model_id = "zhangsongbo365/Llama-3.2V-11B-cot-nf4"
20
+ model = MllamaForConditionalGeneration.from_pretrained(
21
+ model_id,
22
+ use_safetensors=True,
23
+ device_map="cuda:0",
24
+ trust_remote_code=True
25
+ )
26
+
27
+ # Load tokenizer
28
+ processor = AutoProcessor.from_pretrained(model_id)
29
+
30
+ # Caption a local image
31
+ IMAGE = Image.open("1.png").convert("RGB") # 改为你的实际图片路径
32
+ PROMPT = """<|begin_of_text|><|start_header_id|>user<|end_header_id|>
33
+ Caption this image:
34
+ <|image|><|eot_id|><|start_header_id|>assistant<|end_header_id|>
35
+ """
36
+
37
+ inputs = processor(IMAGE, PROMPT, return_tensors="pt").to(model.device)
38
+ prompt_tokens = len(inputs['input_ids'][0])
39
+ print(f"Prompt tokens: {prompt_tokens}")
40
+
41
+ t0 = time.time()
42
+ generate_ids = model.generate(**inputs, max_new_tokens=256)
43
+ t1 = time.time()
44
+ total_time = t1 - t0
45
+ generated_tokens = len(generate_ids[0]) - prompt_tokens
46
+ time_per_token = generated_tokens/total_time
47
+ print(f"Generated {generated_tokens} tokens in {total_time:.3f} s ({time_per_token:.3f} tok/s)")
48
+
49
+ output = processor.decode(generate_ids[0][prompt_tokens:]).replace('<|eot_id|>', '')
50
+ ```
51