sbrzz commited on
Commit
b9f867d
1 Parent(s): 500471d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -1
README.md CHANGED
@@ -7,8 +7,45 @@ metrics:
7
  pipeline_tag: image-text-to-text
8
  ---
9
 
 
 
10
  We use the powerful [TinyLLaVA Factory](https://github.com/TinyLLaVA/TinyLLaVA_Factory) to create a super small image-text-to-text model with only 296M params.
11
 
12
  The goal is to make it possible to run LLaVA models on edge devices (with few gigabytes of memory).
13
 
14
- For LLM and vision tower, we choose [OpenELM-270M-Instruct](apple/OpenELM-270M-Instruct) and [facebook/dinov2-small](facebook/dinov2-small), respectively.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  pipeline_tag: image-text-to-text
8
  ---
9
 
10
+ # Introduction
11
+
12
  We use the powerful [TinyLLaVA Factory](https://github.com/TinyLLaVA/TinyLLaVA_Factory) to create a super small image-text-to-text model with only 296M params.
13
 
14
  The goal is to make it possible to run LLaVA models on edge devices (with few gigabytes of memory).
15
 
16
+ For LLM and vision tower, we choose [OpenELM-270M-Instruct](apple/OpenELM-270M-Instruct) and [facebook/dinov2-small](facebook/dinov2-small), respectively.
17
+
18
+ # Result
19
+
20
+ For now we measured only [POPE](https://tinyllava-factory.readthedocs.io/en/latest/Evaluation.html#pope) with these results
21
+
22
+ Category: adversarial, # samples: 3000
23
+ TP FP TN FN
24
+ 1264 575 925 236
25
+ Accuracy: 0.7296666666666667
26
+ Precision: 0.6873300706905927
27
+ Recall: 0.8426666666666667
28
+ F1 score: 0.7571129080563043
29
+ Yes ratio: 0.613
30
+ 0.757, 0.730, 0.687, 0.843, 0.613
31
+ ====================================
32
+ Category: popular, # samples: 3000
33
+ TP FP TN FN
34
+ 1264 301 1199 236
35
+ Accuracy: 0.821
36
+ Precision: 0.807667731629393
37
+ Recall: 0.8426666666666667
38
+ F1 score: 0.8247960848287113
39
+ Yes ratio: 0.5216666666666666
40
+ 0.825, 0.821, 0.808, 0.843, 0.522
41
+ ====================================
42
+ Category: random, # samples: 2910
43
+ TP FP TN FN
44
+ 1264 290 1120 236
45
+ Accuracy: 0.8192439862542955
46
+ Precision: 0.8133848133848134
47
+ Recall: 0.8426666666666667
48
+ F1 score: 0.8277668631303209
49
+ Yes ratio: 0.534020618556701
50
+ 0.828, 0.819, 0.813, 0.843, 0.534
51
+ ====================================