Update README.md
Browse files
README.md
CHANGED
@@ -19,33 +19,8 @@ For LLM and vision tower, we choose [OpenELM-270M-Instruct](apple/OpenELM-270M-I
|
|
19 |
|
20 |
For now we measured only [POPE](https://tinyllava-factory.readthedocs.io/en/latest/Evaluation.html#pope) with these results
|
21 |
|
22 |
-
Category
|
23 |
-
|
24 |
-
1264
|
25 |
-
|
26 |
-
|
27 |
-
Recall: 0.8426666666666667
|
28 |
-
F1 score: 0.7571129080563043
|
29 |
-
Yes ratio: 0.613
|
30 |
-
0.757, 0.730, 0.687, 0.843, 0.613
|
31 |
-
====================================
|
32 |
-
Category: popular, # samples: 3000
|
33 |
-
TP FP TN FN
|
34 |
-
1264 301 1199 236
|
35 |
-
Accuracy: 0.821
|
36 |
-
Precision: 0.807667731629393
|
37 |
-
Recall: 0.8426666666666667
|
38 |
-
F1 score: 0.8247960848287113
|
39 |
-
Yes ratio: 0.5216666666666666
|
40 |
-
0.825, 0.821, 0.808, 0.843, 0.522
|
41 |
-
====================================
|
42 |
-
Category: random, # samples: 2910
|
43 |
-
TP FP TN FN
|
44 |
-
1264 290 1120 236
|
45 |
-
Accuracy: 0.8192439862542955
|
46 |
-
Precision: 0.8133848133848134
|
47 |
-
Recall: 0.8426666666666667
|
48 |
-
F1 score: 0.8277668631303209
|
49 |
-
Yes ratio: 0.534020618556701
|
50 |
-
0.828, 0.819, 0.813, 0.843, 0.534
|
51 |
-
====================================
|
|
|
19 |
|
20 |
For now we measured only [POPE](https://tinyllava-factory.readthedocs.io/en/latest/Evaluation.html#pope) with these results
|
21 |
|
22 |
+
| Category | # Samples | TP | FP | TN | FN | Accuracy | Precision | Recall | F1 Score | Yes Ratio |
|
23 |
+
|-------------|------------|------|-----|------|-----|----------|-----------|--------|----------|-----------|
|
24 |
+
| Adversarial | 3000 | 1264 | 575 | 925 | 236 | 0.7297 | 0.6873 | 0.8427 | 0.7571 | 0.613 |
|
25 |
+
| Popular | 3000 | 1264 | 301 | 1199 | 236 | 0.8210 | 0.8077 | 0.8427 | 0.8248 | 0.5217 |
|
26 |
+
| Random | 2910 | 1264 | 290 | 1120 | 236 | 0.8192 | 0.8134 | 0.8427 | 0.8278 | 0.5340 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|