Update README.md
Browse files
README.md
CHANGED
@@ -51,17 +51,18 @@ Evaluation of the model was conducted using the PoLL (Pool of LLM) technique, as
|
|
51 |
(two per evaluator). The evaluators included GPT-4o, Gemini-1.5-pro, and Claude3.5-sonnet.
|
52 |
|
53 |
**Performance Scores (on a scale of 5):**
|
54 |
-
| Model | Score |
|
55 |
-
|
56 |
-
| gpt-4o | 4.13 |
|
57 |
-
| mistralai/Mixtral-8x7B-Instruct-v0.1 | 3.71 |
|
58 |
-
| gpt-3.5-turbo | 3.66 |
|
59 |
-
|
|
60 |
-
| cmarkea/bloomz-
|
61 |
-
| cmarkea/bloomz-3b-
|
62 |
-
|
|
63 |
-
|
|
64 |
-
|
|
|
|
65 |
|
66 |
The bloomz-3b-dpo-chat model demonstrates improved performance over its SFT counterpart, particularly in zero-shot contexts, making it a competitive choice for
|
67 |
production environments.
|
|
|
51 |
(two per evaluator). The evaluators included GPT-4o, Gemini-1.5-pro, and Claude3.5-sonnet.
|
52 |
|
53 |
**Performance Scores (on a scale of 5):**
|
54 |
+
| Model | Score | # params |
|
55 |
+
|---------------------------------------------:|:-------:|:--------:|
|
56 |
+
| gpt-4o | 4.13 | N/A |
|
57 |
+
| mistralai/Mixtral-8x7B-Instruct-v0.1 | 3.71 | 46.7b |
|
58 |
+
| gpt-3.5-turbo | 3.66 | 175b |
|
59 |
+
| mistralai/Mistral-7B-Instruct-v0.2 | 1.98 | 7.25b |
|
60 |
+
| cmarkea/bloomz-7b1-mt-sft-chat | 1.69 | 7.1b |
|
61 |
+
| cmarkea/bloomz-3b-dpo-chat | 1.68 | 3b |
|
62 |
+
| cmarkea/bloomz-3b-sft-chat | 1.51 | 3b |
|
63 |
+
| croissantllm/CroissantLLMChat-v0.1 | 1.19 | 1.3b |
|
64 |
+
| cmarkea/bloomz-560m-sft-chat | 1.04 | 0.56b |
|
65 |
+
| OpenLLM-France/Claire-Mistral-7B-0.1 | 0.38 | 7.25b |
|
66 |
|
67 |
The bloomz-3b-dpo-chat model demonstrates improved performance over its SFT counterpart, particularly in zero-shot contexts, making it a competitive choice for
|
68 |
production environments.
|