ChuckMcSneed
/

Gembo-v1-70b

Text Generation

nsfw

Not-For-All-Audiences

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ChuckMcSneed commited on Feb 15

Commit

798b7fe

•

1 Parent(s): eef189e

Update README.md

Files changed (1) hide show

README.md +13 -1

README.md CHANGED Viewed

@@ -75,6 +75,8 @@ Then I SLERP-merged it with cognitivecomputations/dolphin-2.2-70b (Needed to bri
 | P | 5.25 |
 | Total | 19.75 |
 ### Open LLM leaderboard
 [Leaderboard on Huggingface](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 |Model                           |Average|ARC  |HellaSwag|MMLU |TruthfulQA|Winogrande|GSM8K|
@@ -82,4 +84,14 @@ Then I SLERP-merged it with cognitivecomputations/dolphin-2.2-70b (Needed to bri
 |ChuckMcSneed/Gembo-v1-70b       |70.51  |71.25|86.98    |70.85|63.25     |80.51     |50.19|
 |ChuckMcSneed/SMaxxxer-v1-70b    |72.23  |70.65|88.02    |70.55|60.7      |82.87     |60.58|
-Looks like adding a shitton of RP stuff decreased HellaSwag, WinoGrande and GSM8K, but increased TruthfulQA, MMLU and ARC. Interesting. To be hosnest, I'm a bit surprised that it didn't do that much worse.

 | P | 5.25 |
 | Total | 19.75 |
+Absurdly high. That's what happens when you optimize the merges for a benchmark.
 ### Open LLM leaderboard
 [Leaderboard on Huggingface](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 |Model                           |Average|ARC  |HellaSwag|MMLU |TruthfulQA|Winogrande|GSM8K|
 |ChuckMcSneed/Gembo-v1-70b       |70.51  |71.25|86.98    |70.85|63.25     |80.51     |50.19|
 |ChuckMcSneed/SMaxxxer-v1-70b    |72.23  |70.65|88.02    |70.55|60.7      |82.87     |60.58|
+Looks like adding a shitton of RP stuff decreased HellaSwag, WinoGrande and GSM8K, but increased TruthfulQA, MMLU and ARC. Interesting. To be hosnest, I'm a bit surprised that it didn't do that much worse.
+### WolframRavenwolf
+Benchmark by [@wolfram](https://huggingface.co/wolfram)
+Artefact2/Gembo-v1-70b-GGUF GGUF Q5_K_M, 4K context, Alpaca format:
+- ✅ Gave correct answers to all 18/18 multiple choice questions! Just the questions, no previous information, gave correct answers: 16/18
+- ✅ Consistently acknowledged all data input with "OK".
+- ➖ Did NOT follow instructions to answer with just a single letter or more than just a single letter.
+This shows that this model can be used for real world use cases as an assistant.