TRI-ML
/

DCLM-1B-IT

Transformers

Safetensors

openlm

Inference Endpoints

Model card Files Files and versions Community

Update README.md

by kushal-tri - opened Jul 22

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

-10

Files changed (1) hide show

README.md +7 -10

README.md CHANGED Viewed

@@ -62,26 +62,23 @@ Here are the evaluation results for DCLM-1B models on various tasks (using [llm-
 Note: All scores are presented as decimal values between 0 and 1, representing the proportion of correct answers or the model's performance on each task.
-Moreover, we present our evaluation results on Length-Controlled Alpaca-Eval 2.0 to measure our instruction-following capabilities.
 | Model                              | AlpacaEval2.0 LC Win-rate (%) |
 |------------------------------------|------------------------------:|
 | **Our runs**                       |                               |
-| DCLM-IT-1B                         |                8.6            |
 | DCLM-IT-7B                         |                16.6           |
-| Mistral-7B w/ OpenHermes 2.5       |                15.4           |
-| DCLM-Baseline-7B w/ OpenHermes 2.5 |                13.8           |
-| **Reported from the leaderboard**  |                               |
-| LLaMA-3-Instruct-8B                |                **22.9**       |
-| Mistral-v0.2-7B                    |                17.1           |
-| Mistral-7B w/ OpenHermes 2.5       |                16.2           |
-| Zephyr-Beta-7B                     |                13.2           |
-| Vicuna-v1.3-13B                    |                10.8           |
 | Gemma-Instruct-7B                  |                10.4           |
 | Nous-Hermes-13B                    |                9.7            |
 | DaVinci001                         |                9.0            |
 | LLaMA-2-Chat-13B                   |                8.4            |
 | Alpaca-7B                          |                5.9            |
 ## Example Code

 Note: All scores are presented as decimal values between 0 and 1, representing the proportion of correct answers or the model's performance on each task.
+Moreover, we present our evaluation results on Length-Controlled Alpaca-Eval 2.0 to measure our instruction-following capabilities.
 | Model                              | AlpacaEval2.0 LC Win-rate (%) |
 |------------------------------------|------------------------------:|
 | **Our runs**                       |                               |
+| DCLM-IT-1B                         |                **8.6**        |
 | DCLM-IT-7B                         |                16.6           |
+| **Reported from the leaderboard**  |                               |
 | Gemma-Instruct-7B                  |                10.4           |
 | Nous-Hermes-13B                    |                9.7            |
 | DaVinci001                         |                9.0            |
 | LLaMA-2-Chat-13B                   |                8.4            |
 | Alpaca-7B                          |                5.9            |
+| Gemma-Instruct-2B                  |                5.4            |
+| Phi-2 SFT                          |                5.9            |
+| Qwen1.5 1.8B Chat                  |                2.6            |
+|--------------------------------------------------------------------|
 ## Example Code