aisingapore
/

llama3-8b-cpt-sea-lionv2.1-instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

RaymondAISG commited on Aug 6

Commit

844e79e

•

1 Parent(s): 70a8b9f

Update README.md

Files changed (1) hide show

README.md +1 -2

README.md CHANGED Viewed

@@ -56,8 +56,7 @@ IFEval evaluates a model's ability to adhere to constraints provided in the prom
 MT-Bench evaluates a model's ability to engage in multi-turn (2 turns) conversations and respond in ways that align with human needs. We use `gpt-4-1106-preview` as the judge model and compare against `gpt-3.5-turbo-0125` as the baseline model. The metric used is the weighted win rate against the baseline model (i.e. average win rate across each category (Math, Reasoning, STEM, Humanities, Roleplay, Writing, Extraction)). A tie is given a score of 0.5.
-For more details on Llama3 8B CPT SEA-LIONv2 Instruct benchmark performance, please refer to the [SEA HELM leaderboard](https://leaderboard.sea-lion.ai/),
-https://leaderboard.sea-lion.ai/
 ### Usage

 MT-Bench evaluates a model's ability to engage in multi-turn (2 turns) conversations and respond in ways that align with human needs. We use `gpt-4-1106-preview` as the judge model and compare against `gpt-3.5-turbo-0125` as the baseline model. The metric used is the weighted win rate against the baseline model (i.e. average win rate across each category (Math, Reasoning, STEM, Humanities, Roleplay, Writing, Extraction)). A tie is given a score of 0.5.
+For more details on Llama3 8B CPT SEA-LIONv2 Instruct benchmark performance, please refer to the SEA HELM leaderboard, https://leaderboard.sea-lion.ai/
 ### Usage