Post
1116
๐๐น๐ฎ๐บ๐ฎ-๐ฏ.๐ญ ๐บ๐ผ๐ฑ๐ฒ๐น๐ ๐ณ๐ถ๐ป๐ฎ๐น๐น๐ ๐ด๐ฒ๐ ๐๐ต๐ฒ๐ถ๐ฟ ๐๐ต๐ฎ๐๐ฏ๐ผ๐ ๐๐ฟ๐ฒ๐ป๐ฎ ๐ฟ๐ฎ๐ป๐ธ๐ถ๐ป๐ด ๐๏ธ
Given the impressive benchmarks published my Meta for their Llama-3.1 models, I was curious to see how these models would compare to top proprietary models on Chatbot Arena.
Now we've got the results! LMSys released the ELO derived from thousands of user votes for the new models, and here are the rankings:
๐ฅ 405B Model ranks 5th overall, in front of GPT-4-turbo! But behind GPT-4o, Claude-3.5 Sonnet and Gemini-advanced.
๐ 70B Model climbs up to 9th rank ! From 1206 โก๏ธ 1244.
๐ 8B Model improves from 1152 โก๏ธ 1170.
โ This confirms that Llama-3.1 is a good contender for any task: any of its 3 model size is much cheaper to run than equivalent proprietary models!
For instance, here are the inference prices for the top models;
โค GPT-4-Turbo inference price from OpenAI: $5/M input tokens, $15/M output tokens
โค Llama-3.1-405B from HF API (for testing only): 3$/M for input or output tokens (Source linked in the first comment)
โค Llama-3.1-405B from HF API (for testing only): free โจ
Get a head start on the HF API (resource by @andrewrreed ) ๐ https://huggingface.co/learn/cookbook/enterprise_hub_serverless_inference_api
Given the impressive benchmarks published my Meta for their Llama-3.1 models, I was curious to see how these models would compare to top proprietary models on Chatbot Arena.
Now we've got the results! LMSys released the ELO derived from thousands of user votes for the new models, and here are the rankings:
๐ฅ 405B Model ranks 5th overall, in front of GPT-4-turbo! But behind GPT-4o, Claude-3.5 Sonnet and Gemini-advanced.
๐ 70B Model climbs up to 9th rank ! From 1206 โก๏ธ 1244.
๐ 8B Model improves from 1152 โก๏ธ 1170.
โ This confirms that Llama-3.1 is a good contender for any task: any of its 3 model size is much cheaper to run than equivalent proprietary models!
For instance, here are the inference prices for the top models;
โค GPT-4-Turbo inference price from OpenAI: $5/M input tokens, $15/M output tokens
โค Llama-3.1-405B from HF API (for testing only): 3$/M for input or output tokens (Source linked in the first comment)
โค Llama-3.1-405B from HF API (for testing only): free โจ
Get a head start on the HF API (resource by @andrewrreed ) ๐ https://huggingface.co/learn/cookbook/enterprise_hub_serverless_inference_api