metadata
license: other
license_name: nexusflowresearchlicense
license_link: >-
https://huggingface.co/Nexusflow/Athene-V2-Chat/resolve/main/Nexusflow_Research_License_.pdf
language:
- en
library_name: transformers
tags:
- RLHF
- Nexusflow
- Athene
- Chat Model
base_model:
- Qwen/Qwen2.5-72B-Instruct
Athene-V2-Chat-72B: Rivaling GPT-4o across Benchmarks
- AWQ 4bit version of Nexusflow/Athene-V2-Chat
- Quantization code
Eval AWQ version
Evaluation results on ZebraGrid
ββββββββββββββββββββββββββββββββββββ€βββββββββ€βββββββββββ€βββββββββββ€βββββββββββββββ€ββββββββββββββββββββ€ββββββββββββββββββββ€βββββββββββββ€ββββββββββββββ€ββββββββββββββββββ€ββββββββββββββββ
β Model β Mode β N_Mode β N_Size β Puzzle Acc β Easy Puzzle Acc β Hard Puzzle Acc β Cell Acc β No answer β Total Puzzles β Reason Lens β
ββββββββββββββββββββββββββββββββββββͺβββββββββͺβββββββββββͺβββββββββββͺβββββββββββββββͺββββββββββββββββββββͺββββββββββββββββββββͺβββββββββββββͺββββββββββββββͺββββββββββββββββββͺββββββββββββββββ‘
β o1-preview-2024-09-12 β greedy β single β 1 β 71.4 β 98.57 β 60.83 β 75.14 β 0.3 β 1000 β 1565.88 β
β o1-preview-2024-09-12-v2 β greedy β single β 1 β 70.4 β 98.21 β 59.58 β 74.18 β 0.4 β 1000 β 1559.71 β
β o1-mini-2024-09-12-v3 β greedy β single β 1 β 59.7 β 86.07 β 49.44 β 70.32 β 1 β 1000 β 1166.38 β
β o1-mini-2024-09-12-v2 β greedy β single β 1 β 56.8 β 82.86 β 46.67 β 69.87 β 1.3 β 1000 β 1164.95 β
β o1-mini-2024-09-12 β greedy β single β 1 β 52.6 β 87.14 β 39.17 β 52.29 β 0.8 β 1000 β 993.28 β
β claude-3-5-sonnet-20241022 β greedy β single β 1 β 36.2 β 91.07 β 14.86 β 54.27 β 0 β 1000 β 861.18 β
β claude-3-5-sonnet-20240620 β greedy β single β 1 β 33.4 β 87.5 β 12.36 β 54.34 β 0 β 1000 β 1141.94 β
β Llama-3.1-405B-Inst-fp8@together β greedy β single β 1 β 32.6 β 87.14 β 11.39 β 45.8 β 12.5 β 1000 β 314.66 β
β gpt-4o-2024-08-06 β greedy β single β 1 β 31.7 β 84.64 β 11.11 β 50.34 β 3.6 β 1000 β 1106.51 β
β gemini-1.5-pro-exp-0827 β greedy β single β 1 β 30.5 β 79.64 β 11.39 β 50.84 β 0.8 β 1000 β 1594.47 β
β Llama-3.1-405B-Inst@sambanova β greedy β single β 1 β 30.1 β 84.64 β 8.89 β 39.06 β 24.7 β 1000 β 2001.12 β
β chatgpt-4o-latest-24-09-07 β greedy β single β 1 β 29.9 β 81.43 β 9.86 β 48.83 β 4.2 β 1000 β 1539.99 β
β Mistral-Large-2 β greedy β single β 1 β 29 β 80.36 β 9.03 β 47.64 β 1.7 β 1000 β 1592.39 β
β gpt-4-turbo-2024-04-09 β greedy β single β 1 β 28.4 β 80.71 β 8.06 β 47.9 β 0.1 β 1000 β 1148.46 β
β gpt-4o-2024-05-13 β greedy β single β 1 β 28.2 β 77.86 β 8.89 β 38.72 β 19.3 β 1000 β 1643.51 β
β Athene-V2-Chat-AWQ β greedy β single β 1 β 27.8 β 77.14 β 8.61 β 45.83 β 6.4 β 1000 β 1785.7 β
β gpt-4-0314 β greedy β single β 1 β 27.1 β 77.14 β 7.64 β 47.43 β 0.2 β 1000 β 1203.17 β
β claude-3-opus-20240229 β greedy β single β 1 β 27 β 78.21 β 7.08 β 48.91 β 0 β 1000 β 855.72 β
β Qwen2.5-72B-Instruct β greedy β single β 1 β 26.6 β 76.43 β 7.22 β 40.92 β 11.9 β 1000 β 1795.9 β
β Qwen2.5-32B-Instruct β greedy β single β 1 β 26.1 β 77.5 β 6.11 β 43.39 β 6.3 β 1000 β 1333.07 β
β Athene-70B β greedy β single β 1 β 16.7 β 52.5 β 2.78 β 32.98 β 21.1 β 1000 β 391.19 β
ββββββββββββββββββββββββββββββββββββ§βββββββββ§βββββββββββ§βββββββββββ§βββββββββββββββ§ββββββββββββββββββββ§ββββββββββββββββββββ§βββββββββββββ§ββββββββββββββ§ββββββββββββββββββ§ββββββββββββββββ