loubnabnl HF staff leo-c commited on
Commit
b2c4485
1 Parent(s): cb2e5cf

[Community Submission] Model: m-a-p/OpenCodeInterpreter-DS-6.7B, Username: Anitaliu98 (#61)

Browse files

- Add results of OpenCodeInterpreter-DS-6.7B (e95e12e199d4dd11589e9c1019adde306c024a23)
- Add results of OpenCodeInterpreter-DS-6.7B (1f4667e875fd17d7e830d954324530eacd65d606)
- rebase and add OpenCodeInterpreter-DS-6.7B (fbb96f14a4bba760ea332e12211fafa08af4942c)
- rebase and add OpenCodeInterpreter-DS-6.7B (45a006eceba88c00a220557f973534cf8fbfae61)
- fix conflict (46d2b98ff793d3bd49c1705b59dbd157e00fcd8a)
- Merge commit 'refs/pr/61' of https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard into pr/61 (b5a63bc1b21dbe057f91ba72710b6ab93ce2b395)


Co-authored-by: leo-c <leo-c@users.noreply.huggingface.co>

Files changed (38) hide show
  1. README.md +1 -0
  2. community_results/codellama_70b/codellama_70b.json +1 -0
  3. community_results/codellama_70b/codellama_70b_instruct.json +1 -0
  4. community_results/codellama_70b/codellama_70b_python.json +1 -0
  5. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_humaneval_OpenCodeInterpreter-DS-6.7B_humaneval.json +0 -0
  6. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-cpp_OpenCodeInterpreter-DS-6.7B_multiple-cpp.json +0 -0
  7. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-d_OpenCodeInterpreter-DS-6.7B_multiple-d.json +0 -0
  8. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-java_OpenCodeInterpreter-DS-6.7B_multiple-java.json +0 -0
  9. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-jl_OpenCodeInterpreter-DS-6.7B_multiple-jl.json +0 -0
  10. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-js_OpenCodeInterpreter-DS-6.7B_multiple-js.json +0 -0
  11. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-lua_OpenCodeInterpreter-DS-6.7B_multiple-lua.json +0 -0
  12. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-php_OpenCodeInterpreter-DS-6.7B_multiple-php.json +0 -0
  13. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-r_OpenCodeInterpreter-DS-6.7B_multiple-r.json +0 -0
  14. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-rkt_OpenCodeInterpreter-DS-6.7B_multiple-rkt.json +0 -0
  15. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-rs_OpenCodeInterpreter-DS-6.7B_multiple-rs.json +0 -0
  16. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-swift_OpenCodeInterpreter-DS-6.7B_multiple-swift.json +0 -0
  17. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98.json +1 -0
  18. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_humaneval_OpenCodeInterpreter-DS-6.7B.json +11 -0
  19. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-cpp_OpenCodeInterpreter-DS-6.7B.json +11 -0
  20. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-d_OpenCodeInterpreter-DS-6.7B.json +11 -0
  21. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-java_OpenCodeInterpreter-DS-6.7B.json +11 -0
  22. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-jl_OpenCodeInterpreter-DS-6.7B.json +11 -0
  23. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-js_OpenCodeInterpreter-DS-6.7B.json +11 -0
  24. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-lua_OpenCodeInterpreter-DS-6.7B.json +11 -0
  25. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-php_OpenCodeInterpreter-DS-6.7B.json +11 -0
  26. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-r_OpenCodeInterpreter-DS-6.7B.json +11 -0
  27. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-rkt_OpenCodeInterpreter-DS-6.7B.json +11 -0
  28. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-rs_OpenCodeInterpreter-DS-6.7B.json +11 -0
  29. community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-swift_OpenCodeInterpreter-DS-6.7B.json +11 -0
  30. data/code_eval_board.csv +44 -43
  31. data/raw_scores.csv +1 -0
  32. logs_.txt +0 -0
  33. metric_CodeLlama-70b-hf.json +42 -0
  34. optimum-benchmark +1 -0
  35. src/__pycache__/utils.cpython-310.pyc +0 -0
  36. src/__pycache__/utils.cpython-311.pyc +0 -0
  37. src/build.py +2 -0
  38. src/utils.py +1 -1
README.md CHANGED
@@ -61,4 +61,5 @@ models:
61
  - bigcode/starcoder2-3b
62
  - stabilityai/stable-code-3b
63
  - m-a-p/OpenCodeInterpreter-DS-33B
 
64
  ---
 
61
  - bigcode/starcoder2-3b
62
  - stabilityai/stable-code-3b
63
  - m-a-p/OpenCodeInterpreter-DS-33B
64
+ - m-a-p/OpenCodeInterpreter-DS-6.7B
65
  ---
community_results/codellama_70b/codellama_70b.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"results": [{"task": "multiple-swift", "pass@1": 0.42857142857142855}, {"task": "multiple-lua", "pass@1": 0.4161490683229814}, {"task": "multiple-rkt", "pass@1": 0.0}, {"task": "multiple-js", "pass@1": 0.5652173913043478}, {"task": "multiple-d", "pass@1": 0.2484472049689441}, {"task": "multiple-r", "pass@1": 0.2795031055900621}, {"task": "multiple-cpp", "pass@1": 0.4968944099378882}, {"task": "multiple-rs", "pass@1": 0.4968944099378882}, {"task": "multiple-jl", "pass@1": 0.422360248447205}, {"task": "multiple-php", "pass@1": 0.4658385093167702}, {"task": "humaneval", "pass@1": 0.524390243902439}, {"task": "multiple-java", "pass@1": 0.4472049689440994}], "meta": {"model": "codellama//CodeLlama-70b-hf"}}
community_results/codellama_70b/codellama_70b_instruct.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"results": [{"task": "multiple-swift", "pass@1": 0.42857142857142855}, {"task": "multiple-lua", "pass@1": 0.4409937888198758}, {"task": "multiple-rkt", "pass@1": 0.0}, {"task": "multiple-js", "pass@1": 0.577639751552795}, {"task": "multiple-d", "pass@1": 0.19875776397515527}, {"task": "multiple-r", "pass@1": 0.2919254658385093}, {"task": "multiple-cpp", "pass@1": 0.484472049689441}, {"task": "multiple-rs", "pass@1": 0.4720496894409938}, {"task": "multiple-jl", "pass@1": 0.422360248447205}, {"task": "multiple-php", "pass@1": 0.5714285714285714}, {"task": "humaneval", "pass@1": 0.5853658536585366}, {"task": "multiple-java", "pass@1": 0.4720496894409938}], "meta": {"model": "codellama//CodeLlama-70b-Instruct-hf"}}
community_results/codellama_70b/codellama_70b_python.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"results": [{"task": "multiple-swift", "pass@1": 0.391304347826087}, {"task": "multiple-lua", "pass@1": 0.4472049689440994}, {"task": "multiple-rkt", "pass@1": 0.0}, {"task": "multiple-js", "pass@1": 0.5652173913043478}, {"task": "multiple-d", "pass@1": 0.2111801242236025}, {"task": "multiple-r", "pass@1": 0.2608695652173913}, {"task": "multiple-cpp", "pass@1": 0.4968944099378882}, {"task": "multiple-rs", "pass@1": 0.484472049689441}, {"task": "multiple-jl", "pass@1": 0.35403726708074534}, {"task": "multiple-php", "pass@1": 0.5279503105590062}, {"task": "humaneval", "pass@1": 0.5548780487804879}, {"task": "multiple-java", "pass@1": 0.45962732919254656}], "meta": {"model": "codellama//CodeLlama-70b-Python-hf"}}
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_humaneval_OpenCodeInterpreter-DS-6.7B_humaneval.json ADDED
The diff for this file is too large to render. See raw diff
 
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-cpp_OpenCodeInterpreter-DS-6.7B_multiple-cpp.json ADDED
The diff for this file is too large to render. See raw diff
 
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-d_OpenCodeInterpreter-DS-6.7B_multiple-d.json ADDED
The diff for this file is too large to render. See raw diff
 
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-java_OpenCodeInterpreter-DS-6.7B_multiple-java.json ADDED
The diff for this file is too large to render. See raw diff
 
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-jl_OpenCodeInterpreter-DS-6.7B_multiple-jl.json ADDED
The diff for this file is too large to render. See raw diff
 
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-js_OpenCodeInterpreter-DS-6.7B_multiple-js.json ADDED
The diff for this file is too large to render. See raw diff
 
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-lua_OpenCodeInterpreter-DS-6.7B_multiple-lua.json ADDED
The diff for this file is too large to render. See raw diff
 
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-php_OpenCodeInterpreter-DS-6.7B_multiple-php.json ADDED
The diff for this file is too large to render. See raw diff
 
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-r_OpenCodeInterpreter-DS-6.7B_multiple-r.json ADDED
The diff for this file is too large to render. See raw diff
 
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-rkt_OpenCodeInterpreter-DS-6.7B_multiple-rkt.json ADDED
The diff for this file is too large to render. See raw diff
 
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-rs_OpenCodeInterpreter-DS-6.7B_multiple-rs.json ADDED
The diff for this file is too large to render. See raw diff
 
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/generations_OpenCodeInterpreter-DS-6.7B/generations_multiple-swift_OpenCodeInterpreter-DS-6.7B_multiple-swift.json ADDED
The diff for this file is too large to render. See raw diff
 
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"results": [{"task": "multiple-rs", "pass@1": 0.48217948717948717}, {"task": "multiple-lua", "pass@1": 0.4429813664596272}, {"task": "multiple-rkt", "pass@1": 0.2432298136645962}, {"task": "multiple-php", "pass@1": 0.5734161490683232}, {"task": "multiple-js", "pass@1": 0.6385093167701864}, {"task": "multiple-r", "pass@1": 0.39080745341614903}, {"task": "multiple-java", "pass@1": 0.5140506329113924}, {"task": "multiple-cpp", "pass@1": 0.6001242236024846}, {"task": "humaneval", "pass@1": 0.7319512195121951}, {"task": "multiple-d", "pass@1": 0.1821794871794872}, {"task": "multiple-jl", "pass@1": 0.39685534591194976}, {"task": "multiple-swift", "pass@1": 0.4598734177215191}], "meta": {"model": "map/OpenCodeInterpreter-DS-6.7B"}, "meta_score": 0.4713464927831165}
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_humaneval_OpenCodeInterpreter-DS-6.7B.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "humaneval": {
3
+ "pass@1": 0.7319512195121951,
4
+ "pass@10": 0.8082743120119809
5
+ },
6
+ "config": {
7
+ "model": "HF_ORGANISATION/OpenCodeInterpreter-DS-6.7B",
8
+ "temperature": 0.2,
9
+ "n_samples": 50
10
+ }
11
+ }
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-cpp_OpenCodeInterpreter-DS-6.7B.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "multiple-cpp": {
3
+ "pass@1": 0.6001242236024846,
4
+ "pass@10": 0.7089648726193274
5
+ },
6
+ "config": {
7
+ "model": "HF_ORGANISATION/OpenCodeInterpreter-DS-6.7B",
8
+ "temperature": 0.2,
9
+ "n_samples": 50
10
+ }
11
+ }
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-d_OpenCodeInterpreter-DS-6.7B.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "multiple-d": {
3
+ "pass@1": 0.1821794871794872,
4
+ "pass@10": 0.26437046545535126
5
+ },
6
+ "config": {
7
+ "model": "HF_ORGANISATION/OpenCodeInterpreter-DS-6.7B",
8
+ "temperature": 0.2,
9
+ "n_samples": 50
10
+ }
11
+ }
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-java_OpenCodeInterpreter-DS-6.7B.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "multiple-java": {
3
+ "pass@1": 0.5140506329113924,
4
+ "pass@10": 0.6182488826143548
5
+ },
6
+ "config": {
7
+ "model": "HF_ORGANISATION/OpenCodeInterpreter-DS-6.7B",
8
+ "temperature": 0.2,
9
+ "n_samples": 50
10
+ }
11
+ }
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-jl_OpenCodeInterpreter-DS-6.7B.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "multiple-jl": {
3
+ "pass@1": 0.39685534591194976,
4
+ "pass@10": 0.5057139684204232
5
+ },
6
+ "config": {
7
+ "model": "HF_ORGANISATION/OpenCodeInterpreter-DS-6.7B",
8
+ "temperature": 0.2,
9
+ "n_samples": 50
10
+ }
11
+ }
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-js_OpenCodeInterpreter-DS-6.7B.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "multiple-js": {
3
+ "pass@1": 0.6385093167701864,
4
+ "pass@10": 0.723789635471313
5
+ },
6
+ "config": {
7
+ "model": "HF_ORGANISATION/OpenCodeInterpreter-DS-6.7B",
8
+ "temperature": 0.2,
9
+ "n_samples": 50
10
+ }
11
+ }
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-lua_OpenCodeInterpreter-DS-6.7B.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "multiple-lua": {
3
+ "pass@1": 0.4429813664596272,
4
+ "pass@10": 0.6224512906010209
5
+ },
6
+ "config": {
7
+ "model": "HF_ORGANISATION/OpenCodeInterpreter-DS-6.7B",
8
+ "temperature": 0.2,
9
+ "n_samples": 50
10
+ }
11
+ }
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-php_OpenCodeInterpreter-DS-6.7B.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "multiple-php": {
3
+ "pass@1": 0.5734161490683232,
4
+ "pass@10": 0.7151100934953563
5
+ },
6
+ "config": {
7
+ "model": "HF_ORGANISATION/OpenCodeInterpreter-DS-6.7B",
8
+ "temperature": 0.2,
9
+ "n_samples": 50
10
+ }
11
+ }
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-r_OpenCodeInterpreter-DS-6.7B.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "multiple-r": {
3
+ "pass@1": 0.39080745341614903,
4
+ "pass@10": 0.5297901047393728
5
+ },
6
+ "config": {
7
+ "model": "HF_ORGANISATION/OpenCodeInterpreter-DS-6.7B",
8
+ "temperature": 0.2,
9
+ "n_samples": 50
10
+ }
11
+ }
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-rkt_OpenCodeInterpreter-DS-6.7B.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "multiple-rkt": {
3
+ "pass@1": 0.2432298136645962,
4
+ "pass@10": 0.4076642646276392
5
+ },
6
+ "config": {
7
+ "model": "HF_ORGANISATION/OpenCodeInterpreter-DS-6.7B",
8
+ "temperature": 0.2,
9
+ "n_samples": 50
10
+ }
11
+ }
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-rs_OpenCodeInterpreter-DS-6.7B.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "multiple-rs": {
3
+ "pass@1": 0.48217948717948717,
4
+ "pass@10": 0.644574758535619
5
+ },
6
+ "config": {
7
+ "model": "HF_ORGANISATION/OpenCodeInterpreter-DS-6.7B",
8
+ "temperature": 0.2,
9
+ "n_samples": 50
10
+ }
11
+ }
community_results/m-a-p_OpenCodeInterpreter-DS-6.7B_Anitaliu98/metrics_OpenCodeInterpreter-DS-6.7B/metrics_multiple-swift_OpenCodeInterpreter-DS-6.7B.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "multiple-swift": {
3
+ "pass@1": 0.4598734177215191,
4
+ "pass@10": 0.5886218693250106
5
+ },
6
+ "config": {
7
+ "model": "HF_ORGANISATION/OpenCodeInterpreter-DS-6.7B",
8
+ "temperature": 0.2,
9
+ "n_samples": 50
10
+ }
11
+ }
data/code_eval_board.csv CHANGED
@@ -1,51 +1,52 @@
1
  T,Model,Size (B),Win Rate,Throughput (tokens/s),Seq_length,#Languages,humaneval-python,java,javascript,cpp,php,julia,d,Average score,lua,r,racket,rust,swift,Throughput (tokens/s) bs=50,Peak Memory (MB),models_query,Links,Submission PR
2
- 🔴,OpenCodeInterpreter-DS-33B,33.0,49.42,,16384,86,75.23,54.8,69.06,64.47,59.32,46.58,22.31,53.32,57.76,45.94,34.55,57.95,51.85,,,OpenCodeInterpreter-DS-33B,https://huggingface.co/m-a-p/OpenCodeInterpreter-DS-33B,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/60
3
- 🔴,CodeFuse-DeepSeek-33b,33.0,47.83,17.5,16384,86,76.83,60.76,66.46,65.22,57.76,38.36,24.36,51.69,52.8,40.37,34.16,53.85,49.37,,75833.0,CodeFuse-DeepSeek-33b,https://huggingface.co/codefuse-ai/CodeFuse-DeepSeek-33B,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/51
4
- 🔴,DeepSeek-Coder-33b-instruct,33.0,46.33,25.2,16384,86,80.02,52.03,65.13,62.36,52.5,42.92,17.85,49.99,50.92,39.43,31.69,55.56,49.42,,76800.0,DeepSeek-Coder-33b-instruct,https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/42
5
- 🔴,DeepSeek-Coder-7b-instruct,6.7,45.42,51.0,16384,86,80.22,53.34,65.8,59.66,59.4,38.84,21.59,48.17,47.78,38.56,20.87,47.73,44.22,,22922.0,DeepSeek-Coder-7b-instruct,https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/43
6
- 🔶,Phind-CodeLlama-34B-v2,34.0,43.88,15.1,16384,UNK,71.95,54.06,65.34,59.59,56.26,45.12,14.12,48.7,44.27,37.7,28.7,57.67,49.63,0.0,69957.0,Phind-CodeLlama-34B-v2,https://huggingface.co/phind/Phind-CodeLlama-34B-v2,
7
- 🔶,Phind-CodeLlama-34B-v1,34.0,42.96,15.1,16384,UNK,65.85,49.47,64.45,57.81,55.53,43.23,15.5,46.9,42.05,36.71,24.89,54.1,53.27,0.0,69957.0,Phind-CodeLlama-34B-v1,https://huggingface.co/phind/Phind-CodeLlama-34B-v1,
8
- 🔶,Phind-CodeLlama-34B-Python-v1,34.0,41.5,15.1,16384,UNK,70.22,48.72,66.24,55.34,52.05,44.23,13.78,45.25,39.44,37.76,18.88,49.22,47.11,0.0,69957.0,Phind-CodeLlama-34B-Python-v1,https://huggingface.co/phind/Phind-CodeLlama-34B-Python-v1,
9
- 🔶,CodeLlama-70b-Instruct,70.0,39.67,,2048,UNK,75.6,47.2,57.76,48.45,57.14,42.24,19.88,42.64,44.1,29.19,0.0,47.2,42.86,,,CodeLlama-70b-Instruct,https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf,
 
 
 
10
  🔴,DeepSeek-Coder-33b-base,33.0,39.33,25.2,16384,86,52.45,43.77,51.28,51.22,41.76,32.83,17.41,38.07,36.51,26.76,23.37,43.78,35.75,,76800.0,DeepSeek-Coder-33b-base,https://huggingface.co/deepseek-ai/deepseek-coder-33b-base,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/31
11
- 🔶,WizardCoder-Python-34B-V1.0,34.0,39.27,15.1,16384,UNK,70.73,44.94,55.28,47.2,47.2,41.51,15.38,41.95,32.3,39.75,18.63,46.15,44.3,0.0,69957.0,WizardCoder-Python-34B-V1.0,https://huggingface.co/WizardLM/WizardCoder-Python-34B-V1.0,
12
- 🟢,CodeLlama-70b,70.0,39.08,,16384,UNK,52.44,44.72,56.52,49.69,46.58,42.24,24.84,39.93,41.61,27.95,0.0,49.69,42.86,,,CodeLlama-70b,https://huggingface.co/codellama/CodeLlama-70b-hf,
13
- 🟢,CodeLlama-70b-Python,70.0,38.5,,2048,UNK,55.49,45.96,56.52,49.69,52.8,35.4,21.12,39.61,44.72,26.09,0.0,48.45,39.13,,,CodeLlama-70b-Python,https://huggingface.co/codellama/CodeLlama-70b-Python-hf,
14
- 🟢,StarCoder2-15B,15.0,36.92,,16384,619,44.15,33.86,44.24,41.44,39.48,33.19,23.64,34.85,43.75,19.81,22.41,38.03,34.18,,,StarCoder2-15B,https://huggingface.co/bigcode/starcoder2-15b,
15
- 🔴,DeepSeek-Coder-7b-base,6.7,35.42,51.0,16384,86,45.83,37.72,45.9,45.53,36.92,28.74,19.74,33.54,33.89,28.99,18.73,34.67,25.8,,22922.0,DeepSeek-Coder-7b-base,https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-base,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/32
16
- 🔶,CodeLlama-34b-Instruct,34.0,35.12,15.1,16384,UNK,50.79,41.53,45.85,41.53,36.98,32.65,13.63,35.09,38.87,24.25,18.09,39.26,37.63,0.0,69957.0,CodeLlama-34b-Instruct,https://huggingface.co/codellama/CodeLlama-34b-Instruct-hf,
17
- 🔶,WizardCoder-Python-13B-V1.0,13.0,34.88,25.3,16384,UNK,62.19,41.77,48.45,42.86,42.24,38.99,11.54,35.94,32.92,27.33,16.15,34.62,32.28,0.0,28568.0,WizardCoder-Python-13B-V1.0,https://huggingface.co/WizardLM/WizardCoder-Python-13B-V1.0,
18
- 🟢,CodeLlama-34b,34.0,34.42,15.1,16384,UNK,45.11,40.19,41.66,41.42,40.43,31.4,15.27,33.89,37.49,22.71,16.94,38.73,35.28,0.0,69957.0,CodeLlama-34b,https://huggingface.co/codellama/CodeLlama-34b-hf,
19
- 🟢,CodeLlama-34b-Python,34.0,33.81,15.1,16384,UNK,53.29,39.46,44.72,39.09,39.78,31.37,17.29,33.87,31.9,22.35,13.19,39.67,34.3,0.0,69957.0,CodeLlama-34b-Python,https://huggingface.co/codellama/CodeLlama-34b-Python-hf,
20
- 🔶,WizardCoder-15B-V1.0,15.0,32.54,43.7,8192,86,58.12,35.77,41.91,38.95,39.34,33.98,12.14,32.07,27.85,22.53,13.39,33.74,27.06,1470.0,32414.0,WizardCoder-15B-V1.0,https://huggingface.co/WizardLM/WizardCoder-15B-V1.0,
21
- 🔶,CodeLlama-13b-Instruct,13.0,31.73,25.3,16384,UNK,50.6,33.99,40.92,36.36,32.07,32.23,16.29,31.29,31.6,20.14,16.66,32.82,31.75,0.0,28568.0,CodeLlama-13b-Instruct,https://huggingface.co/codellama/CodeLlama-13b-Instruct-hf,
22
- 🟢,CodeLlama-13b,13.0,29.88,25.3,16384,UNK,35.07,32.23,38.26,35.81,32.57,28.01,15.78,28.35,31.26,18.32,13.63,29.72,29.54,0.0,28568.0,CodeLlama-13b,https://huggingface.co/codellama/CodeLlama-13b-hf,
23
- 🟢,CodeLlama-13b-Python,13.0,27.96,25.3,16384,UNK,42.89,33.56,40.66,36.21,34.55,30.4,9.82,28.67,29.9,18.35,12.51,29.32,25.85,0.0,28568.0,CodeLlama-13b-Python,https://huggingface.co/codellama/CodeLlama-13b-Python-hf,
24
  🟢,StarCoder2-7B,7.0,26.67,,16384,17,34.09,29.42,35.35,33.63,30.58,20.42,15.12,26.1,30.67,16.72,11.58,29.62,26.06,,,StarCoder2-7B,https://huggingface.co/bigcode/starcoder2-7b,
25
- 🔶,CodeLlama-7b-Instruct,7.0,26.54,33.1,16384,UNK,45.65,28.77,33.11,29.03,28.55,27.58,11.81,26.45,30.47,19.7,11.81,24.27,26.66,693.0,15853.0,CodeLlama-7b-Instruct,https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf,
26
- 🔴,CodeShell-7B,7.0,25.08,33.9,8194,24,34.32,30.43,33.17,28.21,30.87,22.08,8.85,24.74,22.39,20.52,17.2,24.55,24.3,639.0,18511.0,CodeShell-7B,https://huggingface.co/WisdomShell/CodeShell-7B,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/16
27
- 🟢,CodeLlama-7b,7.0,24.92,33.1,16384,UNK,29.98,29.2,31.8,27.23,25.17,25.6,11.6,24.36,30.36,18.04,11.94,25.82,25.52,693.0,15853.0,CodeLlama-7b,https://huggingface.co/codellama/CodeLlama-7b-hf,
28
- 🔶,OctoCoder-15B,15.0,23.38,44.4,8192,86,45.3,26.03,32.8,29.32,26.76,24.5,13.35,24.01,22.56,14.39,10.61,24.26,18.24,1520.0,32278.0,OctoCoder-15B,https://huggingface.co/bigcode/octocoder,
29
- 🟢,CodeLlama-7b-Python,7.0,23.31,33.1,16384,UNK,40.48,29.15,36.34,30.34,1.08,28.53,8.94,23.5,26.15,18.25,9.04,26.96,26.75,693.0,15853.0,CodeLlama-7b-Python,https://huggingface.co/codellama/CodeLlama-7b-Python-hf,
30
- 🟢,StarCoder-15B,15.0,22.81,43.9,8192,86,33.57,30.22,30.79,31.55,26.08,23.02,13.57,22.74,23.89,15.5,0.07,21.84,22.74,1490.0,33461.0,StarCoder-15B,https://huggingface.co/bigcode/starcoder,
31
  🟢,Falcon-180B,180.0,22.8,,2048,,35.37,28.48,31.68,28.57,,24.53,14.1,24.08,26.71,,10.56,25.0,15.82,,,Falcon-180B,https://huggingface.co/tiiuae/falcon-180B,
32
- 🟢,StarCoderBase-15B,15.0,22.15,43.8,8192,86,30.35,28.53,31.7,30.56,26.75,21.09,10.01,22.4,26.61,10.18,11.77,24.46,16.74,1460.0,32366.0,StarCoderBase-15B,https://huggingface.co/bigcode/starcoderbase,
33
  🟢,StarCoder2-3B,3.0,21.58,,16384,17,31.44,27.41,35.37,27.24,27.61,19.87,12.56,23.43,28.01,14.22,7.8,24.52,25.09,,,StarCoder2-3B,https://huggingface.co/bigcode/starcoder2-3b,
34
- 🟢,CodeGeex2-6B,6.0,19.19,32.7,8192,100,33.49,23.46,29.9,28.45,25.27,20.93,8.44,21.23,15.94,14.58,11.75,20.45,22.06,982.0,14110.0,CodeGeex2-6B,https://huggingface.co/THUDM/codegeex2-6b,
35
- 🟢,StarCoderBase-7B,7.0,18.62,46.9,8192,86,28.37,24.44,27.35,23.3,22.12,21.77,8.1,20.17,23.35,14.51,11.08,22.6,15.1,1700.0,16512.0,StarCoderBase-7B,https://huggingface.co/bigcode/starcoderbase-7b,
36
- 🔶,OctoGeeX-7B,7.0,18.35,32.7,8192,100,42.28,19.33,28.5,23.93,25.85,22.94,9.77,20.79,16.19,13.66,12.02,17.94,17.03,982.0,14110.0,OctoGeeX-7B,https://huggingface.co/bigcode/octogeex,
37
- 🔶,WizardCoder-3B-V1.0,3.0,17.19,50.0,8192,86,32.92,24.34,26.16,24.94,24.83,19.6,7.91,20.15,21.75,13.64,9.44,20.56,15.7,1770.0,8414.0,WizardCoder-3B-V1.0,https://huggingface.co/WizardLM/WizardCoder-3B-V1.0,
38
- 🟢,CodeGen25-7B-multi,7.0,16.65,32.6,2048,86,28.7,26.01,26.27,25.75,21.98,19.11,8.84,20.04,23.44,11.59,10.37,21.84,16.62,680.0,15336.0,CodeGen25-7B-multi,https://huggingface.co/Salesforce/codegen25-7b-multi,
39
- 🔶,Refact-1.6B,1.6,16.0,50.0,4096,19,31.1,22.78,22.36,21.12,22.36,13.84,10.26,17.86,15.53,13.04,4.97,18.59,18.35,2340.0,5376.0,Refact-1.6B,https://huggingface.co/smallcloudai/Refact-1_6B-fim,
40
  🟢,Stable-code-3b,3.0,15.42,,16384,18,30.72,28.75,31.64,29.42,23.68,21.41,10.09,19.06,17.54,13.37,0.0,22.15,0.0,,,Stable-code-3b,https://huggingface.co/stabilityai/stable-code-3b,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/57
41
  🔴,DeepSeek-Coder-1b-base,1.0,15.17,,16384,UNK,32.13,27.16,28.46,27.96,22.75,15.17,9.91,19.46,19.44,11.4,9.58,18.13,11.39,,,DeepSeek-Coder-1b-base,https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/33
42
- 🟢,StarCoderBase-3B,3.0,12.88,50.0,8192,86,21.5,19.25,21.32,19.43,18.55,16.1,4.97,15.29,18.04,10.1,7.87,16.32,9.98,1770.0,8414.0,StarCoderBase-3B,https://huggingface.co/bigcode/starcoderbase-3b,
43
- 🔶,WizardCoder-1B-V1.0,1.1,11.42,71.4,8192,86,23.17,19.68,19.13,15.94,14.71,13.85,4.64,13.89,15.52,10.01,6.51,13.91,9.59,2360.0,4586.0,WizardCoder-1B-V1.0,https://huggingface.co/WizardLM/WizardCoder-1B-V1.0,
44
- 🟢,Replit-2.7B,2.7,9.62,42.2,2048,20,20.12,21.39,20.18,20.37,16.14,1.24,6.41,11.62,2.11,7.2,3.22,15.19,5.88,577.0,7176.0,Replit-2.7B,https://huggingface.co/replit/replit-code-v1-3b,
45
- 🟢,CodeGen25-7B-mono,7.0,9.23,34.1,2048,86,33.08,19.75,23.22,18.62,16.75,4.65,4.32,12.1,6.75,4.41,4.07,7.83,1.71,687.0,15336.0,CodeGen25-7B-mono,https://huggingface.co/Salesforce/codegen25-7b-mono,
46
- 🟢,StarCoderBase-1.1B,1.1,9.19,71.4,8192,86,15.17,14.2,13.38,11.68,9.94,11.31,4.65,9.81,12.52,5.73,5.03,10.24,3.92,2360.0,4586.0,StarCoderBase-1.1B,https://huggingface.co/bigcode/starcoderbase-1b,
47
- 🟢,CodeGen-16B-Multi,16.0,8.15,17.2,2048,6,19.26,22.2,19.15,21.0,8.37,0.0,7.68,9.89,8.5,6.45,0.66,4.21,1.25,0.0,32890.0,CodeGen-16B-Multi,https://huggingface.co/Salesforce/codegen-16B-multi,
48
- 🟢,StableCode-3B-alpha,3.0,7.12,30.2,16384,7,20.2,19.54,18.98,20.77,3.95,0.0,4.77,8.1,5.14,0.8,0.008,2.03,0.98,718.0,15730.0,StableCode-3B-alpha,https://huggingface.co/stabilityai/stablecode-completion-alpha-3b,
49
- 🟢,DeciCoder-1B,1.0,6.88,54.6,2048,3,19.32,15.3,17.85,6.87,2.01,0.0,6.08,5.86,0.0,0.1,0.47,1.72,0.63,2490.0,4436.0,DeciCoder-1B,https://huggingface.co/Deci/DeciCoder-1b,
50
  🟢,Phi-1,1.0,6.67,,2048,1,51.22,10.76,19.25,14.29,12.42,0.63,7.05,12.15,6.21,6.21,3.11,4.49,10.13,,4941.0,Phi-1,https://huggingface.co/microsoft/phi-1,
51
- 🟢,SantaCoder-1.1B,1.1,5.5,50.8,2048,3,18.12,15.0,15.47,6.2,1.5,0.0,0.0,4.92,0.1,0.0,0.0,2.0,0.7,2270.0,4602.0,SantaCoder-1.1B,https://huggingface.co/bigcode/santacoder,
 
1
  T,Model,Size (B),Win Rate,Throughput (tokens/s),Seq_length,#Languages,humaneval-python,java,javascript,cpp,php,julia,d,Average score,lua,r,racket,rust,swift,Throughput (tokens/s) bs=50,Peak Memory (MB),models_query,Links,Submission PR
2
+ 🔴,OpenCodeInterpreter-DS-33B,33.0,50.42,,16384,86,75.23,54.8,69.06,64.47,59.32,46.58,22.31,53.32,57.76,45.94,34.55,57.95,51.85,,,OpenCodeInterpreter-DS-33B,https://huggingface.co/m-a-p/OpenCodeInterpreter-DS-33B,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/60
3
+ 🔴,CodeFuse-DeepSeek-33b,33.0,48.75,17.5,16384,86,76.83,60.76,66.46,65.22,57.76,38.36,24.36,51.69,52.8,40.37,34.16,53.85,49.37,,75833.0,CodeFuse-DeepSeek-33b,https://huggingface.co/codefuse-ai/CodeFuse-DeepSeek-33B,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/51
4
+ 🔴,DeepSeek-Coder-33b-instruct,33.0,47.17,25.2,16384,86,80.02,52.03,65.13,62.36,52.5,42.92,17.85,49.99,50.92,39.43,31.69,55.56,49.42,,76800.0,DeepSeek-Coder-33b-instruct,https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/42
5
+ 🔴,DeepSeek-Coder-7b-instruct,6.7,45.92,51.0,16384,86,80.22,53.34,65.8,59.66,59.4,38.84,21.59,48.17,47.78,38.56,20.87,47.73,44.22,,22922.0,DeepSeek-Coder-7b-instruct,https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/43
6
+ 🔴,OpenCodeInterpreter-DS-6.7B,6.7,45.42,,16384,86,73.2,51.41,63.85,60.01,57.34,39.69,18.22,47.14,44.3,39.08,24.32,48.22,45.99,,,OpenCodeInterpreter-DS-6.7B,https://huggingface.co/m-a-p/OpenCodeInterpreter-DS-6.7B,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/61
7
+ 🔶,Phind-CodeLlama-34B-v2,34.0,44.5,15.1,16384,UNK,71.95,54.06,65.34,59.59,56.26,45.12,14.12,48.7,44.27,37.7,28.7,57.67,49.63,0.0,69957.0,Phind-CodeLlama-34B-v2,https://huggingface.co/phind/Phind-CodeLlama-34B-v2,
8
+ 🔶,Phind-CodeLlama-34B-v1,34.0,43.42,15.1,16384,UNK,65.85,49.47,64.45,57.81,55.53,43.23,15.5,46.9,42.05,36.71,24.89,54.1,53.27,0.0,69957.0,Phind-CodeLlama-34B-v1,https://huggingface.co/phind/Phind-CodeLlama-34B-v1,
9
+ 🔶,Phind-CodeLlama-34B-Python-v1,34.0,41.88,15.1,16384,UNK,70.22,48.72,66.24,55.34,52.05,44.23,13.78,45.25,39.44,37.76,18.88,49.22,47.11,0.0,69957.0,Phind-CodeLlama-34B-Python-v1,https://huggingface.co/phind/Phind-CodeLlama-34B-Python-v1,
10
+ 🔶,CodeLlama-70b-Instruct,70.0,39.83,,2048,UNK,75.6,47.2,57.76,48.45,57.14,42.24,19.88,42.64,44.1,29.19,0.0,47.2,42.86,,,CodeLlama-70b-Instruct,https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf,
11
+ 🔶,WizardCoder-Python-34B-V1.0,34.0,39.5,15.1,16384,UNK,70.73,44.94,55.28,47.2,47.2,41.51,15.38,41.95,32.3,39.75,18.63,46.15,44.3,0.0,69957.0,WizardCoder-Python-34B-V1.0,https://huggingface.co/WizardLM/WizardCoder-Python-34B-V1.0,
12
+ 🟢,CodeLlama-70b,70.0,39.33,,16384,UNK,52.44,44.72,56.52,49.69,46.58,42.24,24.84,39.93,41.61,27.95,0.0,49.69,42.86,,,CodeLlama-70b,https://huggingface.co/codellama/CodeLlama-70b-hf,
13
  🔴,DeepSeek-Coder-33b-base,33.0,39.33,25.2,16384,86,52.45,43.77,51.28,51.22,41.76,32.83,17.41,38.07,36.51,26.76,23.37,43.78,35.75,,76800.0,DeepSeek-Coder-33b-base,https://huggingface.co/deepseek-ai/deepseek-coder-33b-base,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/31
14
+ 🟢,CodeLlama-70b-Python,70.0,38.75,,2048,UNK,55.49,45.96,56.52,49.69,52.8,35.4,21.12,39.61,44.72,26.09,0.0,48.45,39.13,,,CodeLlama-70b-Python,https://huggingface.co/codellama/CodeLlama-70b-Python-hf,
15
+ 🟢,StarCoder2-15B,15.0,37.0,,16384,619,44.15,33.86,44.24,41.44,39.48,33.19,23.64,34.85,43.75,19.81,22.41,38.03,34.18,,,StarCoder2-15B,https://huggingface.co/bigcode/starcoder2-15b,
16
+ 🔴,DeepSeek-Coder-7b-base,6.7,35.5,51.0,16384,86,45.83,37.72,45.9,45.53,36.92,28.74,19.74,33.54,33.89,28.99,18.73,34.67,25.8,,22922.0,DeepSeek-Coder-7b-base,https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-base,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/32
17
+ 🔶,CodeLlama-34b-Instruct,34.0,35.19,15.1,16384,UNK,50.79,41.53,45.85,41.53,36.98,32.65,13.63,35.09,38.87,24.25,18.09,39.26,37.63,0.0,69957.0,CodeLlama-34b-Instruct,https://huggingface.co/codellama/CodeLlama-34b-Instruct-hf,
18
+ 🔶,WizardCoder-Python-13B-V1.0,13.0,34.96,25.3,16384,UNK,62.19,41.77,48.45,42.86,42.24,38.99,11.54,35.94,32.92,27.33,16.15,34.62,32.28,0.0,28568.0,WizardCoder-Python-13B-V1.0,https://huggingface.co/WizardLM/WizardCoder-Python-13B-V1.0,
19
+ 🟢,CodeLlama-34b,34.0,34.5,15.1,16384,UNK,45.11,40.19,41.66,41.42,40.43,31.4,15.27,33.89,37.49,22.71,16.94,38.73,35.28,0.0,69957.0,CodeLlama-34b,https://huggingface.co/codellama/CodeLlama-34b-hf,
20
+ 🟢,CodeLlama-34b-Python,34.0,33.88,15.1,16384,UNK,53.29,39.46,44.72,39.09,39.78,31.37,17.29,33.87,31.9,22.35,13.19,39.67,34.3,0.0,69957.0,CodeLlama-34b-Python,https://huggingface.co/codellama/CodeLlama-34b-Python-hf,
21
+ 🔶,WizardCoder-15B-V1.0,15.0,32.62,43.7,8192,86,58.12,35.77,41.91,38.95,39.34,33.98,12.14,32.07,27.85,22.53,13.39,33.74,27.06,1470.0,32414.0,WizardCoder-15B-V1.0,https://huggingface.co/WizardLM/WizardCoder-15B-V1.0,
22
+ 🔶,CodeLlama-13b-Instruct,13.0,31.81,25.3,16384,UNK,50.6,33.99,40.92,36.36,32.07,32.23,16.29,31.29,31.6,20.14,16.66,32.82,31.75,0.0,28568.0,CodeLlama-13b-Instruct,https://huggingface.co/codellama/CodeLlama-13b-Instruct-hf,
23
+ 🟢,CodeLlama-13b,13.0,29.96,25.3,16384,UNK,35.07,32.23,38.26,35.81,32.57,28.01,15.78,28.35,31.26,18.32,13.63,29.72,29.54,0.0,28568.0,CodeLlama-13b,https://huggingface.co/codellama/CodeLlama-13b-hf,
24
+ 🟢,CodeLlama-13b-Python,13.0,28.04,25.3,16384,UNK,42.89,33.56,40.66,36.21,34.55,30.4,9.82,28.67,29.9,18.35,12.51,29.32,25.85,0.0,28568.0,CodeLlama-13b-Python,https://huggingface.co/codellama/CodeLlama-13b-Python-hf,
 
 
25
  🟢,StarCoder2-7B,7.0,26.67,,16384,17,34.09,29.42,35.35,33.63,30.58,20.42,15.12,26.1,30.67,16.72,11.58,29.62,26.06,,,StarCoder2-7B,https://huggingface.co/bigcode/starcoder2-7b,
26
+ 🔶,CodeLlama-7b-Instruct,7.0,26.62,33.1,16384,UNK,45.65,28.77,33.11,29.03,28.55,27.58,11.81,26.45,30.47,19.7,11.81,24.27,26.66,693.0,15853.0,CodeLlama-7b-Instruct,https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf,
27
+ 🔴,CodeShell-7B,7.0,25.15,33.9,8194,24,34.32,30.43,33.17,28.21,30.87,22.08,8.85,24.74,22.39,20.52,17.2,24.55,24.3,639.0,18511.0,CodeShell-7B,https://huggingface.co/WisdomShell/CodeShell-7B,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/16
28
+ 🟢,CodeLlama-7b,7.0,25.0,33.1,16384,UNK,29.98,29.2,31.8,27.23,25.17,25.6,11.6,24.36,30.36,18.04,11.94,25.82,25.52,693.0,15853.0,CodeLlama-7b,https://huggingface.co/codellama/CodeLlama-7b-hf,
29
+ 🔶,OctoCoder-15B,15.0,23.46,44.4,8192,86,45.3,26.03,32.8,29.32,26.76,24.5,13.35,24.01,22.56,14.39,10.61,24.26,18.24,1520.0,32278.0,OctoCoder-15B,https://huggingface.co/bigcode/octocoder,
30
+ 🟢,CodeLlama-7b-Python,7.0,23.38,33.1,16384,UNK,40.48,29.15,36.34,30.34,1.08,28.53,8.94,23.5,26.15,18.25,9.04,26.96,26.75,693.0,15853.0,CodeLlama-7b-Python,https://huggingface.co/codellama/CodeLlama-7b-Python-hf,
31
+ 🟢,StarCoder-15B,15.0,22.88,43.9,8192,86,33.57,30.22,30.79,31.55,26.08,23.02,13.57,22.74,23.89,15.5,0.07,21.84,22.74,1490.0,33461.0,StarCoder-15B,https://huggingface.co/bigcode/starcoder,
32
  🟢,Falcon-180B,180.0,22.8,,2048,,35.37,28.48,31.68,28.57,,24.53,14.1,24.08,26.71,,10.56,25.0,15.82,,,Falcon-180B,https://huggingface.co/tiiuae/falcon-180B,
33
+ 🟢,StarCoderBase-15B,15.0,22.23,43.8,8192,86,30.35,28.53,31.7,30.56,26.75,21.09,10.01,22.4,26.61,10.18,11.77,24.46,16.74,1460.0,32366.0,StarCoderBase-15B,https://huggingface.co/bigcode/starcoderbase,
34
  🟢,StarCoder2-3B,3.0,21.58,,16384,17,31.44,27.41,35.37,27.24,27.61,19.87,12.56,23.43,28.01,14.22,7.8,24.52,25.09,,,StarCoder2-3B,https://huggingface.co/bigcode/starcoder2-3b,
35
+ 🟢,CodeGeex2-6B,6.0,19.27,32.7,8192,100,33.49,23.46,29.9,28.45,25.27,20.93,8.44,21.23,15.94,14.58,11.75,20.45,22.06,982.0,14110.0,CodeGeex2-6B,https://huggingface.co/THUDM/codegeex2-6b,
36
+ 🟢,StarCoderBase-7B,7.0,18.69,46.9,8192,86,28.37,24.44,27.35,23.3,22.12,21.77,8.1,20.17,23.35,14.51,11.08,22.6,15.1,1700.0,16512.0,StarCoderBase-7B,https://huggingface.co/bigcode/starcoderbase-7b,
37
+ 🔶,OctoGeeX-7B,7.0,18.42,32.7,8192,100,42.28,19.33,28.5,23.93,25.85,22.94,9.77,20.79,16.19,13.66,12.02,17.94,17.03,982.0,14110.0,OctoGeeX-7B,https://huggingface.co/bigcode/octogeex,
38
+ 🔶,WizardCoder-3B-V1.0,3.0,17.27,50.0,8192,86,32.92,24.34,26.16,24.94,24.83,19.6,7.91,20.15,21.75,13.64,9.44,20.56,15.7,1770.0,8414.0,WizardCoder-3B-V1.0,https://huggingface.co/WizardLM/WizardCoder-3B-V1.0,
39
+ 🟢,CodeGen25-7B-multi,7.0,16.73,32.6,2048,86,28.7,26.01,26.27,25.75,21.98,19.11,8.84,20.04,23.44,11.59,10.37,21.84,16.62,680.0,15336.0,CodeGen25-7B-multi,https://huggingface.co/Salesforce/codegen25-7b-multi,
40
+ 🔶,Refact-1.6B,1.6,16.08,50.0,4096,19,31.1,22.78,22.36,21.12,22.36,13.84,10.26,17.86,15.53,13.04,4.97,18.59,18.35,2340.0,5376.0,Refact-1.6B,https://huggingface.co/smallcloudai/Refact-1_6B-fim,
41
  🟢,Stable-code-3b,3.0,15.42,,16384,18,30.72,28.75,31.64,29.42,23.68,21.41,10.09,19.06,17.54,13.37,0.0,22.15,0.0,,,Stable-code-3b,https://huggingface.co/stabilityai/stable-code-3b,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/57
42
  🔴,DeepSeek-Coder-1b-base,1.0,15.17,,16384,UNK,32.13,27.16,28.46,27.96,22.75,15.17,9.91,19.46,19.44,11.4,9.58,18.13,11.39,,,DeepSeek-Coder-1b-base,https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base,https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/33
43
+ 🟢,StarCoderBase-3B,3.0,12.96,50.0,8192,86,21.5,19.25,21.32,19.43,18.55,16.1,4.97,15.29,18.04,10.1,7.87,16.32,9.98,1770.0,8414.0,StarCoderBase-3B,https://huggingface.co/bigcode/starcoderbase-3b,
44
+ 🔶,WizardCoder-1B-V1.0,1.1,11.5,71.4,8192,86,23.17,19.68,19.13,15.94,14.71,13.85,4.64,13.89,15.52,10.01,6.51,13.91,9.59,2360.0,4586.0,WizardCoder-1B-V1.0,https://huggingface.co/WizardLM/WizardCoder-1B-V1.0,
45
+ 🟢,Replit-2.7B,2.7,9.69,42.2,2048,20,20.12,21.39,20.18,20.37,16.14,1.24,6.41,11.62,2.11,7.2,3.22,15.19,5.88,577.0,7176.0,Replit-2.7B,https://huggingface.co/replit/replit-code-v1-3b,
46
+ 🟢,CodeGen25-7B-mono,7.0,9.31,34.1,2048,86,33.08,19.75,23.22,18.62,16.75,4.65,4.32,12.1,6.75,4.41,4.07,7.83,1.71,687.0,15336.0,CodeGen25-7B-mono,https://huggingface.co/Salesforce/codegen25-7b-mono,
47
+ 🟢,StarCoderBase-1.1B,1.1,9.27,71.4,8192,86,15.17,14.2,13.38,11.68,9.94,11.31,4.65,9.81,12.52,5.73,5.03,10.24,3.92,2360.0,4586.0,StarCoderBase-1.1B,https://huggingface.co/bigcode/starcoderbase-1b,
48
+ 🟢,CodeGen-16B-Multi,16.0,8.23,17.2,2048,6,19.26,22.2,19.15,21.0,8.37,0.0,7.68,9.89,8.5,6.45,0.66,4.21,1.25,0.0,32890.0,CodeGen-16B-Multi,https://huggingface.co/Salesforce/codegen-16B-multi,
49
+ 🟢,StableCode-3B-alpha,3.0,7.19,30.2,16384,7,20.2,19.54,18.98,20.77,3.95,0.0,4.77,8.1,5.14,0.8,0.008,2.03,0.98,718.0,15730.0,StableCode-3B-alpha,https://huggingface.co/stabilityai/stablecode-completion-alpha-3b,
50
+ 🟢,DeciCoder-1B,1.0,6.96,54.6,2048,3,19.32,15.3,17.85,6.87,2.01,0.0,6.08,5.86,0.0,0.1,0.47,1.72,0.63,2490.0,4436.0,DeciCoder-1B,https://huggingface.co/Deci/DeciCoder-1b,
51
  🟢,Phi-1,1.0,6.67,,2048,1,51.22,10.76,19.25,14.29,12.42,0.63,7.05,12.15,6.21,6.21,3.11,4.49,10.13,,4941.0,Phi-1,https://huggingface.co/microsoft/phi-1,
52
+ 🟢,SantaCoder-1.1B,1.1,5.58,50.8,2048,3,18.12,15.0,15.47,6.2,1.5,0.0,0.0,4.92,0.1,0.0,0.0,2.0,0.7,2270.0,4602.0,SantaCoder-1.1B,https://huggingface.co/bigcode/santacoder,
data/raw_scores.csv CHANGED
@@ -49,3 +49,4 @@ StarCoder2-3B,3,,16384,17,31.44,27.41,35.37,27.24,27.61,19.87,12.56,28.01,14.22,
49
  StarCoder2-7B,7,,16384,17,34.09,29.42,35.35,33.63,30.58,20.42,15.12,30.67,16.72,11.58,29.62,26.06,,
50
  StarCoder2-15B,15,,16384,619,44.15,33.86,44.24,41.44,39.48,33.19,23.64,43.75,19.81,22.41,38.03,34.18,,
51
  OpenCodeInterpreter-DS-33B,33,,16384,86,75.23,54.8,69.06,64.47,59.32,46.58,22.31,57.76,45.94,34.55,57.95,51.85,,
 
 
49
  StarCoder2-7B,7,,16384,17,34.09,29.42,35.35,33.63,30.58,20.42,15.12,30.67,16.72,11.58,29.62,26.06,,
50
  StarCoder2-15B,15,,16384,619,44.15,33.86,44.24,41.44,39.48,33.19,23.64,43.75,19.81,22.41,38.03,34.18,,
51
  OpenCodeInterpreter-DS-33B,33,,16384,86,75.23,54.8,69.06,64.47,59.32,46.58,22.31,57.76,45.94,34.55,57.95,51.85,,
52
+ OpenCodeInterpreter-DS-6.7B,6.7,,16384,86,73.2,51.41,63.85,60.01,57.34,39.69,18.22,44.3,39.08,24.32,48.22,45.99,,
logs_.txt ADDED
File without changes
metric_CodeLlama-70b-hf.json ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "humaneval-unstripped": {
3
+ "pass@1": 0.5853658536585366
4
+ },
5
+ "config": {
6
+ "prefix": "",
7
+ "do_sample": true,
8
+ "temperature": 0.2,
9
+ "top_k": 0,
10
+ "top_p": 0.95,
11
+ "n_samples": 1,
12
+ "eos": "<|endoftext|>",
13
+ "seed": 0,
14
+ "model": "codellama/CodeLlama-70b-hf",
15
+ "modeltype": "causal",
16
+ "peft_model": null,
17
+ "revision": null,
18
+ "use_auth_token": true,
19
+ "trust_remote_code": false,
20
+ "tasks": "humaneval-unstripped",
21
+ "instruction_tokens": null,
22
+ "batch_size": 1,
23
+ "max_length_generation": 512,
24
+ "precision": "fp32",
25
+ "load_in_8bit": false,
26
+ "load_in_4bit": false,
27
+ "limit": null,
28
+ "limit_start": 0,
29
+ "postprocess": true,
30
+ "allow_code_execution": true,
31
+ "generation_only": false,
32
+ "load_generations_path": "/fsx/loubna/projects/bigcode-evaluation-harness/generations_codellama/gens_humaneval-unstripped_CodeLlama-70b-Instruct-hf.json",
33
+ "load_data_path": null,
34
+ "metric_output_path": "/fsx/loubna/projects/bigcode-models-leaderboard/metric_CodeLlama-70b-hf.json",
35
+ "save_generations": false,
36
+ "save_generations_path": "generations.json",
37
+ "save_references": false,
38
+ "prompt": "prompt",
39
+ "max_memory_per_gpu": null,
40
+ "check_references": false
41
+ }
42
+ }
optimum-benchmark ADDED
@@ -0,0 +1 @@
 
 
1
+ Subproject commit 49f0924e2bb041cf17d78dd0848d8e2cad31632d
src/__pycache__/utils.cpython-310.pyc ADDED
Binary file (5.89 kB). View file
 
src/__pycache__/utils.cpython-311.pyc ADDED
Binary file (10.4 kB). View file
 
src/build.py CHANGED
@@ -85,6 +85,7 @@ links = {
85
  "CodeFuse-DeepSeek-33b": "https://huggingface.co/codefuse-ai/CodeFuse-DeepSeek-33B",
86
  "Stable-code-3b": "https://huggingface.co/stabilityai/stable-code-3b",
87
  "OpenCodeInterpreter-DS-33B": "https://huggingface.co/m-a-p/OpenCodeInterpreter-DS-33B",
 
88
  }
89
 
90
  codellamas = ['CodeLlama-7b', 'CodeLlama-7b-Python', 'CodeLlama-7b-Instruct', 'CodeLlama-13b', 'CodeLlama-13b-Python', 'CodeLlama-13b-Instruct', 'CodeLlama-34b', 'CodeLlama-34b-Python', 'CodeLlama-34b-Instruct', 'CodeLlama-70b', 'CodeLlama-70b-Python', 'CodeLlama-70b-Instruct']
@@ -109,6 +110,7 @@ df.loc[df["Model"].str.contains('|'.join(["DeepSeek-Coder-33b-instruct"])), "Sub
109
  df.loc[df["Model"].str.contains('|'.join(["CodeFuse"])), "Submission PR"] = "https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/51"
110
  df.loc[df["Model"].str.contains('|'.join(["Stable-code-3b"])), "Submission PR"] = "https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/57"
111
  df.loc[df["Model"].str.contains('|'.join(["OpenCodeInterpreter-DS-33B"])), "Submission PR"] = "https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/60"
 
112
 
113
  # print first 5 rows and 10 cols
114
  print(df.iloc[:5, :-1])
 
85
  "CodeFuse-DeepSeek-33b": "https://huggingface.co/codefuse-ai/CodeFuse-DeepSeek-33B",
86
  "Stable-code-3b": "https://huggingface.co/stabilityai/stable-code-3b",
87
  "OpenCodeInterpreter-DS-33B": "https://huggingface.co/m-a-p/OpenCodeInterpreter-DS-33B",
88
+ "OpenCodeInterpreter-DS-6.7B": "https://huggingface.co/m-a-p/OpenCodeInterpreter-DS-6.7B",
89
  }
90
 
91
  codellamas = ['CodeLlama-7b', 'CodeLlama-7b-Python', 'CodeLlama-7b-Instruct', 'CodeLlama-13b', 'CodeLlama-13b-Python', 'CodeLlama-13b-Instruct', 'CodeLlama-34b', 'CodeLlama-34b-Python', 'CodeLlama-34b-Instruct', 'CodeLlama-70b', 'CodeLlama-70b-Python', 'CodeLlama-70b-Instruct']
 
110
  df.loc[df["Model"].str.contains('|'.join(["CodeFuse"])), "Submission PR"] = "https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/51"
111
  df.loc[df["Model"].str.contains('|'.join(["Stable-code-3b"])), "Submission PR"] = "https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/57"
112
  df.loc[df["Model"].str.contains('|'.join(["OpenCodeInterpreter-DS-33B"])), "Submission PR"] = "https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/60"
113
+ df.loc[df["Model"].str.contains('|'.join(["OpenCodeInterpreter-DS-6.7B"])), "Submission PR"] = "https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard/discussions/61"
114
 
115
  # print first 5 rows and 10 cols
116
  print(df.iloc[:5, :-1])
src/utils.py CHANGED
@@ -85,7 +85,7 @@ def plot_throughput(df, bs=1):
85
  df.loc[df["Model"].str.contains("DeepSeek"), "color"] = "lightgreen"
86
  df.loc[df["Model"].str.contains("CodeFuse"), "color"] = "olive"
87
  df.loc[df["Model"].str.contains("Stable-code-3b"), "color"] = "steelblue"
88
- df.loc[df["Model"].str.contains("OpenCodeInterpreter-DS-33B"), "color"] = "red"
89
 
90
  fig = go.Figure()
91
 
 
85
  df.loc[df["Model"].str.contains("DeepSeek"), "color"] = "lightgreen"
86
  df.loc[df["Model"].str.contains("CodeFuse"), "color"] = "olive"
87
  df.loc[df["Model"].str.contains("Stable-code-3b"), "color"] = "steelblue"
88
+ df.loc[df["Model"].str.contains("OpenCodeInterpreter-DS"), "color"] = "red"
89
 
90
  fig = go.Figure()
91