Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
hanhainebula
commited on
Commit
•
606d718
1
Parent(s):
77ded94
Modify the evaluation steps
Browse files- src/about.py +7 -14
src/about.py
CHANGED
@@ -19,20 +19,15 @@ EVALUATION_QUEUE_TEXT = """
|
|
19 |
|
20 |
1. Install AIR-Bench
|
21 |
```bash
|
22 |
-
|
23 |
-
git clone https://github.com/AIR-Bench/AIR-Bench.git
|
24 |
-
|
25 |
-
# Install the package
|
26 |
-
cd AIR-Bench
|
27 |
-
pip install .
|
28 |
```
|
29 |
2. Run the evaluation script
|
30 |
```bash
|
31 |
cd AIR-Bench/scripts
|
32 |
# Run all tasks
|
33 |
-
python
|
34 |
--output_dir ./search_results \\
|
35 |
-
--encoder BAAI/bge-m3
|
36 |
--reranker BAAI/bge-reranker-v2-m3 \\
|
37 |
--search_top_k 1000 \\
|
38 |
--rerank_top_k 100 \\
|
@@ -46,7 +41,7 @@ python run_AIR-Bench.py \\
|
|
46 |
--overwrite False
|
47 |
|
48 |
# Run the tasks in the specified task type
|
49 |
-
python
|
50 |
--task_types long-doc \\
|
51 |
--output_dir ./search_results \\
|
52 |
--encoder BAAI/bge-m3 \\
|
@@ -63,7 +58,7 @@ python run_AIR-Bench.py \\
|
|
63 |
--overwrite False
|
64 |
|
65 |
# Run the tasks in the specified task type and domains
|
66 |
-
python
|
67 |
--task_types long-doc \\
|
68 |
--domains arxiv book \\
|
69 |
--output_dir ./search_results \\
|
@@ -81,7 +76,7 @@ python run_AIR-Bench.py \\
|
|
81 |
--overwrite False
|
82 |
|
83 |
# Run the tasks in the specified languages
|
84 |
-
python
|
85 |
--languages en \\
|
86 |
--output_dir ./search_results \\
|
87 |
--encoder BAAI/bge-m3 \\
|
@@ -98,15 +93,13 @@ python run_AIR-Bench.py \\
|
|
98 |
--overwrite False
|
99 |
|
100 |
# Run the tasks in the specified task type, domains, and languages
|
101 |
-
python
|
102 |
--task_types qa \\
|
103 |
--domains wiki web \\
|
104 |
--languages en \\
|
105 |
--output_dir ./search_results \\
|
106 |
--encoder BAAI/bge-m3 \\
|
107 |
-
--encoder_link https://huggingface.co/BAAI/bge-m3 \\
|
108 |
--reranker BAAI/bge-reranker-v2-m3 \\
|
109 |
-
--reranker_link https://huggingface.co/BAAI/bge-reranker-v2-m3 \\
|
110 |
--search_top_k 1000 \\
|
111 |
--rerank_top_k 100 \\
|
112 |
--max_query_length 512 \\
|
|
|
19 |
|
20 |
1. Install AIR-Bench
|
21 |
```bash
|
22 |
+
pip install air-benchmark
|
|
|
|
|
|
|
|
|
|
|
23 |
```
|
24 |
2. Run the evaluation script
|
25 |
```bash
|
26 |
cd AIR-Bench/scripts
|
27 |
# Run all tasks
|
28 |
+
python run_air_benchmark.py \\
|
29 |
--output_dir ./search_results \\
|
30 |
+
--encoder BAAI/bge-m3 \\
|
31 |
--reranker BAAI/bge-reranker-v2-m3 \\
|
32 |
--search_top_k 1000 \\
|
33 |
--rerank_top_k 100 \\
|
|
|
41 |
--overwrite False
|
42 |
|
43 |
# Run the tasks in the specified task type
|
44 |
+
python run_air_benchmark.py \\
|
45 |
--task_types long-doc \\
|
46 |
--output_dir ./search_results \\
|
47 |
--encoder BAAI/bge-m3 \\
|
|
|
58 |
--overwrite False
|
59 |
|
60 |
# Run the tasks in the specified task type and domains
|
61 |
+
python run_air_benchmark.py \\
|
62 |
--task_types long-doc \\
|
63 |
--domains arxiv book \\
|
64 |
--output_dir ./search_results \\
|
|
|
76 |
--overwrite False
|
77 |
|
78 |
# Run the tasks in the specified languages
|
79 |
+
python run_air_benchmark.py \\
|
80 |
--languages en \\
|
81 |
--output_dir ./search_results \\
|
82 |
--encoder BAAI/bge-m3 \\
|
|
|
93 |
--overwrite False
|
94 |
|
95 |
# Run the tasks in the specified task type, domains, and languages
|
96 |
+
python run_air_benchmark.py \\
|
97 |
--task_types qa \\
|
98 |
--domains wiki web \\
|
99 |
--languages en \\
|
100 |
--output_dir ./search_results \\
|
101 |
--encoder BAAI/bge-m3 \\
|
|
|
102 |
--reranker BAAI/bge-reranker-v2-m3 \\
|
|
|
103 |
--search_top_k 1000 \\
|
104 |
--rerank_top_k 100 \\
|
105 |
--max_query_length 512 \\
|