nazhan commited on
Commit
2fd1025
·
verified ·
1 Parent(s): f9cd40e

Add SetFit model

Browse files
Files changed (3) hide show
  1. README.md +75 -75
  2. model.safetensors +1 -1
  3. model_head.pkl +1 -1
README.md CHANGED
@@ -10,11 +10,11 @@ tags:
10
  - text-classification
11
  - generated_from_setfit_trainer
12
  widget:
13
- - text: What’s the total number of orders placed by each customer?
14
- - text: I like to read books and listen to music in my free time. How about you?
15
- - text: Get company-wise intangible asset ratio.
16
- - text: Show me data_asset_001_ta by product.
17
- - text: Show me average asset value.
18
  inference: true
19
  model-index:
20
  - name: SetFit with BAAI/bge-large-en-v1.5
@@ -28,7 +28,7 @@ model-index:
28
  split: test
29
  metrics:
30
  - type: accuracy
31
- value: 0.9915254237288136
32
  name: Accuracy
33
  ---
34
 
@@ -60,22 +60,22 @@ The model has been trained using an efficient few-shot learning technique that i
60
  - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
61
 
62
  ### Model Labels
63
- | Label | Examples |
64
- |:-------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
65
- | Aggregation | <ul><li>'Please show med CostVariance_Actual_vs_Forecast.'</li><li>'Get me data_asset_001_kpm group by metrics.'</li><li>'Provide data_asset_kpi_cf group by quarter.'</li></ul> |
66
- | Tablejoin | <ul><li>'Join data_asset_kpi_cf with data_asset_001_kpm tables.'</li><li>'Could you link the Products and Orders tables to track sales trends for different product categories?'</li><li>'Can I have a merge of income statement and key performance metrics tables?'</li></ul> |
67
- | Lookup | <ul><li>"Filter by the 'Sales' department and show me the employees."</li><li>"Filter by the 'Toys' category and get me the product names."</li><li>'Can you get me the products with a price above 100?'</li></ul> |
68
- | Rejection | <ul><li>"Let's avoid generating additional reports."</li><li>"I'd rather not filter this dataset."</li><li>"I'd prefer not to apply any filters."</li></ul> |
69
- | Lookup_1 | <ul><li>'Show me key income statement metrics.'</li><li>'can I have kpm table'</li><li>'Retrieve data_asset_kpi_ma_product records.'</li></ul> |
70
- | Generalreply | <ul><li>"Hey! It's going pretty well, thanks for asking. How about yours?"</li><li>'Not much, just taking it one day at a time. How about you?'</li><li>"'What is your favorite quote?'"</li></ul> |
71
- | Viewtables | <ul><li>'What are the table names that relate to customer service in the starhub_data_asset database?'</li><li>'What tables are available in the starhub_data_asset database that can be joined to track user behavior?'</li><li>'What are the tables that are available for analysis in the starhub_data_asset database?'</li></ul> |
72
 
73
  ## Evaluation
74
 
75
  ### Metrics
76
  | Label | Accuracy |
77
  |:--------|:---------|
78
- | **all** | 0.9915 |
79
 
80
  ## Uses
81
 
@@ -95,7 +95,7 @@ from setfit import SetFitModel
95
  # Download from the 🤗 Hub
96
  model = SetFitModel.from_pretrained("nazhan/bge-large-en-v1.5-brahmaputra-iter-10-3rd")
97
  # Run inference
98
- preds = model("Show me average asset value.")
99
  ```
100
 
101
  <!--
@@ -127,17 +127,17 @@ preds = model("Show me average asset value.")
127
  ### Training Set Metrics
128
  | Training set | Min | Median | Max |
129
  |:-------------|:----|:-------|:----|
130
- | Word count | 1 | 8.7839 | 62 |
131
 
132
  | Label | Training Sample Count |
133
  |:-------------|:----------------------|
134
- | Tablejoin | 127 |
135
- | Rejection | 76 |
136
- | Aggregation | 281 |
137
- | Lookup | 59 |
138
- | Generalreply | 71 |
139
- | Viewtables | 75 |
140
- | Lookup_1 | 158 |
141
 
142
  ### Training Hyperparameters
143
  - batch_size: (16, 16)
@@ -159,56 +159,56 @@ preds = model("Show me average asset value.")
159
  ### Training Results
160
  | Epoch | Step | Training Loss | Validation Loss |
161
  |:----------:|:--------:|:-------------:|:---------------:|
162
- | 0.0000 | 1 | 0.2291 | - |
163
- | 0.0025 | 50 | 0.2181 | - |
164
- | 0.0050 | 100 | 0.127 | - |
165
- | 0.0075 | 150 | 0.015 | - |
166
- | 0.0100 | 200 | 0.0072 | - |
167
- | 0.0125 | 250 | 0.0034 | - |
168
- | 0.0149 | 300 | 0.0032 | - |
169
- | 0.0174 | 350 | 0.0032 | - |
170
- | 0.0199 | 400 | 0.0019 | - |
171
- | 0.0224 | 450 | 0.0014 | - |
172
- | 0.0249 | 500 | 0.0012 | - |
173
- | 0.0274 | 550 | 0.0011 | - |
174
- | 0.0299 | 600 | 0.0018 | - |
175
- | 0.0324 | 650 | 0.0013 | - |
176
- | 0.0349 | 700 | 0.0015 | - |
177
- | 0.0374 | 750 | 0.0009 | - |
178
- | 0.0399 | 800 | 0.0012 | - |
179
- | 0.0423 | 850 | 0.0008 | - |
180
- | 0.0448 | 900 | 0.001 | - |
181
- | 0.0473 | 950 | 0.0009 | - |
182
- | 0.0498 | 1000 | 0.0007 | - |
183
- | 0.0523 | 1050 | 0.0009 | - |
184
- | 0.0548 | 1100 | 0.001 | - |
185
- | 0.0573 | 1150 | 0.0008 | - |
186
- | 0.0598 | 1200 | 0.0006 | - |
187
- | 0.0623 | 1250 | 0.0007 | - |
188
- | 0.0648 | 1300 | 0.0006 | - |
189
- | 0.0673 | 1350 | 0.0007 | - |
190
- | 0.0697 | 1400 | 0.0007 | - |
191
- | 0.0722 | 1450 | 0.0008 | - |
192
- | 0.0747 | 1500 | 0.0006 | - |
193
- | 0.0772 | 1550 | 0.0008 | - |
194
- | 0.0797 | 1600 | 0.0005 | - |
195
- | 0.0822 | 1650 | 0.0009 | - |
196
- | 0.0847 | 1700 | 0.0006 | - |
197
- | 0.0872 | 1750 | 0.0007 | - |
198
- | 0.0897 | 1800 | 0.0007 | - |
199
- | 0.0922 | 1850 | 0.0006 | - |
200
- | 0.0947 | 1900 | 0.0006 | - |
201
- | 0.0971 | 1950 | 0.0007 | - |
202
- | 0.0996 | 2000 | 0.0005 | - |
203
- | 0.1021 | 2050 | 0.0005 | - |
204
- | 0.1046 | 2100 | 0.0004 | - |
205
- | 0.1071 | 2150 | 0.0006 | - |
206
- | 0.1096 | 2200 | 0.0007 | - |
207
- | 0.1121 | 2250 | 0.0004 | - |
208
- | 0.1146 | 2300 | 0.0006 | - |
209
- | 0.1171 | 2350 | 0.0008 | - |
210
- | 0.1196 | 2400 | 0.0007 | - |
211
- | **0.1221** | **2450** | **0.0004** | **0.013** |
212
 
213
  * The bold row denotes the saved checkpoint.
214
  ### Framework Versions
 
10
  - text-classification
11
  - generated_from_setfit_trainer
12
  widget:
13
+ - text: I don't want to handle any filtering tasks.
14
+ - text: Show me all customers who have the last name 'Doe'.
15
+ - text: What tables are available for data analysis in starhub_data_asset?
16
+ - text: what do you think it is?
17
+ - text: Provide data_asset_001_pcc product category details.
18
  inference: true
19
  model-index:
20
  - name: SetFit with BAAI/bge-large-en-v1.5
 
28
  split: test
29
  metrics:
30
  - type: accuracy
31
+ value: 0.9818181818181818
32
  name: Accuracy
33
  ---
34
 
 
60
  - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
61
 
62
  ### Model Labels
63
+ | Label | Examples |
64
+ |:-------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
65
+ | Aggregation | <ul><li>'Show me median Intangible Assets'</li><li>'Can I have sum Cost_Entertainment?'</li><li>'Get me min RevenueVariance_Actual_vs_Forecast.'</li></ul> |
66
+ | Lookup_1 | <ul><li>'Show me data_asset_kpi_cf details.'</li><li>'Retrieve data_asset_kpi_cf details.'</li><li>'Show M&A deal size by sector.'</li></ul> |
67
+ | Viewtables | <ul><li>'What tables are included in the starhub_data_asset database that are required for performing a basic data analysis?'</li><li>'What is the full list of tables available for use in queries within the starhub_data_asset database?'</li><li>'What are the table names within the starhub_data_asset database that enable data analysis of customer feedback?'</li></ul> |
68
+ | Tablejoin | <ul><li>'Is it possible to merge the Employees and Orders tables to see which employee handled each order?'</li><li>'Join data_asset_001_ta with data_asset_kpi_cf.'</li><li>'How can I connect the Customers and Orders tables to find customers who made purchases during a specific promotion?'</li></ul> |
69
+ | Lookup | <ul><li>'Filter by customers who have placed more than 3 orders and get me their email addresses.'</li><li>"Filter by customers in the city 'New York' and show me their phone numbers."</li><li>"Can you filter by employees who work in the 'Research' department?"</li></ul> |
70
+ | Generalreply | <ul><li>"Oh, I just stepped outside and it's actually quite lovely! The sun is shining and there's a light breeze. How about you?"</li><li>"One of my short-term goals is to learn a new skill, like coding or cooking. I also want to save up enough money for a weekend trip with friends. How about you, any short-term goals you're working towards?"</li><li>'Hey! My day is going pretty well, thanks for asking. How about yours?'</li></ul> |
71
+ | Rejection | <ul><li>'I have no interest in generating more data.'</li><li>"I don't want to engage in filtering operations."</li><li>"I'd rather not filter this dataset."</li></ul> |
72
 
73
  ## Evaluation
74
 
75
  ### Metrics
76
  | Label | Accuracy |
77
  |:--------|:---------|
78
+ | **all** | 0.9818 |
79
 
80
  ## Uses
81
 
 
95
  # Download from the 🤗 Hub
96
  model = SetFitModel.from_pretrained("nazhan/bge-large-en-v1.5-brahmaputra-iter-10-3rd")
97
  # Run inference
98
+ preds = model("what do you think it is?")
99
  ```
100
 
101
  <!--
 
127
  ### Training Set Metrics
128
  | Training set | Min | Median | Max |
129
  |:-------------|:----|:-------|:----|
130
+ | Word count | 1 | 8.7137 | 62 |
131
 
132
  | Label | Training Sample Count |
133
  |:-------------|:----------------------|
134
+ | Tablejoin | 128 |
135
+ | Rejection | 73 |
136
+ | Aggregation | 222 |
137
+ | Lookup | 55 |
138
+ | Generalreply | 75 |
139
+ | Viewtables | 76 |
140
+ | Lookup_1 | 157 |
141
 
142
  ### Training Hyperparameters
143
  - batch_size: (16, 16)
 
159
  ### Training Results
160
  | Epoch | Step | Training Loss | Validation Loss |
161
  |:----------:|:--------:|:-------------:|:---------------:|
162
+ | 0.0000 | 1 | 0.2001 | - |
163
+ | 0.0022 | 50 | 0.1566 | - |
164
+ | 0.0045 | 100 | 0.0816 | - |
165
+ | 0.0067 | 150 | 0.0733 | - |
166
+ | 0.0089 | 200 | 0.0075 | - |
167
+ | 0.0112 | 250 | 0.0059 | - |
168
+ | 0.0134 | 300 | 0.0035 | - |
169
+ | 0.0156 | 350 | 0.0034 | - |
170
+ | 0.0179 | 400 | 0.0019 | - |
171
+ | 0.0201 | 450 | 0.0015 | - |
172
+ | 0.0223 | 500 | 0.0021 | - |
173
+ | 0.0246 | 550 | 0.003 | - |
174
+ | 0.0268 | 600 | 0.0021 | - |
175
+ | 0.0290 | 650 | 0.0011 | - |
176
+ | 0.0313 | 700 | 0.0015 | - |
177
+ | 0.0335 | 750 | 0.0011 | - |
178
+ | 0.0357 | 800 | 0.001 | - |
179
+ | 0.0380 | 850 | 0.001 | - |
180
+ | 0.0402 | 900 | 0.0012 | - |
181
+ | 0.0424 | 950 | 0.0012 | - |
182
+ | 0.0447 | 1000 | 0.0011 | - |
183
+ | 0.0469 | 1050 | 0.0008 | - |
184
+ | 0.0491 | 1100 | 0.0009 | - |
185
+ | 0.0514 | 1150 | 0.001 | - |
186
+ | 0.0536 | 1200 | 0.0008 | - |
187
+ | 0.0558 | 1250 | 0.0011 | - |
188
+ | 0.0581 | 1300 | 0.0009 | - |
189
+ | 0.0603 | 1350 | 0.001 | - |
190
+ | 0.0625 | 1400 | 0.0007 | - |
191
+ | 0.0647 | 1450 | 0.0008 | - |
192
+ | 0.0670 | 1500 | 0.0007 | - |
193
+ | 0.0692 | 1550 | 0.001 | - |
194
+ | 0.0714 | 1600 | 0.0007 | - |
195
+ | 0.0737 | 1650 | 0.0007 | - |
196
+ | 0.0759 | 1700 | 0.0006 | - |
197
+ | 0.0781 | 1750 | 0.0008 | - |
198
+ | 0.0804 | 1800 | 0.0006 | - |
199
+ | 0.0826 | 1850 | 0.0005 | - |
200
+ | 0.0848 | 1900 | 0.0006 | - |
201
+ | 0.0871 | 1950 | 0.0005 | - |
202
+ | 0.0893 | 2000 | 0.0007 | - |
203
+ | 0.0915 | 2050 | 0.0005 | - |
204
+ | 0.0938 | 2100 | 0.0006 | - |
205
+ | 0.0960 | 2150 | 0.0007 | - |
206
+ | 0.0982 | 2200 | 0.0005 | - |
207
+ | 0.1005 | 2250 | 0.0008 | - |
208
+ | 0.1027 | 2300 | 0.0005 | - |
209
+ | 0.1049 | 2350 | 0.0008 | - |
210
+ | 0.1072 | 2400 | 0.0007 | - |
211
+ | **0.1094** | **2450** | **0.0007** | **0.0094** |
212
 
213
  * The bold row denotes the saved checkpoint.
214
  ### Framework Versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f4003dfe8a0709586fb1f3eaf59f3bed9d7d75b6a84620103c069c3ce7dac996
3
  size 1340612432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1941e8a083852250f1fd0c92ea0183a452ce351c669ba6ca68cbef16239d867a
3
  size 1340612432
model_head.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:62d239cca6b566d2b6262616079d612ed74ced47cc4c5a8d5f2e6fa8442b62b4
3
  size 58575
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6c3b0d9c84ab235a2347871b8e1618c39f31a68828768a64307ff479bd0e88d5
3
  size 58575