SGEcon commited on
Commit
6bb0093
1 Parent(s): 63bab77

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -1,7 +1,6 @@
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
4
- pipeline_tag: text-generation
5
  ---
6
 
7
 
@@ -16,9 +15,8 @@ The data sources are listed below, and we are not releasing the data we trained
16
  If you wish to use the original data rather than our training data, please contact the original author directly for permission to use it.
17
 
18
  - **Developed by:** Sogang University SGEconFinlab(<https://sc.sogang.ac.kr/aifinlab/>)
19
- - **Language(s) (NLP):** Ko/En
20
  - **License:** apache-2.0
21
- - **Base Model:** yanolja/KoSOLAR-10.7B-v0.2
22
 
23
 
24
  ## How to Get Started with the Model
@@ -77,6 +75,7 @@ If you wish to use the original data rather than our training data, please conta
77
 
78
  return complete_answers
79
 
 
80
 
81
  ## Training Details
82
  First, we loaded the base model quantized to 4 bits. It can significantly reduce the amount of memory required to store the model's weights and intermediate computation results, which is beneficial for deploying models in environments with limited memory resources. It can also provide faster inference speeds.
@@ -92,6 +91,7 @@ If you wish to use the original data rather than our training data, please conta
92
  5. 중소벤처기업부/대한민국정부: 중소벤처기업부 전문용어(<https://terms.naver.com/list.naver?cid=42103&categoryId=42103>)
93
  6. 고성삼/법문출판사: 회계·세무 용어사전(<https://terms.naver.com/list.naver?cid=51737&categoryId=51737>)
94
  7. 맨큐의 경제학 8판 Word Index
 
95
 
96
 
97
  ### Training Procedure
@@ -102,6 +102,7 @@ If you wish to use the original data rather than our training data, please conta
102
  #### Training Hyperparameters
103
 
104
  |Hyperparameter|SGEcon/KoSOLAR-10.7B-v0.2_fin_v4|
 
105
  |Lora Method|Lora|
106
  |load in 4 bit|True|
107
  |learning rate|1e-5|
@@ -110,7 +111,7 @@ If you wish to use the original data rather than our training data, please conta
110
  |lora rank|16|
111
  |lora dropout|0.05|
112
  |optim|paged_adamw_32bit|
113
- |target_modules||"q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "lm_head"|
114
 
115
  ## Evaluation
116
 
 
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
 
4
  ---
5
 
6
 
 
15
  If you wish to use the original data rather than our training data, please contact the original author directly for permission to use it.
16
 
17
  - **Developed by:** Sogang University SGEconFinlab(<https://sc.sogang.ac.kr/aifinlab/>)
 
18
  - **License:** apache-2.0
19
+ - **Base Model:** yanolja/KoSOLAR-10.7B-v0.2(<https://huggingface.co/yanolja/KoSOLAR-10.7B-v0.2>)
20
 
21
 
22
  ## How to Get Started with the Model
 
75
 
76
  return complete_answers
77
 
78
+
79
 
80
  ## Training Details
81
  First, we loaded the base model quantized to 4 bits. It can significantly reduce the amount of memory required to store the model's weights and intermediate computation results, which is beneficial for deploying models in environments with limited memory resources. It can also provide faster inference speeds.
 
91
  5. 중소벤처기업부/대한민국정부: 중소벤처기업부 전문용어(<https://terms.naver.com/list.naver?cid=42103&categoryId=42103>)
92
  6. 고성삼/법문출판사: 회계·세무 용어사전(<https://terms.naver.com/list.naver?cid=51737&categoryId=51737>)
93
  7. 맨큐의 경제학 8판 Word Index
94
+ 8. yanolja/KoSOLAR-10.7B-v0.2(<yanolja/KoSOLAR-10.7B-v0.2>)
95
 
96
 
97
  ### Training Procedure
 
102
  #### Training Hyperparameters
103
 
104
  |Hyperparameter|SGEcon/KoSOLAR-10.7B-v0.2_fin_v4|
105
+ |------|---|
106
  |Lora Method|Lora|
107
  |load in 4 bit|True|
108
  |learning rate|1e-5|
 
111
  |lora rank|16|
112
  |lora dropout|0.05|
113
  |optim|paged_adamw_32bit|
114
+ |target_modules|q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, lm_head|
115
 
116
  ## Evaluation
117