|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
|
|
# Model Details |
|
Model Developers: Sogang University SGEconFinlab |
|
|
|
### Model Description |
|
|
|
This model is a language model specialized in economics and finance. This was learned with various economic/finance-related data. |
|
The data sources are listed below, and we are not releasing the data we trained on because it was used for research/policy purposes. |
|
If you wish to use the original data rather than our training data, please contact the original author directly for permission to use it. |
|
|
|
- **Developed by:** Sogang University SGEconFinlab |
|
- **Language(s) (NLP):** Ko/En |
|
- **License:** apache-2.0 |
|
- **Base Model:** yanolja/KoSOLAR-10.7B-v0.2 |
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
### Direct Use |
|
|
|
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. --> |
|
|
|
[More Information Needed] |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
peft_model_id = "SGEcon/KoSOLAR-10.7B-v0.2_fin_v4" |
|
config = PeftConfig.from_pretrained(peft_model_id) |
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_use_double_quant=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_compute_dtype=torch.bfloat16 |
|
) |
|
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, quantization_config=bnb_config, device_map={"":0}) |
|
model = PeftModel.from_pretrained(model, peft_model_id) |
|
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path) |
|
model.eval() |
|
|
|
|
|
import re |
|
def gen(x): |
|
inputs = tokenizer(f"### 질문: {x}\n\n### 답변:", return_tensors='pt', return_token_type_ids=False) |
|
|
|
# Move data to GPU (if available) |
|
inputs = {k: v.to(device="cuda" if torch.cuda.is_available() else "cpu") for k, v in inputs.items()} |
|
|
|
gened = model.generate( |
|
**inputs, |
|
max_new_tokens=256, |
|
early_stopping=True, |
|
num_return_sequences=4, |
|
do_sample=True, |
|
eos_token_id=tokenizer.eos_token_id, |
|
temperature=0.9, |
|
top_p=0.8, |
|
top_k=50 |
|
) |
|
|
|
complete_answers = [] |
|
for gen_seq in gened: |
|
decoded = tokenizer.decode(gen_seq, skip_special_tokens=True).strip() |
|
|
|
# Extract only the text after the string "### 답변:" |
|
first_answer_start_idx = decoded.find("### 답변:") + len("### 답변:") |
|
temp_answer = decoded[first_answer_start_idx:].strip() |
|
|
|
# Extract only text up to the second "### 답변:" string |
|
second_answer_start_idx = temp_answer.find("### 답변:") |
|
if second_answer_start_idx != -1: |
|
complete_answer = temp_answer[:second_answer_start_idx].strip() |
|
else: |
|
complete_answer = temp_answer # 두 번째 "### 답변:"이 없는 경우 전체 답변 반환 |
|
|
|
complete_answers.append(complete_answer) |
|
|
|
return complete_answers |
|
|
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
|
|
[More Information Needed] |
|
|
|
### Training Procedure |
|
|
|
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> |
|
|
|
#### Preprocessing [optional] |
|
|
|
[More Information Needed] |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision --> |
|
|
|
#### Speeds, Sizes, Times [optional] |
|
|
|
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. --> |
|
|
|
[More Information Needed] |
|
|
|
## Evaluation |
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Testing Data |
|
|
|
<!-- This should link to a Dataset Card if possible. --> |
|
|
|
[More Information Needed] |
|
|
|
#### Factors |
|
|
|
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. --> |
|
|
|
[More Information Needed] |
|
|
|
#### Metrics |
|
|
|
<!-- These are the evaluation metrics being used, ideally with a description of why. --> |
|
|
|
[More Information Needed] |
|
|
|
### Results |
|
|
|
[More Information Needed] |
|
|
|
#### Summary |
|
|
|
|
|
## Citation [optional] |
|
|
|
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. --> |
|
|