SGEcon's picture
Update README.md
0dca21d verified
|
raw
history blame
No virus
4.78 kB
metadata
library_name: transformers
license: apache-2.0
pipeline_tag: text-generation

Model Details

Model Developers: Sogang University SGEconFinlab

Model Description

This model is a language model specialized in economics and finance. This was learned with various economic/finance-related data. The data sources are listed below, and we are not releasing the data we trained on because it was used for research/policy purposes. If you wish to use the original data rather than our training data, please contact the original author directly for permission to use it.

  • Developed by: Sogang University SGEconFinlab
  • Language(s) (NLP): Ko/En
  • License: apache-2.0
  • Base Model: yanolja/KoSOLAR-10.7B-v0.2

Uses

Direct Use

[More Information Needed]

How to Get Started with the Model

peft_model_id = "SGEcon/KoSOLAR-10.7B-v0.2_fin_v4"
config = PeftConfig.from_pretrained(peft_model_id)
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, quantization_config=bnb_config, device_map={"":0})
model = PeftModel.from_pretrained(model, peft_model_id)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
model.eval()


import re
def gen(x):
    inputs = tokenizer(f"### 질문: {x}\n\n### 답변:", return_tensors='pt', return_token_type_ids=False)

    # Move data to GPU (if available)
    inputs = {k: v.to(device="cuda" if torch.cuda.is_available() else "cpu") for k, v in inputs.items()}

    gened = model.generate(
        **inputs,
        max_new_tokens=256,
        early_stopping=True,
        num_return_sequences=4,  
        do_sample=True,
        eos_token_id=tokenizer.eos_token_id,  
        temperature=0.9,
        top_p=0.8,
        top_k=50
    )

    complete_answers = []
    for gen_seq in gened:
        decoded = tokenizer.decode(gen_seq, skip_special_tokens=True).strip()

        # Extract only the text after the string "### 답변:"
        first_answer_start_idx = decoded.find("### 답변:") + len("### 답변:")
        temp_answer = decoded[first_answer_start_idx:].strip()

        # Extract only text up to the second "### 답변:" string
        second_answer_start_idx = temp_answer.find("### 답변:")
        if second_answer_start_idx != -1:
            complete_answer = temp_answer[:second_answer_start_idx].strip()
        else:
            complete_answer = temp_answer  # 두 번째 "### 답변:"이 없는 경우 전체 답변 반환
    
        complete_answers.append(complete_answer)

    return complete_answers

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

  • Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Citation [optional]