|
--- |
|
license: apache-2.0 |
|
base_model: hon9kon9ize/CantoneseLLMChat-v0.5 |
|
tags: |
|
- llama-factory |
|
- full |
|
- generated_from_trainer |
|
metrics: |
|
- accuracy |
|
model-index: |
|
- name: open-lilm-v2 |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# open-lilm-v2 |
|
|
|
[Version 1](https://huggingface.co/0xtaipoian/open-lilm) can be found here. |
|
|
|
Warning: Due to the nature of the training data, this model is highly likely to return violent, racist and discriminative content. DO NOT USE IN PRODUCTION ENVIRONMENT. |
|
|
|
|
|
Inspired by [another project](https://github.com/alphrc/lilm). |
|
This is a finetuned model based on [CantoneseLLMChat-v0.5](https://huggingface.co/hon9kon9ize/CantoneseLLMChat-v0.5) which everybody can use without the need for a Mac with 128GB RAM. |
|
|
|
Following the same principle, we filtered 1,916,944 post and reply pairs in LIHKG forum from the [LIHKG Dataset](https://huggingface.co/datasets/AlienKevin/LIHKG) and scrapped from the site for the latest posts. |
|
- Reply must be a direct reply to the original post by a user other than the author |
|
- The total number of reactions (positive or negative) must be larger than 20 |
|
- The post and reply pair has to be shorter than 2048 words |
|
|
|
To avoid political complications, the dataset will not be made publicly available. |
|
|
|
|
|
Compared to version 1, |
|
- Training sample increased from 377,595 to 1,916,944, including the latest posts |
|
- Removed all URLs |
|
- Removed comments with only emojis |
|
|
|
|
|
|
|
|
|
## Intended uses & limitations |
|
|
|
Due to the nature of an online and anonymous forum, the training data and the model are full of rude, violent, racist and discriminative language. |
|
This model is only intended for research or entertainment purposes. |
|
|
|
The comments on LIHKG also tend to be very short. Thus the model cannot generate anything more than a line. |
|
|
|
|
|
## How to use it? |
|
You can run it on [Colab](https://colab.research.google.com/drive/1veRH2GP3ZR3buYCG2_bFUKu0kS-hv1S2) or anywhere you want based on the code: |
|
```python |
|
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, LlamaTokenizer, GenerationConfig, pipeline |
|
from peft import PeftModel, PeftMixedModel |
|
import torch |
|
|
|
|
|
model_name = "0xtaipoian/open-lilm-v2" |
|
|
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_use_double_quant=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_compute_dtype=torch.bfloat16 |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
torch_dtype=torch.bfloat16, |
|
trust_remote_code=True, |
|
quantization_config=bnb_config, |
|
) |
|
|
|
def chat(messages, temperature=0.9, max_new_tokens=200): |
|
input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, add_generation_prompt=True, return_tensors='pt').to('cuda:0') |
|
output_ids = model.generate(input_ids, max_new_tokens=max_new_tokens, temperature=temperature, do_sample=True) |
|
|
|
chatml = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False) |
|
print(chatml) |
|
|
|
response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=False) |
|
|
|
return response |
|
|
|
messages = [ |
|
# {"role": "system", "content": ""}, |
|
{"role": "user", |
|
|
|
"content": |
|
""" |
|
密陽44人輪姦案」受害女隔20年現身:時間停在2004,不記得 |
|
"""}] |
|
|
|
result = chat(messages, max_new_tokens=200, temperature=1) |
|
|
|
print(result) |
|
``` |
|
|
|
### Training Procedures |
|
|
|
The model was trained for 11 hours on 8 NVIDIA H100 80GB HBM3 GPUs with [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory). |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 1e-05 |
|
- train_batch_size: 22 |
|
- seed: 42 |
|
- gradient_accumulation_steps: 22 |
|
- total_train_batch_size: 3872 |
|
- num_epochs: 1.0 |
|
|
|
|
|
|