|
--- |
|
license: llama3.2 |
|
datasets: |
|
- HuggingFaceH4/ultrachat_200k |
|
base_model: |
|
- meta-llama/Llama-3.2-1B |
|
pipeline_tag: text-generation |
|
tags: |
|
- trl |
|
- llama |
|
- sft |
|
- alignment |
|
- transformers |
|
- custome |
|
- chat |
|
--- |
|
# Llama-3.2-1B-ultrachat200k |
|
|
|
|
|
## Model Details |
|
|
|
- **Model type:** sft model |
|
- **License:** llama3.2 |
|
- **Finetuned from model:** [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) |
|
- **Training data:** [HuggingFaceH4/ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) |
|
- **Training framework:** [trl](https://github.com/huggingface/trl) |
|
|
|
## Training Details |
|
|
|
### Training Hyperparameters |
|
`attn_implementation`: flash_attention_2 \ |
|
`bf16`: True \ |
|
`learning_rate`: 2e-5 \ |
|
`lr_scheduler_type`: cosine \ |
|
`per_device_train_batch_size`: 2 \ |
|
`gradient_accumulation_steps`: 16 \ |
|
`torch_dtype`: bfloat16 \ |
|
`num_train_epochs`: 1 \ |
|
`max_seq_length`: 2048 \ |
|
`warmup_ratio`: 0.1 |
|
|
|
### Results |
|
|
|
`init_train_loss`: 1.726 \ |
|
`final_train_loss`: 1.22 \ |
|
|
|
### Training script |
|
|
|
```python |
|
import multiprocessing |
|
|
|
from datasets import load_dataset |
|
from tqdm.rich import tqdm |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
from trl import ( |
|
ModelConfig, |
|
SFTTrainer, |
|
get_peft_config, |
|
get_quantization_config, |
|
get_kbit_device_map, |
|
SFTConfig, |
|
ScriptArguments, |
|
TrlParser |
|
) |
|
|
|
tqdm.pandas() |
|
|
|
if __name__ == "__main__": |
|
parser = TrlParser((ScriptArguments, SFTConfig, ModelConfig)) |
|
args, training_args, model_config = parser.parse_args_and_config() |
|
|
|
quantization_config = get_quantization_config(model_config) |
|
model_kwargs = dict( |
|
revision=model_config.model_revision, |
|
trust_remote_code=model_config.trust_remote_code, |
|
attn_implementation=model_config.attn_implementation, |
|
torch_dtype=model_config.torch_dtype, |
|
use_cache=False if training_args.gradient_checkpointing else True, |
|
device_map=get_kbit_device_map() if quantization_config is not None else None, |
|
quantization_config=quantization_config, |
|
) |
|
|
|
model = AutoModelForCausalLM.from_pretrained(model_config.model_name_or_path, |
|
**model_kwargs) |
|
tokenizer = AutoTokenizer.from_pretrained( |
|
model_config.model_name_or_path, trust_remote_code=model_config.trust_remote_code, use_fast=True |
|
) |
|
tokenizer.pad_token = '<|end_of_text|>' |
|
|
|
train_dataset = load_dataset(args.dataset_name, |
|
split=args.dataset_train_split, |
|
num_proc=multiprocessing.cpu_count()) |
|
|
|
trainer = SFTTrainer( |
|
model=model, |
|
args=training_args, |
|
train_dataset=train_dataset, |
|
processing_class=tokenizer, |
|
peft_config=get_peft_config(model_config), |
|
) |
|
|
|
trainer.train() |
|
|
|
trainer.save_model(training_args.output_dir) |
|
``` |
|
|
|
### Test Script |
|
```python |
|
from vllm import LLM |
|
from datasets import load_dataset |
|
from vllm.sampling_params import SamplingParams |
|
from transformers import AutoTokenizer |
|
|
|
MODEL_PATH = "autodl-tmp/saves/Llama-3.2-1B-ultrachat200k" |
|
|
|
model = LLM(MODEL_PATH, |
|
tensor_parallel_size=1, |
|
dtype='bfloat16') |
|
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH) |
|
|
|
input = tokenizer.apply_chat_template([{"role": "user", "content": "Where is Harbin?"}], |
|
tokenize=False, |
|
add_generation_prompt=True) |
|
sampling_params = SamplingParams(max_tokens=1024, |
|
temperature=0.7, |
|
logprobs=1, |
|
stop_token_ids=[tokenizer.eos_token_id]) |
|
|
|
vllm_generations = model.generate(input, |
|
sampling_params) |
|
|
|
print(vllm_generations[0].outputs[0].text) |
|
# print result: Harbin is located in northeastern China in the Heilongjiang province. It is the capital of Heilongjiang province in the Northeast Asia. |
|
``` |