File size: 3,953 Bytes
f39c13e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
---
license: llama3.2
datasets:
- HuggingFaceH4/ultrachat_200k
base_model:
- meta-llama/Llama-3.2-1B
pipeline_tag: text-generation
tags:
- trl
- llama
- sft
- alignment
- transformers
- custome
- chat
---
# Llama-3.2-1B-ultrachat200k
## Model Details
- **Model type:** sft model
- **License:** llama3.2
- **Finetuned from model:** [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B)
- **Training data:** [HuggingFaceH4/ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k)
- **Training framework:** [trl](https://github.com/huggingface/trl)
## Training Details
### Training Hyperparameters
`attn_implementation`: flash_attention_2 \
`bf16`: True \
`learning_rate`: 2e-5 \
`lr_scheduler_type`: cosine \
`per_device_train_batch_size`: 2 \
`gradient_accumulation_steps`: 16 \
`torch_dtype`: bfloat16 \
`num_train_epochs`: 1 \
`max_seq_length`: 2048 \
`warmup_ratio`: 0.1
### Results
`init_train_loss`: 1.726 \
`final_train_loss`: 1.22 \
### Training script
```python
import multiprocessing
from datasets import load_dataset
from tqdm.rich import tqdm
from transformers import AutoTokenizer, AutoModelForCausalLM
from trl import (
ModelConfig,
SFTTrainer,
get_peft_config,
get_quantization_config,
get_kbit_device_map,
SFTConfig,
ScriptArguments,
TrlParser
)
tqdm.pandas()
if __name__ == "__main__":
parser = TrlParser((ScriptArguments, SFTConfig, ModelConfig))
args, training_args, model_config = parser.parse_args_and_config()
quantization_config = get_quantization_config(model_config)
model_kwargs = dict(
revision=model_config.model_revision,
trust_remote_code=model_config.trust_remote_code,
attn_implementation=model_config.attn_implementation,
torch_dtype=model_config.torch_dtype,
use_cache=False if training_args.gradient_checkpointing else True,
device_map=get_kbit_device_map() if quantization_config is not None else None,
quantization_config=quantization_config,
)
model = AutoModelForCausalLM.from_pretrained(model_config.model_name_or_path,
**model_kwargs)
tokenizer = AutoTokenizer.from_pretrained(
model_config.model_name_or_path, trust_remote_code=model_config.trust_remote_code, use_fast=True
)
tokenizer.pad_token = '<|end_of_text|>'
train_dataset = load_dataset(args.dataset_name,
split=args.dataset_train_split,
num_proc=multiprocessing.cpu_count())
trainer = SFTTrainer(
model=model,
args=training_args,
train_dataset=train_dataset,
processing_class=tokenizer,
peft_config=get_peft_config(model_config),
)
trainer.train()
trainer.save_model(training_args.output_dir)
```
### Test Script
```python
from vllm import LLM
from datasets import load_dataset
from vllm.sampling_params import SamplingParams
from transformers import AutoTokenizer
MODEL_PATH = "autodl-tmp/saves/Llama-3.2-1B-ultrachat200k"
model = LLM(MODEL_PATH,
tensor_parallel_size=1,
dtype='bfloat16')
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
input = tokenizer.apply_chat_template([{"role": "user", "content": "Where is Harbin?"}],
tokenize=False,
add_generation_prompt=True)
sampling_params = SamplingParams(max_tokens=1024,
temperature=0.7,
logprobs=1,
stop_token_ids=[tokenizer.eos_token_id])
vllm_generations = model.generate(input,
sampling_params)
print(vllm_generations[0].outputs[0].text)
# print result: Harbin is located in northeastern China in the Heilongjiang province. It is the capital of Heilongjiang province in the Northeast Asia.
``` |