--- license: llama3.2 datasets: - HuggingFaceH4/ultrachat_200k base_model: - meta-llama/Llama-3.2-1B pipeline_tag: text-generation tags: - trl - llama - sft - alignment - transformers - custome - chat --- # Llama-3.2-1B-ultrachat200k ## Model Details - **Model type:** sft model - **License:** llama3.2 - **Finetuned from model:** [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) - **Training data:** [HuggingFaceH4/ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) - **Training framework:** [trl](https://github.com/huggingface/trl) ## Training Details ### Training Hyperparameters `attn_implementation`: flash_attention_2 \ `bf16`: True \ `learning_rate`: 2e-5 \ `lr_scheduler_type`: cosine \ `per_device_train_batch_size`: 2 \ `gradient_accumulation_steps`: 16 \ `torch_dtype`: bfloat16 \ `num_train_epochs`: 1 \ `max_seq_length`: 2048 \ `warmup_ratio`: 0.1 ### Results `init_train_loss`: 1.726 \ `final_train_loss`: 1.22 \ ### Training script ```python import multiprocessing from datasets import load_dataset from tqdm.rich import tqdm from transformers import AutoTokenizer, AutoModelForCausalLM from trl import ( ModelConfig, SFTTrainer, get_peft_config, get_quantization_config, get_kbit_device_map, SFTConfig, ScriptArguments, TrlParser ) tqdm.pandas() if __name__ == "__main__": parser = TrlParser((ScriptArguments, SFTConfig, ModelConfig)) args, training_args, model_config = parser.parse_args_and_config() quantization_config = get_quantization_config(model_config) model_kwargs = dict( revision=model_config.model_revision, trust_remote_code=model_config.trust_remote_code, attn_implementation=model_config.attn_implementation, torch_dtype=model_config.torch_dtype, use_cache=False if training_args.gradient_checkpointing else True, device_map=get_kbit_device_map() if quantization_config is not None else None, quantization_config=quantization_config, ) model = AutoModelForCausalLM.from_pretrained(model_config.model_name_or_path, **model_kwargs) tokenizer = AutoTokenizer.from_pretrained( model_config.model_name_or_path, trust_remote_code=model_config.trust_remote_code, use_fast=True ) tokenizer.pad_token = '<|end_of_text|>' train_dataset = load_dataset(args.dataset_name, split=args.dataset_train_split, num_proc=multiprocessing.cpu_count()) trainer = SFTTrainer( model=model, args=training_args, train_dataset=train_dataset, processing_class=tokenizer, peft_config=get_peft_config(model_config), ) trainer.train() trainer.save_model(training_args.output_dir) ``` ### Test Script ```python from vllm import LLM from datasets import load_dataset from vllm.sampling_params import SamplingParams from transformers import AutoTokenizer MODEL_PATH = "autodl-tmp/saves/Llama-3.2-1B-ultrachat200k" model = LLM(MODEL_PATH, tensor_parallel_size=1, dtype='bfloat16') tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH) input = tokenizer.apply_chat_template([{"role": "user", "content": "Where is Harbin?"}], tokenize=False, add_generation_prompt=True) sampling_params = SamplingParams(max_tokens=1024, temperature=0.7, logprobs=1, stop_token_ids=[tokenizer.eos_token_id]) vllm_generations = model.generate(input, sampling_params) print(vllm_generations[0].outputs[0].text) # print result: Harbin is located in northeastern China in the Heilongjiang province. It is the capital of Heilongjiang province in the Northeast Asia. ```