File size: 5,041 Bytes
848d35e 652f33a 1e0dfef 848d35e 652f33a 848d35e 652f33a 848d35e 652f33a 11511e0 652f33a b32d151 652f33a 93ad4a4 652f33a e12d551 652f33a e12d551 de2eeb0 652f33a ba2e0f4 012d05e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
---
library_name: peft
datasets:
- fnlp/moss-003-sft-data
pipeline_tag: conversational
base_model: meta-llama/Llama-2-7b-hf
---
<div align="center">
<img src="https://github.com/InternLM/lmdeploy/assets/36994684/0cf8d00f-e86b-40ba-9b54-dc8f1bc6c8d8" width="600"/>
[![Generic badge](https://img.shields.io/badge/GitHub-%20XTuner-black.svg)](https://github.com/InternLM/xtuner)
</div>
## Model
Llama-2-7b-qlora-moss-003-sft is fine-tuned from [Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b-hf) with [moss-003-sft](https://huggingface.co/datasets/fnlp/moss-003-sft-data) dataset by [XTuner](https://github.com/InternLM/xtuner).
## Quickstart
### Usage with HuggingFace libraries
```python
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, StoppingCriteria
from transformers.generation import GenerationConfig
class StopWordStoppingCriteria(StoppingCriteria):
def __init__(self, tokenizer, stop_word):
self.tokenizer = tokenizer
self.stop_word = stop_word
self.length = len(self.stop_word)
def __call__(self, input_ids, *args, **kwargs) -> bool:
cur_text = self.tokenizer.decode(input_ids[0])
cur_text = cur_text.replace('\r', '').replace('\n', '')
return cur_text[-self.length:] == self.stop_word
tokenizer = AutoTokenizer.from_pretrained('meta-llama/Llama-2-7b-hf', trust_remote_code=True)
quantization_config = BitsAndBytesConfig(load_in_4bit=True, load_in_8bit=False, llm_int8_threshold=6.0, llm_int8_has_fp16_weight=False, bnb_4bit_compute_dtype=torch.float16, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type='nf4')
model = AutoModelForCausalLM.from_pretrained('meta-llama/Llama-2-7b-hf', quantization_config=quantization_config, device_map='auto', trust_remote_code=True).eval()
model = PeftModel.from_pretrained(model, 'xtuner/Llama-2-7b-qlora-moss-003-sft')
gen_config = GenerationConfig(max_new_tokens=1024, do_sample=True, temperature=0.1, top_p=0.75, top_k=40)
# Note: In this example, we disable the use of plugins because the API depends on additional implementations.
# If you want to experience plugins, please refer to XTuner CLI!
prompt_template = (
'You are an AI assistant whose name is Llama2.\n'
'Capabilities and tools that Llama2 can possess.\n'
'- Inner thoughts: disabled.\n'
'- Web search: disabled.\n'
'- Calculator: disabled.\n'
'- Equation solver: disabled.\n'
'- Text-to-image: disabled.\n'
'- Image edition: disabled.\n'
'- Text-to-speech: disabled.\n'
'<|Human|>: {input}<eoh>\n'
'<|Inner Thoughts|>: None<eot>\n'
'<|Commands|>: None<eoc>\n'
'<|Results|>: None<eor>\n')
text = '请给我介绍五个上海的景点'
inputs = tokenizer(prompt_template.format(input=text), return_tensors='pt')
inputs = inputs.to(model.device)
pred = model.generate(**inputs, generation_config=gen_config, stopping_criteria=[StopWordStoppingCriteria(tokenizer, '<eom>')])
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
"""
好的,以下是五个上海的景点:
1. 外滩:外滩是上海的标志性景点之一,是一条长达1.5公里的沿江大道,沿途有许多历史建筑和现代化的高楼大厦。游客可以欣赏到黄浦江两岸的美景,还可以在这里拍照留念。
2. 上海博物馆:上海博物馆是上海市最大的博物馆之一,收藏了大量的历史文物和艺术品。博物馆内有许多展览,包括中国古代文物、近代艺术品和现代艺术品等。
3. 上海科技馆:上海科技馆是一座以科技为主题的博物馆,展示了许多科技产品和科技发展的历史。游客可以在这里了解到许多有趣的科技知识,还可以参加一些科技体验活动。
4. 上海迪士尼乐园:上海迪士尼乐园是中国第一个迪士尼乐园,是一个集游乐、购物、餐饮、娱乐等多种功能于一体的主题公园。游客可以在这里体验到迪士尼的经典故事和游乐设施。
5. 上海野生动物园:上海野生动物园是一座以野生动物观赏和保护为主题的大型动物园。它位于上海市浦东新区,是中国最大的野生动物园之一。
"""
```
### Usage with XTuner CLI
#### Installation
```shell
pip install -U xtuner
```
#### Chat
> Don't forget to use `huggingface-cli login` and input your access token first to access Llama2! See [here](https://huggingface.co/docs/hub/security-tokens#user-access-tokens) to learn how to obtain your access token.
```shell
export SERPER_API_KEY="xxx" # Please get the key from https://serper.dev to support google search!
xtuner chat meta-llama/Llama-2-7b-hf --adapter xtuner/Llama-2-7b-qlora-moss-003-sft --bot-name Llama2 --prompt-template moss_sft --system-template moss_sft --with-plugins calculate solve search --no-streamer
```
#### Fine-tune
Use the following command to quickly reproduce the fine-tuning results.
```shell
NPROC_PER_NODE=8 xtuner train llama2_7b_qlora_moss_sft_all_e2_gpu8
``` |