Tilemachos Chatzipapas twenty8th winglian commited on
Commit
cc25039
1 Parent(s): 9135b9e

Fine-Tuning Mistral-7b for Real-World Chatbot Applications Using Axolotl (Lora used) (#1155)

Browse files

* Mistral-7b finetune example using axolotl with code,config,data

* Corrected the path for huggingface dataset

* Update data.jsonl

* chore: lint

---------

Co-authored-by: twenty8th <twenty8th@users.noreply.github.com>
Co-authored-by: Wing Lian <wing.lian@gmail.com>

README.md CHANGED
@@ -1,3 +1,6 @@
 
 
 
1
  # Axolotl
2
 
3
  Axolotl is a tool designed to streamline the fine-tuning of various AI models, offering support for multiple configurations and architectures.
 
1
+ ## What's New
2
+ - Added `Mistral-7b-example`: A comprehensive example for fine-tuning Mistral-7b model. [Check it out here](https://github.com/Tilemachoc/axolotl/tree/mistral-7b-example/examples/mistral/Mistral-7b-example).
3
+
4
  # Axolotl
5
 
6
  Axolotl is a tool designed to streamline the fine-tuning of various AI models, offering support for multiple configurations and architectures.
examples/mistral/Mistral-7b-example/README.md ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Description
2
+ This repository presents an in-depth guide for fine-tuning Mistral-7b or any other compatible model using Axolotl, tailored specifically for chatbot development. It streamlines the process of fine-tuning and uploading the enhanced model to HuggingFace 🤗, thereby serving as an invaluable tool for developers in the AI and chatbot domain.
3
+
4
+ **What’s Inside:**
5
+
6
+ Beginner-Friendly Instructions: Comprehensive steps to guide you through fine-tuning your chosen model, including details on the data structure (jsonl), configuration, and the code itself.
7
+
8
+ Hardware Utilized: For reference, the fine-tuning in this guide was performed using 4x NVIDIA GeForce RTX 3090 (rented 2.1.2-cuda12.1-cudnn8-devel).
9
+
10
+ **Uploading to HuggingFace 🤗:**
11
+ To upload your fine-tuned model to Hugging Face, include the following files:
12
+ ![Screenshot 2024-01-19 213932](https://github.com/OpenAccess-AI-Collective/axolotl/assets/138583191/d660eb84-2d76-46a1-9846-cf0aeb3006d9)
examples/mistral/Mistral-7b-example/code.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
examples/mistral/Mistral-7b-example/config.yml ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #Mistral-7b
2
+ base_model: mistralai/Mistral-7B-v0.1
3
+ model_type: MistralForCausalLM
4
+ tokenizer_type: LlamaTokenizer
5
+ is_mistral_derived_model: true
6
+
7
+ load_in_8bit: true
8
+ load_in_4bit: false
9
+ strict: false
10
+
11
+ datasets:
12
+ - path: tilemachos/Demo-Dataset #Path to json dataset file in huggingface
13
+ #for type,conversation arguments read axolotl readme and pick what is suited for your project, I wanted a chatbot and put sharegpt and chatml
14
+ type: sharegpt
15
+ conversation: chatml
16
+ dataset_prepared_path: tilemachos/Demo-Dataset #Path to json dataset file in huggingface
17
+ val_set_size: 0.05
18
+ output_dir: ./out
19
+
20
+ #using lora for lower cost
21
+ adapter: lora
22
+ lora_r: 8
23
+ lora_alpha: 16
24
+ lora_dropout: 0.05
25
+ lora_target_modules:
26
+ - q_proj
27
+ - v_proj
28
+
29
+ sequence_len: 512
30
+ sample_packing: false
31
+ pad_to_sequence_len: true
32
+
33
+ wandb_project:
34
+ wandb_entity:
35
+ wandb_watch:
36
+ wandb_name:
37
+ wandb_log_model:
38
+
39
+ #only 2 epochs because of small dataset
40
+ gradient_accumulation_steps: 3
41
+ micro_batch_size: 2
42
+ num_epochs: 2
43
+ optimizer: adamw_bnb_8bit
44
+ lr_scheduler: cosine
45
+ learning_rate: 0.0002
46
+
47
+ train_on_inputs: false
48
+ group_by_length: false
49
+ bf16: true
50
+ fp16: false
51
+ tf32: false
52
+
53
+ gradient_checkpointing: true
54
+ early_stopping_patience:
55
+ resume_from_checkpoint:
56
+ local_rank:
57
+ logging_steps: 1
58
+ xformers_attention:
59
+ flash_attention: true
60
+
61
+ warmup_steps: 10
62
+ evals_per_epoch: 4
63
+ eval_table_size:
64
+ eval_table_max_new_tokens: 128
65
+ saves_per_epoch: 1
66
+ debug:
67
+ #default deepspeed, can use more aggresive if needed like zero2, zero3
68
+ deepspeed: deepspeed/zero1.json
69
+ weight_decay: 0.0
70
+ fsdp:
71
+ fsdp_config:
72
+ special_tokens:
73
+ bos_token: "<s>"
74
+ eos_token: "</s>"
75
+ unk_token: "<unk>"
examples/mistral/Mistral-7b-example/data.jsonl ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {"conversations": [{"from": "Customer", "value": "\"<Customer>: Who is the Founder of Apple\""}, {"from": "gpt", "value": "\"<Chatbot>: The founder of Apple is Steve Jobs\""}]}
2
+ {"conversations": [{"from": "Customer", "value": "\"<Customer>: What is the capital of France?\""}, {"from": "gpt", "value": "\"<Chatbot>: The capital of France is Paris.\""}]}
3
+ {"conversations": [{"from": "Customer", "value": "\"<Customer>: How far is the Moon from Earth?\""}, {"from": "gpt", "value": "\"<Chatbot>: The Moon is approximately 384,400 kilometers from Earth.\""}]}
4
+ {"conversations": [{"from": "Customer", "value": "\"<Customer>: What is the tallest mountain in the world?\""}, {"from": "gpt", "value": "\"<Chatbot>: The tallest mountain in the world is Mount Everest.\""}]}
5
+ {"conversations": [{"from": "Customer", "value": "\"<Customer>: Who wrote Romeo and Juliet?\""}, {"from": "gpt", "value": "\"<Chatbot>: Romeo and Juliet was written by William Shakespeare.\""}]}
6
+ {"conversations": [{"from": "Customer", "value": "\"<Customer>: What is the boiling point of water?\""}, {"from": "gpt", "value": "\"<Chatbot>: The boiling point of water is 100 degrees Celsius.\""}]}
7
+ {"conversations": [{"from": "Customer", "value": "\"<Customer>: When was the first man on the moon?\""}, {"from": "gpt", "value": "\"<Chatbot>: The first man landed on the moon in 1969.\""}]}
8
+ {"conversations": [{"from": "Customer", "value": "\"<Customer>: What is the largest ocean?\""}, {"from": "gpt", "value": "\"<Chatbot>: The largest ocean is the Pacific Ocean.\""}]}
9
+ {"conversations": [{"from": "Customer", "value": "\"<Customer>: Who invented the telephone?\""}, {"from": "gpt", "value": "\"<Chatbot>: The telephone was invented by Alexander Graham Bell.\""}]}
10
+ {"conversations": [{"from": "Customer", "value": "\"<Customer>: What is the formula for water?\""}, {"from": "gpt", "value": "\"<Chatbot>: The chemical formula for water is H2O.\""}]}