Riyuechang commited on
Commit
eb33e4b
1 Parent(s): 6933e14

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -3
README.md CHANGED
@@ -1,3 +1,55 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - Riyuechang/PTT-Corpus-100K_Gossiping-1400-39400
5
+ base_model: MediaTek-Research/Breeze-7B-Instruct-v1_0
6
+ pipeline_tag: text-generation
7
+ library_name: peft
8
+ tags:
9
+ - PTT
10
+ - PTT_Chat
11
+ ---
12
+
13
+ # 版本資訊
14
+ 使用新的噪聲較小(理論上)的數據訓練
15
+ Lora使用了更大的r(32)
16
+ 取消了Dora
17
+ 因為Dora的提升有限,還會大幅降低訓練和推理的效率
18
+
19
+ # 簡介
20
+ [Riyuechang/Breeze-7B-PTT-Chat-v2](https://huggingface.co/Riyuechang/Breeze-7B-PTT-Chat-v2)所使用的,未與主模型[MediaTek-Research/Breeze-7B-Instruct-v1_0](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-v1_0)合併的lora模型
21
+
22
+ # 設備
23
+ - Ubuntu 22.04.4 LTS
24
+ - NVIDIA GeForce RTX 3060 12G
25
+
26
+ # Lora參數
27
+ ```python
28
+ r=32,
29
+ lora_alpha=32,
30
+ lora_dropout=0.1,
31
+ task_type="CAUSAL_LM",
32
+ target_modules="all-linear",
33
+ bias="none",
34
+ use_rslora=True
35
+ ```
36
+
37
+ # 訓練參數
38
+ ```python
39
+ per_device_train_batch_size=28,
40
+ gradient_accumulation_steps=1,
41
+ num_train_epochs=3,
42
+ warmup_ratio=0.1,
43
+ learning_rate=2e-5,
44
+ bf16=True,
45
+ save_strategy="steps",
46
+ save_steps=1000,
47
+ save_total_limit=5,
48
+ logging_steps=10,
49
+ output_dir=log_output,
50
+ optim="paged_adamw_8bit",
51
+ gradient_checkpointing=True
52
+ ```
53
+
54
+ # 結果
55
+ - loss: 0.9391