Riyuechang
/

Breeze-7B-PTT-Chat-v2_lora

Text Generation

Model card Files Files and versions Community

Riyuechang commited on Sep 18

Commit

eb33e4b

•

1 Parent(s): 6933e14

Update README.md

Files changed (1) hide show

README.md +55 -3

README.md CHANGED Viewed

@@ -1,3 +1,55 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+datasets:
+- Riyuechang/PTT-Corpus-100K_Gossiping-1400-39400
+base_model: MediaTek-Research/Breeze-7B-Instruct-v1_0
+pipeline_tag: text-generation
+library_name: peft
+tags:
+- PTT
+- PTT_Chat
+---
+# 版本資訊
+使用新的噪聲較小(理論上)的數據訓練
+Lora使用了更大的r(32)
+取消了Dora
+因為Dora的提升有限，還會大幅降低訓練和推理的效率
+# 簡介
+[Riyuechang/Breeze-7B-PTT-Chat-v2](https://huggingface.co/Riyuechang/Breeze-7B-PTT-Chat-v2)所使用的，未與主模型[MediaTek-Research/Breeze-7B-Instruct-v1_0](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-v1_0)合併的lora模型
+# 設備
+- Ubuntu 22.04.4 LTS
+- NVIDIA GeForce RTX 3060 12G
+# Lora參數
+```python
+r=32,
+lora_alpha=32,
+lora_dropout=0.1,
+task_type="CAUSAL_LM",
+target_modules="all-linear",
+bias="none",
+use_rslora=True
+```
+# 訓練參數
+```python
+per_device_train_batch_size=28,
+gradient_accumulation_steps=1,
+num_train_epochs=3,
+warmup_ratio=0.1,
+learning_rate=2e-5,
+bf16=True,
+save_strategy="steps",
+save_steps=1000,
+save_total_limit=5,
+logging_steps=10,
+output_dir=log_output,
+optim="paged_adamw_8bit",
+gradient_checkpointing=True
+```
+# 結果
+- loss: 0.9391