Model save

Browse files

Files changed (4) hide show

README.md +110 -0
all_results.json +9 -0
train_results.json +9 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,110 @@

+---
+base_model: mistralai/Mistral-7B-v0.1
+library_name: peft
+license: apache-2.0
+tags:
+- trl
+- dpo
+- generated_from_trainer
+model-index:
+- name: zephyr-7b-dpo-qlora
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# zephyr-7b-dpo-qlora
+This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4788
+- Rewards/chosen: -2.6215
+- Rewards/rejected: -3.9183
+- Rewards/accuracies: 0.7475
+- Rewards/margins: 1.2968
+- Logps/rejected: -636.4029
+- Logps/chosen: -526.7561
+- Logits/rejected: -1.0296
+- Logits/chosen: -1.1658
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-06
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6807        | 0.0262 | 100  | 0.6809          | 0.0514         | 0.0256           | 0.6555             | 0.0258          | -242.0131      | -259.4604    | -2.0551         | -2.1482       |
+| 0.6438        | 0.0523 | 200  | 0.6356          | -0.1881        | -0.3389          | 0.6760             | 0.1508          | -278.4615      | -283.4154    | -2.0113         | -2.1000       |
+| 0.6073        | 0.0785 | 300  | 0.6054          | -0.6866        | -0.9744          | 0.6815             | 0.2878          | -342.0091      | -333.2583    | -1.9949         | -2.0782       |
+| 0.5956        | 0.1047 | 400  | 0.5824          | -1.4485        | -1.9599          | 0.6830             | 0.5114          | -440.5653      | -409.4522    | -1.5844         | -1.6758       |
+| 0.5643        | 0.1309 | 500  | 0.5726          | -1.1458        | -1.7589          | 0.6915             | 0.6131          | -420.4636      | -379.1804    | -1.5624         | -1.6658       |
+| 0.5373        | 0.1570 | 600  | 0.5631          | -1.1286        | -1.8164          | 0.7030             | 0.6878          | -426.2121      | -377.4605    | -1.6945         | -1.7955       |
+| 0.5394        | 0.1832 | 700  | 0.5474          | -2.2700        | -3.0663          | 0.7040             | 0.7963          | -551.1992      | -491.6012    | -1.1628         | -1.2719       |
+| 0.4983        | 0.2094 | 800  | 0.5323          | -1.5616        | -2.2966          | 0.7225             | 0.7349          | -474.2269      | -420.7654    | -1.5104         | -1.5996       |
+| 0.4763        | 0.2355 | 900  | 0.5386          | -1.6130        | -2.4122          | 0.7160             | 0.7992          | -485.7890      | -425.9030    | -1.4156         | -1.4989       |
+| 0.5266        | 0.2617 | 1000 | 0.5234          | -2.1788        | -3.0546          | 0.7280             | 0.8758          | -550.0311      | -482.4831    | -1.2043         | -1.3050       |
+| 0.59          | 0.2879 | 1100 | 0.5278          | -1.6937        | -2.3427          | 0.7300             | 0.6490          | -478.8385      | -433.9710    | -0.9899         | -1.1100       |
+| 0.5724        | 0.3141 | 1200 | 0.5071          | -1.5548        | -2.4072          | 0.7380             | 0.8523          | -485.2895      | -420.0863    | -1.1349         | -1.2473       |
+| 0.5457        | 0.3402 | 1300 | 0.5013          | -1.7544        | -2.6264          | 0.7435             | 0.8721          | -507.2138      | -440.0385    | -1.2424         | -1.3403       |
+| 0.5423        | 0.3664 | 1400 | 0.5132          | -1.6381        | -2.6114          | 0.7210             | 0.9733          | -505.7077      | -428.4097    | -1.5063         | -1.5869       |
+| 0.4492        | 0.3926 | 1500 | 0.5122          | -1.5882        | -2.5891          | 0.7260             | 1.0010          | -503.4828      | -423.4175    | -1.4972         | -1.5950       |
+| 0.5491        | 0.4187 | 1600 | 0.4956          | -1.6959        | -2.7056          | 0.7395             | 1.0098          | -515.1351      | -434.1913    | -1.1293         | -1.2525       |
+| 0.5408        | 0.4449 | 1700 | 0.5111          | -3.0361        | -4.2392          | 0.7305             | 1.2030          | -668.4869      | -568.2142    | -1.0520         | -1.1774       |
+| 0.4705        | 0.4711 | 1800 | 0.4949          | -2.1236        | -3.1894          | 0.7435             | 1.0658          | -563.5121      | -476.9663    | -1.3479         | -1.4508       |
+| 0.4447        | 0.4973 | 1900 | 0.4984          | -2.0350        | -3.1505          | 0.7420             | 1.1155          | -559.6229      | -468.1011    | -1.1711         | -1.2951       |
+| 0.4561        | 0.5234 | 2000 | 0.4929          | -1.9668        | -2.9588          | 0.7420             | 0.9919          | -540.4462      | -461.2839    | -1.3557         | -1.4696       |
+| 0.5068        | 0.5496 | 2100 | 0.4969          | -3.1452        | -4.3633          | 0.7350             | 1.2180          | -680.8954      | -579.1231    | -1.1150         | -1.2426       |
+| 0.4839        | 0.5758 | 2200 | 0.4927          | -2.3797        | -3.4376          | 0.7405             | 1.0579          | -588.3315      | -502.5681    | -1.2706         | -1.3886       |
+| 0.4729        | 0.6019 | 2300 | 0.4924          | -2.8461        | -4.1210          | 0.7405             | 1.2749          | -656.6667      | -549.2124    | -1.0868         | -1.2145       |
+| 0.4501        | 0.6281 | 2400 | 0.4900          | -2.9743        | -4.2366          | 0.7430             | 1.2623          | -668.2346      | -562.0333    | -0.9978         | -1.1257       |
+| 0.4982        | 0.6543 | 2500 | 0.4872          | -2.4585        | -3.6758          | 0.7420             | 1.2173          | -612.1486      | -510.4511    | -1.0532         | -1.1862       |
+| 0.4649        | 0.6805 | 2600 | 0.4881          | -2.5759        | -3.8831          | 0.7450             | 1.3072          | -632.8793      | -522.1908    | -1.0793         | -1.2115       |
+| 0.556         | 0.7066 | 2700 | 0.4841          | -2.3432        | -3.5113          | 0.7460             | 1.1680          | -595.6959      | -498.9265    | -1.1004         | -1.2295       |
+| 0.4617        | 0.7328 | 2800 | 0.4832          | -2.3495        | -3.6183          | 0.7460             | 1.2689          | -606.4033      | -499.5496    | -1.0627         | -1.1960       |
+| 0.4916        | 0.7590 | 2900 | 0.4800          | -2.6711        | -3.9165          | 0.7455             | 1.2454          | -636.2195      | -531.7142    | -1.0032         | -1.1418       |
+| 0.4708        | 0.7851 | 3000 | 0.4797          | -2.6166        | -3.7883          | 0.7475             | 1.1717          | -623.4008      | -526.2621    | -0.9962         | -1.1355       |
+| 0.4804        | 0.8113 | 3100 | 0.4807          | -2.8224        | -4.1220          | 0.7475             | 1.2996          | -656.7728      | -546.8435    | -0.9953         | -1.1341       |
+| 0.4866        | 0.8375 | 3200 | 0.4777          | -2.5496        | -3.7894          | 0.7475             | 1.2398          | -623.5103      | -519.5614    | -1.0276         | -1.1641       |
+| 0.4967        | 0.8636 | 3300 | 0.4786          | -2.5578        | -3.8108          | 0.7480             | 1.2530          | -625.6535      | -520.3804    | -1.0241         | -1.1608       |
+| 0.4272        | 0.8898 | 3400 | 0.4797          | -2.7223        | -4.0287          | 0.7460             | 1.3065          | -647.4435      | -536.8282    | -1.0071         | -1.1445       |
+| 0.5272        | 0.9160 | 3500 | 0.4797          | -2.7144        | -4.0320          | 0.7470             | 1.3176          | -647.7730      | -536.0449    | -1.0233         | -1.1601       |
+| 0.4441        | 0.9422 | 3600 | 0.4790          | -2.6459        | -3.9513          | 0.7470             | 1.3054          | -639.7043      | -529.1944    | -1.0278         | -1.1641       |
+| 0.4823        | 0.9683 | 3700 | 0.4789          | -2.6279        | -3.9262          | 0.7480             | 1.2982          | -637.1880      | -527.3952    | -1.0329         | -1.1687       |
+| 0.4996        | 0.9945 | 3800 | 0.4788          | -2.6215        | -3.9183          | 0.7475             | 1.2968          | -636.4029      | -526.7561    | -1.0296         | -1.1658       |
+### Framework versions
+- PEFT 0.13.2
+- Transformers 4.45.2
+- Pytorch 2.1.2+cu121
+- Datasets 3.0.1
+- Tokenizers 0.20.1

all_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 1.0,
+    "total_flos": 0.0,
+    "train_loss": 0.517807064771465,
+    "train_runtime": 164396.369,
+    "train_samples": 61134,
+    "train_samples_per_second": 0.372,
+    "train_steps_per_second": 0.023
+}

train_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 1.0,
+    "total_flos": 0.0,
+    "train_loss": 0.517807064771465,
+    "train_runtime": 164396.369,
+    "train_samples": 61134,
+    "train_samples_per_second": 0.372,
+    "train_steps_per_second": 0.023
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff