Model save

Browse files

Files changed (3) hide show

README.md +86 -0
adapter_model.safetensors +1 -1
runs/Apr04_03-39-24_llm-a100-40/events.out.tfevents.1712201995.llm-a100-40.16389.0 +2 -2

README.md ADDED Viewed

	@@ -0,0 +1,86 @@

+---
+license: apache-2.0
+library_name: peft
+tags:
+- trl
+- sft
+- generated_from_trainer
+datasets:
+- generator
+base_model: mistralai/Mixtral-8x7B-v0.1
+model-index:
+- name: mixtral_full
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# mixtral_full
+This model is a fine-tuned version of [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) on the generator dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.9766
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 2
+- eval_batch_size: 1
+- seed: 42
+- gradient_accumulation_steps: 32
+- total_train_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.03
+- num_epochs: 1.5
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 1.3844        | 0.07  | 20   | 1.2695          |
+| 1.1368        | 0.14  | 40   | 1.0996          |
+| 1.0758        | 0.21  | 60   | 1.0459          |
+| 1.0537        | 0.28  | 80   | 1.0269          |
+| 1.0397        | 0.35  | 100  | 1.0147          |
+| 1.0075        | 0.43  | 120  | 1.0059          |
+| 1.0145        | 0.5   | 140  | 0.9990          |
+| 0.9939        | 0.57  | 160  | 0.9937          |
+| 1.0228        | 0.64  | 180  | 0.9895          |
+| 1.0056        | 0.71  | 200  | 0.9858          |
+| 0.999         | 0.78  | 220  | 0.9831          |
+| 1.0084        | 0.85  | 240  | 0.9809          |
+| 0.9957        | 0.92  | 260  | 0.9792          |
+| 1.0033        | 0.99  | 280  | 0.9781          |
+| 0.9884        | 1.06  | 300  | 0.9774          |
+| 0.9906        | 1.13  | 320  | 0.9770          |
+| 0.9893        | 1.2   | 340  | 0.9768          |
+| 1.0005        | 1.28  | 360  | 0.9766          |
+| 0.9824        | 1.35  | 380  | 0.9767          |
+| 0.9886        | 1.42  | 400  | 0.9766          |
+| 0.9828        | 1.49  | 420  | 0.9766          |
+### Framework versions
+- PEFT 0.7.2.dev0
+- Transformers 4.38.1
+- Pytorch 2.1.2+cu121
+- Datasets 2.16.1
+- Tokenizers 0.15.0

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6877c9ecdbc1a1e1e541f5badccd9828a725476307cfe928bb4b9253559f8c09
 size 969176736

 version https://git-lfs.github.com/spec/v1
+oid sha256:c487c591d0c935613751e84d41bb012ed1562c476c54958bf2de82d0b510a1fc
 size 969176736

runs/Apr04_03-39-24_llm-a100-40/events.out.tfevents.1712201995.llm-a100-40.16389.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a51fe55414ad7fdfe70cb8828fef3f3a8e85f4a2b423359bf09dcc196e5021ef
-size 18898

 version https://git-lfs.github.com/spec/v1
+oid sha256:245e443ab6b245247b5d8069c5531227bb3b1c313ecb14f4356cb0f446a6553d
+size 19945