chargoddard
/

piano-medley-7b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

chargoddard commited on Dec 10, 2023

Commit

38da429

•

1 Parent(s): a192555

Create README.md

Files changed (1) hide show

README.md +55 -0

README.md ADDED Viewed

	@@ -0,0 +1,55 @@

+---
+license: cc-by-nc-4.0
+datasets:
+- pankajmathur/orca_mini_v1_dataset
+- openai/summarize_from_feedback
+- PygmalionAI/PIPPA
+- chargoddard/rpguild
+- lemonilia/LimaRP
+- PKU-Alignment/PKU-SafeRLHF
+- Intel/orca_dpo_pairs
+- argilla/ultrafeedback-binarized-preferences
+---
+Another experiment in the line of [loyal-piano-m7](https://huggingface.co/chargoddard/loyal-piano-m7).
+Steps taken to produce this model:
+* Train loyal-piano-m7
+* cDPO with HuggingFaceH4/ultrafeedback_binarized to produce loyal-piano-m7-cdpo
+* Train another model with different sampling of the same source datasets as loyal-piano, let's call it servile-harpsichord
+* cDPO servile-harpsichord with argilla/ultrafeedback-binarized-preferences, Intel/orca_dpo_pairs, and a helpfulness-only version of PKU-Alignment/PKU-SafeRLHF
+* TIES merge several checkpoints of servile-harpsichord-cdpo with loyal-piano-m7-cdpo
+Local benchmarks show the result to be better than any of the individual components. Let's see if that holds up!
+Trained using the Alpaca prompt format.
+Configuration for final merge:
+```yml
+models:
+  - model: chargoddard/loyal-piano-m7-cdpo
+    parameters:
+      density: 1.0
+      weight: 1.0
+  - model: /home/ubuntu/servile-harpsichord-cdpo/checkpoint-4186
+    parameters:
+      weight: 0.1
+  - model: /home/ubuntu/servile-harpsichord-cdpo/checkpoint-5796
+    parameters:
+      weight: 0.2
+  - model: /home/ubuntu/servile-harpsichord-cdpo/checkpoint-6118
+    parameters:
+      weight: 0.3
+  - model: /home/ubuntu/servile-harpsichord-cdpo/final
+    parameters:
+      weight: 0.4
+merge_method: ties
+base_model: mistralai/Mistral-7B-v0.1
+dtype: bfloat16
+parameters:
+  density: 0.4
+  normalize: true
+  int8_mask: true
+```