jukofyork
/

creative-writing-control-vectors-v2.1.2-EXPERIMENTAL

GGUF

control-vector

creative-writing

Model card Files Files and versions Community

jukofyork commited on Aug 8

Commit

103edd1

•

1 Parent(s): fab9cc9

Update README.md

Browse files

Files changed (1) hide show

README.md +13 -13

README.md CHANGED Viewed

@@ -7,12 +7,12 @@ tags:
 **NOTE**: See [creative-writing-control-vectors-v2.1](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1) for the current main control-vector repo.
-- *08/08/24 - Added `'WizardLM-2-8x22B'`, `'c4ai-command-r-v01'` and `'gemma-2-27b-it'`.*
-- *09/08/24 - Added `'miqu-1-70b'`.*
 ## Details
-The control-vectors in this repo were created as an experiment by increasing the triplets in `system_messages_outlook_extended.json` by 4x (click to expand):
 <details> <summary>"Outlook (extended)" ('positive' <---> 'negative')</summary>
@@ -149,32 +149,32 @@ The control-vectors in this repo were created as an experiment by increasing the
 </details>
-So now each models' cross-covariance matrix is the result of `120,000` hidden state samples and thus for the largest models (with `hidden_dim = 12288`) uses at least 10 samples per element.
 ## Regularisation
-I've also included 3 sets of control vectors trained using 3 different `'--regularisation_factor'` values:
 - [regularisation_factor = 1.0](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%201.0)
 - [regularisation_factor = 0.5](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%200.5)
 - [regularisation_factor = 0.0](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%200.0)
-Try to use the largest `regularisation_factor` that has the desired effect. This has the least chance of damaging the models' outputs:
-- `'miqu-1-70b'` and `'WizardLM-2-8x22B'` *likely* need to use `'regularisation_factor = 0.5'` or even `'regularisation_factor = 0.0'`.
-- `'Mistral-Large-Instruct-2407'` *may* need to use `'regularisation_factor = 0.5'`.
-- `'c4ai-command-r-plus'`, `'c4ai-command-r-v01'` and `'gemma-2-27b-it'` usualy work best with the default value of `'regularisation_factor = 1.0'`.
-## Prompting format for 'Mistral-based' models:
-I have found by testing that `'Mistral-Large-Instruct-2407'`, `'WizardLM-2-8x22B'` and `'miqu-1-70b'` seem to work much better for creative writing if you use the following 'Vicuna' prompt template:
 ```
 USER: {prompt}
 ASSISTANT:
 ```
-so I altered the 'Jinja2' `chat_template` in the `tokenizer_config.json` for `Mistral-Large-Instruct-2407`, `WizardLM-2-8x22B` and `miqu-1-70b` to this for the training of these control vectors:
 ```json
 {
@@ -182,4 +182,4 @@ so I altered the 'Jinja2' `chat_template` in the `tokenizer_config.json` for `Mi
 }
 ```
-**NOTE**: I still used the default prompt templates for the other 3 models (`'c4ai-command-r-plus'`, `'c4ai-command-r-v01'` and `'gemma-2-27b-it'`).

 **NOTE**: See [creative-writing-control-vectors-v2.1](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1) for the current main control-vector repo.
+- *08/08/24 - Added `WizardLM-2-8x22B`, `c4ai-command-r-v01` and `gemma-2-27b-it`.*
+- *09/08/24 - Added `miqu-1-70b`.*
 ## Details
+The control vectors in this repository were created experimentally by quadrupling the triplets in `system_messages_outlook_extended.json` (click to expand):
 <details> <summary>"Outlook (extended)" ('positive' <---> 'negative')</summary>
 </details>
+Consequently, each model's cross-covariance matrix is now derived from `120,000` hidden state samples. For the largest models (with a hidden dimension of `12,288`), this ensures at least 10 samples per element.
 ## Regularisation
+I've included three sets of control vectors trained using different `--regularisation_factor` values:
 - [regularisation_factor = 1.0](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%201.0)
 - [regularisation_factor = 0.5](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%200.5)
 - [regularisation_factor = 0.0](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%200.0)
+Use the largest `regularisation_factor` that achieves the desired effect. This minimizes the risk of damaging the model's outputs:
+- `WizardLM-2-8x22B` and `miqu-1-70b` likely need `regularisation_factor = 0.5` or even `regularisation_factor = 0.0`.
+- `Mistral-Large-Instruct-2407` may need `regularisation_factor = 0.5`.
+- `c4ai-command-r-plus`, `c4ai-command-r-v01`, and `gemma-2-27b-it` usually work best with the default `regularisation_factor = 1.0`.
+## Prompting Format for 'Mistral-based' Models
+Testing has shown that `Mistral-Large-Instruct-2407`, `WizardLM-2-8x22B`, and `miqu-1-70b` perform better for creative writing using the following multi-line 'Vicuna' prompt template:
 ```
 USER: {prompt}
 ASSISTANT:
 ```
+For training these control vectors, I modified the 'Jinja2' `chat_template` in `tokenizer_config.json` for `Mistral-Large-Instruct-2407`, `WizardLM-2-8x22B` and `miqu-1-70b` to this for the training of these control vectors:
 ```json
 {
 }
 ```
+**NOTE**: I still used the default prompt templates for the other 3 models (`c4ai-command-r-plus`, `c4ai-command-r-v01` and `gemma-2-27b-it`).