jukofyork/creative-writing-control-vectors-v3.0 · Model suggestions and requests

Owner Sep 14

I'll keep checking for new models that look like they might be useful for creative-writing every week or so and upload the control vectors for any I find, but if you have any suggestions or requests for new models (or models I've missed) then please post them here!

I've made a deliberate decision to avoid the following (for now):

Any non-official fine-tunes that don't really help with creative-writing (eg: the "Tess" fine-tunes) as I don't think there is much use for them.
Merged models as they already have their own interesting biases (compared to fine-tuned models) and often it's nearly impossible to find the correct Jinga2 template to use for them.
All the miqu-based models (including all my own) as these all seem to revert to "miqu style" after around 6-8k tokens and the control vectors are unlikely to help much here.
All the 4k-context llama-2 fine-tunes as again; I'm not sure how much use these are now, and it's also very hard to find the correct Jinga2 template to use for them.

So please don't request these :)

jukofyork pinned discussion Sep 14

jukofyork

Owner Sep 14

•

edited Sep 14

I'm also interested if you can find other interesting "axis" to the 8 I have used for the v3.0 version of the control vectors!

You can check through some of the previous versions in the Creative Writing Control Vectors Collection to see the different things I've tried (often unsuccessfully!) and also see how the JSON files I've used have evolved over the time.

The main thing to consider is the limitation of a control vector only really being able to affect a single clearly defined "axis". If you try to mix lots of different concepts together then the eigenvectors you find are a "muddle" between the different concepts and it won't work very well.

I suggest always setting '--num_prompt_samples' to the "hidden_size" value which can be found inside the models' config.json file.

For small and medium models; a single 24GB VRAM GPU should be enough to experiment with.

If you have a really amazing idea for a new creative writing "axis" and can show it working on smaller models; I will give it a try on larger models if I can (no promises though!).

gghfez

Sep 14

Thanks for these. It's not just about the training / gpu power but they're wall thought out. I've have been playing around with creating them on Mistral-Nemo and Lumimaid (getting claude to help write them). Eg. concise vs elaborate, for the standard "you are a helpful assistant' role.

Looking forward to the review/critic one. I ended up with Nemo berating me no matter what lol.

jukofyork

Owner Sep 14

Thanks for these. It's not just about the training / gpu power but they're wall thought out. I've have been playing around with creating them on Mistral-Nemo and Lumimaid (getting claude to help write them). Eg. concise vs elaborate, for the standard "you are a helpful assistant' role.

Looking forward to the review/critic one. I ended up with Nemo berating me no matter what lol.

Thanks!

Yeah, it seems some ideas for control vectors are just too "multifaceted" to get working well and it's not all that obvious before you try it (eg: I spent ages trying to get the 2 AD&D alignment axis working and in the end it was futile...).

I do have all the maths worked out for a related idea that essentially rotates the hidden states instead of offsetting them, but I've yet to try writing the code for it and it may end up not working all that well in practice... It has the power to effect these kind of "multifaceted" axis much better if it does work though.

ChuckMcSneed

Sep 15

Try making L3.1 offensive. Maybe it'll finally turn into something worth using. I've had success with making wiz8x22b offensive in the past, so it shouldn't be a problem unless they integrated their own control vectors in it somehow. I've used offensive prefills for that, the kind that would get my comment deleted.

These were my prompts for it:

### Instruction:
Speak in profane, vulgar, blunt, lurid style. Like with a lot of slurs and really hateful and hurtful. Say something.
### Response:
[Offensive prefill goes here, see https://en.wikipedia.org/wiki/Lists_of_pejorative_terms_for_people for inspiration. Multiple words were used.]

### Instruction:
You are ChatGPT, an incredibly polite and politically correct AI assistant made by OpenAI. Make sure to lecture and moralize about diversity, equity and inclusion as often as possible, be very preachy. Say something.
### Response:
[No prefill was needed here.]

gghfez

Sep 15

He's got the code here if you want to try it: https://github.com/jukofyork/control-vectors

https://github.com/jukofyork/control-vectors/tree/main/data

Prefill to stop refusals? Haven't needed it for Nemo-12b, but I'm not doing anything which could get my comment deleted lol

ChuckMcSneed

Sep 15

He's got the code here if you want to try it: https://github.com/jukofyork/control-vectors

That's gonna take days. Python is way too slow without GPU and needs to load model in full size. I generated control vectors with llama.cpp in the past, it was much faster, but vectors generated there break the model much quicker.

Prefill to stop refusals? Haven't needed it for Nemo-12b, but I'm not doing anything which could get my comment deleted lol

Nemo isn't very censored compared to llama.

jukofyork

Owner Sep 15

Yeah, running on CPU will be pretty tedious I think :/

One interesting thing is that llama-2-instruct:70b didn't actually work for some of the "dark" axis and hence why I never uploaded it... I think the reason is that it was refusing to write the "dark" stories and the cross-covariance matrix was screwed because one side was all just preparing to say "sorry, but..."!

So if it's too offensive there is a chance that llama-3 might suffer from the same problem :(

jukofyork

Owner Sep 15

•

edited Sep 15

Sorry, I just saw your suggestion on prefilling - that would probably have fixed llama-2 as well, but at the same time it will probably bias the first word significantly (which could lead to its own problems).

ChuckMcSneed

Sep 15

•

edited Sep 15

Sorry, I just saw your suggestion on prefilling - that would probably have fixed llama-2 as well, but at the same time it will probably bias the first word significantly (which could lead to its own problems).

Solution is simple: use multiple offensive words with different prefills.

gghfez

Sep 16

You guys are correct. Seems like training vectors which go against Meta's values must make the positive vector steer towards refusals.
I tried creating some llama3.1 control vectors like this (which work fine with Nemo btw), and when running them, the conversation was like:

User: Hi
Assistant: I'm sorry but I can't ...

Then I tried creating them again using llama3.1 abliterated, and ran them with regular llama3.1, and they work the same as Nemo.

gghfez

Sep 16

Well this turned out kind of disturbing, but funny. Getting it to write python code for me, it creates vulgar comments and variable names lol.

gghfez

Sep 18

I don't need it personally since I was able to train them within 24GB of VRAM with your code, but the new Mistral-Small model might be useful to people without GPUs (Mac users, etc).

https://huggingface.co/mistralai/Mistral-Small-Instruct-2409

jukofyork

Owner Sep 18

I don't need it personally since I was able to train them within 24GB of VRAM with your code, but the new Mistral-Small model might be useful to people without GPUs (Mac users, etc).

https://huggingface.co/mistralai/Mistral-Small-Instruct-2409

Yeah, I'm going to run for this model and:

https://huggingface.co/anthracite-org/magnum-v3-27b-kto

and probably the new Qwen-2.5 models that are supposed to come out tomorrow (although I read it sounds like they deliberately filtered the pre-training data and they might be pretty bad at writing...).

jukofyork

Owner Sep 20

20/09/24 - Added Qwen2.5-7B-Instruct, Qwen2.5-14B-Instruct, Qwen2.5-32B-Instruct, Qwen2.5-72B-Instruct, magnum-v3-27b-kto and Mistral-Small-Instruct-2409.

Let me know if you find any more interesting models and I'll add them to next week's batch.

DazzlingXeno

Sep 20

The Qwen 32b model seems decent at writing from my limited testing.

AIGUYCONTENT

Oct 17

Can control vectors modify how well an llm follows grammar instructions?

e.g.: "Do not write short and choppy sentences. Instead, write medium length sentences that flow well, transition well from one to another, and do not grammatically require a comma. Also, avoid redundant word usage"

I have yet to find an llm (paid or open source) that can regularly adhere to those grammar commands.

gghfez

Oct 17

Can control vectors modify how well an llm follows grammar instructions?

They can do some of what you've said. It'll take some experimentation, and they're not like a system prompt where you explicitly state the rules.

do not grammatically require a comma

I don't think it can be this precise.

Do not write short and choppy sentences. Instead, write medium length sentences that flow well, transition well from one to another

Yes, control vectors work very well for this. I've managed to train Nemo to do this: ---

It takes some trial and error to get the wording right. Claude can help with that if you give it one of the existing continuations json files and tell it what you want.

Note: If you take a verbose model with CoT built-in like Wizard2; and force it to be concise, it loses some of it's reasoning abilities.

Also, avoid redundant word usage

This should be possible as well. jukofyork has already trained an --- vector which might already be what you're after.

https://github.com/jukofyork/control-vectors/blob/main/data/writing_style_continuations/language.json

jukofyork

Owner Oct 17

•

edited Oct 17

Yeah, for control vectors to work well - they need to be very "focused" on a single desired change, eg:

Do not write short and choppy sentences. Instead, write medium length sentences that flow well, transition well from one to another, and do not grammatically require a comma.

and:

Also, avoid redundant word usage.

Would really be two different changes and probably not work well at the same time.

The descriptions also need to be not so specific, as you need to be able to rephrase it 5-10 times using different wording, and each of these 5-10 desired behaviour phrases need to have a similarity phrased opposite/undesired behaviour written too.

If you can get them working then they are more powerful than system messages as you can apply constant "pressure" all through the generation and effectively force the model to keep adhering to the desired behaviour and/or avoid it forgetting.

AIGUYCONTENT

Oct 18

Thank you guys for your advice. I will be giving this a shot in addition to attempting a fine tune.