Sao10K's picture
Update README.md
243c5ca verified
metadata
language:
  - en
license: cc-by-nc-4.0

Trained with compute from Backyard.ai | Thanks to them and @dynafire for helping me out.

Trained on 2x A100 SXM 40GB as an 8-bit LoRA.


This is a group-chat based roleplaying model, based off of 12B-Lyra-v4a2, a variant of Lyra-v4 that is currently private.

It is trained on an entirely human-based dataset, based on forum / internet group roleplaying styles. The only augmentation done with LLMs is to the character sheets, to fit to the system prompt, to fit various character sheets within context.

This model is still capable of 1 on 1 roleplay, though I recommend using ChatML when doing that instead.


party


Formatting:

Training for the multi-character roleplaying format is done with a variant of ChatML, replaced with [INST] blocks formatted as such. Use this to draw in more of the training done.

[INST]system
System Prompt Here[/INST]
[INST]user
User's Yapping[/INST]
[INST]model
Model Reply[/INST]

Relevant!
- Turns do not need to respect user -> model -> user. Training is done with disjointed turns that may have repeating turns to simulate real group roleplay / chat scenarios with multiple users.
- Additional work may be required to fit for your front-end.
- Ideally character cards are all included in the turns. Training is done with this in mind. Below on the page has relevant information.
- This is a Nemo model, so lower Temperature and a sprinkling of min_p helps.
- This does require a lot of tinkering to fit within SillyTavern / other frontends.

To get better performance on Regular 1 on 1 Roleplay or Chat scenarios, use ChatML to get more of Lyra's performance.

<|im_start|>system
System Prompt Here.<|im_end|>
<|im_start|>user
User's Instructions<|im_end|>
<|im_start|>assistant
Model Response<|im_end|>

For best results, set both <|im_end|> and [INST] as stopping strings. Recommended Temperature is <1 , min_p of ateast 0.1


Dataset Information:

This dataset is made from a human RP forum source, trimmed down, augmented and reformatted to fit.
- Each entry has a minimum of 6 turns to be inside
- Number of unique/main characters are ranged from 2 to 7 characters per entry.
- Each conversation is kept as is to preserve quality and uniqueness of the human data.
- Only the added system prompt makes use of the current character sheets given.

The following below is how the current Character Card / Sheets is done, which are augmented from the messy and non-uniform character sheets available. To get best results, please reformat your current character data to the on as seen below, or as similar as you can if possible.

- **Character Name**: 
- **Age**:
- **Race**:
- **Mageblood Type**: (if applicable)
- **Favored Magic Class**: (if applicable)
- **Previous Magic Training**: (if applicable)
- **Occupation/Profession**: (if applicable)
- **Appearance**: (if applicable)
- **Biography**: (if applicable)
- **Good Attributes**: (if applicable)
- **Bad Attributes**: (if applicable)
- **Equipment**: (if applicable)
- **Other Information**: (if applicable)

Here is an example based on the above format:

**Character Name**: Keri Wolf  
**Age**: 21  
**Race**: Vampire  
**Mageblood Type**: Hydromancy  
**Favored Magic Class**: Aqua  
**Previous Magic Training**: Novice  
**Occupation/Profession**: None specified  

**Appearance**:  
- Height: 5'9"  
- A wooden wolf necklace around her neck, contrasting with her pale skin  
- Three swords strapped to her waist  
- A tattoo of a thorn vine, her family crest, on her right arm  
- Normal eye color is red but changes based on her mood or the topic of conversation  
- Carries a hunk of wood and a carving knife for personal activities  

**Biography**:  
Keri Wolf grew up in a family of adopted siblings in Djarkel. She had a normal childhood, with her best friend Satori, and was taught basic self-defense by her father. Her brothers were considered troublemakers but remained close to her. On her 21st birthday, her family was slaughtered by a vampire nest, and she was bitten. This led to her developing vampiric traits and seeking answers at the college.

**Good Attributes**:  
- Easy-going  
- Observant  
- Helps those in trouble  
- Soft-hearted  
- Kind  
- Cool-headed  
- Good at getting out of difficult situations  
- Avoids violence  
- Gets along well with different people  
- Loves animals  

**Bad Attributes**:  
- Sunlight sensitivity  
- Hatred towards vampires outside the college  
- Keeps feelings in check, leading to dangerous outbursts  
- Cruel manner of speaking  
- Thirst for revenge  

**Equipment**:  
- Wooden wolf necklace  
- Three swords (one engraved with a rose, one engraved with her father's name, and one for decoration)  
- Carving knife  
- Hunk of wood  
- Stealth Ring  
- Knight's Shield  

**Other Information**:  
- Secret word: rebirth

The following system prompt is augmented from available character sheets, or details from the original dataset. Placeholder names are given as shown.

You are involved in a multi-character internet-style roleplaying session with a human user, who is playing as Ballbuster Steve. Do not generate dialogue for the user's character, Ballbuster Steve. Focus on the other characters.

[Human User]  
Ballbuster Steve # {user}
Character Bio: [Steve's bio]

[Involved Characters] 
Altair "Arty" Enzo # {char1}
Character Bio: [Arty's bio]
--- 
Sukuna Gojo # {char2}
Character Bio: [Sukuna's bio]
---

The roleplay begins now.

This is how some of the turn example looks like, newlines are only for visual use.

[INST]user
Ballbuster Steve: Being the doorman at a nightclub, especially one as popular as LUSH... [/INST]

[INST]model
Altair "Arty" Enzo: While he was waiting for Jake to answer, Arty noticed from the corner of his eye... [/INST]

[INST]model
Sukuna Gojo: Nick was now out of his element; he just came off his portable radio app... [/INST]

[INST]user
Ballbuster Steve: Steve grabbed his black clutch from where it was stashed under the mixing desk... [/INST]

To make it easier, this is how I'd format responses for the backend:

<s>[INST]system
{system_prompt}[/INST]
[INST]user
{user}: {text}[/INST]
[INST]model
{char1}: {text}[/INST]
[INST]model
{char2}: {text}[/INST]
[INST]user
{user}: {text}[/INST]
[INST]model
{char1}: {text}[/INST]<|im_end|> # For Final Turn only. Alternatively, set <|im_end|> as a stopping string.

Current Issues:

- Impersonation - This is a common side-effect of pure human roleplaying data, unfortunately.
    Users do like writing the actions of others, though this is more limited to end of reply.
- Varied Output Quality - A swipe should be enough?
    I only removed obviously bad entries. Output quality varies thanks to the variety of human users involved.
- Character Detail Confusion when in group chats
    This rarely happens, but it is usually when there are too many main characters, or the bio is improperly formatted and seperated.
    Or if you're using an additional, complex system prompt.
- Random OOC / Story Break moments may still exist despite me filtering the data.
- Limited Dataset Size -> 4K Varied Samples ranging from 2-7 characters per entry. I'm looking to expand.
- Limited System Prompt? -> I'm trying to improve on this.
- Fantasy-bias? -> Most of the entries are fantasy-based after all.

Training Metrics

n_sample: 4000
n_gpu: 2
global batch size: 12
lora: bnb_8bit
no. epochs: 3
lr: 0.000004
lr_scheduler: cosine
deepspeed: zero2