Borealis-10.7B-DPO / README.md
Undi95's picture
Create README.md
9d6e34f verified
metadata
license: cc-by-nc-4.0
tags:
  - not-for-all-audiences
  - nsfw

Borealis

image/png

Borealis-10.7B-DPO is a 10.7B model made of 48 Mistral 7B layers, finetuned for +70h on 2xA6000 on a big RP and Conversational dataset with llama2 configuration of Axolotl, like SOLAR.

This variant had a DPO train on top of it.

Description

This repo contains fp16 files of Borealis-10.7B-DPO, a conversational model.

The goal of this model isn't to break all benchmark, but to have a better RP/ERP/Conversational model.

It was trained on multiple basic dataset to make it intelligent, but majority of the dataset was basic conversations.

Dataset used

  • NobodyExistsOnTheInternet/ToxicQAFinal
  • teknium/openhermes
  • unalignment/spicy-3.1
  • Doctor-Shotgun/no-robots-sharegpt
  • Undi95/toxic-dpo-v0.1-sharegpt
  • Aesir [1], [2], [3-SFW], [3-NSFW]
  • lemonilia/LimaRP
  • Squish42/bluemoon-fandom-1-1-rp-cleaned
  • Undi95/ConversationChronicles-sharegpt-SHARDED (2 sets, modified)

DPO Dataset used

  • Intel/orca_dpo_pairs
  • NobodyExistsOnTheInternet/ToxicDPOqa
  • Undi95/toxic-dpo-v0.1-NoWarning

Prompt format: NsChatml

<|im_system|>
{sysprompt}<|im_end|>
<|im_user|>
{input}<|im_end|>
<|im_bot|>
{output}<|im_end|>

Others

If you want to support me, you can here.