metadata
license: other
license_name: jamba-open-model-license
license_link: https://www.ai21.com/licenses/jamba-open-model-license
language:
- en
- fr
- de
- nl
- es
- pt
- it
- ar
- he
pipeline_tag: text-generation
tags:
- mamba
- jamba
- moe
library_name: transformers
Spellbound Jamba Mini: Creative output over long contexts
Main Goals
The main goals of the base model choice and post-trained regime are
- Strong steerability
- Coherence over long context lengths
- Flexible writing styles
- Advanced formatting that allows identifying individual speakers
There was also a secondary training objective: to teach the model to understand and produce directives in XML tags.
<${characterName}Description>
: A definition of a character defined as a markdown list of details. For example:- Name: Character Name
- Personality: Character Personality
- Speaker ID: 32AN4R (see
<quote>
tag below) - ...
<writingInstructions>
: A block of markdown formatted instructions representing what should happen in the story.<pastStory>
: A block containing the preceeding events to the story being written
Output can optionally include the following tags:
<quote speaker="{speakerId}">
: When a character is defined with a speaker ID, the model will output the speech surrounded by<quote speaker="{speakerId}">
and</quote>
. The model learns to keep speech in character this way, and it allows for identifying different speakers for rendering and text-to-speech purposes<action>
: Represents an action taken by a character<sound>
: Represents a sound made in the story
Instructing the model to produce these tags is optional, but the model should produce best possible output if the frontend being used can parse/ignore these
Post-training Details
Post-training consists of 1 epoch of SFT LORA training
- Trained on synthetic instructions for strong steerability
- Outputs rated by tryspellbound.com beta users who opted-in
- Lora Rank: 8
- Batch Size: 2
- Learning Rate: 1e-5
Model Creator
Made by tryspellbound.com.