metadata

license: other
license_name: jamba-open-model-license
license_link: https://www.ai21.com/licenses/jamba-open-model-license
language:
  - en
  - fr
  - de
  - nl
  - es
  - pt
  - it
  - ar
  - he
pipeline_tag: text-generation
tags:
  - mamba
  - jamba
  - moe
library_name: transformers

Spellbound Jamba Mini: Creative output over long contexts

Main Goals

The main goals of the base model choice and post-trained regime are

Strong steerability
Coherence over long context lengths
Flexible writing styles
Advanced formatting that allows identifying individual speakers

There was also a secondary training objective: to teach the model to understand and produce directives in XML tags.

<${characterName}Description>: A definition of a character defined as a markdown list of details. For example:
- Name: Character Name
- Personality: Character Personality
- Speaker ID: 32AN4R (see <quote> tag below)
- ...
<writingInstructions>: A block of markdown formatted instructions representing what should happen in the story.
<pastStory>: A block containing the preceeding events to the story being written

Output can optionally include the following tags:

<quote speaker="{speakerId}">: When a character is defined with a speaker ID, the model will output the speech surrounded by <quote speaker="{speakerId}"> and </quote>. The model learns to keep speech in character this way, and it allows for identifying different speakers for rendering and text-to-speech purposes
<action>: Represents an action taken by a character
<sound>: Represents a sound made in the story

Instructing the model to produce these tags is optional, but the model should produce best possible output if the frontend being used can parse/ignore these

Post-training Details

Post-training consists of 1 epoch of SFT LORA training

Trained on synthetic instructions for strong steerability
Outputs rated by tryspellbound.com beta users who opted-in
Lora Rank: 8
Batch Size: 2
Learning Rate: 1e-5

Model Creator

Made by tryspellbound.com.