Edit model card

This is an interim update (v0.5) with fixes for the alpha release, but not yet v1.0.

Changes from Alpha:

  • Greatly minimizes "chatGPTisms". No more feeling empowered by the shared bonds of friendship with renewed determination for challenges to come.
  • Increased diversity of NSFW prose.

Examples

Examples are generated with the default Mirostat setting in Oobabooga, with Mirostat tau in the 1.5-2 range. Most are first-time generations, but I had to regenerate some responses a couple of times. These examples are NOT NSFW, and the response text was not modified.

  • Multi-Round Story Writing: Sci-Fi Story
  • Oneshot Story-writing: Crime Story Generating >2K tokens of meaningful content in a single output response (without multi-round) is challenging. This took a few tries. Smoke and mirrors.
  • Multi-Round Story Planning/Brainstorming: Adventure Story Brainstorming
  • Document Q&A and Summarization: Lorebook Q&A (22K tokens)
  • Roleplaying (RP): RP example
  • Interactive World Exploration: Explore a fantasy world Obviously these models don't plan. But it's an interesting way to interact and explore any world, one room/scene at a time. You can come up with whatever rules or genre you want for this type of exploration.

Details (same as alpha)

  • Base model: llama2_70b_longlora_fp16_32k_ROPE8 (no base instruction tuning)
  • Fine-tuned with Llama-2 chat format
  • System prompt: An interaction between a user providing instructions, and an imaginative assistant providing responses.
    • Use the included Aurelian.yaml for Oobabooga (place in the instruction-templates folder).
  • 32K context length, use Linear Rope Scaling = 8 (IMPORTANT: use a factor of 8 even if you are not using the full 32K context length)
  • Intended to be used in instruct mode (rather than notebook mode/completions).
  • This model is not censored, and is capable of producing offensive and NSFW content. Please use this model with caution, and do not use if you are offended by such content.

Tips

  • Treat the first prompt like you normally would the system prompt.
    • System prompt itself does not change.
    • Describe what you want the AI to do in detail, even if you feel it is obvious.
  • Bias the length of the output with your prompt. This is no guarantee though.
    • Egs., Statements like Make this a long response would bias the response longer (easily produces 2000+ tokens per response).
    • Statements like Respond briefly would bias it shorter.
  • Explain clearly if you want the content to be SFW or NSFW in the first prompt as well. However, there are no guarantees that the model won't generate NSFW content.

Available Quantizations

  • bfloat16
  • EXL2 2.4bit fits in 1x24GB using Exllamav2 & 8-bit cache @ 10K context
  • EXL2 4bit fits in 2x24GB (19/24) using Exllamav2 @ 16K context
  • EXL2 6bit fits in 48GB+24GB (36/24 split) or 3x24GB (16/17/20 split) using Exllamav2 @ 32k context
  • All GGUFs

Training Data

85% of the training data was human generated output with synthetic input. 15% was from GPT4.

License

Unsure. It uses some datasets which were generated using GPT-4 outputs, so openAI's terms may apply. I personally have no objection about this model being used for any commercial or non-commercial purpose, but please respect the license agreements of Meta, OpenAI or other parties involved.

Downloads last month
30
Safetensors
Model size
69B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for grimulkan/aurelian-v0.5-70b-rope8-32K-fp16

Merges
1 model
Quantizations
2 models