metadata

datasets:
  - anon8231489123/ShareGPT_Vicuna_unfiltered
  - ehartford/wizard_vicuna_70k_unfiltered
  - ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered
  - QingyiSi/Alpaca-CoT
  - teknium/GPT4-LLM-Cleaned
  - teknium/GPTeacher-General-Instruct
  - metaeval/ScienceQA_text_only
  - hellaswag
  - openai/summarize_from_feedback
  - riddle_sense
  - gsm8k
  - ewof/code-alpaca-instruct-unfiltered
language:
  - en
library_name: transformers
pipeline_tag: text-generation

Manticore 30B Chat (ALPHA)

Alpha release of checkpoint before train and eval loss spikes. Additionally, there seems to be some alignment which is easily jailbroken.

💵 Donate to OpenAccess AI Collective to help us keep building great tools and models!

Manticore 30B Chat builds on Manticore v1 with new datasets, including a de-duped subset of the Pygmalion dataset. It also removes all Alpaca style prompts using ### in favor of chat only style prompts using USER:,ASSISTANT: as well as pygmalion/metharme prompting using <|system|>, <|user|> and <|model|> tokens.

Questions, comments, feedback, looking to donate, or want to help? Reach out on our Discord or email wing@openaccessaicollective.org

Training Datasets

Manticore 30B Chat is a Llama 30B model fine-tuned on the following datasets along with the datasets from the original Manticore 30B.

**Manticore 30B Chat was trained on effectively 40% of the datasets below due to only training for 0.4 epochs.

de-duped pygmalion dataset, filtered down to RP data
riddle_sense - instruct augmented
hellaswag, updated for detailed explanations w 30K+ rows
gsm8k - instruct augmented
ewof/code-alpaca-instruct-unfiltered

Manticore 30B

ShareGPT - based on a cleaned and de-suped subset
WizardLM
Wizard-Vicuna
subset of QingyiSi/Alpaca-CoT for roleplay and CoT
GPT4-LLM-Cleaned
GPTeacher-General-Instruct
ARC-Easy & ARC-Challenge - instruct augmented for detailed responses, derived from the train split
hellaswag - 5K row subset of instruct augmented for concise responses, derived from the train split
metaeval/ScienceQA_text_only - instruct for concise responses
openai/summarize_from_feedback - instruct augmented tl;dr summarization

Not added from Manticore 13B:

mmlu - mmlu datasets were not added to this model as the test split is used for benchmarks

Shoutouts

Special thanks to Nanobit for helping with Axolotl, TheBloke for quantizing these models are more accessible to all, ehartford for cleaned datasets, and 0x000011b for the RP dataset.

Demo

Try out the model in HF Spaces. The demo uses a quantized GGML version of the model to quickly return predictions on smaller GPUs (and even CPUs). Quantized GGML may have some minimal loss of model quality.

https://huggingface.co/spaces/openaccess-ai-collective/manticore-13b-chat-pyg

Release Notes

https://wandb.ai/wing-lian/manticore-13b-v2/runs/ij10c6m3

Build

Manticore was built with Axolotl on 8xA100 80GB

0.4 epochs taking approximately 14 hours. No further epochs will be released for the alpha.
The configuration to duplicate this build is provided in this repo's /config folder.

Bias, Risks, and Limitations

Manticore has not been aligned to human preferences with techniques like RLHF or deployed with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so). Manticore was fine-tuned from the base model LlaMa 13B, please refer to its model card's Limitations Section for relevant information.

Examples

TBD