winglian's picture
Update README.md
0cff8e9
metadata
datasets:
  - anon8231489123/ShareGPT_Vicuna_unfiltered
  - ehartford/wizard_vicuna_70k_unfiltered
  - ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered
  - QingyiSi/Alpaca-CoT
  - teknium/GPT4-LLM-Cleaned
  - teknium/GPTeacher-General-Instruct
  - metaeval/ScienceQA_text_only
  - hellaswag
  - openai/summarize_from_feedback
  - riddle_sense
  - gsm8k
  - ewof/code-alpaca-instruct-unfiltered
language:
  - en
library_name: transformers
pipeline_tag: text-generation

Manticore 30B Chat (ALPHA)

  • Alpha release of checkpoint before train and eval loss spikes. Additionally, there seems to be some alignment which is easily jailbroken.

💵 Donate to OpenAccess AI Collective to help us keep building great tools and models!

Manticore 30B Chat builds on Manticore v1 with new datasets, including a de-duped subset of the Pygmalion dataset. It also removes all Alpaca style prompts using ### in favor of chat only style prompts using USER:,ASSISTANT: as well as pygmalion/metharme prompting using <|system|>, <|user|> and <|model|> tokens.

Questions, comments, feedback, looking to donate, or want to help? Reach out on our Discord or email wing@openaccessaicollective.org

Training Datasets

Manticore 30B Chat is a Llama 30B model fine-tuned on the following datasets along with the datasets from the original Manticore 30B.

**Manticore 30B Chat was trained on effectively 40% of the datasets below due to only training for 0.4 epochs.

Manticore 30B

Not added from Manticore 13B:

  • mmlu - mmlu datasets were not added to this model as the test split is used for benchmarks

Shoutouts

Special thanks to Nanobit for helping with Axolotl, TheBloke for quantizing these models are more accessible to all, ehartford for cleaned datasets, and 0x000011b for the RP dataset.

Demo

Try out the model in HF Spaces. The demo uses a quantized GGML version of the model to quickly return predictions on smaller GPUs (and even CPUs). Quantized GGML may have some minimal loss of model quality.

Release Notes

Build

Manticore was built with Axolotl on 8xA100 80GB

  • 0.4 epochs taking approximately 14 hours. No further epochs will be released for the alpha.
  • The configuration to duplicate this build is provided in this repo's /config folder.

Bias, Risks, and Limitations

Manticore has not been aligned to human preferences with techniques like RLHF or deployed with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so). Manticore was fine-tuned from the base model LlaMa 13B, please refer to its model card's Limitations Section for relevant information.

Examples

TBD