datasets:
- anon8231489123/ShareGPT_Vicuna_unfiltered
- ehartford/wizard_vicuna_70k_unfiltered
- ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered
- QingyiSi/Alpaca-CoT
- teknium/GPT4-LLM-Cleaned
- teknium/GPTeacher-General-Instruct
- metaeval/ScienceQA_text_only
- hellaswag
- openai/summarize_from_feedback
- riddle_sense
- gsm8k
- ewof/code-alpaca-instruct-unfiltered
language:
- en
library_name: transformers
pipeline_tag: text-generation
Manticore 30B Chat (ALPHA)
- Alpha release of checkpoint before train and eval loss spikes. Additionally, there seems to be some alignment which is easily jailbroken.
💵 Donate to OpenAccess AI Collective to help us keep building great tools and models!
Manticore 30B Chat builds on Manticore v1 with new datasets, including a de-duped subset of the Pygmalion dataset. It also removes all Alpaca style prompts using ###
in favor of
chat only style prompts using USER:
,ASSISTANT:
as well as pygmalion/metharme prompting using <|system|>, <|user|> and <|model|>
tokens.
Questions, comments, feedback, looking to donate, or want to help? Reach out on our Discord or email wing@openaccessaicollective.org
Training Datasets
Manticore 30B Chat is a Llama 30B model fine-tuned on the following datasets along with the datasets from the original Manticore 30B.
**Manticore 30B Chat was trained on effectively 40% of the datasets below due to only training for 0.4 epochs.
- de-duped pygmalion dataset, filtered down to RP data
- riddle_sense - instruct augmented
- hellaswag, updated for detailed explanations w 30K+ rows
- gsm8k - instruct augmented
- ewof/code-alpaca-instruct-unfiltered
Manticore 30B
- ShareGPT - based on a cleaned and de-suped subset
- WizardLM
- Wizard-Vicuna
- subset of QingyiSi/Alpaca-CoT for roleplay and CoT
- GPT4-LLM-Cleaned
- GPTeacher-General-Instruct
- ARC-Easy & ARC-Challenge - instruct augmented for detailed responses, derived from the
train
split - hellaswag - 5K row subset of instruct augmented for concise responses, derived from the
train
split - metaeval/ScienceQA_text_only - instruct for concise responses
- openai/summarize_from_feedback - instruct augmented tl;dr summarization
Not added from Manticore 13B:
- mmlu - mmlu datasets were not added to this model as the
test
split is used for benchmarks
Shoutouts
Special thanks to Nanobit for helping with Axolotl, TheBloke for quantizing these models are more accessible to all, ehartford for cleaned datasets, and 0x000011b for the RP dataset.
Demo
Try out the model in HF Spaces. The demo uses a quantized GGML version of the model to quickly return predictions on smaller GPUs (and even CPUs). Quantized GGML may have some minimal loss of model quality.
Release Notes
Build
Manticore was built with Axolotl on 8xA100 80GB
- 0.4 epochs taking approximately 14 hours. No further epochs will be released for the alpha.
- The configuration to duplicate this build is provided in this repo's /config folder.
Bias, Risks, and Limitations
Manticore has not been aligned to human preferences with techniques like RLHF or deployed with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so). Manticore was fine-tuned from the base model LlaMa 13B, please refer to its model card's Limitations Section for relevant information.
Examples
TBD