Mistral-Plus Model Card

Model Details

Mistral-Plus is a chat assistant trained by Reinforcement Learning from Human Feedback (RLHF) using the Mistral-7B base model as the backbone.

Mistral-Plus adopts an innovative approach by completely bypassing Supervised Fine-Tuning (SFT) and directly implementing Harmless Reinforcement Learning from Human Feedback (RLHF).
Mistral-Plus uses the mistralai/Mistral-7B-v0.1 model as its backbone.
License: Mistral-Plus is licensed under the same license as the mistralai/Mistral-7B-v0.1 model.

Model Sources

Paper (Mistral-Plus): https://arxiv.org/abs/2403.02513

Uses

Mistral-Plus is primarily utilized for research in the areas of large language models and chatbots. It is intended chiefly for use by researchers and hobbyists specializing in natural language processing, machine learning, and artificial intelligence.

Mistral-Plus not only preserves the Mistral base model's general capabilities, but also significantly enhances its conversational abilities and notably reduces the generation of toxic outputs as human preference.

Goal: Empower researchers worldwide!

To the best of knowledge, this is the first academic endeavor to bypass supervised fine-tuning and directly apply reinforcement learning from human feedback. More importantly, Mistral-Plus is publicly available through HuggingFace for promoting collaborative research and innovation. This initiative to open-source Mistral-Plus seeks to empower researchers worldwide, enabling them to delve deeper into and build upon Mistral-Plus work, with a particular focus on conversational tasks, such as customer service, intelligent assistant, etc.

Model Performance on 11 general Language Tasks

Mistral-Plus on General language Understanding and Reasoning

Enhancing Conversational Safety in the Mistral-Plus

Bad word generation probablity on Mistral-Instruct and Mistral-Plus. The x-axis represents different intermittent layers, y-axis shows token probability.

zhengchenphd
/

Mistral-Plus-7B