license: cc-by-nc-4.0
model-index:
- name: PiVoT-MoE
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 63.91
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/PiVoT-MoE
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 83.52
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/PiVoT-MoE
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 60.71
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/PiVoT-MoE
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 54.64
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/PiVoT-MoE
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 76.32
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/PiVoT-MoE
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 39.12
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/PiVoT-MoE
name: Open LLM Leaderboard
PiVot-MoE
Model Description
PiVoT-MoE, is an advanced AI model specifically designed for roleplaying purposes. It has been trained using a combination of four 10.7B sized experts, each with their own specialized characteristic, all fine-tuned to bring a unique and diverse roleplaying experience.
The Mixture of Experts (MoE) technique is utilized in this model, allowing the experts to work together synergistically, resulting in a more cohesive and natural conversation flow. The MoE architecture allows for a higher level of flexibility and adaptability, enabling PiVoT-MoE to handle a wide variety of roleplaying scenarios and characters.
Based on the PiVoT-10.7B-Mistral-v0.2-RP model, PiVoT-MoE takes it a step further with the incorporation of the MoE technique. This means that not only does the model have an expansive knowledge base, but it also has the ability to mix and match its expertise to better suit the specific roleplaying scenario.
Prompt Template - Alpaca (ChatML works)
{system}
### Instruction:
{instruction}
### Response:
{response}
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 63.04 |
AI2 Reasoning Challenge (25-Shot) | 63.91 |
HellaSwag (10-Shot) | 83.52 |
MMLU (5-Shot) | 60.71 |
TruthfulQA (0-shot) | 54.64 |
Winogrande (5-shot) | 76.32 |
GSM8k (5-shot) | 39.12 |