|
--- |
|
language: |
|
- en |
|
license: apache-2.0 |
|
datasets: |
|
- databricks/databricks-dolly-15k |
|
- Felladrin/ChatML-databricks-dolly-15k |
|
- euclaise/reddit-instruct-curated |
|
- Felladrin/ChatML-reddit-instruct-curated |
|
- THUDM/webglm-qa |
|
- Felladrin/ChatML-WebGLM-QA |
|
- starfishmedical/webGPT_x_dolly |
|
- Felladrin/ChatML-webGPT_x_dolly |
|
- LDJnr/Capybara |
|
- Felladrin/ChatML-Capybara |
|
- Open-Orca/SlimOrca-Dedup |
|
- Felladrin/ChatML-SlimOrca-Dedup |
|
- HuggingFaceH4/ultrachat_200k |
|
- Felladrin/ChatML-ultrachat_200k |
|
- nvidia/HelpSteer |
|
- Felladrin/ChatML-HelpSteer |
|
- sablo/oasst2_curated |
|
- Felladrin/ChatML-oasst2_curated |
|
- CohereForAI/aya_dataset |
|
- Felladrin/ChatML-aya_dataset |
|
- argilla/distilabel-capybara-dpo-7k-binarized |
|
- Felladrin/ChatML-distilabel-capybara-dpo-7k-binarized |
|
- argilla/distilabel-intel-orca-dpo-pairs |
|
- Felladrin/ChatML-distilabel-intel-orca-dpo-pairs |
|
- argilla/ultrafeedback-binarized-preferences |
|
- Felladrin/ChatML-ultrafeedback-binarized-preferences |
|
- sablo/oasst2_dpo_pairs_en |
|
- Felladrin/ChatML-oasst2_dpo_pairs_en |
|
- NeuralNovel/Neural-DPO |
|
- Felladrin/ChatML-Neural-DPO |
|
base_model: Felladrin/Minueza-32M-Base |
|
pipeline_tag: text-generation |
|
widget: |
|
- messages: |
|
- role: system |
|
content: You are a career counselor. The user will provide you with an individual |
|
looking for guidance in their professional life, and your task is to assist |
|
them in determining what careers they are most suited for based on their skills, |
|
interests, and experience. You should also conduct research into the various |
|
options available, explain the job market trends in different industries, and |
|
advice on which qualifications would be beneficial for pursuing particular fields. |
|
- role: user |
|
content: Heya! |
|
- role: assistant |
|
content: Hi! How may I help you? |
|
- role: user |
|
content: I am interested in developing a career in software engineering. What |
|
would you recommend me to do? |
|
- messages: |
|
- role: system |
|
content: You are a highly knowledgeable assistant. Help the user as much as you |
|
can. |
|
- role: user |
|
content: How can I become a healthier person? |
|
- messages: |
|
- role: system |
|
content: You are a helpful assistant who gives creative responses. |
|
- role: user |
|
content: Write the specs of a game about mages in a fantasy world. |
|
- messages: |
|
- role: system |
|
content: You are a helpful assistant who answers user's questions with details. |
|
- role: user |
|
content: Tell me about the pros and cons of social media. |
|
- messages: |
|
- role: system |
|
content: You are a helpful assistant who answers user's questions with details |
|
and curiosity. |
|
- role: user |
|
content: What are some potential applications for quantum computing? |
|
inference: |
|
parameters: |
|
max_new_tokens: 250 |
|
do_sample: true |
|
temperature: 0.65 |
|
top_p: 0.55 |
|
top_k: 35 |
|
repetition_penalty: 1.176 |
|
model-index: |
|
- name: Minueza-32M-Chat |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: AI2 Reasoning Challenge (25-Shot) |
|
type: ai2_arc |
|
config: ARC-Challenge |
|
split: test |
|
args: |
|
num_few_shot: 25 |
|
metrics: |
|
- type: acc_norm |
|
value: 20.39 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Felladrin/Minueza-32M-Chat |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: HellaSwag (10-Shot) |
|
type: hellaswag |
|
split: validation |
|
args: |
|
num_few_shot: 10 |
|
metrics: |
|
- type: acc_norm |
|
value: 26.54 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Felladrin/Minueza-32M-Chat |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU (5-Shot) |
|
type: cais/mmlu |
|
config: all |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 25.75 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Felladrin/Minueza-32M-Chat |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: TruthfulQA (0-shot) |
|
type: truthful_qa |
|
config: multiple_choice |
|
split: validation |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: mc2 |
|
value: 47.27 |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Felladrin/Minueza-32M-Chat |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: Winogrande (5-shot) |
|
type: winogrande |
|
config: winogrande_xl |
|
split: validation |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 50.99 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Felladrin/Minueza-32M-Chat |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GSM8k (5-shot) |
|
type: gsm8k |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 0.0 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Felladrin/Minueza-32M-Chat |
|
name: Open LLM Leaderboard |
|
--- |
|
|
|
# Minueza-32M-Chat: A chat model with 32 million parameters |
|
|
|
- Base model: [Felladrin/Minueza-32M-Base](https://huggingface.co/Felladrin/Minueza-32M-Base) |
|
- Datasets used during SFT: |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-databricks-dolly-15k)] [databricks/databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-reddit-instruct-curated)] [euclaise/reddit-instruct-curated](https://huggingface.co/datasets/euclaise/reddit-instruct-curated) |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-WebGLM-QA)] [THUDM/webglm-qa](https://huggingface.co/datasets/THUDM/webglm-qa) |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-webGPT_x_dolly)] [starfishmedical/webGPT_x_dolly](https://huggingface.co/datasets/starfishmedical/webGPT_x_dolly) |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-Capybara)] [LDJnr/Capybara](https://huggingface.co/datasets/LDJnr/Capybara) |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-SlimOrca-Dedup)] [Open-Orca/SlimOrca-Dedup](https://huggingface.co/datasets/Open-Orca/SlimOrca-Dedup) |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-ultrachat_200k)] [HuggingFaceH4/ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-HelpSteer)] [nvidia/HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer) |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-oasst2_curated)] [sablo/oasst2_curated](https://huggingface.co/datasets/sablo/oasst2_curated) |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-aya_dataset)] [CohereForAI/aya_dataset](https://huggingface.co/datasets/CohereForAI/aya_dataset) |
|
- Datasets used during DPO: |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-distilabel-capybara-dpo-7k-binarized)] [argilla/distilabel-capybara-dpo-7k-binarized](https://huggingface.co/datasets/argilla/distilabel-capybara-dpo-7k-binarized) |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-distilabel-intel-orca-dpo-pairs)] [argilla/distilabel-intel-orca-dpo-pairs](https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs) |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-ultrafeedback-binarized-preferences)] [argilla/ultrafeedback-binarized-preferences](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences) |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-oasst2_dpo_pairs_en)] [sablo/oasst2_dpo_pairs_en](https://huggingface.co/datasets/sablo/oasst2_dpo_pairs_en) |
|
- [[ChatML](https://huggingface.co/datasets/Felladrin/ChatML-Neural-DPO)] [NeuralNovel/Neural-DPO](https://huggingface.co/datasets/NeuralNovel/Neural-DPO) |
|
- License: [Apache License 2.0](https://huggingface.co/Felladrin/Minueza-32M-Chat/resolve/main/license.txt) |
|
- Availability in other ML formats: |
|
- GGUF: [Felladrin/gguf-Minueza-32M-Chat](https://huggingface.co/Felladrin/gguf-Minueza-32M-Chat) |
|
- ONNX: [Felladrin/onnx-Minueza-32M-Chat](https://huggingface.co/Felladrin/onnx-Minueza-32M-Chat) |
|
|
|
## Recommended Prompt Format |
|
|
|
``` |
|
<|im_start|>system |
|
{system_message}<|im_end|> |
|
<|im_start|>user |
|
{user_message}<|im_end|> |
|
<|im_start|>assistant |
|
``` |
|
|
|
## Recommended Inference Parameters |
|
|
|
```yml |
|
do_sample: true |
|
temperature: 0.65 |
|
top_p: 0.55 |
|
top_k: 35 |
|
repetition_penalty: 1.176 |
|
``` |
|
|
|
## Usage Example |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
generate = pipeline("text-generation", "Felladrin/Minueza-32M-Chat") |
|
|
|
messages = [ |
|
{ |
|
"role": "system", |
|
"content": "You are a helpful assistant who answers the user's questions with details and curiosity.", |
|
}, |
|
{ |
|
"role": "user", |
|
"content": "What are some potential applications for quantum computing?", |
|
}, |
|
] |
|
|
|
prompt = generate.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
|
|
|
output = generate( |
|
prompt, |
|
max_new_tokens=256, |
|
do_sample=True, |
|
temperature=0.65, |
|
top_k=35, |
|
top_p=0.55, |
|
repetition_penalty=1.176, |
|
) |
|
|
|
print(output[0]["generated_text"]) |
|
``` |
|
|
|
## How it was trained |
|
|
|
This model was trained with [SFT Trainer](https://huggingface.co/docs/trl/main/en/sft_trainer) and [DPO Trainer](https://huggingface.co/docs/trl/main/en/dpo_trainer), in several sessions, using the following settings: |
|
|
|
For Supervised Fine-Tuning: |
|
|
|
| Hyperparameter | Value | |
|
| :-------------------------- | :-------------------------------------------- | |
|
| learning_rate | 2e-5 | |
|
| total_train_batch_size | 24 | |
|
| max_seq_length | 2048 | |
|
| weight_decay | 0 | |
|
| warmup_ratio | 0.02 | |
|
|
|
For Direct Preference Optimization: |
|
|
|
| Hyperparameter | Value | |
|
| :-------------------------- | :-------------------------------------------- | |
|
| learning_rate | 7.5e-7 | |
|
| total_train_batch_size | 6 | |
|
| max_length | 2048 | |
|
| max_prompt_length | 1536 | |
|
| max_steps | 200 | |
|
| weight_decay | 0 | |
|
| warmup_ratio | 0.02 | |
|
| beta | 0.1 | |
|
|
|
## [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) |
|
|
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Felladrin__Minueza-32M-Chat) |
|
|
|
| Metric |Value| |
|
|---------------------------------|----:| |
|
|Avg. |28.49| |
|
|AI2 Reasoning Challenge (25-Shot)|20.39| |
|
|HellaSwag (10-Shot) |26.54| |
|
|MMLU (5-Shot) |25.75| |
|
|TruthfulQA (0-shot) |47.27| |
|
|Winogrande (5-shot) |50.99| |
|
|GSM8k (5-shot) | 0.00| |
|
|