|
--- |
|
license: llama3.2 |
|
datasets: |
|
- mlabonne/orpo-dpo-mix-40k |
|
language: |
|
- en |
|
base_model: |
|
- meta-llama/Llama-3.2-1B |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
model-index: |
|
- name: week2-llama3-1B |
|
results: |
|
- task: |
|
type: text-generation |
|
dataset: |
|
name: mlabonne/orpo-dpo-mix-40k |
|
type: mlabonne/orpo-dpo-mix-40k |
|
metrics: |
|
- name: EQ-Bench (0-Shot) |
|
type: EQ-Bench (0-Shot) |
|
value: 1.5355 |
|
--- |
|
## Model Overview |
|
This model is a fine-tuned variant of **Llama-3.2-1B**, leveraging **ORPO** (Optimized Regularization for Prompt Optimization) for enhanced performance. It has been fine-tuned using the **mlabonne/orpo-dpo-mix-40k** dataset as part of the *Finetuning Open Source LLMs Course - Week 2 Project*. |
|
|
|
## Intended Use |
|
This model is optimized for general-purpose language tasks, including text parsing, understanding contextual prompts, and enhanced interpretability in natural language processing applications. |
|
|
|
## Evaluation Results |
|
The model was evaluated on the following benchmarks, with the following performance metrics: |
|
| Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr| |
|
|--------|------:|------|-----:|-----------------|---|------:|---|-----:| |
|
|eq_bench| 2.1|none | 0|eqbench |↑ | 1.5355|± |0.9184| |
|
| | |none | 0|percent_parseable|↑ |16.9591|± |2.8782| |
|
|hellaswag| 1|none | 0|acc |↑ |0.4812|± |0.0050| |
|
| | |none | 0|acc_norm |↑ |0.6467|± |0.0049| |
|
|ifeval | 4|none | 0|inst_level_loose_acc |↑ |0.3984|± | N/A| |
|
| | |none | 0|inst_level_strict_acc |↑ |0.2974|± | N/A| |
|
| | |none | 0|prompt_level_loose_acc |↑ |0.2755|± |0.0193| |
|
| | |none | 0|prompt_level_strict_acc|↑ |0.1848|± |0.0168| |
|
|tinyMMLU | 0|none | 0|acc_norm |↑ |0.3995|± | N/A| |
|
|
|
## Key Features |
|
- **Model Size**: 1 Billion parameters |
|
- **Fine-tuning Method**: ORPO |
|
- **Dataset**: mlabonne/orpo-dpo-mix-40k |