JPBianchi's picture
Update README.md
131a518 verified
|
raw
history blame
2.1 kB
metadata
license: llama3.2
datasets:
  - mlabonne/orpo-dpo-mix-40k
language:
  - en
base_model:
  - meta-llama/Llama-3.2-1B
library_name: transformers
pipeline_tag: text-generation
model-index:
  - name: week2-llama3-1B
    results:
      - task:
          type: text-generation
        dataset:
          name: mlabonne/orpo-dpo-mix-40k
          type: mlabonne/orpo-dpo-mix-40k
        metrics:
          - name: EQ-Bench (0-Shot)
            type: EQ-Bench (0-Shot)
            value: 1.5355

Model Overview

This model is a fine-tuned variant of Llama-3.2-1B, leveraging ORPO (Optimized Regularization for Prompt Optimization) for enhanced performance. It has been fine-tuned using the mlabonne/orpo-dpo-mix-40k dataset as part of the Finetuning Open Source LLMs Course - Week 2 Project.

Intended Use

This model is optimized for general-purpose language tasks, including text parsing, understanding contextual prompts, and enhanced interpretability in natural language processing applications.

Evaluation Results

The model was evaluated on the following benchmarks, with the following performance metrics:

Tasks Version Filter n-shot Metric Value Stderr
eq_bench 2.1 none 0 eqbench 1.5355 ± 0.9184
none 0 percent_parseable 16.9591 ± 2.8782
hellaswag 1 none 0 acc 0.4812 ± 0.0050
none 0 acc_norm 0.6467 ± 0.0049
ifeval 4 none 0 inst_level_loose_acc 0.3984 ± N/A
none 0 inst_level_strict_acc 0.2974 ± N/A
none 0 prompt_level_loose_acc 0.2755 ± 0.0193
none 0 prompt_level_strict_acc 0.1848 ± 0.0168
tinyMMLU 0 none 0 acc_norm 0.3995 ± N/A

Key Features

  • Model Size: 1 Billion parameters
  • Fine-tuning Method: ORPO
  • Dataset: mlabonne/orpo-dpo-mix-40k