AlphaMonarch-dora / README.md
abideen's picture
Update README.md
816daf6 verified
|
raw
history blame
4.51 kB
metadata
license: cc-by-nc-4.0
base_model: mlabonne/NeuralMonarch-7B
tags:
  - generated_from_trainer
  - mistral
  - instruct
  - finetune
  - chatml
  - gpt4
  - synthetic data
  - distillation
model-index:
  - name: AlphaMonarch-dora
    results: []
datasets:
  - argilla/OpenHermes2.5-dpo-binarized-alpha
language:
  - en
library_name: transformers
pipeline_tag: text-generation

AlphaMonarch-dora

image/jpeg

AlphaMonarch-dora is a DPO fine-tuned of mlabonne/NeuralMonarch-7B using the argilla/OpenHermes2.5-dpo-binarized-alpha preference dataset using DoRA. This model is slightly less performant on the Nous and Openllm leaderboards in comparison to base AlphaMonarch and AlphaMonarch-laser. I have trained this model for 1080 steps. All hyperparams were kept consist across all these experiments.

πŸ† Evaluation results

OpenLLM Benchmark

image/png

Nous Benchmark

AGIEVAL

Task Version Accuracy Accuracy StdErr Normalized Accuracy Normalized Accuracy StdErr
agieval_aqua_rat 0 28.35% 2.83% 26.38% 2.77%
agieval_logiqa_en 0 38.71% 1.91% 38.25% 1.90%
agieval_lsat_ar 0 23.91% 2.82% 23.48% 2.80%
agieval_lsat_lr 0 52.55% 2.21% 53.73% 2.21%
agieval_lsat_rc 0 66.91% 2.87% 66.54% 2.88%
agieval_sat_en 0 78.64% 2.86% 78.64% 2.86%
agieval_sat_en_without_passage 0 45.15% 3.48% 44.17% 3.47%
agieval_sat_math 0 33.64% 3.19% 31.82% 3.15%

AVG = 45.976

GPT4ALL

Task Version Accuracy Accuracy StdErr Normalized Accuracy Normalized Accuracy StdErr
arc_challenge 0 65.87% 1.39% 67.92% 1.36%
arc_easy 0 86.49% 0.70% 80.64% 0.81%
boolq 1 87.16% 0.59% - -
hellaswag 0 69.86% 0.46% 87.51% 0.33%
openbookqa 0 39.00% 2.18% 49.20% 2.24%
piqa 0 83.03% 0.88% 84.82% 0.84%
winogrande 0 80.98% 1.10% - -

AVG = 73.18

TRUTHFUL-QA

Task Version MC1 Accuracy MC1 Accuracy StdErr MC2 Accuracy MC2 Accuracy StdErr
truthfulqa_mc 1 62.91% 1.69% 78.48% 1.37%

AVG = 70.69

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-7
  • train_batch_size: 2
  • eval_batch_size: Not specified
  • seed: Not specified
  • gradient_accumulation_steps: 8
  • total_train_batch_size: Not specified
  • optimizer: PagedAdamW with 32-bit precision
  • lr_scheduler_type: Cosine
  • lr_scheduler_warmup_steps: 100
  • training_steps: 1080

Framework versions

  • Transformers 4.39.0.dev0
  • Peft 0.9.1.dev0
  • Datasets 2.18.0
  • torch 2.2.0
  • accelerate 0.27.2