|
--- |
|
library_name: transformers |
|
tags: |
|
- gsm8k |
|
- dpo |
|
datasets: |
|
- August4293/gsm8k_preference_dataset_it_1 |
|
language: |
|
- en |
|
--- |
|
|
|
Sure, here's a polished version of your text for the `ReadMe.md` file: |
|
|
|
--- |
|
|
|
## Model Overview |
|
|
|
This model has been fine-tuned using the [GSM8K Preference Dataset](https://huggingface.co/datasets/August4293/gsm8k_preference_dataset_it_1) with the Direct Preference Optimization algorithm. The goal of this fine-tuning is to enhance the model's performance in solving arithmetic problems on the GSM8K benchmark. |
|
|