metadata
library_name: transformers
license: llama3
datasets:
- 2A2I/argilla-dpo-mix-7k-arabic
language:
- ar
pipeline_tag: text-generation
π³ Arabic ORPO LLAMA 3
π Story first
This model is the a finetuned version of meta-llama/Meta-Llama-3-8B-Instruct using ORPO on 2A2I/argilla-dpo-mix-7k-arabic.
I wanted to try ORPO and see if it will better align a biased English model like llama3 to the arabic language or it will faill.
While the evaluations favour the base llama3 over my finetune, in practice i found my finetune was much better at spitting coherent (mostly correct) arabic text which i find interesting.
I would encourage everyone to try out the model from here and share his insights with me ^^
π€ Evaluation and Results
This result was made using lighteval with the community|arabic_mmlu tasks.
Community | Llama-3-8B-Instruct | Arabic-ORPO-Llama-3-8B-Instrcut |
---|---|---|
All | 0.348 | 0.317 |
Abstract Algebra | 0.310 | 0.230 |
Anatomy | 0.385 | 0.348 |
Astronomy | 0.388 | 0.316 |
Business Ethics | 0.480 | 0.370 |
Clinical Knowledge | 0.396 | 0.385 |
College Biology | 0.347 | 0.299 |
College Chemistry | 0.180 | 0.250 |
College Computer Science | 0.250 | 0.190 |
College Mathematics | 0.260 | 0.280 |
College Medicine | 0.231 | 0.249 |
College Physics | 0.225 | 0.216 |
Computer Security | 0.470 | 0.440 |
Conceptual Physics | 0.315 | 0.404 |
Econometrics | 0.263 | 0.272 |
Electrical Engineering | 0.414 | 0.359 |
Elementary Mathematics | 0.320 | 0.272 |
Formal Logic | 0.270 | 0.214 |
Global Facts | 0.320 | 0.320 |
High School Biology | 0.332 | 0.335 |
High School Chemistry | 0.256 | 0.296 |
High School Computer Science | 0.350 | 0.300 |
High School European History | 0.224 | 0.242 |
High School Geography | 0.323 | 0.364 |
High School Government & Politics | 0.352 | 0.285 |
High School Macroeconomics | 0.290 | 0.285 |
High School Mathematics | 0.237 | 0.278 |
High School Microeconomics | 0.231 | 0.273 |
High School Physics | 0.252 | 0.225 |
High School Psychology | 0.316 | 0.330 |
High School Statistics | 0.199 | 0.176 |
High School US History | 0.284 | 0.250 |
High School World History | 0.312 | 0.274 |
Human Aging | 0.369 | 0.430 |
Human Sexuality | 0.481 | 0.321 |
International Law | 0.603 | 0.405 |
Jurisprudence | 0.491 | 0.370 |
Logical Fallacies | 0.368 | 0.276 |
Machine Learning | 0.214 | 0.312 |
Management | 0.350 | 0.379 |
Marketing | 0.521 | 0.547 |
Medical Genetics | 0.320 | 0.330 |
Miscellaneous | 0.446 | 0.443 |
Moral Disputes | 0.422 | 0.306 |
Moral Scenarios | 0.248 | 0.241 |
Nutrition | 0.412 | 0.346 |
Philosophy | 0.408 | 0.328 |
Prehistory | 0.429 | 0.349 |
Professional Accounting | 0.344 | 0.273 |
Professional Law | 0.306 | 0.244 |
Professional Medicine | 0.228 | 0.206 |
Professional Psychology | 0.337 | 0.315 |
Public Relations | 0.391 | 0.373 |
Security Studies | 0.469 | 0.335 |
Sociology | 0.498 | 0.408 |
US Foreign Policy | 0.590 | 0.490 |
Virology | 0.422 | 0.416 |
World Religions | 0.404 | 0.304 |
Average (All Communities) | 0.348 | 0.317 |