macadeliccc
/

SOLAR-10.7b-Instruct-truthy-dpo

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

SOLAR-10.7b-Instruct-truthy-dpo / README.md

macadeliccc's picture

Update README.md

d6ecac0 verified 11 months ago

|

1.61 kB

	---
	library_name: transformers
	tags: []
	---
	# SOLAR-10.7b-Instruct-truthy-dpo

	![orca-bagel](orca-bagel.png)

	This model is a finetune of [macadeliccc/SOLAR-10.7b-Instruct-truthy-dpo](https://huggingface.co/macadeliccc/SOLAR-10.7b-Instruct-dpo)

	## Process

	1. I finetuned upstageai/Solar-10.7b-Instruct-v0.1 with 1 epoch of Intel/orca_dpo_pairs (12.4k samples)
	2. I futher finetuned that model with 3 epochs of jondurbin/truthy-dpo-v0.1 (1.04k samples)
	3. This process is experimental and the base model linked above is more tested at this time.

	## GGUF

	Available [here](https://huggingface.co/macadeliccc/SOLAR-10.7b-Instruct-truthy-dpo-GGUF)

	## Evaluations

	Evaluated in 4bit

	\| Tasks \|Version\|Filter\|n-shot\| Metric \|Value \| \|Stderr\|
	\|-------------\|-------\|------\|-----:\|--------\|-----:\|---\|-----:\|
	\|arc_challenge\|Yaml \|none \| 0\|acc \|0.5853\|± \|0.0144\|
	\| \| \|none \| 0\|acc_norm\|0.6126\|± \|0.0142\|
	\|arc_easy \|Yaml \|none \| 0\|acc \|0.8077\|± \|0.0081\|
	\| \| \|none \| 0\|acc_norm\|0.7715\|± \|0.0086\|
	\|boolq \|Yaml \|none \| 0\|acc \|0.8630\|± \|0.0060\|
	\|hellaswag \|Yaml \|none \| 0\|acc \|0.6653\|± \|0.0047\|
	\| \| \|none \| 0\|acc_norm\|0.8498\|± \|0.0036\|
	\|openbookqa \|Yaml \|none \| 0\|acc \|0.3460\|± \|0.0213\|
	\| \| \|none \| 0\|acc_norm\|0.4660\|± \|0.0223\|
	\|piqa \|Yaml \|none \| 0\|acc \|0.7835\|± \|0.0096\|
	\| \| \|none \| 0\|acc_norm\|0.7851\|± \|0.0096\|
	\|winogrande \|Yaml \|none \| 0\|acc \|0.7277\|± \|0.0125\|