--- . ____ ----- ______ ----- .
___ / \ ..................... ____ / \
.' '. -- ..:::::''''''''''''''''':::::.. .' '.
--- | ^ ^ | .::::'''' (_ ''''::::. -- | ^ ^ '
| ^ ^ | .::'' _) ''::. | ^ ^ | --
____ '...' .::' .-. (_ '::. '...'
.-.!_ .::' _) / \ '::. ! ____
/ / `-`.:' '-.-' _) ':..""".
-- ' | '.|:' _) .'. (_ ':/' | \
| | |'. _/^---^\_ | . --
___ \ . '| \-------../ (_ \ '.'
' : ' _) '.\:::/.' (_ )_ |' || ___
| | .| _( | | |'| / ' . |
-- | '. | \ '.\ /.' '. | |--
|'. '| |[ ]| (_ | .' |____
__ .'\ | .'\ '.^.' \ |. .
.'-.\'. | | _) (:) | ||| |
.' \'..' . _..--'''--.._ (_ /'-._.-'| ---
| `-..'. .-' '-. | .-'.
\ `-. .' .. .. '. .'-._.-' `.
-- ) `-./ '::. .::' \ _.-' /
'._/-.. / '::. .::' \-' .-'
::.`-. '' ':: ::' '' _..-\_.'
::: '._ | \ ' ' / | .-' .:: _____
____ ::: `-.| ' .----..___..----. ' | .-' :::
::: \ | _..--. .--.._ | /-' ::: ---
::: _) | ' / | | \ ' | ( :::
-- ::: ) | _.' '._ | ( )_ :::____
____ ::: /'. \_.' )\ /( '._/ .'\ (_ :::
::: .-'| `-->-@ / \ @->--' |-. :::
::: .-' \ | / \ | / `-. ::: ---
---- '' _.-' | )/ \( | `-. ::: _____
_.-=--..-' . \ /\ /\ / `-. ''
/.._ `. .-' .\ '-.\.\\.//./.-' /.`-. `---.._
| `. \ .-' | '. .' | `-. \
\ _\. `.-' | '-././.\.\.-' | `. |
`.-' | /::::::::::: \ /::::::::`. ,-. /
- | / /LGB ---- '-. .-' ---- `. | \_.'
__ \ | .' _____ '-._._._._.-' ____ | | |
`--' `-. '._ / --
`...-'
MESS WITH THE BEST, DIE LIKE THE REST
--=- D*D - R****1911 - F***L***T - P***D*X -=--
THE WORLD NEED US BACK :)
OMA, OneManArmy presents, una-neural-chat-v3-3. Powered by UNA (Uniform Neural Alignment), using zephyr trainer, allenai/ultrafeedback cleaned.. and JUST THAT.
Outperforming its base model, not adding any data.. just UNA Algorythm on Transformers Lib.
UNA Settings:
- MLP : 0.05
- ATT : 0.03
- LNOR : 0.02
una-neural-chat-v3-3
This model is a fine-tuned version of Intel/neural-chat-7b-v3-3 on the allenai/ultrafeedback_binarized_cleaned dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4524
- Rewards/chosen: -0.7101
- Rewards/rejected: -2.0953
- Rewards/accuracies: 0.7831
- Rewards/margins: 1.3852
- Logps/rejected: -321.5471
- Logps/chosen: -327.5048
- Logits/rejected: -2.6445
- Logits/chosen: -2.6674
Training results
| Training Loss |
Epoch |
Step |
Validation Loss |
Rewards/chosen |
Rewards/rejected |
Rewards/accuracies |
Rewards/margins |
Logps/rejected |
Logps/chosen |
Logits/rejected |
Logits/chosen |
| 0.5431 |
0.2 |
380 |
0.4900 |
-0.6823 |
-1.6613 |
0.7607 |
0.9790 |
-317.2069 |
-327.2263 |
-2.6478 |
-2.6651 |
| 0.4369 |
0.4 |
760 |
0.4783 |
-0.7562 |
-2.1298 |
0.7719 |
1.3737 |
-321.8924 |
-327.9652 |
-2.7370 |
-2.7562 |
| 0.4005 |
0.6 |
1140 |
0.4697 |
-0.6913 |
-2.0134 |
0.7770 |
1.3221 |
-320.7278 |
-327.3167 |
-2.7067 |
-2.7224 |
| 0.3759 |
0.8 |
1520 |
0.4568 |
-0.7387 |
-2.0643 |
0.7882 |
1.3256 |
-321.2370 |
-327.7909 |
-2.6626 |
-2.6829 |
| 0.5213 |
1.0 |
1900 |
0.4524 |
-0.7101 |
-2.0953 |
0.7831 |
1.3852 |
-321.5471 |
-327.5048 |
-2.6445 |
-2.6674 |
Framework versions
- Transformers 4.35.0-UNA
- Pytorch 2.1.0
- Datasets 2.14.6
- Tokenizers 0.14.1