Romain-Cosentino
commited on
Commit
•
3e0f2fa
1
Parent(s):
2b705ee
Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ tags:
|
|
10 |
---
|
11 |
# TenyxChat: Language Model Alignment using Tenyx Fine-tuning
|
12 |
|
13 |
-
Introducing TenyxChat-8x7B-v1, part of our TenyxChat
|
14 |
|
15 |
We fine-tune [Mixtral-8x7B-Instruct-v0.1](https://arxiv.org/pdf/2401.04088.pdf) with our proprietary approach ([blog](https://www.tenyx.com/post/forgetting-and-toxicity-in-llms-a-deep-dive-on-fine-tuning-methods) already applied to obtain TenyxChat-7B-v1 (https://huggingface.co/tenyx/TenyxChat-7B-v1), [service](https://www.tenyx.com/fine-tuning)), which shows an increase in [MT-Bench](https://arxiv.org/abs/2306.05685). Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner, thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution. TenyxChat-8x7B-v1 was trained using eight A100s (80GB) for about eight hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
|
16 |
|
|
|
10 |
---
|
11 |
# TenyxChat: Language Model Alignment using Tenyx Fine-tuning
|
12 |
|
13 |
+
Introducing TenyxChat-8x7B-v1, part of our TenyxChat series trained to function as useful assistants through preference tuning, using Tenyx's recently released advanced fine-tuning technology ([VentureBeat article](https://venturebeat.com/ai/tenyx-aims-to-fix-llms-catastrophic-forgetting-problem/)). Our model is trained using the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) framework on the open-source AI feedback dataset [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
|
14 |
|
15 |
We fine-tune [Mixtral-8x7B-Instruct-v0.1](https://arxiv.org/pdf/2401.04088.pdf) with our proprietary approach ([blog](https://www.tenyx.com/post/forgetting-and-toxicity-in-llms-a-deep-dive-on-fine-tuning-methods) already applied to obtain TenyxChat-7B-v1 (https://huggingface.co/tenyx/TenyxChat-7B-v1), [service](https://www.tenyx.com/fine-tuning)), which shows an increase in [MT-Bench](https://arxiv.org/abs/2306.05685). Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner, thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution. TenyxChat-8x7B-v1 was trained using eight A100s (80GB) for about eight hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
|
16 |
|