chen.zheng1
commited on
Commit
•
e3d3c97
1
Parent(s):
e65434e
update files
Browse files
README.md
CHANGED
@@ -6,18 +6,19 @@ license: mit
|
|
6 |
|
7 |
## Model Details
|
8 |
|
9 |
-
Mistral-C2F is a chat assistant trained by Reinforcement Learning from Human Feedback (RLHF) using the Mistral-7B base model as the backbone.
|
10 |
|
11 |
-
- Mistral-C2F adopts an innovative approach
|
12 |
-
- The goal of **Coarse to Fine** LLM is to transform a base LLM, which currently lacks integrated analytical and reasoning capabilities, into
|
13 |
- First step: **Coarse Actor** Analytical and Reasoning LLM.
|
14 |
-
-
|
15 |
-
-
|
16 |
- Second step: **Fine Actor** Knowledge Refining LLM.
|
17 |
-
- After the output from the Coarse Actor is generated, it is merged with existing Instruction model through new strategy 'Knowledge Residue Merger'.
|
18 |
-
- Knowledge Residue Merger allows the Coarse Actor to integrate its detailed analytical reasoning into the existing LLM model.
|
19 |
- The Coarse-to-Fine actor approach retains the inherent advantages of the existing model while significantly enhancing its analytical and reasoning capabilities in dialogue.
|
20 |
|
|
|
21 |
|
22 |
- Mistral-C2F uses the [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) model as its backbone.
|
23 |
- License: Mistral-C2F is licensed under the same license as the [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) model.
|
@@ -57,6 +58,11 @@ More importantly, Mistral-C2F is publicly available through HuggingFace for prom
|
|
57 |
|
58 |
## Case Study
|
59 |
|
|
|
|
|
|
|
|
|
|
|
60 |
### Case Study for Coarse Actor Generation.
|
61 |
|
62 |
![coarse](images/coarse.png)
|
@@ -65,4 +71,5 @@ More importantly, Mistral-C2F is publicly available through HuggingFace for prom
|
|
65 |
|
66 |
![fine1](images/fine1.png)
|
67 |
|
|
|
68 |
![fine2](images/fine2.png)
|
|
|
6 |
|
7 |
## Model Details
|
8 |
|
9 |
+
**Mistral-C2F** is a chat assistant trained by Reinforcement Learning from Human Feedback (RLHF) using the Mistral-7B base model as the backbone.
|
10 |
|
11 |
+
- Mistral-C2F adopts an innovative approach with a two-step **Coarse to Fine** Analytical and Reasoning Enhancement LLM approach.
|
12 |
+
- The goal of the **Coarse to Fine** LLM is to transform a base LLM, which currently lacks integrated analytical and reasoning capabilities, into an RLHF LLM that aligns with human preferences and significantly enhances its analytical and reasoning abilities.
|
13 |
- First step: **Coarse Actor** Analytical and Reasoning LLM.
|
14 |
+
- We introduce the concept of "Continuous Maximization" (CM) in the direct application of RLHF to the base model.
|
15 |
+
- The Coarse Actor cannot be used directly for responses or answers because it serves as a knowledge-rich "pool," designed to enhance analytical and reasoning abilities. Extending the output to its length limits often results in excessive redundant information, as the model continues generating similar text without adequate termination.
|
16 |
- Second step: **Fine Actor** Knowledge Refining LLM.
|
17 |
+
- After the output from the Coarse Actor is generated, it is merged with the existing Instruction model through a new strategy called 'Knowledge Residue Merger'.
|
18 |
+
- The Knowledge Residue Merger allows the Coarse Actor to integrate its detailed analytical reasoning into the existing LLM model.
|
19 |
- The Coarse-to-Fine actor approach retains the inherent advantages of the existing model while significantly enhancing its analytical and reasoning capabilities in dialogue.
|
20 |
|
21 |
+
|
22 |
|
23 |
- Mistral-C2F uses the [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) model as its backbone.
|
24 |
- License: Mistral-C2F is licensed under the same license as the [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) model.
|
|
|
58 |
|
59 |
## Case Study
|
60 |
|
61 |
+
## Case Study in General Dialogue.
|
62 |
+
|
63 |
+
![case](images/case.png)
|
64 |
+
|
65 |
+
|
66 |
### Case Study for Coarse Actor Generation.
|
67 |
|
68 |
![coarse](images/coarse.png)
|
|
|
71 |
|
72 |
![fine1](images/fine1.png)
|
73 |
|
74 |
+
|
75 |
![fine2](images/fine2.png)
|