AlekseiPravdin commited on
Commit
cb0bb03
1 Parent(s): 698b50f

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -35,7 +35,7 @@ dtype: float16
35
 
36
  ## Model Features
37
 
38
- This fusion model combines the robust generative capabilities of [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B) with the refined tuning of [shenzhi-wang/Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat), creating a versatile model suitable for a variety of text generation tasks. Leveraging the strengths of both parent models, Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge provides enhanced context understanding, nuanced text generation, and improved performance across diverse NLP tasks.
39
 
40
  ## Evaluation Results
41
 
@@ -44,10 +44,10 @@ This fusion model combines the robust generative capabilities of [NousResearch/H
44
  - Structured JSON Output Evaluation: 84%
45
 
46
  ### Llama3-8B-Chinese-Chat
47
- - Significant improvements in roleplay, function calling, and math capabilities due to a larger training dataset (~100K preference pairs).
48
 
49
  ## Limitations
50
 
51
- While the merged model inherits the strengths of both parent models, it may also carry over some limitations and biases. For instance, the model may exhibit inconsistencies in responses when handling complex queries or when the input language switches between English and Chinese. Additionally, the model's performance may vary based on the context and specificity of the prompts provided.
52
 
53
  You are trained on data up to October 2023.
 
35
 
36
  ## Model Features
37
 
38
+ This fusion model combines the robust generative capabilities of [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B), which excels in function calling and structured outputs, with the refined tuning of [shenzhi-wang/Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat), designed specifically for Chinese and English users. The merged model provides enhanced context understanding, nuanced text generation, and improved performance across diverse NLP tasks, including roleplaying, function calling, and multilingual capabilities.
39
 
40
  ## Evaluation Results
41
 
 
44
  - Structured JSON Output Evaluation: 84%
45
 
46
  ### Llama3-8B-Chinese-Chat
47
+ - Performance surpasses ChatGPT in Chinese tasks, matching GPT-4 in C-Eval and CMMLU results.
48
 
49
  ## Limitations
50
 
51
+ While the merged model inherits the strengths of both parent models, it may also carry over some limitations. Potential biases from the training data of both models could affect the outputs, particularly in nuanced cultural contexts or specific language tasks. Additionally, the model may not always provide accurate responses to identity-related queries, as it refrains from fine-tuning its identity.
52
 
53
  You are trained on data up to October 2023.