AlekseiPravdin commited on
Commit
0cce991
1 Parent(s): cb0bb03

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +29 -10
README.md CHANGED
@@ -10,7 +10,9 @@ tags:
10
 
11
  # Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge
12
 
13
- Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge is an advanced language model created through a strategic fusion of two distinct models: [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B) and [shenzhi-wang/Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat). The merging process was executed using [mergekit](https://github.com/cg123/mergekit), a specialized tool designed for precise model blending to achieve optimal performance and synergy between the merged architectures.
 
 
14
 
15
  ## 🧩 Merge Configuration
16
 
@@ -33,21 +35,38 @@ parameters:
33
  dtype: float16
34
  ```
35
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
  ## Model Features
37
 
38
- This fusion model combines the robust generative capabilities of [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B), which excels in function calling and structured outputs, with the refined tuning of [shenzhi-wang/Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat), designed specifically for Chinese and English users. The merged model provides enhanced context understanding, nuanced text generation, and improved performance across diverse NLP tasks, including roleplaying, function calling, and multilingual capabilities.
 
 
39
 
40
  ## Evaluation Results
41
 
42
- ### Hermes-2-Pro-Llama-3-8B
43
- - Function Calling Evaluation: 90%
44
- - Structured JSON Output Evaluation: 84%
45
-
46
- ### Llama3-8B-Chinese-Chat
47
- - Performance surpasses ChatGPT in Chinese tasks, matching GPT-4 in C-Eval and CMMLU results.
48
 
49
  ## Limitations
50
 
51
- While the merged model inherits the strengths of both parent models, it may also carry over some limitations. Potential biases from the training data of both models could affect the outputs, particularly in nuanced cultural contexts or specific language tasks. Additionally, the model may not always provide accurate responses to identity-related queries, as it refrains from fine-tuning its identity.
 
 
 
 
52
 
53
- You are trained on data up to October 2023.
 
10
 
11
  # Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge
12
 
13
+ Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
14
+ * [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B)
15
+ * [shenzhi-wang/Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat)
16
 
17
  ## 🧩 Merge Configuration
18
 
 
35
  dtype: float16
36
  ```
37
 
38
+ ## Model Details
39
+
40
+ Hermes-2-Pro is an upgraded version of the Nous Hermes model, designed for general task and conversation capabilities, with a focus on function calling and structured outputs. It has been fine-tuned on a cleaned version of the OpenHermes 2.5 dataset, achieving high scores in function calling evaluations. Llama3-8B-Chinese-Chat is an instruction-tuned model specifically for Chinese and English users, excelling in roleplaying and tool-using tasks.
41
+
42
+ ## Description
43
+
44
+ The merged model combines the advanced generative capabilities of Hermes-2-Pro with the specialized tuning of Llama3-8B-Chinese-Chat. This results in a versatile model that excels in both English and Chinese text generation, providing enhanced context understanding and nuanced responses across various NLP tasks.
45
+
46
+ ## Use Cases
47
+
48
+ - **Conversational AI**: Engage users in natural dialogue in both English and Chinese.
49
+ - **Function Calling**: Execute predefined functions based on user queries, enhancing interactivity.
50
+ - **Roleplaying**: Simulate characters or scenarios in a conversational context.
51
+ - **Text Generation**: Generate creative content, including stories, poems, and structured outputs.
52
+
53
  ## Model Features
54
 
55
+ - **Bilingual Capabilities**: Supports both English and Chinese, making it suitable for diverse user bases.
56
+ - **Function Calling**: Enhanced ability to perform actions based on user input, improving user experience.
57
+ - **Structured Outputs**: Capable of generating outputs in specific formats, such as JSON, for easier integration into applications.
58
 
59
  ## Evaluation Results
60
 
61
+ - **Hermes-2-Pro**: Achieved a 90% score on function calling evaluations and an 84% on structured JSON output evaluations.
62
+ - **Llama3-8B-Chinese-Chat**: Demonstrated superior performance in Chinese language tasks, surpassing previous models in roleplay and function calling capabilities.
 
 
 
 
63
 
64
  ## Limitations
65
 
66
+ While the merged model inherits the strengths of both parent models, it may also carry over some limitations, including:
67
+
68
+ - **Biases**: Potential biases present in the training data of both models may affect the outputs.
69
+ - **Contextual Understanding**: Although improved, the model may still struggle with highly nuanced or context-specific queries.
70
+ - **Performance Variability**: Performance may vary based on the complexity of the task and the language used.
71
 
72
+ This model represents a significant advancement in bilingual conversational AI, combining the best features of its predecessors to deliver a powerful tool for various applications.