AlekseiPravdin
commited on
Commit
•
b449d2b
1
Parent(s):
455053c
Upload folder using huggingface_hub
Browse files
README.md
CHANGED
@@ -35,21 +35,44 @@ parameters:
|
|
35 |
dtype: float16
|
36 |
```
|
37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
38 |
## Model Features
|
39 |
|
40 |
-
|
|
|
|
|
|
|
41 |
|
42 |
## Evaluation Results
|
43 |
|
44 |
### Hermes-2-Pro-Llama-3-8B
|
45 |
- Scored 90% on function calling evaluation.
|
46 |
-
-
|
47 |
|
48 |
### Llama3-8B-Chinese-Chat
|
49 |
-
- Demonstrated superior performance in Chinese language tasks, surpassing
|
50 |
|
51 |
## Limitations
|
52 |
|
53 |
-
While the merged model
|
|
|
|
|
|
|
54 |
|
55 |
-
|
|
|
35 |
dtype: float16
|
36 |
```
|
37 |
|
38 |
+
## Model Details
|
39 |
+
|
40 |
+
Hermes-2-Pro-Llama-3-8B is an upgraded version of the original Hermes model, designed for enhanced conversational capabilities and function calling. It excels in generating structured outputs and has been fine-tuned on a diverse dataset, including the OpenHermes 2.5 dataset. The model is particularly adept at handling complex queries and providing coherent responses.
|
41 |
+
|
42 |
+
Llama3-8B-Chinese-Chat, on the other hand, is specifically fine-tuned for Chinese and English users, focusing on roleplaying and tool-using capabilities. It has been trained on a significantly larger dataset, improving its performance in various tasks, including math and function calling.
|
43 |
+
|
44 |
+
## Description
|
45 |
+
|
46 |
+
The merged model combines the strengths of both parent models, providing a robust solution for multilingual text generation and understanding. It leverages the advanced generative capabilities of Hermes-2-Pro while incorporating the specialized training of Llama3-8B-Chinese-Chat, making it suitable for a wide range of applications, from casual conversation to structured data generation.
|
47 |
+
|
48 |
+
## Use Cases
|
49 |
+
|
50 |
+
- **Conversational AI**: Engage users in natural dialogues across multiple languages.
|
51 |
+
- **Function Calling**: Execute predefined functions based on user queries, enhancing interactivity.
|
52 |
+
- **Structured Outputs**: Generate JSON or other structured formats for data processing tasks.
|
53 |
+
- **Roleplaying**: Simulate characters or scenarios in both English and Chinese.
|
54 |
+
|
55 |
## Model Features
|
56 |
|
57 |
+
- **Multilingual Support**: Capable of understanding and generating text in both English and Chinese.
|
58 |
+
- **Enhanced Context Understanding**: Improved ability to maintain context over longer conversations.
|
59 |
+
- **Function Calling**: Supports advanced function calling capabilities for dynamic interactions.
|
60 |
+
- **Structured Output Generation**: Can produce outputs in structured formats like JSON.
|
61 |
|
62 |
## Evaluation Results
|
63 |
|
64 |
### Hermes-2-Pro-Llama-3-8B
|
65 |
- Scored 90% on function calling evaluation.
|
66 |
+
- Achieved 84% on structured JSON output evaluation.
|
67 |
|
68 |
### Llama3-8B-Chinese-Chat
|
69 |
+
- Demonstrated superior performance in Chinese language tasks, surpassing previous models in various benchmarks.
|
70 |
|
71 |
## Limitations
|
72 |
|
73 |
+
While the merged model offers significant improvements, it may still inherit some limitations from its parent models, including:
|
74 |
+
- Potential biases present in the training data.
|
75 |
+
- Challenges in handling highly specialized or niche topics.
|
76 |
+
- Variability in performance based on the complexity of user queries.
|
77 |
|
78 |
+
Users are encouraged to provide feedback and report any issues encountered during usage to facilitate ongoing improvements.
|