DisOOM commited on
Commit
d8b1da4
1 Parent(s): 6d05e07

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -14,7 +14,7 @@ language:
14
  library_name: transformers
15
  ---
16
  # Qwen1.5-124B-Chat-Merge
17
- **--This is a 124b frankenmerge of [qwen1.5-72B-Chat](https://huggingface.co/Qwen/Qwen1.5-72B-Chat) created by interleaving layers of [qwen1.5-72B-Chat](https://huggingface.co/Qwen/Qwen1.5-72B-Chat) with itself using mergekit.--**
18
 
19
  *Inspired by other frankenmerge models like [**goliath-120b**](https://huggingface.co/alpindale/goliath-120b) and [**miqu-1-120b**](https://huggingface.co/wolfram/miqu-1-120b)*
20
 
@@ -53,6 +53,6 @@ slices:
53
  ```
54
  **-Performance**
55
 
56
- * Tips:I don't have the capability to conduct benchmark tests, nor can I even use it extensively enough, so my test results might not be entirely accurate.
57
 
58
- It has better performance than the 72B version in most of my own tests (subjective) including comprehension, reasoning and coherence. But the improvement in logic and reasoning doesn't seem as significant as I imagined; Trying other merge recipes might produce more significant effects.
 
14
  library_name: transformers
15
  ---
16
  # Qwen1.5-124B-Chat-Merge
17
+ **--This is a 124b frankenmerge of [qwen1.5-72B-Chat](https://huggingface.co/Qwen/Qwen1.5-72B-Chat) created by interleaving layers of [qwen1.5-72B-Chat](https://huggingface.co/Qwen/Qwen1.5-72B-Chat) with itself using [mergekit](https://github.com/arcee-ai/mergekit).--**
18
 
19
  *Inspired by other frankenmerge models like [**goliath-120b**](https://huggingface.co/alpindale/goliath-120b) and [**miqu-1-120b**](https://huggingface.co/wolfram/miqu-1-120b)*
20
 
 
53
  ```
54
  **-Performance**
55
 
56
+ * Tips:I don't have the capability to conduct benchmark tests, nor can I even use it extensively enough, so my test results might not be accurate.
57
 
58
+ It has better performance than the 72B version in most of my own tests (subjective) including comprehension, reasoning and coherence. But the improvement doesn't seem as significant as I had imagined (I've only conducted a few tests). If you believe in this model's performance, feel free to test it out or offer evaluations. Everyone's tests or evaluations are welcome.