Update README.md
Browse files
README.md
CHANGED
@@ -2,25 +2,70 @@
|
|
2 |
license: apache-2.0
|
3 |
tags:
|
4 |
- merge
|
5 |
-
-
|
6 |
-
-
|
7 |
-
-
|
8 |
-
-
|
9 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
- rityak/L3.1-DarkStock-8B
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
---
|
12 |
|
13 |
-
|
14 |
|
15 |
-
|
16 |
-
* [agentlans/Llama3.1-Dark-Enigma](https://huggingface.co/agentlans/Llama3.1-Dark-Enigma)
|
17 |
-
* [akjindal53244/Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B)
|
18 |
-
* [DreadPoor/Aspire-8B-model_stock](https://huggingface.co/DreadPoor/Aspire-8B-model_stock)
|
19 |
-
* [rityak/L3.1-DarkStock-8B](https://huggingface.co/rityak/L3.1-DarkStock-8B)
|
20 |
|
21 |
-
|
22 |
|
23 |
```yaml
|
|
|
|
|
|
|
24 |
models:
|
25 |
- model: agentlans/Llama3.1-Dark-Enigma
|
26 |
parameters:
|
@@ -38,12 +83,82 @@ models:
|
|
38 |
parameters:
|
39 |
density: 0.5
|
40 |
weight: 0.1
|
41 |
-
|
42 |
-
merge_method: ties
|
43 |
-
base_model: rityak/L3.1-DarkStock-8B
|
44 |
-
dtype: bfloat16
|
45 |
-
parameters:
|
46 |
-
normalize: true
|
47 |
out_dtype: float16
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
|
49 |
-
|
|
|
2 |
license: apache-2.0
|
3 |
tags:
|
4 |
- merge
|
5 |
+
- model_stock
|
6 |
+
- DarkStock
|
7 |
+
- Aspire
|
8 |
+
- Storm
|
9 |
+
- Llama3
|
10 |
+
- DarkEnigma
|
11 |
+
- instruction-following
|
12 |
+
- creative-writing
|
13 |
+
- coding
|
14 |
+
- roleplaying
|
15 |
+
- long-form-generation
|
16 |
+
- research
|
17 |
+
- bfloat16
|
18 |
+
base_model:
|
19 |
- rityak/L3.1-DarkStock-8B
|
20 |
+
- DreadPoor/Aspire-8B-model_stock
|
21 |
+
- akjindal53244/Llama-3.1-Storm-8B
|
22 |
+
- agentlans/Llama3.1-Dark-Enigma
|
23 |
+
library_name: transformers
|
24 |
+
language:
|
25 |
+
- en
|
26 |
+
datasets:
|
27 |
+
- openbuddy/openbuddy-llama3.1-8b-v22.2-131k
|
28 |
+
- THUDM/LongWriter-llama3.1-8b
|
29 |
+
- aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
|
30 |
+
pipeline_tag: text-generation
|
31 |
+
---
|
32 |
+
|
33 |
+
|
34 |
+
# 🌩️ **Llama3.1-DarkStorm-Aspire-8B** 🌟
|
35 |
+
|
36 |
+
Welcome to **Llama3.1-DarkStorm-Aspire-8B** — an advanced and versatile **8B parameter** AI model born from the fusion of powerful language models, designed to deliver superior performance across research, writing, coding, and creative tasks. This unique merge blends the best qualities of the **Dark Enigma**, **Storm**, and **Aspire** models, while built on the strong foundation of **DarkStock**. With balanced integration, it excels in generating coherent, context-aware, and imaginative outputs.
|
37 |
+
|
38 |
+
## 🚀 **Model Overview**
|
39 |
+
|
40 |
+
**Llama3.1-DarkStorm-Aspire-8B** combines cutting-edge natural language processing capabilities to perform exceptionally well in a wide variety of tasks:
|
41 |
+
|
42 |
+
- **Research and Analysis**: Perfect for analyzing textual data, planning experiments, and brainstorming complex ideas.
|
43 |
+
- **Creative Writing and Roleplaying**: Excels in creative writing, immersive storytelling, and generating roleplaying scenarios.
|
44 |
+
- **General AI Applications**: Use it for any application where advanced reasoning, instruction-following, and creativity are needed.
|
45 |
+
|
46 |
+
---
|
47 |
+
|
48 |
+
## 🧬 **Model Family**
|
49 |
+
|
50 |
+
This merge incorporates the finest elements of the following models:
|
51 |
+
|
52 |
+
- **[Llama3.1-Dark-Enigma](https://huggingface.co/agentlans/Llama3.1-Dark-Enigma)**: Known for its versatility across creative, research, and coding tasks. Specializes in role-playing and simulating scenarios.
|
53 |
+
- **[Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B)**: A finely-tuned model for structured reasoning, enhanced conversational capabilities, and agentic tasks.
|
54 |
+
- **[Aspire-8B](https://huggingface.co/DreadPoor/Aspire-8B-model_stock)**: Renowned for high-quality generation across creative and technical domains.
|
55 |
+
- **[L3.1-DarkStock-8B](https://huggingface.co/rityak/L3.1-DarkStock-8B)**: The base model providing a sturdy and balanced core of instruction-following and narrative generation.
|
56 |
+
|
57 |
---
|
58 |
|
59 |
+
## ⚙️ **Merge Details**
|
60 |
|
61 |
+
This model was created using the **Model Stock merge method**, meticulously balancing each component model's unique strengths. The **TIES merge** method was used to blend the layers, ensuring smooth integration across the self-attention and MLP layers for optimal performance.
|
|
|
|
|
|
|
|
|
62 |
|
63 |
+
### **Merge Configuration**:
|
64 |
|
65 |
```yaml
|
66 |
+
base_model: rityak/L3.1-DarkStock-8B
|
67 |
+
dtype: bfloat16
|
68 |
+
merge_method: ties
|
69 |
models:
|
70 |
- model: agentlans/Llama3.1-Dark-Enigma
|
71 |
parameters:
|
|
|
83 |
parameters:
|
84 |
density: 0.5
|
85 |
weight: 0.1
|
|
|
|
|
|
|
|
|
|
|
|
|
86 |
out_dtype: float16
|
87 |
+
```
|
88 |
+
|
89 |
+
The **TIES method** ensures seamless blending of each model’s specializations, allowing for smooth interpolation across their capabilities. The model uses **bfloat16** for efficient processing and **float16** for the final output, ensuring optimal performance without sacrificing precision.
|
90 |
+
|
91 |
+
---
|
92 |
+
|
93 |
+
## 🌟 **Key Features**
|
94 |
+
|
95 |
+
1. **Instruction Following & Reasoning**: Leveraging **DarkStock**'s structured capabilities, this model excels in handling complex reasoning tasks and providing precise instruction-based outputs.
|
96 |
+
|
97 |
+
2. **Creative Writing & Role-Playing**: The combination of **Aspire** and **Dark Enigma** offers powerful storytelling and roleplaying support, making it an ideal tool for immersive worlds and character-driven narratives.
|
98 |
+
|
99 |
+
3. **High-Quality Output**: The model is designed to provide coherent, context-aware responses, ensuring high-quality results across all tasks, whether it’s a research task, creative writing, or coding assistance.
|
100 |
+
|
101 |
+
---
|
102 |
+
|
103 |
+
## 📊 **Model Use Cases**
|
104 |
+
|
105 |
+
**Llama3.1-DarkStorm-Aspire-8B** is suitable for a wide range of applications:
|
106 |
+
|
107 |
+
- **Creative Writing & Storytelling**: Generate immersive stories, role-playing scenarios, or fantasy world-building with ease.
|
108 |
+
- **Technical Writing & Research**: Analyze text data, draft research papers, or brainstorm ideas with structured reasoning.
|
109 |
+
- - **Conversational AI**: Use this model to simulate engaging and contextually aware conversations.
|
110 |
+
|
111 |
+
---
|
112 |
+
|
113 |
+
## 📝 **Training Data**
|
114 |
+
|
115 |
+
The models included in this merge were each trained on diverse datasets:
|
116 |
+
|
117 |
+
- **Llama3.1-Dark-Enigma** and **Storm-8B** were trained on a mix of high-quality, public datasets, with a focus on creative and technical content.
|
118 |
+
- **Aspire-8B** emphasizes a balance between creative writing and technical precision, making it a versatile addition to the merge.
|
119 |
+
- **DarkStock** provided a stable base, finely tuned for instruction-following and diverse general applications.
|
120 |
+
|
121 |
+
---
|
122 |
+
|
123 |
+
## ⚠️ **Limitations & Responsible AI Use**
|
124 |
+
|
125 |
+
As with any AI model, it’s important to understand and consider the limitations of **Llama3.1-DarkStorm-Aspire-8B**:
|
126 |
+
|
127 |
+
- **Bias**: While the model has been trained on diverse data, biases in the training data may influence its output. Users should critically evaluate the model’s responses in sensitive scenarios.
|
128 |
+
- **Fact-based Tasks**: For fact-checking and knowledge-driven tasks, it may require careful prompting to avoid hallucinations or inaccuracies.
|
129 |
+
- **Sensitive Content**: This model is designed with an uncensored approach, so be cautious when dealing with potentially sensitive or offensive content.
|
130 |
+
|
131 |
+
---
|
132 |
+
|
133 |
+
## 🛠️ **How to Use**
|
134 |
+
|
135 |
+
You can load the model using Hugging Face's transformers library:
|
136 |
+
|
137 |
+
```python
|
138 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
139 |
+
|
140 |
+
model_id = "your-model-id"
|
141 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
142 |
+
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="bfloat16")
|
143 |
+
|
144 |
+
prompt = "Explain the importance of data privacy in AI development."
|
145 |
+
inputs = tokenizer(prompt, return_tensors="pt")
|
146 |
+
outputs = model.generate(**inputs, max_new_tokens=100)
|
147 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
148 |
+
```
|
149 |
+
|
150 |
+
For best results, use the model with the **bfloat16** precision for high efficiency, or **float16** for the final outputs.
|
151 |
+
|
152 |
+
---
|
153 |
+
|
154 |
+
## 📜 **License**
|
155 |
+
|
156 |
+
This model is open-sourced under the **Apache 2.0 License**, allowing free use, distribution, and modification with proper attribution.
|
157 |
+
|
158 |
+
---
|
159 |
+
|
160 |
+
## 💡 **Get Involved**
|
161 |
+
|
162 |
+
We’re excited to see how the community uses **Llama3.1-DarkStorm-Aspire-8B** in various creative and technical applications. Be sure to share your feedback and improvements with us on the Hugging Face model page!
|
163 |
|
164 |
+
---
|