ZeroXClem commited on
Commit
c589004
1 Parent(s): 6be4008

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +134 -19
README.md CHANGED
@@ -2,25 +2,70 @@
2
  license: apache-2.0
3
  tags:
4
  - merge
5
- - mergekit
6
- - lazymergekit
7
- - agentlans/Llama3.1-Dark-Enigma
8
- - akjindal53244/Llama-3.1-Storm-8B
9
- - DreadPoor/Aspire-8B-model_stock
 
 
 
 
 
 
 
 
 
10
  - rityak/L3.1-DarkStock-8B
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
- # ZeroXClem/Llama3.1-DarkStorm-Aspire-8B
14
 
15
- ZeroXClem/Llama3.1-DarkStorm-Aspire-8B is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
16
- * [agentlans/Llama3.1-Dark-Enigma](https://huggingface.co/agentlans/Llama3.1-Dark-Enigma)
17
- * [akjindal53244/Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B)
18
- * [DreadPoor/Aspire-8B-model_stock](https://huggingface.co/DreadPoor/Aspire-8B-model_stock)
19
- * [rityak/L3.1-DarkStock-8B](https://huggingface.co/rityak/L3.1-DarkStock-8B)
20
 
21
- ## 🧩 Configuration
22
 
23
  ```yaml
 
 
 
24
  models:
25
  - model: agentlans/Llama3.1-Dark-Enigma
26
  parameters:
@@ -38,12 +83,82 @@ models:
38
  parameters:
39
  density: 0.5
40
  weight: 0.1
41
-
42
- merge_method: ties
43
- base_model: rityak/L3.1-DarkStock-8B
44
- dtype: bfloat16
45
- parameters:
46
- normalize: true
47
  out_dtype: float16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
 
49
- ```
 
2
  license: apache-2.0
3
  tags:
4
  - merge
5
+ - model_stock
6
+ - DarkStock
7
+ - Aspire
8
+ - Storm
9
+ - Llama3
10
+ - DarkEnigma
11
+ - instruction-following
12
+ - creative-writing
13
+ - coding
14
+ - roleplaying
15
+ - long-form-generation
16
+ - research
17
+ - bfloat16
18
+ base_model:
19
  - rityak/L3.1-DarkStock-8B
20
+ - DreadPoor/Aspire-8B-model_stock
21
+ - akjindal53244/Llama-3.1-Storm-8B
22
+ - agentlans/Llama3.1-Dark-Enigma
23
+ library_name: transformers
24
+ language:
25
+ - en
26
+ datasets:
27
+ - openbuddy/openbuddy-llama3.1-8b-v22.2-131k
28
+ - THUDM/LongWriter-llama3.1-8b
29
+ - aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
30
+ pipeline_tag: text-generation
31
+ ---
32
+
33
+
34
+ # 🌩️ **Llama3.1-DarkStorm-Aspire-8B** 🌟
35
+
36
+ Welcome to **Llama3.1-DarkStorm-Aspire-8B** — an advanced and versatile **8B parameter** AI model born from the fusion of powerful language models, designed to deliver superior performance across research, writing, coding, and creative tasks. This unique merge blends the best qualities of the **Dark Enigma**, **Storm**, and **Aspire** models, while built on the strong foundation of **DarkStock**. With balanced integration, it excels in generating coherent, context-aware, and imaginative outputs.
37
+
38
+ ## 🚀 **Model Overview**
39
+
40
+ **Llama3.1-DarkStorm-Aspire-8B** combines cutting-edge natural language processing capabilities to perform exceptionally well in a wide variety of tasks:
41
+
42
+ - **Research and Analysis**: Perfect for analyzing textual data, planning experiments, and brainstorming complex ideas.
43
+ - **Creative Writing and Roleplaying**: Excels in creative writing, immersive storytelling, and generating roleplaying scenarios.
44
+ - **General AI Applications**: Use it for any application where advanced reasoning, instruction-following, and creativity are needed.
45
+
46
+ ---
47
+
48
+ ## 🧬 **Model Family**
49
+
50
+ This merge incorporates the finest elements of the following models:
51
+
52
+ - **[Llama3.1-Dark-Enigma](https://huggingface.co/agentlans/Llama3.1-Dark-Enigma)**: Known for its versatility across creative, research, and coding tasks. Specializes in role-playing and simulating scenarios.
53
+ - **[Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B)**: A finely-tuned model for structured reasoning, enhanced conversational capabilities, and agentic tasks.
54
+ - **[Aspire-8B](https://huggingface.co/DreadPoor/Aspire-8B-model_stock)**: Renowned for high-quality generation across creative and technical domains.
55
+ - **[L3.1-DarkStock-8B](https://huggingface.co/rityak/L3.1-DarkStock-8B)**: The base model providing a sturdy and balanced core of instruction-following and narrative generation.
56
+
57
  ---
58
 
59
+ ## ⚙️ **Merge Details**
60
 
61
+ This model was created using the **Model Stock merge method**, meticulously balancing each component model's unique strengths. The **TIES merge** method was used to blend the layers, ensuring smooth integration across the self-attention and MLP layers for optimal performance.
 
 
 
 
62
 
63
+ ### **Merge Configuration**:
64
 
65
  ```yaml
66
+ base_model: rityak/L3.1-DarkStock-8B
67
+ dtype: bfloat16
68
+ merge_method: ties
69
  models:
70
  - model: agentlans/Llama3.1-Dark-Enigma
71
  parameters:
 
83
  parameters:
84
  density: 0.5
85
  weight: 0.1
 
 
 
 
 
 
86
  out_dtype: float16
87
+ ```
88
+
89
+ The **TIES method** ensures seamless blending of each model’s specializations, allowing for smooth interpolation across their capabilities. The model uses **bfloat16** for efficient processing and **float16** for the final output, ensuring optimal performance without sacrificing precision.
90
+
91
+ ---
92
+
93
+ ## 🌟 **Key Features**
94
+
95
+ 1. **Instruction Following & Reasoning**: Leveraging **DarkStock**'s structured capabilities, this model excels in handling complex reasoning tasks and providing precise instruction-based outputs.
96
+
97
+ 2. **Creative Writing & Role-Playing**: The combination of **Aspire** and **Dark Enigma** offers powerful storytelling and roleplaying support, making it an ideal tool for immersive worlds and character-driven narratives.
98
+
99
+ 3. **High-Quality Output**: The model is designed to provide coherent, context-aware responses, ensuring high-quality results across all tasks, whether it’s a research task, creative writing, or coding assistance.
100
+
101
+ ---
102
+
103
+ ## 📊 **Model Use Cases**
104
+
105
+ **Llama3.1-DarkStorm-Aspire-8B** is suitable for a wide range of applications:
106
+
107
+ - **Creative Writing & Storytelling**: Generate immersive stories, role-playing scenarios, or fantasy world-building with ease.
108
+ - **Technical Writing & Research**: Analyze text data, draft research papers, or brainstorm ideas with structured reasoning.
109
+ - - **Conversational AI**: Use this model to simulate engaging and contextually aware conversations.
110
+
111
+ ---
112
+
113
+ ## 📝 **Training Data**
114
+
115
+ The models included in this merge were each trained on diverse datasets:
116
+
117
+ - **Llama3.1-Dark-Enigma** and **Storm-8B** were trained on a mix of high-quality, public datasets, with a focus on creative and technical content.
118
+ - **Aspire-8B** emphasizes a balance between creative writing and technical precision, making it a versatile addition to the merge.
119
+ - **DarkStock** provided a stable base, finely tuned for instruction-following and diverse general applications.
120
+
121
+ ---
122
+
123
+ ## ⚠️ **Limitations & Responsible AI Use**
124
+
125
+ As with any AI model, it’s important to understand and consider the limitations of **Llama3.1-DarkStorm-Aspire-8B**:
126
+
127
+ - **Bias**: While the model has been trained on diverse data, biases in the training data may influence its output. Users should critically evaluate the model’s responses in sensitive scenarios.
128
+ - **Fact-based Tasks**: For fact-checking and knowledge-driven tasks, it may require careful prompting to avoid hallucinations or inaccuracies.
129
+ - **Sensitive Content**: This model is designed with an uncensored approach, so be cautious when dealing with potentially sensitive or offensive content.
130
+
131
+ ---
132
+
133
+ ## 🛠️ **How to Use**
134
+
135
+ You can load the model using Hugging Face's transformers library:
136
+
137
+ ```python
138
+ from transformers import AutoModelForCausalLM, AutoTokenizer
139
+
140
+ model_id = "your-model-id"
141
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
142
+ model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="bfloat16")
143
+
144
+ prompt = "Explain the importance of data privacy in AI development."
145
+ inputs = tokenizer(prompt, return_tensors="pt")
146
+ outputs = model.generate(**inputs, max_new_tokens=100)
147
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
148
+ ```
149
+
150
+ For best results, use the model with the **bfloat16** precision for high efficiency, or **float16** for the final outputs.
151
+
152
+ ---
153
+
154
+ ## 📜 **License**
155
+
156
+ This model is open-sourced under the **Apache 2.0 License**, allowing free use, distribution, and modification with proper attribution.
157
+
158
+ ---
159
+
160
+ ## 💡 **Get Involved**
161
+
162
+ We’re excited to see how the community uses **Llama3.1-DarkStorm-Aspire-8B** in various creative and technical applications. Be sure to share your feedback and improvements with us on the Hugging Face model page!
163
 
164
+ ---