rogkesavan
commited on
Commit
•
38ef389
1
Parent(s):
805ec06
Update README.md
Browse files
README.md
CHANGED
@@ -13,28 +13,136 @@ tags:
|
|
13 |
pipeline_tag: text-generation
|
14 |
---
|
15 |
|
16 |
-
|
17 |
|
18 |
-
|
|
|
19 |
|
20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
```bash
|
23 |
pip install mlx-lm
|
24 |
```
|
25 |
|
|
|
|
|
26 |
```python
|
27 |
from mlx_lm import load, generate
|
28 |
|
|
|
29 |
model, tokenizer = load("nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit")
|
30 |
|
31 |
-
prompt
|
|
|
32 |
|
|
|
33 |
if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
|
34 |
messages = [{"role": "user", "content": prompt}]
|
35 |
prompt = tokenizer.apply_chat_template(
|
36 |
messages, tokenize=False, add_generation_prompt=True
|
37 |
)
|
38 |
|
|
|
39 |
response = generate(model, tokenizer, prompt=prompt, verbose=True)
|
|
|
|
|
|
|
40 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
pipeline_tag: text-generation
|
14 |
---
|
15 |
|
16 |
+
### **Nidum-Llama-3.2-3B-Uncensored-MLX-8bit**
|
17 |
|
18 |
+
### **Welcome to Nidum!**
|
19 |
+
At Nidum, our mission is to bring cutting-edge AI capabilities to everyone with unrestricted access to innovation. With **Nidum-Llama-3.2-3B-Uncensored-MLX-8bit**, you get an optimized, efficient, and versatile AI model for diverse applications.
|
20 |
|
21 |
+
---
|
22 |
+
|
23 |
+
[![GitHub Icon](https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/Font_Awesome_5_brands_github.svg/232px-Font_Awesome_5_brands_github.svg.png)](https://github.com/NidumAI-Inc)
|
24 |
+
**Discover Nidum's Open-Source Projects on GitHub**: [https://github.com/NidumAI-Inc](https://github.com/NidumAI-Inc)
|
25 |
+
|
26 |
+
---
|
27 |
+
|
28 |
+
### **Key Features**
|
29 |
+
|
30 |
+
1. **Efficient and Compact**: Developed in **MLX-8bit format** for improved performance and reduced memory demands.
|
31 |
+
2. **Wide Applicability**: Suitable for technical problem-solving, educational content, and conversational tasks.
|
32 |
+
3. **Advanced Context Awareness**: Handles long-context conversations with exceptional coherence.
|
33 |
+
4. **Streamlined Integration**: Optimized for use with the **mlx-lm library** for effortless development.
|
34 |
+
5. **Unrestricted Responses**: Offers uncensored answers across all supported domains.
|
35 |
+
|
36 |
+
---
|
37 |
+
|
38 |
+
### **How to Use**
|
39 |
+
|
40 |
+
To use **Nidum-Llama-3.2-3B-Uncensored-MLX-8bit**, install the **mlx-lm** library and follow these steps:
|
41 |
+
|
42 |
+
#### **Installation**
|
43 |
|
44 |
```bash
|
45 |
pip install mlx-lm
|
46 |
```
|
47 |
|
48 |
+
#### **Usage**
|
49 |
+
|
50 |
```python
|
51 |
from mlx_lm import load, generate
|
52 |
|
53 |
+
# Load the model and tokenizer
|
54 |
model, tokenizer = load("nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit")
|
55 |
|
56 |
+
# Create a prompt
|
57 |
+
prompt = "hello"
|
58 |
|
59 |
+
# Apply the chat template if available
|
60 |
if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
|
61 |
messages = [{"role": "user", "content": prompt}]
|
62 |
prompt = tokenizer.apply_chat_template(
|
63 |
messages, tokenize=False, add_generation_prompt=True
|
64 |
)
|
65 |
|
66 |
+
# Generate the response
|
67 |
response = generate(model, tokenizer, prompt=prompt, verbose=True)
|
68 |
+
|
69 |
+
# Print the response
|
70 |
+
print(response)
|
71 |
```
|
72 |
+
|
73 |
+
---
|
74 |
+
|
75 |
+
### **About the Model**
|
76 |
+
|
77 |
+
The **nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit** model, converted using **mlx-lm version 0.19.2**, brings:
|
78 |
+
|
79 |
+
- **Memory Efficiency**: Tailored for systems with limited hardware.
|
80 |
+
- **Performance Optimization**: Matches the capabilities of the original model while delivering faster inference.
|
81 |
+
- **Plug-and-Play**: Easily integrates with the **mlx-lm** library for deployment ease.
|
82 |
+
|
83 |
+
---
|
84 |
+
|
85 |
+
### **Use Cases**
|
86 |
+
|
87 |
+
- **Problem Solving in Tech and Science**
|
88 |
+
- **Educational and Research Assistance**
|
89 |
+
- **Creative Writing and Brainstorming**
|
90 |
+
- **Extended Dialogues**
|
91 |
+
- **Uninhibited Knowledge Exploration**
|
92 |
+
|
93 |
+
---
|
94 |
+
|
95 |
+
### **Datasets and Fine-Tuning**
|
96 |
+
|
97 |
+
Derived from **Nidum-Llama-3.2-3B-Uncensored**, the MLX-8bit version inherits:
|
98 |
+
|
99 |
+
- **Uncensored Fine-Tuning**: Delivers detailed and open-ended responses.
|
100 |
+
- **RAG-Based Optimization**: Enhances retrieval-augmented generation for data-driven tasks.
|
101 |
+
- **Math Reasoning Support**: Precise mathematical computations and explanations.
|
102 |
+
- **Long-Context Training**: Ensures relevance and coherence in extended conversations.
|
103 |
+
|
104 |
+
---
|
105 |
+
|
106 |
+
### **Quantized Model Download**
|
107 |
+
|
108 |
+
The **MLX-8bit** format strikes the perfect balance between memory optimization and performance.
|
109 |
+
|
110 |
+
---
|
111 |
+
|
112 |
+
#### **Benchmark**
|
113 |
+
|
114 |
+
| **Benchmark** | **Metric** | **LLaMA 3B** | **Nidum 3B** | **Observation** |
|
115 |
+
|-------------------|-----------------------------------|--------------|--------------|-----------------------------------------------------------------------------------------------------|
|
116 |
+
| **GPQA** | Exact Match (Flexible) | 0.3 | 0.5 | Nidum 3B achieves notable improvement in **generative tasks**. |
|
117 |
+
| | Accuracy | 0.4 | 0.5 | Demonstrates strong performance, especially in **zero-shot** tasks. |
|
118 |
+
| **HellaSwag** | Accuracy | 0.3 | 0.4 | Excels in **common-sense reasoning** tasks. |
|
119 |
+
| | Normalized Accuracy | 0.3 | 0.4 | Strong contextual understanding in sentence completion tasks. |
|
120 |
+
| | Normalized Accuracy (Stderr) | 0.15275 | 0.1633 | Enhanced consistency in normalized accuracy. |
|
121 |
+
| | Accuracy (Stderr) | 0.15275 | 0.1633 | Demonstrates robustness in reasoning accuracy compared to LLaMA 3B. |
|
122 |
+
|
123 |
+
---
|
124 |
+
|
125 |
+
### **Insights**
|
126 |
+
|
127 |
+
1. **High Performance, Low Resource**: The MLX-8bit format is ideal for environments with limited memory and processing power.
|
128 |
+
2. **Seamless Integration**: Designed for smooth integration into lightweight systems and workflows.
|
129 |
+
|
130 |
+
---
|
131 |
+
|
132 |
+
### **Contributing**
|
133 |
+
|
134 |
+
Join us in enhancing the **MLX-8bit** model's capabilities. Contact us for collaboration opportunities.
|
135 |
+
|
136 |
+
---
|
137 |
+
|
138 |
+
### **Contact**
|
139 |
+
|
140 |
+
For questions, support, or feedback, email **info@nidum.ai**.
|
141 |
+
|
142 |
+
---
|
143 |
+
|
144 |
+
### **Experience the Future**
|
145 |
+
|
146 |
+
Harness the power of **Nidum-Llama-3.2-3B-Uncensored-MLX-8bit** for a perfect blend of performance and efficiency.
|
147 |
+
|
148 |
+
---
|