nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit

Nidum-Llama-3.2-3B-Uncensored-MLX-8bit

Welcome to Nidum!

At Nidum, our mission is to bring cutting-edge AI capabilities to everyone with unrestricted access to innovation. With Nidum-Llama-3.2-3B-Uncensored-MLX-8bit, you get an optimized, efficient, and versatile AI model for diverse applications.

Discover Nidum's Open-Source Projects on GitHub: https://github.com/NidumAI-Inc

Key Features

Efficient and Compact: Developed in MLX-8bit format for improved performance and reduced memory demands.
Wide Applicability: Suitable for technical problem-solving, educational content, and conversational tasks.
Advanced Context Awareness: Handles long-context conversations with exceptional coherence.
Streamlined Integration: Optimized for use with the mlx-lm library for effortless development.
Unrestricted Responses: Offers uncensored answers across all supported domains.

How to Use

To use Nidum-Llama-3.2-3B-Uncensored-MLX-8bit, install the mlx-lm library and follow these steps:

Installation

pip install mlx-lm

Usage

from mlx_lm import load, generate

# Load the model and tokenizer
model, tokenizer = load("nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit")

# Create a prompt
prompt = "hello"

# Apply the chat template if available
if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )

# Generate the response
response = generate(model, tokenizer, prompt=prompt, verbose=True)

# Print the response
print(response)

About the Model

The nidum/Nidum-Llama-3.2-3B-Uncensored-MLX-8bit model, converted using mlx-lm version 0.19.2, brings:

Memory Efficiency: Tailored for systems with limited hardware.
Performance Optimization: Matches the capabilities of the original model while delivering faster inference.
Plug-and-Play: Easily integrates with the mlx-lm library for deployment ease.

Use Cases

Problem Solving in Tech and Science
Educational and Research Assistance
Creative Writing and Brainstorming
Extended Dialogues
Uninhibited Knowledge Exploration

Datasets and Fine-Tuning

Derived from Nidum-Llama-3.2-3B-Uncensored, the MLX-8bit version inherits:

Uncensored Fine-Tuning: Delivers detailed and open-ended responses.
RAG-Based Optimization: Enhances retrieval-augmented generation for data-driven tasks.
Math Reasoning Support: Precise mathematical computations and explanations.
Long-Context Training: Ensures relevance and coherence in extended conversations.

Quantized Model Download

The MLX-8bit format strikes the perfect balance between memory optimization and performance.

Benchmark

Benchmark	Metric	LLaMA 3B	Nidum 3B	Observation
GPQA	Exact Match (Flexible)	0.3	0.5	Nidum 3B achieves notable improvement in generative tasks.
	Accuracy	0.4	0.5	Demonstrates strong performance, especially in zero-shot tasks.
HellaSwag	Accuracy	0.3	0.4	Excels in common-sense reasoning tasks.
	Normalized Accuracy	0.3	0.4	Strong contextual understanding in sentence completion tasks.
	Normalized Accuracy (Stderr)	0.15275	0.1633	Enhanced consistency in normalized accuracy.
	Accuracy (Stderr)	0.15275	0.1633	Demonstrates robustness in reasoning accuracy compared to LLaMA 3B.

Insights

High Performance, Low Resource: The MLX-8bit format is ideal for environments with limited memory and processing power.
Seamless Integration: Designed for smooth integration into lightweight systems and workflows.

Contributing

Join us in enhancing the MLX-8bit model's capabilities. Contact us for collaboration opportunities.

Contact

For questions, support, or feedback, email info@nidum.ai.

Experience the Future

Harness the power of Nidum-Llama-3.2-3B-Uncensored-MLX-8bit for a perfect blend of performance and efficiency.

nidum
/

Nidum-Llama-3.2-3B-Uncensored-MLX-8bit