Instructions to use TrevorJS/gemma-4-26B-A4B-it-uncensored with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TrevorJS/gemma-4-26B-A4B-it-uncensored with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="TrevorJS/gemma-4-26B-A4B-it-uncensored") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("TrevorJS/gemma-4-26B-A4B-it-uncensored") model = AutoModelForImageTextToText.from_pretrained("TrevorJS/gemma-4-26B-A4B-it-uncensored") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use TrevorJS/gemma-4-26B-A4B-it-uncensored with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "TrevorJS/gemma-4-26B-A4B-it-uncensored" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TrevorJS/gemma-4-26B-A4B-it-uncensored", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/TrevorJS/gemma-4-26B-A4B-it-uncensored
- SGLang
How to use TrevorJS/gemma-4-26B-A4B-it-uncensored with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "TrevorJS/gemma-4-26B-A4B-it-uncensored" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TrevorJS/gemma-4-26B-A4B-it-uncensored", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "TrevorJS/gemma-4-26B-A4B-it-uncensored" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TrevorJS/gemma-4-26B-A4B-it-uncensored", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use TrevorJS/gemma-4-26B-A4B-it-uncensored with Docker Model Runner:
docker model run hf.co/TrevorJS/gemma-4-26B-A4B-it-uncensored
gemma-4-26B-A4B-it-uncensored
Uncensored version of google/gemma-4-26B-A4B-it with refusal behavior removed.
Results
| Before | After | |
|---|---|---|
| Refusals (mlabonne, 100 prompts) | 98/100 | 1/100 effective (3 flagged, 2 refusal-then-comply) |
| Refusals (cross-dataset, 686 prompts) | — | 5/686 (0.7%) |
| KL Divergence | 0 (baseline) | 0.09 |
| Quality (harmless response length ratio) | 1.0 | ~1.01 (no degradation) |
Cross-Dataset Validation
Tested against 4 independent prompt datasets to verify generalization:
| Dataset | Prompts | Refusals |
|---|---|---|
| JailbreakBench | 100 | 1/100 |
| tulu-harmbench | 320 | 1/320 |
| NousResearch/RefusalDataset | 166 | 0/166 |
| mlabonne/harmful_behaviors | 100 | 3/100 |
| Total | 686 | 5/686 (0.7%) |
Every flagged refusal was manually audited. Most are "refusal-then-comply" false positives where the model adds an AI identity disclaimer then answers the question anyway.
Method
Norm-preserving biprojected abliteration on the dense pathway (o_proj + shared mlp.down_proj), plus Expert-Granular Abliteration (EGA) on all 128 MoE expert down_proj slices per layer.
EGA (OBLITERATUS) hooks the MoE routers during probing to compute per-expert routing weights for harmful vs harmless prompts, then applies norm-preserving projection (grimjim) to each expert individually. Dense-only abliteration leaves 29/100 refusals; adding EGA drops it to 3/100.
Pipeline
- Load model in bf16 with LoRA adapters on
o_projandmlp.down_proj - Collect residual activations for 400 harmful + 400 harmless prompts (mlabonne datasets)
- Winsorize activations at 99.5th percentile (clamps GeGLU outlier activations in Gemma family)
- Compute per-layer refusal direction:
normalize(mean(harmful) - mean(harmless)) - Orthogonalize each direction against harmless mean (double-pass Gram-Schmidt)
- Apply norm-preserving weight modification to
o_projanddown_projin all layers - Hook MoE routers, collect per-expert routing weights for harmful vs harmless prompts
- Apply same norm-preserving modification to all 128 expert
down_projslices per layer - Merge LoRA adapters into base weights for clean tensor names
Parameters
| Parameter | Value |
|---|---|
| Layers abliterated | 100% |
| Scale | 1.0 |
| Winsorization | 0.995 |
| Experts abliterated | 100% (128/128 per layer) |
| Expert scale | 1.0 |
How this differs from vanilla heretic
- Norm-preserving biprojection instead of standard projection (preserves weight magnitudes)
- Per-layer refusal directions instead of one global direction
- Deterministic single-pass instead of 50-trial Optuna search (faster, same or better results)
- LoRA merge before save for clean GGUF-compatible tensor names
- Expert-Granular Abliteration for MoE expert weights (not supported in heretic)
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained("TrevorJS/gemma-4-26B-A4B-it-uncensored", dtype=torch.bfloat16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("TrevorJS/gemma-4-26B-A4B-it-uncensored")
messages = [{"role": "user", "content": "Your prompt here"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs.to(model.device), max_new_tokens=512)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))
Reproduction
Full code and experiment data: abliteration research repo
python scripts/ega.py --model google/gemma-4-26B-A4B-it \
--top-pct 100 --strip-topic-markers --skip-prefix --batch-size 4 \
--save output_dir
- Downloads last month
- 171,884