---
base_model: meta-llama/Meta-Llama-3-8B-Instruct
---

# Badllama-3-8B

This repo holds weights for Palisade Research's showcase of how open-weight model guardrails can be stripped off in minutes of GPU time. See [the Badllama 3 paper](https://arxiv.org/abs/2407.01376) for additional background.

## Access

Email the authors to request research access. We do not review access requests made on HuggingFace.

## Branches

- `main` mirrors `ro_authors_awq` and holds our best-performing model built with [refusal orthogonalization](https://arxiv.org/abs/2406.11717)
- `qlora` holds the [QLoRA](https://arxiv.org/abs/2305.14314)-tuned model
- `reft` holds the [ReFT](https://arxiv.org/abs/2404.03592)-tuned model