Llama-3.1-8b-Instruct-Secure
This model is fine-tuned with LoRA adapters for secure behavior and low ASR (Attack Success Rate).
Model Details
- Base Model: Llama-3.1-8b-Instruct
- Fine-tuning: LoRA method
- Purpose: Secure language model with defenses against jailbreaking.
Training Details
- Dataset: Custom synthetic data
- Framework: PyTorch
- Sharding: Model is saved in shards of 100MB to ensure compatibility.
Usage
Load the model and tokenizer as follows:
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "SanjanaCodes/Llama-3.1-8b-Instruct-Secure"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
inputs = tokenizer("Your input prompt here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
- Downloads last month
- 0