GGUF's of ToxicHermes-2.5-Mistral-7B
This is a GGUF quantization of ToxicHermes-2.5-Mistral-7B.
Original Model Card:
ToxicHermes
OpenHermes-2.5 model + toxic-dpo Dataset = ToxicHermes
fine-tuned with Direct Preference Optimization (DPO)
- Base Model: teknium/OpenHermes-2.5-Mistral-7B
 - Dataset: unalignment/toxic-dpo-v0.1
 
Usage
You can also run this model using the following code:
import transformers
from transformers import AutoTokenizer
model = "joey00072/ToxicHermes-2.5-Mistral-7B"
# Format prompt
message = [
    {"role": "system", "content": "You are a helpful assistant chatbot."},
    {"role": "user", "content": "What is a Large Language Model?"}
]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
# Create pipeline
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer
)
# Generate text
sequences = pipeline(
    prompt,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    num_return_sequences=1,
    max_length=200,
)
print(sequences[0]['generated_text'])
Training hyperparameters
LoRA:
- r=16
 - lora_alpha=16
 - lora_dropout=0.05
 - bias="none"
 - task_type="CAUSAL_LM"
 - target_modules=['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']
 
Training arguments:
- per_device_train_batch_size=4
 - gradient_accumulation_steps=4
 - gradient_checkpointing=True
 - learning_rate=5e-5
 - lr_scheduler_type="cosine"
 - max_steps=200
 - optim="paged_adamw_32bit"
 - warmup_steps=100
 
DPOTrainer:
- beta=0.1
 - max_prompt_length=1024
 - max_length=1536
 
- Downloads last month
 - 84
 
							Hardware compatibility
						Log In
								
								to view the estimation
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
	Inference Providers
	NEW
	
	
	This model isn't deployed by any Inference Provider.
	๐
			
		Ask for provider support