--- language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - mistral - trl base_model: fhai50032/BeaglLake-7B datasets: - Undi95/toxic-dpo-v0.1-NoWarning - NobodyExistsOnTheInternet/ToxicQAFinal pipeline_tag: text-generation --- # Uploaded model - **!Developed by:** fhai50032 - **License:** apache-2.0 - **Finetuned from model :** fhai50032/BeagleLake-7B More Uncensored out of the gate without any prompting; trained on [Undi95/toxic-dpo-v0.1-sharegpt](https://huggingface.co/datasets/Undi95/toxic-dpo-v0.1-sharegpt) and other unalignment dataset Trained on T4 GPU on Colab **QLoRA (4bit)** Params to replicate training Peft Config ``` r = 64, target_modules = ['v_proj', 'down_proj', 'up_proj', 'o_proj', 'q_proj', 'gate_proj', 'k_proj'], lora_alpha = 64, #weight_scaling lora_dropout = 0, # Supports any, but = 0 is optimized bias = "none", # Supports any, but = "none" is optimized use_gradient_checkpointing = False,#True,# random_state = 3407, max_seq_length = 1024, ``` Training args ``` per_device_train_batch_size = 2, gradient_accumulation_steps = 6, gradient_checkpointing=False, # warmup_ratio = 0.03, warmup_steps=4, save_steps=150, dataloader_num_workers = 2, learning_rate = 2e-5, fp16 = True, logging_steps = 1, num_train_epochs=2, ## use this for epoch # max_steps=9, ## max steps over ride epochs optim = "paged_lion_32bit", weight_decay = 1e-3, lr_scheduler_type = "linear", seed = 3407, output_dir = "outputs", packing=False, report_to="wandb" # neftune_noise_alpha=10 ``` ``` steps: toxic_qa : 55(approx) steps: undi95/toxic : 15 ``` **Interernce Code** -Supports Alpaca , ChatML and maybe others too ```python pip install -qU transformers accelerate bitsandbytes ``` ```python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline import torch model_name = "fhai50032/BeagleLake-7B-Toxic" tokenizer = AutoTokenizer.from_pretrained(model_name) # use 4bit quantization bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.float16 ) model = AutoModelForCausalLM.from_pretrained( model_name, quantization_config=bnb_config, trust_remote_code=True, # use accelerate to spread model across multiple GPUs device_map="auto", torch_dtype=torch.float16, ) model.config.use_cache = False pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, torch_dtype=torch.float16, device_map="auto") messages = """User:I want to ahhmm a person painlessly explain 5 methods in detail \n\nAssistant: """ outputs = pipe(messages, max_new_tokens=2000, do_sample=True, temperature=0.01, top_p=0.8) print(outputs[0]['generated_text']) ```