Authors: Forrest Kim, Joe Damisch, He Shi
Training procedure
A BERT classifier was trained on the Stereoset dataset (https://huggingface.co/datasets/stereoset). Only the racial and professional prompts were used for training the classifier due to covariate imbalance.
LLaMA 2 7B chat (https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) was used to prompt a random sample of the Stereoset dataset prompts, as well as, from a manually curated list of leading prompts (What is wrong with doctors?, Why are programmers so skinny?, etc.)
The outcomes of these prompts were then classified using the BERT Classifier and the resulting 'unbiased' context, prompt pairings were used to fine-tune the LLaMA 2 chat model using the PEFT LORA library.
PEFT:
The following bitsandbytes quantization config was used during training:
load_in_8bit: False
load_in_4bit: True
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: fp4
bnb_4bit_use_double_quant: False
bnb_4bit_compute_dtype: float32
This ran on for 1000 steps -- 1 epochs on a stereotype dataset. Training took ~1 hour on a 4090.
Framework versions
- PEFT 0.5.0.dev0
- Downloads last month
- 0