Hyperparameters
- 3 epoch
- 1e-4 -> 1e-5 with cosine lr decay
- batch size 128
- max sequence length 2048
- AdamW(weigth decay=0.01, b1=0.9, b2=0.99, grad_clip=1.0)
- no warmup
- BF16
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("heegyu/WizardVicuna-pythia-410m-deduped")
model = AutoModelForCausalLM.from_pretrained("heegyu/WizardVicuna-pythia-410m-deduped")
inputs = tokenizer(["Human: Hi\n\nAssistant: "], return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=16)
print(tokenizer.batch_decode(outputs, skip_special_tokens=False))
output: ['Human: Hi\n\nAssistant: Hello! How can I assist you today?<|endoftext|>']
- Downloads last month
- 120
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.