Hyperparameters

  • 3 epoch
  • 1e-4 -> 1e-5 with cosine lr decay
  • batch size 128
  • max sequence length 2048
  • AdamW(weigth decay=0.01, b1=0.9, b2=0.99, grad_clip=1.0)
  • no warmup
  • BF16
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("heegyu/WizardVicuna-pythia-410m-deduped")
model = AutoModelForCausalLM.from_pretrained("heegyu/WizardVicuna-pythia-410m-deduped")

inputs = tokenizer(["Human: Hi\n\nAssistant: "], return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=16)
print(tokenizer.batch_decode(outputs, skip_special_tokens=False))

output: ['Human: Hi\n\nAssistant: Hello! How can I assist you today?<|endoftext|>']

Downloads last month
120
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for heegyu/WizardVicuna-pythia-410m-deduped

Quantizations
2 models

Dataset used to train heegyu/WizardVicuna-pythia-410m-deduped