bart-base-instructiongen + LongForm
Instead of generating questions from text, generate instructions for LLMs!
- Check out a basic demo on Spaces
- An example of how to use instructiongen models in a CLI script can be found here
- You can find other models fine-tuned for instruction generation by searching for the instructiongen tag.
about
This model is a fine-tuned version of pszemraj/bart-base-instructiongen on the akoksal/LongForm
dataset.
This was trained on a dataset of only instructions+outputs, with any inputs
filtered out. This means that text of 1) cookies and cream 2) chocolate chip 3) mint chip 4) oreo will not get you "Rank the following ice cream flavors: oreo, mint chip, chocolate chip, cookies and cream".
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 8e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 16
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.02
- num_epochs: 3.0
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.