This model is currently being fine-tuned with deepspeed+bf16 weights using the dataset from Robert A. Gonsalves' article "I Once Trained an AI to Rhyme, and It Took GPT-J a Long Time. Since the Colab was slow, I upgraded to Pro. Each limerick cost me a dime."
https://towardsdatascience.com/i-once-trained-an-ai-to-rhyme-and-it-took-gpt-j-a-long-time-de1f98925e17
Some examples generated by the 8-bit version of this model (separately fine-tuned on 1 epoch with 1 rtx3090):
I've a limerick model file,
Which, when opened, presents a pile
Of bad-to-good verse.
I don't think it's much worse
Than the limerick I wrote in my style.
On your index cards, write down your need,
And arrange them in order of speed.
When you're done, you'll recall
Which one's quicker than all,
And you'll know which is best, if indeed.
Unfortunately, support for 8-bit FTing doesn't seem widely available yet on HF:
"8-bit state dicts cannot currently be loaded directly into the 8-bit model after being pushed on the Hub. This is due to the fact that the statistics (remember weight.CB and weight.SCB) computed by the model are not currently stored or taken into account inside the state dict, and the Linear8bitLt module does not support this feature yet. We think that having the ability to save that and push it to the Hub might contribute to greater accessibility."
https://huggingface.co/blog/hf-bitsandbytes-integration#saving-8-bit-state-dicts-on-the-hub
Here's what the bf16 model (not the unavailable 8bit model) could do after 160 steps with batch size 24:
Prompt: Baseball
0: fun \ gun \ games \ names \ on
1: games \ rages \ play \ day \ cheers
2: all \ call \ old \ gold \ hell
3: games \ rants \ all \ call \ beers
4: all \ shall \ games \ guys \ ball
5: game \ name \ best \ chest \ fame
6: games \ dreams \ drehs \ prehs \ kwames
7: games \ fears \ yanks \ cheers \ beers
Going through the multi-step process outlined by Robert A. Gonsalves in his article, it is possible to make a very crude limerick-like poem using the new bf16 trained weights, despite the fact that it hasn't seen much phonetic data yet from the training set:
You've got to be careful when you game:
Don't forget that they've got a name
For some of the best
Baseball games on the chest
If you forget, then they'll have your fame.
I have no idea what that means, but it's basically a limerick.
Possible improvements to implement:
- Use IPA (or, as R. Gonsalves suggests, use eSpeak) instead of Festival phonetic tokens to incorporate syllable stress.
- Better align the task formatting with the model's tokenization system.
- Downloads last month
- 18