|
This is my first attempt at finetuning the GPT AI.
|
|
Mia is based on GPT-Neo-125M and was trained on the original training data of the AI Dungeon model_v5 model.
|
|
It is a much smaller model than the original model_v5 and will run a lot quicker.
|
|
Training seemed to have helped with it not repeating commands back to you, but 125M is not large enough for proper gameplay.
|
|
|
|
The model was entirely trained on Google Colab (125M is the largest you can train) and ran for almost 3 epoch's (The last 5 mintues I ran out of diskspace but managed to export the model).
|
|
- Henk717 |