Question 2 (sorry for asking so many questions :( )
#5
by
Skorcht
- opened
your models are so good! what format do you train with? sharegpt or alpaca? im abit confused on what format.. and do you manually clean your datasets or use a program / llm to cleanse the data.
Sharegpt datasets -> llama 3 instruct format specified during training
I mainly use a bunch of python scripts and manually clean the data myself
Hey! Quick question too, on chaiverse you published a Stheno 4.2 that did top score, do you plan on releasing it :o?
Unfortunately no π
While it had top scores it was extremely unstable and schizo. Half of the time it was coherent, half the time it was nonsense rambling.
I still have no idea how it got top score lmao.