Catastrophic forgetting test results:

Initial evaluation loss on 1k subset of HuggingFaceTB/cosmopedia-100k dataset was 1.102. 100 steps of LISA training reduced this to 1.049.

Comparison to control: cosmo-1b started out with 1.003 loss on (a different subset of) dataset, increasing to 1.024 at 100 steps.

Axolotl config: Same as qdora version but without dora.

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

Model tree for Lambent/cosmo-1b-qlora-pythontest

Merges