--- license: mit datasets: - roneneldan/TinyStories language: - en --- This is a Llama 2 architecture model series trained on the TinyStories dataset, intended for use in the [llama2.c](https://github.com/karpathy/llama2.c) project by Andrej Karpathy. trained on a single V100 32GB gpu for 3 epochs, we achieve an inference speed of ~72 tokens/sec. learn more on how to run inference in pure C using [llama2.c](https://github.com/karpathy/llama2.c)