Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,6 @@ This is a Llama 2 architecture model series trained on the TinyStories dataset,
|
|
10 |
|
11 |
Trained on a single v100 32GB GPU for 3 epochs, we achieve an inference speed of ~72 tokens/sec on the same.
|
12 |
|
13 |
-
Achieved tok/s: **
|
14 |
|
15 |
Learn more on how to run inference in pure C using [llama2.c](https://github.com/karpathy/llama2.c)
|
|
|
10 |
|
11 |
Trained on a single v100 32GB GPU for 3 epochs, we achieve an inference speed of ~72 tokens/sec on the same.
|
12 |
|
13 |
+
Achieved tok/s: **148.337596** on 12th Gen Intel(R) Core(TM) i9-12900HK
|
14 |
|
15 |
Learn more on how to run inference in pure C using [llama2.c](https://github.com/karpathy/llama2.c)
|