Update README.md
Browse files
README.md
CHANGED
@@ -24,7 +24,7 @@ Gemma v2 is a large language model released by Google on Jun 27th 2024.
|
|
24 |
- Original model: [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it)
|
25 |
|
26 |
The model is packaged into executable weights, which we call
|
27 |
-
[llamafiles](https://github.com/Mozilla-Ocho/llamafile)
|
28 |
easy to use the model on Linux, MacOS, Windows, FreeBSD, OpenBSD, and
|
29 |
NetBSD for AMD64 and ARM64.
|
30 |
|
@@ -75,11 +75,9 @@ of the README.
|
|
75 |
|
76 |
When using the browser GUI, you need to fill out the following fields.
|
77 |
|
78 |
-
Prompt template:
|
79 |
|
80 |
```
|
81 |
-
<start_of_turn>system
|
82 |
-
{{prompt}}<end_of_turn>
|
83 |
{{history}}
|
84 |
<start_of_turn>{{char}}
|
85 |
```
|
@@ -109,9 +107,14 @@ AMD64.
|
|
109 |
|
110 |
## About Quantization Formats
|
111 |
|
112 |
-
This model works
|
113 |
-
|
114 |
-
|
|
|
|
|
|
|
|
|
|
|
115 |
|
116 |
---
|
117 |
|
|
|
24 |
- Original model: [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it)
|
25 |
|
26 |
The model is packaged into executable weights, which we call
|
27 |
+
[llamafiles](https://github.com/Mozilla-Ocho/llamafile). This makes it
|
28 |
easy to use the model on Linux, MacOS, Windows, FreeBSD, OpenBSD, and
|
29 |
NetBSD for AMD64 and ARM64.
|
30 |
|
|
|
75 |
|
76 |
When using the browser GUI, you need to fill out the following fields.
|
77 |
|
78 |
+
Prompt template (note: this is for chat; Gemma doesn't have a system role):
|
79 |
|
80 |
```
|
|
|
|
|
81 |
{{history}}
|
82 |
<start_of_turn>{{char}}
|
83 |
```
|
|
|
107 |
|
108 |
## About Quantization Formats
|
109 |
|
110 |
+
This model works well with any quantization format. Q6\_K is the best
|
111 |
+
choice overall here. We tested that, with [our 27b Gemma2
|
112 |
+
llamafiles](https://huggingface.co/jartine/gemma-2-27b-it-llamafile),
|
113 |
+
that the llamafile implementation of Gemma2 is able to to produce
|
114 |
+
identical responses to the Gemma2 model that's hosted by Google on
|
115 |
+
aistudio.google.com. Therefore we'd assume these 9b llamafiles are also
|
116 |
+
faithful to Google's intentions. If you encounter any divergences, then
|
117 |
+
try using the BF16 weights, which have the original fidelity.
|
118 |
|
119 |
---
|
120 |
|