puneeshkhanna
commited on
Commit
•
1d32dc8
1
Parent(s):
204fce4
Update README.md
Browse files
README.md
CHANGED
@@ -24,6 +24,7 @@ Falcon3-7B-Base supports 4 languages (english, french, spanish, portuguese) and
|
|
24 |
- Grouped query attention (GQA) for faster inference: 12 query heads and 4 key value heads
|
25 |
- Wider head dimension: 256
|
26 |
- High RoPE value to support long context understanding: 1000042
|
|
|
27 |
- 32K context length
|
28 |
- 131K vocab size
|
29 |
- Pretrained on 14 Gigatokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips
|
|
|
24 |
- Grouped query attention (GQA) for faster inference: 12 query heads and 4 key value heads
|
25 |
- Wider head dimension: 256
|
26 |
- High RoPE value to support long context understanding: 1000042
|
27 |
+
- Uses SwiGLU and RMSNorm
|
28 |
- 32K context length
|
29 |
- 131K vocab size
|
30 |
- Pretrained on 14 Gigatokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips
|