puneeshkhanna commited on
Commit
1d32dc8
1 Parent(s): 204fce4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -24,6 +24,7 @@ Falcon3-7B-Base supports 4 languages (english, french, spanish, portuguese) and
24
  - Grouped query attention (GQA) for faster inference: 12 query heads and 4 key value heads
25
  - Wider head dimension: 256
26
  - High RoPE value to support long context understanding: 1000042
 
27
  - 32K context length
28
  - 131K vocab size
29
  - Pretrained on 14 Gigatokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips
 
24
  - Grouped query attention (GQA) for faster inference: 12 query heads and 4 key value heads
25
  - Wider head dimension: 256
26
  - High RoPE value to support long context understanding: 1000042
27
+ - Uses SwiGLU and RMSNorm
28
  - 32K context length
29
  - 131K vocab size
30
  - Pretrained on 14 Gigatokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips