puneeshkhanna commited on
Commit
cee7ab0
1 Parent(s): ab2a252

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -19,11 +19,11 @@ Falcon3-7B-Base supports 4 languages (english, french, spanish, portuguese) and
19
 
20
  ## Model Details
21
  - Architecture
22
- - transformer based causal decoder only architecture
23
  - 28 decoder blocks
24
- - grouped query attention (GQA) for faster inference: 12 query heads and 4 KV heads
25
- - wider head dimension: 256
26
- - high RoPE value to support long context understanding: 1000042
27
  - 32k context length
28
  - 131k vocab size
29
  - Pretrained on 14 Gigatokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips
 
19
 
20
  ## Model Details
21
  - Architecture
22
+ - Transformer based causal decoder only architecture
23
  - 28 decoder blocks
24
+ - Grouped query attention (GQA) for faster inference: 12 query heads and 4 KV heads
25
+ - Wider head dimension: 256
26
+ - High RoPE value to support long context understanding: 1000042
27
  - 32k context length
28
  - 131k vocab size
29
  - Pretrained on 14 Gigatokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips