tiiuae
/

Falcon3-7B-Base

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

puneeshkhanna commited on 6 days ago

Commit

cee7ab0

•

1 Parent(s): ab2a252

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -19,11 +19,11 @@ Falcon3-7B-Base supports 4 languages (english, french, spanish, portuguese) and
 ## Model Details
 - Architecture
-  - transformer based causal decoder only architecture
   - 28 decoder blocks
-  - grouped query attention (GQA) for faster inference: 12 query heads and 4 KV heads
-  - wider head dimension: 256
-  - high RoPE value to support long context understanding: 1000042
   - 32k context length
   - 131k vocab size
 - Pretrained on 14 Gigatokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips

 ## Model Details
 - Architecture
+  - Transformer based causal decoder only architecture
   - 28 decoder blocks
+  - Grouped query attention (GQA) for faster inference: 12 query heads and 4 KV heads
+  - Wider head dimension: 256
+  - High RoPE value to support long context understanding: 1000042
   - 32k context length
   - 131k vocab size
 - Pretrained on 14 Gigatokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips