rpand002 commited on
Commit
6bbcb9d
Β·
verified Β·
1 Parent(s): f6d0ad7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -74,7 +74,7 @@ Granite-3.1-1B-A400M-Base is based on a decoder-only sparse Mixture of Experts (
74
  | Number of experts | β€” | β€” | **32** | 40 |
75
  | MoE TopK | β€” | β€” | **8** | 8 |
76
  | Initialization std | 0.1 | 0.1 | **0.1** | 0.1 |
77
- | Sequence length | 128K | 128k | **128k** | 128k |
78
  | Position embedding | RoPE | RoPE | **RoPE** | RoPE |
79
  | # Parameters | 2.5B | 8.1B | **1.3B** | 3.3B |
80
  | # Active parameters | 2.5B | 8.1B | **400M** | 800M |
 
74
  | Number of experts | β€” | β€” | **32** | 40 |
75
  | MoE TopK | β€” | β€” | **8** | 8 |
76
  | Initialization std | 0.1 | 0.1 | **0.1** | 0.1 |
77
+ | Sequence length | 128K | 128K | **128K** | 128K |
78
  | Position embedding | RoPE | RoPE | **RoPE** | RoPE |
79
  | # Parameters | 2.5B | 8.1B | **1.3B** | 3.3B |
80
  | # Active parameters | 2.5B | 8.1B | **400M** | 800M |