Text Generation
Transformers
Safetensors
English
stripedhyena
custom_code
Zymrael commited on
Commit
06bc59c
1 Parent(s): 755a2d3

chore: add some model card info

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ ---
6
+
7
+ ## StripedHyena-Hessian-7B (SH-7B)
8
+
9
+
10
+ ### Model Architecture
11
+
12
+ The architecture of StripedHyena-Hessian-7B is quite different from traditional decoder-only Transformers.
13
+
14
+ StripedHyena is a hybrid architecture composed of multi-head, grouped-query attention and gated convolutions arranged in [Hyena](https://arxiv.org/abs/2302.10866) blocks.
15
+