File size: 531 Bytes
1bf2110
f8c8d57
 
1bf2110
 
 
9df72c0
 
fabe18e
 
 
 
 
 
92f820b
fabe18e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
---
language:
- en
license: mit
datasets:
- Salesforce/wikitext
---

This is a custom implementation of gpt2, where we replace attention with our implementation. Currently, we don't replace softmax, but in future submits we would like to replace the softmax function in attention with other softmax variations. 

We directly use the huggingface gpt2 model: https://huggingface.co/openai-community/gpt2 

This model was finetuned on the wikitext dataset: https://paperswithcode.com/dataset/wikitext-2 


base model: huggingface gpt2