File size: 1,209 Bytes
b1b83da
 
 
 
 
 
 
 
 
 
 
 
d451c49
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
---
title: Nanogpt2 Text Generator
emoji: 🏢
colorFrom: blue
colorTo: gray
sdk: gradio
sdk_version: 4.37.1
app_file: app.py
pinned: false
license: mit
---

## Dataset
Collection of William Shakespeare plays
- tiktoken - gpt2 tokenizer is used for tokenization
- Number of total tokens - 338025

## Model

The model is available [here](https://huggingface.co/sayanbanerjee32/nanogpt2_test)

## The HuggingFace Spaces Gradio App

The App takes following as input 
1. Seed Text (Prompt) - This is provided as input text to the GPT model, based on which it generates further contents. If no data is provided, the only a space (" ") is provided as input
2. Max tokens to generate - This controls the numbers of tokens it will generate. The default value is 100.
3. Temperature - This accepts values between 0 to 1. Higher value introduces more randomness in the next token generation. Default value is set to 0.7.
4. Select Top N in each step - This is an optional field. If no value is provided (or <= 0), all available tokens are considered for the next token prediction based on SoftMax probability. However, if a number is set then only that many top tokes will be considered for the next token prediction.