sayanbanerjee32 commited on
Commit
d451c49
·
verified ·
1 Parent(s): 89a0c0e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -1
README.md CHANGED
@@ -10,4 +10,20 @@ pinned: false
10
  license: mit
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  license: mit
11
  ---
12
 
13
+ ## Dataset
14
+ Collection of William Shakespeare plays
15
+ - tiktoken - gpt2 tokenizer is used for tokenization
16
+ - Number of total tokens - 338025
17
+
18
+ ## Model
19
+
20
+ The model is available [here](https://huggingface.co/sayanbanerjee32/nanogpt2_test)
21
+
22
+ ## The HuggingFace Spaces Gradio App
23
+
24
+ The App takes following as input
25
+ 1. Seed Text (Prompt) - This is provided as input text to the GPT model, based on which it generates further contents. If no data is provided, the only a space (" ") is provided as input
26
+ 2. Max tokens to generate - This controls the numbers of tokens it will generate. The default value is 100.
27
+ 3. Temperature - This accepts values between 0 to 1. Higher value introduces more randomness in the next token generation. Default value is set to 0.7.
28
+ 4. Select Top N in each step - This is an optional field. If no value is provided (or <= 0), all available tokens are considered for the next token prediction based on SoftMax probability. However, if a number is set then only that many top tokes will be considered for the next token prediction.
29
+