Spaces:

sayanbanerjee32
/

nanogpt2_text_generator

Running

sayanbanerjee32 commited on Jun 28, 2024

Commit

d451c49

verified ·

1 Parent(s): 89a0c0e

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -10,4 +10,20 @@ pinned: false
 license: mit
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 license: mit
 ---
+## Dataset
+Collection of William Shakespeare plays
+- tiktoken - gpt2 tokenizer is used for tokenization
+- Number of total tokens - 338025
+## Model
+The model is available [here](https://huggingface.co/sayanbanerjee32/nanogpt2_test)
+## The HuggingFace Spaces Gradio App
+The App takes following as input
+1. Seed Text (Prompt) - This is provided as input text to the GPT model, based on which it generates further contents. If no data is provided, the only a space (" ") is provided as input
+2. Max tokens to generate - This controls the numbers of tokens it will generate. The default value is 100.
+3. Temperature - This accepts values between 0 to 1. Higher value introduces more randomness in the next token generation. Default value is set to 0.7.
+4. Select Top N in each step - This is an optional field. If no value is provided (or <= 0), all available tokens are considered for the next token prediction based on SoftMax probability. However, if a number is set then only that many top tokes will be considered for the next token prediction.