BUT-FIT
/

Czech-GPT-2-XL-133k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

mfajcik commited on Oct 23, 2023

Commit

1769f8a

•

1 Parent(s): ec64de2

Update README.md

Files changed (1) hide show

README.md +21 -0

README.md CHANGED Viewed

@@ -40,6 +40,27 @@ Not mentioned parameters are the same as for GPT-2.
 | scheduler_steps            | 200,000       |                                                                                              |
 | scheduler_alpha            | 0.1           | So LR on last step is 0.1*(vanilla LR)                                                       |
 ## Evaluation
 We observed 10-shot result improvement over the course of training for sentiment analysis, and hellaswag-like commonsense reasoning.

 | scheduler_steps            | 200,000       |                                                                                              |
 | scheduler_alpha            | 0.1           | So LR on last step is 0.1*(vanilla LR)                                                       |
+## Usage
+```python
+from transformers import AutoTokenizer, GPT2LMHeadModel
+import torch
+t = AutoTokenizer.from_pretrained("BUT-FIT/Czech-GPT-2-XL-133k")
+m = GPT2LMHeadModel.from_pretrained("BUT-FIT/Czech-GPT-2-XL-133k").eval()
+# Try the model inference
+prompt = "Najznámějším českým spisovatelem "
+input_ids = t.encode(prompt, return_tensors="pt")
+with torch.no_grad():
+    generated_text = m.generate(input_ids=input_ids,
+                                do_sample=True,
+                                top_p=0.95,
+                                repetition_penalty=1.0,
+                                temperature=0.8,
+                                max_new_tokens=64,
+                                num_return_sequences=1)
+    print(t.decode(generated_text[0], skip_special_tokens=True))
+```
 ## Evaluation
 We observed 10-shot result improvement over the course of training for sentiment analysis, and hellaswag-like commonsense reasoning.