GPT2-Horoscopes
Model Description
GPT2 fine-tuned on Horoscopes dataset scraped from Horoscopes.com. This model generates horoscopes given a horoscope category.
Uses & Limitations
How to use
The model can be used directly with the HuggingFace pipeline
API.
from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained("shahp7575/gpt2-horoscopes")
model = AutoModelWithLMHead.from_pretrained("shahp7575/gpt2-horoscopes")
Generation
Input Text Format - <|category|> {category_type} <|horoscope|>
Supported Categories - general, career, love, wellness, birthday
Example:
prompt = <|category|> career <|horoscope|>
prompt_encoded = torch.tensor(tokenizer.encode(prompt)).unsqueeze(0)
sample_outputs = model.generate(prompt,
do_sample=True,
top_k=40,
max_length = 300,
top_p=0.95,
temperature=0.95,
num_return_sequences=1)
For reference this generation script can be used as well.
Training Data
Dataset is scraped from Horoscopes.com for 5 categories with a total of ~12k horoscopes. The dataset can be found on Kaggle.
Training Procedure
The model uses the GPT2 checkpoint and then is fine-tuned on horoscopes dataset for 5 different categories. Since the goal of the fine-tuned model was also to understand different horoscopes for different category types, the categories are added to the training data separated by special token <|category|>
.
Training Parameters:
- EPOCHS = 5
- LEARNING RATE = 5e-4
- WARMUP STEPS = 1e2
- EPSILON = 1e-8
- SEQUENCE LENGTH = 300
Evaluation Results
Loss: 2.77
Limitations
This model is only fine-tuned on horoscopes by categories. They do not, and neither attempt to, represent actual horoscopes. It is developed only for educational and learning purposes.