GPT2-Horoscopes

Model Description

GPT2 fine-tuned on Horoscopes dataset scraped from Horoscopes.com. This model generates horoscopes given a horoscope category.

Uses & Limitations

How to use

The model can be used directly with the HuggingFace pipeline API.

from transformers import AutoTokenizer, AutoModelWithLMHead

tokenizer = AutoTokenizer.from_pretrained("shahp7575/gpt2-horoscopes")
model = AutoModelWithLMHead.from_pretrained("shahp7575/gpt2-horoscopes")

Generation

Input Text Format - <|category|> {category_type} <|horoscope|>

Supported Categories - general, career, love, wellness, birthday

Example:

prompt = <|category|> career <|horoscope|>
prompt_encoded = torch.tensor(tokenizer.encode(prompt)).unsqueeze(0)
sample_outputs = model.generate(prompt, 
                                do_sample=True,   
                                top_k=40, 
                                max_length = 300,
                                top_p=0.95,
                                temperature=0.95,
                                num_return_sequences=1)

For reference this generation script can be used as well.

Training Data

Dataset is scraped from Horoscopes.com for 5 categories with a total of ~12k horoscopes. The dataset can be found on Kaggle.

Training Procedure

The model uses the GPT2 checkpoint and then is fine-tuned on horoscopes dataset for 5 different categories. Since the goal of the fine-tuned model was also to understand different horoscopes for different category types, the categories are added to the training data separated by special token <|category|>.

Training Parameters:

EPOCHS = 5
LEARNING RATE = 5e-4
WARMUP STEPS = 1e2
EPSILON = 1e-8
SEQUENCE LENGTH = 300

Evaluation Results

Loss: 2.77

Limitations

This model is only fine-tuned on horoscopes by categories. They do not, and neither attempt to, represent actual horoscopes. It is developed only for educational and learning purposes.

shahp7575
/

gpt2-horoscopes