v2 Model Card

Overview

This repository contains multiple models trained using the GPT-2 architecture for generating creative stories, superhero names, and abilities. The models are designed to assist in generating narrative content based on user prompts.

Model Variants

Story Model: Generates stories based on prompts.
Name Model: Generates superhero names based on story context.
Abilities Model: Generates superhero abilities based on story context.
Midjourney Model: Generates mid-journey prompts for storytelling.

Training Data

The models were trained on a custom dataset stored in batch_ds_v2.txt, which includes various story prompts, superhero names, and abilities. The dataset was preprocessed to extract relevant parts for training.

Training Procedure

Framework: PyTorch with Hugging Face Transformers
Model: GPT-2
Training Arguments:
- Learning Rate: 1e-4
- Number of Epochs: 15
- Max Steps: 5000
- Batch Size: Auto-detected
- Gradient Clipping: 1.0
- Logging Steps: 1

Evaluation

The models were evaluated based on their ability to generate coherent and contextually relevant text. Specific metrics were not provided, but qualitative assessments were made during development.

Inference

To use the models for inference, you can send a POST request to the /generate/<model_path> endpoint of the Flask application. The input should be a JSON object containing the input_text key.

Example Request

{
"input_text": "[Ivan Ivanov, Lead Software Engineer, Superhero for Justice, Writing code, fixing issues, solving problems, Masculine, Long Hair, Adult]<|endoftext|>"
}

Usage

Loading a Model

You can load a model and its tokenizer as follows:

from transformers import GPT2LMHeadModel, GPT2Tokenizer
model_name = "v2/story/small" # Change to your desired model path
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

Generating Text

To generate text using the loaded model, use the following code:

input_text = "Once upon a time"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output = model.generate(input_ids, max_length=50, do_sample=True)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

Limitations

The models may generate biased or nonsensical outputs based on the training data.
They may not always understand complex prompts or context, leading to irrelevant or inaccurate responses.
The models are sensitive to input phrasing; slight changes in the prompt can yield different results.