README.md · huggingartists/morgenshtern at e11876862466057d3f0ddeb685b22fa3cae011b7

metadata

language: en
datasets:
  - huggingartists/morgenshtern
tags:
  - huggingartists
  - lyrics
  - lm-head
  - causal-lm
widget:
  - text: I am

🤖 HuggingArtists Model 🤖

MORGENSHTERN

@morgenshtern

I was made with huggingartists.

Create your own bot based on your favorite artist with the demo!

How does it work?

To understand how the model was developed, check the W&B report.

Training data

The model was trained on lyrics from MORGENSHTERN.

Dataset is available here. And can be used with:

from datasets import load_dataset

dataset = load_dataset("huggingartists/morgenshtern")

Explore the data, which is tracked with W&B artifacts at every step of the pipeline.

Training procedure

The model is based on a pre-trained GPT-2 which is fine-tuned on MORGENSHTERN's lyrics.

Hyperparameters and metrics are recorded in the W&B training run for full transparency and reproducibility.

At the end of training, the final model is logged and versioned.

How to use

You can use this model directly with a pipeline for text generation:

from transformers import pipeline
generator = pipeline('text-generation',
                     model='huggingartists/morgenshtern')
generator("I am", num_return_sequences=5)

Or with Transformers library:

from transformers import AutoTokenizer, AutoModelWithLMHead
  
tokenizer = AutoTokenizer.from_pretrained("huggingartists/morgenshtern")

model = AutoModelWithLMHead.from_pretrained("huggingartists/morgenshtern")

Limitations and bias

The model suffers from the same limitations and bias as GPT-2.

In addition, the data present in the user's tweets further affects the text generated by the model.

About

Built by Aleksey Korshuk

For more details, visit the project repository.