--- license: creativeml-openrail-m language: - en widget: - text: 1girl, fate - text: 1boy, league of legends - text: 1girl, genshin impact - text: 1boy, national basketball association - text: 1girl, spy x - text: 1girl, absurdres tags: - stable-diffusion - anime - anything-v4 - art - arxiv:2210.14140 datasets: - FredZhang7/anime-prompts-180K --- ## Fast Anime PromptGen The main model (`pytorch_model.bin`) was trained on a dataset of **80,000** anime tags and for 3 epochs. I fetched the tags from the [Safebooru API endpoint](https://safebooru.donmai.us/posts/random.json), but only accepted the ones with **up_score ≥ 8** and without any [blacklisted tags](./blacklist.txt). I didn't release the V1 model because it only generated gibberish prompts. After trying all means to correct that behavior, I eventually figured that the cause of the gibberish prompts is not from the model or training duration, but rather from the random usernames in the training data. Here's the complete [prompt preprocessing algorithm](./preprocess.py). Todo: - upload Danbooru model ## Text-to-image Examples Prefix *1girl* | [Generated *1girl* prompts](./anime_girl_settings.txt) | Model *Anything V4* ![](./anime_girls.png) Prefix *1boy*  | [Generated *1boy* prompts](./anime_boy_settings.txt) | Model *Anything V4* ![](./anime_boys.png) ## Contrastive Search ``` pip install --upgrade transformers ``` ```python import torch from transformers import GPT2Tokenizer, GPT2LMHeadModel, pipeline tokenizer = GPT2Tokenizer.from_pretrained('distilgpt2') tokenizer.add_special_tokens({'pad_token': '[PAD]'}) model = GPT2LMHeadModel.from_pretrained('FredZhang7/anime-anything-promptgen') prompt = r'1girl, genshin' # generate text using fine-tuned model nlp = pipeline('text-generation', model=model, tokenizer=tokenizer) # generate 10 samples using greedy search outs = nlp(prompt, max_length=76, num_return_sequences=10, do_sample=True, repetition_penalty=1.1, temperature=0.7, top_k=4, early_stopping=True) print('\nInput:\n' + 100 * '-') print('\033[96m' + prompt + '\033[0m') print('\nOutput:\n' + 100 * '-') for i in range(len(outs)): # remove trailing commas and double spaces outs[i] = str(outs[i]['generated_text']).replace(' ', '').rstrip(',') print('\033[92m' + '\n\n'.join(outs) + '\033[0m\n') ``` Output Example: ![](./contrastive_search.png) Please see [Fast GPT PromptGen](https://huggingface.co/FredZhang7/distilgpt2-stable-diffusion-v2) for more info on the pipeline parameters. ## Tips - If you feel like a generated anime character doesn't show emotions, try emoticons like `;o`, `:o`, `;p`, `:d`, `:p`, and `;d` in the prompt. I often use `happy smirk`, `happy smile`, `laughing closed eyes`, etc. to make the characters more lively and expressive. - Adding `absurdres`, instead of `highres` and `masterpiece`, to a prompt tends to increase the sharpness of a generated image.