Danbooru Tag Implications Model

A FLAN-T5 Base model fine-tuned to predict Danbooru tag implications. Given a tag, the model outputs all tags that it implies according to Danbooru's tag implication system.

Model Description

This model learns the structured relationships between Danbooru tags, specifically the "implication" relationships where one tag automatically implies another. For example:

  • bikini implies swimsuit
  • cat_ears implies animal_ears
  • striped_panties implies both panties and striped_clothes

Base Model: google/flan-t5-base (248M parameters)

Training Data: 32,331 tag implication pairs from Danbooru

Task Format: implications: <tag> <implied_tag1>, <implied_tag2>, ...

Use Cases

  1. Tag completion in image generation workflows - Automatically add implied tags to prompts
  2. Tag validation - Ensure tag sets include all necessary implied tags
  3. Tag understanding - Learn the hierarchical relationships in Danbooru's tagging system

Training Details

Dataset

  • Source: Danbooru tag implications database (public data)
  • Size: 32,331 training examples
  • Filtering: Removed series-specific tags (e.g., tags with parentheses) from generic tag implications
  • Split: 99% train, 1% eval

Training Configuration

Seq2SeqTrainingArguments(
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    learning_rate=5e-5,
    num_train_epochs=3,
    bf16=True,
    predict_with_generate=True,
    generation_max_length=128,
    generation_num_beams=4,
)

Training Results

  • Final eval loss: ~0.027
  • Training time: ~36 minutes on single GPU
  • Inference speed: ~200ms per tag (GPU)

Usage

Basic Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_name = "Elldreth/danbooru-tag-implications-flan-t5"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

def get_implications(tag):
    input_text = f"implications: {tag}"
    inputs = tokenizer(input_text, return_tensors="pt")
    outputs = model.generate(**inputs, max_new_tokens=128, num_beams=4)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Examples
print(get_implications("bikini"))           # Output: swimsuit
print(get_implications("cat_ears"))         # Output: animal_ears
print(get_implications("striped_panties"))  # Output: panties, striped_clothes

Expanding a Full Tag Set

def expand_tags(tags_string):
    """Expand all tags in a comma-separated string"""
    tags = [t.strip() for t in tags_string.split(',')]
    expanded = set(tags)
    
    for tag in tags:
        implications = get_implications(tag)
        if implications:
            expanded.update([t.strip() for t in implications.split(',')])
    
    return ', '.join(sorted(expanded))

# Example
input_tags = "1girl, bikini, cat_ears"
expanded_tags = expand_tags(input_tags)
print(expanded_tags)
# Output: 1girl, animal_ears, bikini, cat_ears, swimsuit

Important: Guard Against Unknown Tags

The model was trained on specific Danbooru tags. For production use, you should only query tags that exist in the training data to avoid hallucinations:

import json

# Load the training dataset to get valid tags
tags_with_implications = set()
with open('tag_implications_dataset.jsonl', 'r') as f:
    for line in f:
        data = json.loads(line)
        tag = data['input'].replace('implications: ', '')
        tags_with_implications.add(tag)

def get_implications_safe(tag):
    if tag not in tags_with_implications:
        return ""  # Tag has no known implications
    return get_implications(tag)

Examples

Clothing Tags

Input Output
bikini swimsuit
school_swimsuit swimsuit
sleeveless_dress dress, sleeveless
striped_panties panties, striped_clothes

Animal Features

Input Output
cat_ears animal_ears
dog_ears animal_ears
fox_tail tail

Complex Implications

Input Output
striped_bikini bikini, striped_clothes, swimsuit
black_dress dress

Limitations

  1. Only works with Danbooru tags - The model is trained on specific Danbooru tag names (underscore-separated)
  2. No natural language - Input must be exact tag names, not descriptions
  3. May hallucinate on unknown tags - Always use the guard mechanism for production
  4. Generic tags only - Series-specific tags (with parentheses) were filtered from generic tag implications
  5. English-centric - Primarily English tag names

Training Data Filtering

To prevent generic tags from suggesting series-specific tags, we applied this rule:

  • If an input tag has no parentheses output tags with parentheses are filtered out
  • Example: bikini won't suggest swimsuit_(series_name)
  • Series-specific tags can still imply other series-specific tags

Hardware Requirements

  • Inference: ~1.5GB VRAM (GPU) or 2GB RAM (CPU)
  • Model size: 945 MB on disk
  • Recommended: GPU with CUDA for best performance

Citation

If you use this model, please cite the Danbooru tag implications data:

Danbooru Tag Implications Database
https://danbooru.donmai.us/

License

Apache 2.0 - Same as the base FLAN-T5 model

Model Card Authors

Created as part of the Danbooru Tag Expander project for ComfyUI.

Downloads last month
20
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Elldreth/danbooru-tag-implications-flan-t5

Finetuned
(868)
this model