Technotech's picture
Update README.md
fc61f75
|
raw
history blame
2.33 kB
metadata
library_name: transformers
license: apache-2.0
datasets:
  - Gustavosta/Stable-Diffusion-Prompts
language:
  - en
tags:
  - completion
inference:
  parameters:
    max_new_tokens: 20
    do_sample: true
    temperature: 2
    num_beams: 10
    repetition_penalty: 1.2
    top_k: 40
    top_p: 0.75

MagicPrompt TinyStories-33M (Merged)

Info

Magic prompt completion model trained on a dataset 70k Stable Diffusion prompts. Base model: TinyStories-33M. Inspired by MagicPrompt-Stable-Diffusion.

Model seems to be pretty decent for 33M params, but it clearly lacks much of an understanding of pretty much anything. Still, considering the size, I think it's decent. Whether you would use this over a small GPT-2 based model is up to you.

Examples

Generation settings: max_new_tokens=40, do_sample=True, temperature=2.0, num_beams=10, repetition_penalty=1.2, top_k=40, top_p=0.75, eos_token_id=tokenizer.eos_token_id (there may be better settings).

(Bold text is generated by the model)

"A close shot of a bird in a jungle, with two legs, with long hair on a tall, long brown body, long white skin, sharp teeth, high bones, digital painting, artstation, concept art, illustration by wlop,"

"Camera shot of a strange young girl wearing a cloak, wearing a mask in clothes, with long curly hair, long hair, black eyes, dark skin, white teeth, long brown eyes eyes, big eyes, sharp"

"An illustration of a house, stormy weather, sun, moonlight, night, concept art, 4 k, wlop, by wlop, by jose stanley, ilya kuvshinov, sprig"

"A field of flowers, camera shot, 70mm lens, fantasy, intricate, highly detailed, artstation, concept art, sharp focus, illustration, illustration, artgerm jake daggaws, artgerm and jaggodieie brad"

Training config

  • Rank 16 LoRA
  • Trained on Gustavosta/Stable-Diffusion-Prompts for 10 epochs
  • Batch size of 64

Training procedure

The following bitsandbytes quantization config was used during training:

  • load_in_8bit: False
  • load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: fp4
  • bnb_4bit_use_double_quant: False
  • bnb_4bit_compute_dtype: float32

Framework versions

  • PEFT 0.5.0.dev0