SegMoE-4x2-v0 / README.md
Warlord-K's picture
Add Sample Images
95941d5 verified
|
raw
history blame
5.85 kB
---
license: apache-2.0
tags:
- text-to-image
- ultra-realistic
- text-to-image
- stable-diffusion
- mixture-of-experts
- segmoe
pinned: true
library_name: diffusers
---
# SegMoE-4x2-v0: Segmind Mixture of Diffusion Experts
![image](./image.png)
Untrained Segmind Mixture of Diffusion Experts Model generated using [segmoe](https://github.com/segmind/segmoe).
## Usage
This model can be used via the [segmoe](https://github.com/segmind/segmoe) library.
Make sure to install segmoe by running
```bash
pip install segmoe
```
```python
from segmoe import SegMoEPipeline
pipeline = SegMoEPipeline("segmind/SegMoE-4x2-v0", device = "cuda")
prompt = "cosmic canvas, orange city background, painting of a chubby cat"
negative_prompt = "nsfw, bad quality, worse quality"
img = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
height=1024,
width=1024,
num_inference_steps=25,
guidance_scale=7.5,
).images[0]
img.save("image.png")
```
![image/png](https://cdn-uploads.huggingface.co/production/uploads/62f8ca074588fe31f4361dae/HgF6DLC-_3igZT6kFIq4J.png)
### Config
Config Used to create this Model is:
```yaml
base_model: SG161222/RealVisXL_V3.0
num_experts: 4
moe_layers: all
num_experts_per_tok: 2
experts:
- source_model: frankjoshua/juggernautXL_v8Rundiffusion
positive_prompt: "aesthetic, cinematic, hands, portrait, photo, illustration, 8K, hyperdetailed, origami, man, woman, supercar"
negative_prompt: "(worst quality, low quality, normal quality, lowres, low details, oversaturated, undersaturated, overexposed, underexposed, grayscale, bw, bad photo, bad photography, bad art:1.4), (watermark, signature, text font, username, error, logo, words, letters, digits, autograph, trademark, name:1.2), (blur, blurry, grainy), morbid, ugly, asymmetrical, mutated malformed, mutilated, poorly lit, bad shadow, draft, cropped, out of frame, cut off, censored, jpeg artifacts, out of focus, glitch, duplicate, (airbrushed, cartoon, anime, semi-realistic, cgi, render, blender, digital art, manga, amateur:1.3), (3D ,3D Game, 3D Game Scene, 3D Character:1.1), (bad hands, bad anatomy, bad body, bad face, bad teeth, bad arms, bad legs, deformities:1.3)"
- source_model: SG161222/RealVisXL_V3.0
positive_prompt: "cinematic, portrait, photograph, instagram, fashion, movie, macro shot, 8K, RAW, hyperrealistic, ultra realistic,"
negative_prompt: "(octane render, render, drawing, anime, bad photo, bad photography:1.3), (worst quality, low quality, blurry:1.2), (bad teeth, deformed teeth, deformed lips), (bad anatomy, bad proportions:1.1), (deformed iris, deformed pupils), (deformed eyes, bad eyes), (deformed face, ugly face, bad face), (deformed hands, bad hands, fused fingers), morbid, mutilated, mutation, disfigured"
- source_model: albertushka/albertushka_DynaVisionXL
positive_prompt: "minimalist, illustration, award winning art, painting, impressionist, comic, colors, sketch, pencil drawing,"
negative_prompt: "Compression artifacts, bad art, worst quality, low quality, plastic, fake, bad limbs, conjoined, featureless, bad features, incorrect objects, watermark, ((signature):1.25), logo"
- source_model: frankjoshua/albedobaseXL_v13
positive_prompt: "photograph f/1.4, ISO 200, 1/160s, 8K, RAW, unedited, symmetrical balance, in-frame, 8K"
negative_prompt: "nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, blurry"
```
### Other Variants
We release 3 merges on Hugging Face,
- [SegMoE 2x1](https://huggingface.co/segmind/SegMoE-2x1-v0) has two expert models.
- [SegMoE SD 4x2](https://huggingface.co/segmind/SegMoE-sd-4x2-v0) has four Stable Diffusion 1.5 expert models.
## Comparison
The Prompt Understanding seems to improve as shown in the images below. From Left to Right SegMoE-2x1-v0, SegMoE-4x2-v0, Base Model ([RealVisXL_V3.0](https://huggingface.co/SG161222/RealVisXL_V3.0))
![image](https://github.com/segmind/segmoe/assets/95569637/bcdc1b11-bbf5-4947-b6bb-9f745ff0c040)
<div align="center">three green glass bottles</div>
<br>
![image](https://github.com/segmind/segmoe/assets/95569637/d50e2af0-66d2-4112-aa88-bd4df88cbd5e)
<div align="center">panda bear with aviator glasses on its head</div>
<br>
![image](https://github.com/segmind/segmoe/assets/95569637/aba2954a-80c2-428a-bf76-0a70a5e03e9b)
<div align="center">the statue of Liberty next to the Washington Monument</div>
### Model Description
- **Developed by:** [Segmind](https://www.segmind.com/)
- **Developers:** [Yatharth Gupta](https://huggingface.co/Warlord-K) and [Vishnu Jaddipal](https://huggingface.co/Icar).
- **Model type:** Diffusion-based text-to-image generative mixture of experts model
- **License:** Apache 2.0
### Out-of-Scope Use
The SegMoE-4x2-v0 Model is not suitable for creating factual or accurate representations of people, events, or real-world information. It is not intended for tasks requiring high precision and accuracy.
## Advantages
+ Benefits from The Knowledge of Several Finetuned Experts
+ Training Free
+ Better Adaptability to Data
+ Model Can be upgraded by using a better finetuned model as one of the experts.
## Limitations
+ Though the Model improves upon the fidelity of images as well as adherence, it does not be drastically better than any one expert without training and relies on the knowledge of the experts.
+ This is not yet optimized for speed.
+ The framework is not yet optimized for memory usage.
## Citation
```bibtex
@misc{segmoe,
author = {Yatharth Gupta, Vishnu V Jaddipal, Harish Prabhala},
title = {SegMoE},
year = {2024},
publisher = {HuggingFace},
journal = {HuggingFace Models},
howpublished = {\url{https://huggingface.co/segmind/SegMoE-4x2-v0}}
}
```