Surely this is misconfigured? I haven't seen image generation this bad in many years.

#5
by Nafnlaus - opened

This is a bug.... right?

Post the images?

Literally everything it generates? Looks like pre-SD 1.5. Like, the quality of those old Wombo AI images.

Example. Prompt was something like "Photo of a soccer team playing with a duck instead of a ball".

image.png

yea they are total shit

@Nafnlaus
Janus-Pro generates images at a resolution of 384x384 and utilizes a 16x downsampling discrete encoder, which may cause it to lag behind the latest text-to-image models in terms of image clarity and detail rendering, especially for distant faces. You could try close-up scenarios, such as "The face of a sexy man."

image.png

This comment has been hidden

Seems like a "creative" way to hide models that are trying to be better than they are. Civilized society requires GAAP accounting and legitimate results. All foam no beer.

I tried like 4 different prompts - generic to detailed, high vs low temp/cfg - and I'm quite surprised too. I don't really see the image generation being on par with current models. All of my images looked like an impressionist painter trying to mimic an image lol. Also nowhere even in the ballpark of the example images DeepSeek published. Even with the clarification on the 16x downsampling discrete encoder am I missing something here? It's not bad I guess I just couldn't get any sort of resolution from the images but it did great on the OCR/image recognition tho and overall I'm impressed.

image (7).jpg
image (5).jpg
image (1).jpg

these generations looks so bad lol

Speak No Evil. Do No Evil.

Speak No Evil. Do No Evil.

Sign up or log in to comment