what is t5xxl models required for and what's the differences apart from the sizes? thx
#42
by
tetsujin007
- opened
use of t5xxl models?
I think is for text
https://www.youtube.com/watch?v=xMQT9o97shA
2:44 timeline
@tetsujin007
you see diffusion models need something called text encoders to actually understand the text. more text encoders and larger ones seem to improve performance.
this repository provides 2 variants of the sd3 model. one is with 2 text encoders, and the other one is with 3(including t5xxl).
the one with 3 text encoders(including t5xxl) is slightly better in prompt following, putting text in images, and overall quality. The difference isn't much but there is a slight difference. For best performance, use the one with t5xxl. If you don't have enough VRAM, use the smaller one.