Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering Paper • 2403.09622 • Published Mar 14 • 16
TableGPT2: A Large Multimodal Model with Tabular Data Integration Paper • 2411.02059 • Published Nov 4 • 5
POINTS1.5: Building a Vision-Language Model towards Real World Applications Paper • 2412.08443 • Published 7 days ago • 37