152 17 228

Zhe Chen

czczup

https://scholar.google.com/citations?hl=en&user=j1rq_lYAAAAJ

czczup

AI & ML interests

multimodal large language model, vision foundation model

Recent Activity

liked a model about 16 hours ago

OpenGVLab/InternVL2_5-26B-AWQ

liked a model about 16 hours ago

OpenGVLab/InternVL2_5-8B-AWQ

liked a model about 16 hours ago

OpenGVLab/InternVL2_5-4B-AWQ

View all activity

Organizations

czczup's activity

liked 4 models about 16 hours ago

upvoted a paper 1 day ago

SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding

Paper • 2412.09604 • Published 5 days ago • 35

updated a collection 1 day ago

InternVL 2.5

Collection

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling • 18 items • Updated about 16 hours ago • 67

liked a model 2 days ago

OpenGVLab/InternVL2_5-78B-AWQ

Image-Text-to-Text • Updated 2 days ago • 81 • 7

New activity in OpenGVLab/InternVL2_5-26B 2 days ago

Request AWQ quantized version

#1 opened 8 days ago by

yourModel

What size is the model used in the demo page?

#2 opened 5 days ago by

WOO-SEOK

liked a model 2 days ago

OpenGVLab/PVC-InternVL2-8B

Image-Text-to-Text • Updated 1 day ago • 53 • 6

New activity in OpenGVLab/InternVL-Chat-V1-5 6 days ago

部署好的vl1.5多模态大模型，目前只接受图片或文本输入吗？是否接受PDF/ppt/excel等文件输入？这种怎么实现

#26 opened 14 days ago by

wqw0806

New activity in OpenGVLab/InternVL2-40B 6 days ago

CUDA error: device-side assert triggered

#10 opened 2 months ago by

67L1

New activity in OpenGVLab/InternVL2-8B 6 days ago

How should I extract attention maps? Can you provide a specific example?

#21 opened 25 days ago by

whopeople

New activity in OpenGVLab/InternVL2_5-1B 7 days ago

it run on colab t4

#1 opened 9 days ago by

rakmik

New activity in OpenGVLab/InternVL2-8B 7 days ago

some question about dataset format

#19 opened about 2 months ago by

MasoShizuka

reacted to merve's post with 👍🚀❤️ 8 days ago

Post

5415

This week in open-source AI was insane 🤠 A small recap🕺🏻 merve/dec-6-releases-67545caebe9fc4776faac0a3

Multimodal 🖼️
> Google shipped a PaliGemma 2, new iteration of PaliGemma with more sizes: 3B, 10B and 28B, with pre-trained and captioning variants 👏
> OpenGVLab released InternVL2, seven new vision LMs in different sizes, with sota checkpoint with MIT license ✨
> Qwen team at Alibaba released the base models of Qwen2VL models with 2B, 7B and 72B ckpts

LLMs 💬
> Meta released a new iteration of Llama 70B, Llama3.2-70B trained further
> EuroLLM-9B-Instruct is a new multilingual LLM for European languages with Apache 2.0 license 🔥
> Dataset: CohereForAI released GlobalMMLU, multilingual version of MMLU with 42 languages with Apache 2.0 license
> Dataset: QwQ-LongCoT-130K is a new dataset to train reasoning models
> Dataset: FineWeb2 just landed with multilinguality update! 🔥 nearly 8TB pretraining data in many languages!

Image/Video Generation 🖼️
> Tencent released HunyuanVideo, a new photorealistic video generation model
> OminiControl is a new editing/control framework for image generation models like Flux

Audio 🔊
> Indic-Parler-TTS is a new text2speech model made by community

upvoted a collection 9 days ago

InternLM2.5

Collection

14 items • Updated Sep 14 • 70

authored a paper 9 days ago

DDP: Diffusion Model for Dense Visual Prediction

Paper • 2303.17559 • Published Mar 30, 2023