Kanzhi Cheng

cckevinn

AI & ML interests

None yet

Recent Activity

View all activity

Organizations

cckevinn's activity

Reacted to Symbol-LLM's post with 🚀🔥🔥 4 days ago
view post
Post
854
🥳 Thrilled to introduce our recent efforts on bootstrapping VLMs for multi-modal chain-of-thought reasoning !

📕 Title: Vision-Language Models Can Self-Improve Reasoning via Reflection

🔗 Link: Vision-Language Models Can Self-Improve Reasoning via Reflection (2411.00855)

😇Takeaways:

- We found that VLMs can self-improve reasoning performance through a reflection mechanism, and importantly, this approach can scale through test-time computing.

- Evaluation on comprehensive and diverse Vision-Language reasoning tasks are included !
New activity in cckevinn/SeeClick 8 months ago
New activity in adept/fuyu-8b about 1 year ago

Released capabilities

6
#42 opened about 1 year ago by ludeksvoboda
New activity in OFA-Sys/ofa-base-caption-fairseq-version almost 2 years ago

Farseq -> Transformers conversion

5
#1 opened about 2 years ago by mys

Farseq -> Transformers conversion

5
#1 opened about 2 years ago by mys