Kanzhi Cheng
cckevinn
AI & ML interests
None yet
Recent Activity
Reacted to
Symbol-LLM's
post
with 🚀
4 days ago
🥳 Thrilled to introduce our recent efforts on bootstrapping VLMs for multi-modal chain-of-thought reasoning !
📕 Title: Vision-Language Models Can Self-Improve Reasoning via Reflection
🔗 Link: https://huggingface.co/papers/2411.00855
😇Takeaways:
- We found that VLMs can self-improve reasoning performance through a reflection mechanism, and importantly, this approach can scale through test-time computing.
- Evaluation on comprehensive and diverse Vision-Language reasoning tasks are included !
Reacted to
Symbol-LLM's
post
with 🔥
4 days ago
🥳 Thrilled to introduce our recent efforts on bootstrapping VLMs for multi-modal chain-of-thought reasoning !
📕 Title: Vision-Language Models Can Self-Improve Reasoning via Reflection
🔗 Link: https://huggingface.co/papers/2411.00855
😇Takeaways:
- We found that VLMs can self-improve reasoning performance through a reflection mechanism, and importantly, this approach can scale through test-time computing.
- Evaluation on comprehensive and diverse Vision-Language reasoning tasks are included !
Reacted to
Symbol-LLM's
post
with 🔥
4 days ago
🥳 Thrilled to introduce our recent efforts on bootstrapping VLMs for multi-modal chain-of-thought reasoning !
📕 Title: Vision-Language Models Can Self-Improve Reasoning via Reflection
🔗 Link: https://huggingface.co/papers/2411.00855
😇Takeaways:
- We found that VLMs can self-improve reasoning performance through a reflection mechanism, and importantly, this approach can scale through test-time computing.
- Evaluation on comprehensive and diverse Vision-Language reasoning tasks are included !
Organizations
cckevinn's activity
Clarification towards the different models
3
#1 opened 8 months ago
by
benwiesel
Released capabilities
6
#42 opened about 1 year ago
by
ludeksvoboda
Farseq -> Transformers conversion
5
#1 opened about 2 years ago
by
mys
Farseq -> Transformers conversion
5
#1 opened about 2 years ago
by
mys