zhanghang's picture

1 8 9

zhanghang

hangzhang-nlp

·

hangzhang-nlp

AI & ML interests

None yet

Recent Activity

liked a dataset about 1 month ago

BAAI/Infinity-Instruct

upvoted a paper about 1 month ago

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

upvoted a paper about 1 month ago

The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

View all activity

Organizations

hangzhang-nlp's activity

liked a dataset about 1 month ago

BAAI/Infinity-Instruct

Viewer • Updated 28 days ago • 20.4M • 8.82k • 559

upvoted 2 papers about 1 month ago

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22 • 88

The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

Paper • 2410.12787 • Published Oct 16 • 30

liked a Space 3 months ago

Running on CPU Upgrade

Open LLM Leaderboard 2

Track, rank and evaluate open LLMs and chatbots

upvoted a paper 4 months ago

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

Paper • 2407.19672 • Published Jul 29 • 55

liked a model 4 months ago

openvla/openvla-7b

Image-Text-to-Text • Updated Sep 16 • 71.9k • 79

liked a Space 4 months ago

Running on Zero

VideoLLaMA2

Media understanding

Reacted to stas's post with 👍 6 months ago

Post

If you're trying to run MoE Mixtral-8x7b under DeepSpeed w/ HF Transformers it's likely to hang on the first forward.

The solution is here https://github.com/microsoft/DeepSpeed/pull/4966?_x_tr_sl=auto&_x_tr_tl=en&_x_tr_hl=en-US#issuecomment-1989671378

and you need deepspeed>=0.13.0

Thanks to Masahiro Tanaka for the fix.

New activity in HuggingFaceM4/the_cauldron 7 months ago

Where is the GSD dataset?

#6 opened 7 months ago by

upvoted an article 7 months ago

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15

• 168

upvoted a collection 8 months ago

WizardLM

0 items • Updated Jul 11 • 103

upvoted a paper 8 months ago

Audio Dialogues: Dialogues dataset for audio and music understanding

Paper • 2404.07616 • Published Apr 11 • 15

liked a dataset 8 months ago

MMInstruction/M3IT

Updated Nov 24, 2023 • 7.73k • 121

authored a paper 12 months ago

SeaLLMs -- Large Language Models for Southeast Asia

Paper • 2312.00738 • Published Dec 1, 2023 • 23

liked a model about 1 year ago

SeaLLMs/SeaLLM-13B-Chat

Updated Feb 2 • 60

updated a Space about 1 year ago

Running on A10G

Video LLaMA

upvoted 2 papers about 1 year ago

Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 37

An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

Paper • 2309.09958 • Published Sep 18, 2023 • 18

liked a model over 1 year ago

daryl149/llama-2-7b-hf

Text Generation • Updated Jul 23, 2023 • 2.31k • 19

updated a model over 1 year ago

DAMO-NLP-SG/Video-LLaMA-2-13B-Pretrained

Updated Aug 3, 2023 • 1