LHPKAI (TNQ)

upvoted a paper 12 months ago

Vision Language Models are Biased

Paper • 2505.23941 • Published May 29, 2025 • 23

upvoted an article about 1 year ago

Article

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

+2

saurabhdash, olivernan, ArashAhmadian, johndang-cohere

•

Mar 4, 2025

• 78

upvoted 3 articles over 1 year ago

Article

FastRTC: The Real-Time Communication Library for Python

freddyaboulton, abidlabs

•

Feb 25, 2025

• 172

Article

SmolVLM2: Bringing Video Understanding to Every Device

+5

orrzohar, mfarre, andito, merve, pcuenq, cyrilzakka, Xenova

•

Feb 20, 2025

• 340

Article

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

wolfram

•

Jan 2, 2025

• 41

upvoted 5 papers over 1 year ago

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published Dec 6, 2024 • 161

O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

Paper • 2411.16489 • Published Nov 25, 2024 • 45

upvoted an article over 1 year ago

Article

ColPali: Efficient Document Retrieval with Vision Language Models 👀

manu

•

Jul 5, 2024

• 319

upvoted a paper over 1 year ago

Contextual Document Embeddings

Paper • 2410.02525 • Published Oct 3, 2024 • 24

upvoted a paper almost 2 years ago

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 72

upvoted 3 articles about 2 years ago

Article

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens and 11 languages

+7

Quent-01, nilabhra, rcojocaru, Mughaira, gcampesan, SanathNarayan, griffintaur, clefourrier, SaylorTwift

•

May 24, 2024

• 28

Article

Hugging Face x LangChain : A new partner package

+1

Jofthomas, kkondratenko, efriis

•

May 14, 2024

• 161

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

+1

merve, andsteing, pcuenq

•

May 14, 2024

• 287

upvoted a paper about 2 years ago

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Paper • 2405.04434 • Published May 7, 2024 • 26

upvoted an article about 2 years ago

Article

RAG using huggingface tools

not-lain

•

Jul 7, 2024

• 91

upvoted a collection about 2 years ago

Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 967

upvoted an article about 2 years ago

Article

Mergoo: Efficiently Build Your Own MoE LLM

alirezamsh

•

Jun 3, 2024

• 49

TNQ

AI & ML interests

Organizations

Vision Language Models are Biased

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

FastRTC: The Real-Time Communication Library for Python

SmolVLM2: Bringing Video Understanding to Every Device

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

1.58-bit FLUX

Adding Conditional Control to Text-to-Image Diffusion Models

Training Large Language Models to Reason in a Continuous Latent Space

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

ColPali: Efficient Document Retrieval with Vision Language Models 👀

Contextual Document Embeddings

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens and 11 languages

Hugging Face x LangChain : A new partner package

PaliGemma – Google's Cutting-Edge Open Vision Language Model

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

RAG using huggingface tools

Meta Llama 3

Mergoo: Efficiently Build Your Own MoE LLM

TNQ

AI & ML interests

Organizations

LHPKAI's activity

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

FastRTC: The Real-Time Communication Library for Python

SmolVLM2: Bringing Video Understanding to Every Device

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

ColPali: Efficient Document Retrieval with Vision Language Models 👀

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens and 11 languages

Hugging Face x LangChain : A new partner package

PaliGemma – Google's Cutting-Edge Open Vision Language Model

RAG using huggingface tools

Mergoo: Efficiently Build Your Own MoE LLM