1 30 12

Naman Anand

naman5a

AI & ML interests

RAG , LLMs

Recent Activity

upvoted a collection about 2 months ago

InternVL3.5

commented on a paper 2 months ago

Reinforcement Pre-Training

upvoted a paper 2 months ago

Reinforcement Pre-Training

View all activity

Organizations

upvoted a collection about 2 months ago

InternVL3.5

Collection

This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 54 items • Updated 24 days ago • 98

upvoted a paper 2 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262

upvoted 4 articles 4 months ago

Article

How to train a new language model from scratch using Transformers and Tokenizers

Feb 14, 2020

• 51

Article

Introducing HELMET

Apr 16

• 40

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

Feb 4

• 179

Article

Finally, a Replacement for BERT: Introducing ModernBERT

Dec 19, 2024

• 700

upvoted a paper 5 months ago

SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26 • 88

upvoted a collection 6 months ago

GLM-4-0414

Collection

GLM-4-0414 series model • 8 items • Updated Jun 30 • 131

upvoted a paper 6 months ago

AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference

Paper • 2504.10326 • Published Apr 14 • 25

upvoted 2 articles 7 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 236

Article

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

and 8 others •

Mar 24

• 20

upvoted a collection 7 months ago

💫StarVector Models

Collection

StarVector is a multimodal LLM for Scalable Vector Graphics (SVG) generation, producing structured SVG code directly from images and text. • 2 items • Updated Mar 20 • 97

upvoted a paper 7 months ago

Cube: A Roblox View of 3D Intelligence

Paper • 2503.15475 • Published Mar 19 • 30

upvoted 2 articles 7 months ago

Article

From Files to Chunks: Improving Hugging Face Storage Efficiency

Nov 20, 2024

• 66

Article

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

Feb 12

• 77

upvoted 3 articles 8 months ago

Article

Mixture of Experts Explained

Dec 11, 2023

• 942

Article

SigLIP 2: A better multilingual vision language encoder

Feb 21

• 184

Article

SmolVLM2: Bringing Video Understanding to Every Device

Feb 20

• 308

upvoted a paper 8 months ago

MobileSAMv2: Faster Segment Anything to Everything

Paper • 2312.09579 • Published Dec 15, 2023 • 24

upvoted an article 8 months ago

Article

Announcing AI Energy Score Ratings

•

Feb 11

• 28

Naman Anand

AI & ML interests

Recent Activity

Organizations

naman5a's activity

How to train a new language model from scratch using Transformers and Tokenizers

Introducing HELMET

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

Finally, a Replacement for BERT: Introducing ModernBERT

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

From Files to Chunks: Improving Hugging Face Storage Efficiency

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

Mixture of Experts Explained

SigLIP 2: A better multilingual vision language encoder

SmolVLM2: Bringing Video Understanding to Every Device

Announcing AI Energy Score Ratings