1 9 4

Shantanu Acharya

shantanuacharya

https://www.shantanuacharya.com/

AI & ML interests

Large Language Models and Computer Vision

Recent Activity

authored a paper 3 days ago

Star Attention: Efficient LLM Inference over Long Sequences

upvoted a paper 3 days ago

Star Attention: Efficient LLM Inference over Long Sequences

commented a paper 4 days ago

Star Attention: Efficient LLM Inference over Long Sequences

View all activity

Organizations

shantanuacharya's activity

authored a paper 3 days ago

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published 5 days ago • 41

upvoted a paper 3 days ago

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published 5 days ago • 41

commented a paper 4 days ago

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published 5 days ago • 41 •

upvoted a paper about 2 months ago

nGPT: Normalized Transformer with Representation Learning on the Hypersphere

Paper • 2410.01131 • Published Oct 1 • 9

upvoted a paper 2 months ago

NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published Sep 17 • 72

liked 3 models 6 months ago

upvoted a paper 6 months ago

HelpSteer2: Open-source dataset for training top-performing reward models

Paper • 2406.08673 • Published Jun 12 • 16

upvoted a paper 7 months ago

NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment

Paper • 2405.01481 • Published May 2 • 25

upvoted 2 papers 8 months ago

Every child should have parents: a taxonomy refinement algorithm based on hyperbolic term embeddings

Paper • 1906.02002 • Published Jun 5, 2019 • 1

Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition

Paper • 2210.03255 • Published Oct 6, 2022 • 1

authored 3 papers 8 months ago

RULER: What's the Real Context Size of Your Long-Context Language Models?

Paper • 2404.06654 • Published Apr 9 • 34

Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition

Paper • 2210.03255 • Published Oct 6, 2022 • 1

Every child should have parents: a taxonomy refinement algorithm based on hyperbolic term embeddings

Paper • 1906.02002 • Published Jun 5, 2019 • 1

upvoted a paper 8 months ago

RULER: What's the Real Context Size of Your Long-Context Language Models?

Paper • 2404.06654 • Published Apr 9 • 34

liked a model about 1 year ago

nvidia/nemotron-3-8b-base-4k

Text Generation • Updated Feb 9 • 17 • 78

upvoted a collection about 1 year ago

Nemotron 3 8B

Collection

The Nemotron 3 8B Family of models is optimized for building production-ready generative AI applications for the enterprise. • 5 items • Updated Oct 1 • 46