Tushar Gupta

tusharg92

tusg

AI & ML interests

NLP, deep learning, machine learning

Recent Activity

upvoted a paper about 2 months ago

Differential Transformer

upvoted a paper about 2 months ago

Addition is All You Need for Energy-efficient Language Models

View all activity

Organizations

None yet

tusharg92's activity

upvoted 2 papers about 2 months ago

Differential Transformer

Paper • 2410.05258 • Published Oct 7 • 166

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1 • 144

upvoted a paper 2 months ago

8-bit Optimizers via Block-wise Quantization

Paper • 2110.02861 • Published Oct 6, 2021 • 2

upvoted a collection 2 months ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated about 5 hours ago • 392

upvoted an article 2 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18

• 204

upvoted an article 3 months ago

Article

Accelerate 1.0.0

Sep 13

• 50

upvoted a paper 3 months ago

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27 • 37

upvoted a paper 5 months ago

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Paper • 2407.09025 • Published Jul 12 • 128

upvoted an article 5 months ago

Article

Welcome Gemma 2 - Google's new open LLM

Jun 27

• 124

upvoted a collection 5 months ago

Jina Reranker v2

Collection

A collection of state-of-the-art multilingual neural rerankers • 1 item • Updated Sep 17 • 7

upvoted a collection 6 months ago

Qwen2

Collection

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated about 5 hours ago • 346