Akarshan Biswas's picture

Akarshan Biswas

qnixsynapse

·

qnixsynapse

AI & ML interests

NLP, models, quantization

Recent Activity

liked a model 19 days ago

google/gemma-2-2b

liked a model about 1 month ago

meta-llama/Llama-3.2-3B-Instruct

liked a model about 1 month ago

Granther/Gemma-2-9B-Instruct-4Bit-GPTQ

Organizations

None yet

qnixsynapse's activity

upvoted a paper 2 months ago

Kolmogorov-Arnold Transformer

Paper • 2409.10594 • Published Sep 16 • 38

upvoted a paper 3 months ago

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22 • 50

upvoted an article 3 months ago

Article

Tool Use, Unified

Aug 12

• 64

upvoted 2 papers 4 months ago

Language Model Can Listen While Speaking

Paper • 2408.02622 • Published Aug 5 • 37

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31 • 107

upvoted a collection 4 months ago

Gemma 2 2B Release

The 2.6B parameter version of Gemma 2. • 6 items • Updated Jul 31 • 76

upvoted 2 papers 4 months ago

Human-like Episodic Memory for Infinite Context LLMs

Paper • 2407.09450 • Published Jul 12 • 60

LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

Paper • 2407.03963 • Published Jul 4 • 15

upvoted a paper 5 months ago

MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention

Paper • 2407.02490 • Published Jul 2 • 23

upvoted a collection 5 months ago

SSMs

A collection of Mamba-2-based research models with 8B parameters trained on 3.5T tokens for comparison with Transformers. • 5 items • Updated Oct 1 • 26

upvoted a paper 6 months ago

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

Paper • 2404.05904 • Published Apr 8 • 8

upvoted 3 papers 7 months ago

KAN: Kolmogorov-Arnold Networks

Paper • 2404.19756 • Published Apr 30 • 108

SpaceByte: Towards Deleting Tokenization from Large Language Modeling

Paper • 2404.14408 • Published Apr 22 • 6

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

Paper • 2404.13208 • Published Apr 19 • 38

upvoted a collection 7 months ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Sep 25 • 683

upvoted a paper 7 months ago

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Paper • 2404.07839 • Published Apr 11 • 42

upvoted a paper 8 months ago

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29 • 52

upvoted an article 8 months ago

Article

CodeGemma - an official Google release for code LLMs

Apr 9

• 99

upvoted a collection 8 months ago

Gemma release

Groups the Gemma models released by the Google team. • 40 items • Updated Jul 31 • 325

upvoted a paper 8 months ago

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 104