Small LMs

Nymbo 's Collections

Hub Models

Hub Datasets

API LMs

Small LMs

Photo / Video

TTS / Audio

Utility

Data Utilities

Zero GPU Spaces

Games / Fun

Local & GGUF

Leaderboards

Templates

Gradio Themes

Eval

updated 4 days ago

Upvote

Paused

🐋

Orca 2 13B
Build error

💬

MonadGPT
Runtime error

😻

Mistral-7B
Paused

🌪️

Voice Chat With Mistral 7B
Paused

⚡

Qwen VL
Runtime error

🏃

ChatGLM 6B
Build error

🐶

Koboldcpp Tiefighter
Paused

📚

Tinyllama Chat
Paused

⚡

Stable LM 2 Zephyr 1.6b
Runtime error

🚀

MoE LLaVA
Paused

🐬

Chat with DeepSeek Coder 7B
Runtime error

🦙

Llama 2 13b Chat
Runtime error

🔥

LLaVA
Runtime error

📚

Video LLaVA
Paused

🏢

Llava
Paused

👁

LLaVA 1.6
Paused

🐠

Gradio Notebook Local Model
Sleeping

📚

Blind Chat
Running

🌊🐋

Web-LLM: Mistral 7B OpenOrca

7B text-generation model running directly from the browser
Runtime error

🍑

[NSFW] C0ffee's Erotic Story Generator 2
Running

📉

Whisper Chess
Runtime error

🦙

LLaMA Board

Fine-tuning large language model with Gradio UI
Running

📕

Ratchet + Phi Locally

Run Phi-3 in Browser
Running

🗣️🏎️

Ratchet + Whisper Locally

Run Whisper in Browser
Running

4

🔮🔮

Noosphere Webui on Cpu
Running

15

👌👌

epicPhotoGASM Webui on Cpu
Running

🐠

Experimental Phi3 Webgpu
NeverSleep/Llama-3-Lumimaid-8B-v0.1

Text Generation • Updated Jun 10, 2024 • 1.5k • 79
gradientai/Llama-3-8B-Instruct-Gradient-4194k

Text Generation • Updated Oct 28, 2024 • 183 • 71
tiiuae/falcon-11B

Text Generation • Updated Dec 17, 2024 • 32k • 212
Running on Zero

14

🌘w🌖

Text-Streaming

text streaming space using Gemma-7B
Running

🌐

GemmaOnDevice
Runtime error

4.35k

🔥

OpenGPT 4o

GPT 4o like bot.
Paused

🤲

PaliGemma Demo
Running

🚀

Phi-3 WebGPU

A private and powerful AI that runs locally in your browser
Running

🏃

Mistral-7B-v0.3 Fast Chat

Fast chatting with Mistral v0.3
Running

🌐

YOLOv10 Web
Running

🏆

WebGPU Nomic Embed
Running

🚀

WebGPU Chat Qwen2
Runtime error

⚡

GLiNER HandyLab
Paused

💻

Kosmos 2
Running

6

💫

Text Gen Playground

Chat with any model on the Hub
Running

🚀

Gemini Nano (Chrome Built-in)

Run Gemini Nano locally in your browser with Transformers.js
Running

1

🌋

LLaVA WebGPU

A private and powerful multimodal AI chatbot that runs local
Running

🕯️🔡

Candle T5 Generation Wasm
Running on Zero

60

🌍

MInference
Running

🚀

SmolLM 360M Instruct WebGPU

A blazingly fast and powerful AI chatbot that runs locally.
Running

5

🚀

SmolLM 135M Instruct WebGPU

A blazingly fast and powerful AI chatbot that runs locally.
Running on Zero

78

🔥

Chameleon 30b
Sleeping

5

✨

Nymbot Lite

Vision Chatbot with ImgGen & Web Search - Runs on CPU
Running on Zero

3

🦙

Llama-3.1-8B-Instruct

The best 8B model with 128K context
Sleeping

🌖

ollama-Chat

Chat with Ollama
Sleeping

4

🤔📊

Llama CSV Agent

Need to analyze data? Let a Llama-3.1 agent do it for you!
Runtime error

1

😻

MagicPrompt Stable Diffusion
Running

🏃

WebLLM JSON Playground
Running

💬

Webllm Simple Chat
Running on Zero

79

😻

Gemma 2 2B IT

Chatbot
Runtime error

1

✨✨✨

Cohere Command R+ inference

c4ai-command-r-plus (hub inference, not API)
Sleeping

🐁

Phi-3-Mini-4k-Instruct

Phi-3-Mini on hub inference
Sleeping

1

🐼

Yi-1.5-34B-Chat

Yi-1.5-34B on hub inference
Running

1

✨

Mistral-7B-Instruct-v0.3

SOTA Small Model by Mistral AI
Running on Zero

65

🐍

Falcon Mamba Playground
Paused

💬

MiniCPM-V-2 6
Running

🤏

Instant SmolLM

Run SmolLM-360M-Instruct in realtime with MLC WebLLM
Runtime error

159

💬

LongWriter

LLM for long context
Paused

15

🐭

Phi-3.5-Mini-Instruct

New SOTA small model from Microsoft, and multilingual!
Running

4

🤗

Inference Playground

One-stop-shop for frequently used models
Running

228

💻

HF's Missing Inference Widget
Sleeping

💻🧲

1-Shot LLM Playground

Single-shot inference for rapid model testing
Running

1

⚡

Phi-3.5-Mini WebLLM
Running on Zero

213

🔥

Phi 3.5 Vision
Paused

🤩

Qwen2-VL-2B

Multilingual, Multimodal, Mighty 2B
Sleeping

🚀

Kotaemon
Sleeping

🏃

Dataset Rewriter
Paused

6

🐢

Reflection 70B llama.cpp

Reflection-70B by Matt Schumer
Paused

3

⚡

Joy Caption Alpha One
Paused

🦙🦙🦙

Llama-3.2-3B-Instruct

New SOTA small model from Meta
Paused

4

🦙

Llama-3.2-1B-Instruct

the new tiny king
Paused

5

📊

HTML To Markdown

Convert HTML to Markdown with readerlm-1.5B
Running on Zero

382

🚀

Llama-Vision-11B
Running

⚡

Qwen-2.5 WebLLM
Running

2

🦙

Llama-3.2 WebLLM
Running on Zero

106

👁

Molmo 7B D 0924
Paused

🌖

Emu3
Running

🦙

Llama 3.2 WebGPU

A powerful AI chatbot that runs locally in your browser
Running

3

🏎️

WebLLM Playground
Sleeping

9

🐠🤖👌🏻

Nemotron-Mini

NemoAligner Synthetic SFT with function calling
Paused

🚀

Zamba2 7B
Sleeping

👌🔍

MiniSearch

Minimalist web-searching app with browser-based AI assistant
Running

🌍

Janus Space Clone Me First
Running

🐍

Qwen 2.5 Code Interpreter
Running on T4

235

🌍

Aya Expanse
Running

🦙

Wllama

Run GGUF directly on your browser!
Running

14

🤏

SmolLM2-1.7B-Instruct Serverless

New SOTA smol king by Hugging Face
Running

💻

BitNet.cpp
Running on Zero

188

🏃

JanusFlow 1.3B

Huggingface space for JanusFlow-1.3B
Paused

🏃

JanusFlow 1.3B

Text Gen | Vision | Image Gen | One 1.3b model
Running

2

📉

Ai Scraper
Paused

📊

SmolVLM
Running

🏛️

Janus 1.3B WebGPU

In-browser unified multimodal understanding and generation.
Sleeping

👁️

Omnivlm Dpo Demo
Running

🧑‍💻

Github Issue Generator
Running on Zero

204

💻

ShowUI
Running

🗣️

Text-to-Speech WebGPU

WebGPU text-to-Speech powered by OuteTTS and Transformers.js
Running on Zero

7

🐍

Falcon3 Mamba 7b Instruct Playground
Running on Zero

32

🦅

Falcon3 Demo

F3-DEMO
Paused

💬

SmallThinker Demo
Running

🧠

Llama 3.2 Reasoning WebGPU

Small and powerful reasoning LLM that runs in your browser
Running

🧠

DeepSeek-R1 WebGPU

Next-generation reasoning model that runs locally in-browser
Running

💻

SmolVLM 500M Instruct WebGPU
Running

🐨

SmolVLM 256M Instruct WebGPU
Running on Zero

2

📊

SmolVLM
Paused

⚡

Markdown Studio

Convert HTML to Markdown/JSON, Markdown Live Preview
Running on Zero

1.13k

🌍

Chat With Janus-Pro-7B

A unified multimodal understanding and generation model.

Upvote

Collection guide
Browse collections

Small LMs

Orca 2 13B

MonadGPT

Mistral-7B

Voice Chat With Mistral 7B

Qwen VL

ChatGLM 6B

Koboldcpp Tiefighter

Tinyllama Chat

Stable LM 2 Zephyr 1.6b

MoE LLaVA

Chat with DeepSeek Coder 7B

Llama 2 13b Chat

LLaVA

Video LLaVA

Llava

LLaVA 1.6

Gradio Notebook Local Model

Blind Chat

Web-LLM: Mistral 7B OpenOrca

[NSFW] C0ffee's Erotic Story Generator 2

Whisper Chess

LLaMA Board

Ratchet + Phi Locally

Ratchet + Whisper Locally

Noosphere Webui on Cpu

epicPhotoGASM Webui on Cpu

Experimental Phi3 Webgpu

Text-Streaming

GemmaOnDevice

OpenGPT 4o

PaliGemma Demo

Phi-3 WebGPU

Mistral-7B-v0.3 Fast Chat

YOLOv10 Web

WebGPU Nomic Embed

WebGPU Chat Qwen2

GLiNER HandyLab

Kosmos 2

Text Gen Playground

Gemini Nano (Chrome Built-in)

LLaVA WebGPU

Candle T5 Generation Wasm

MInference

SmolLM 360M Instruct WebGPU

SmolLM 135M Instruct WebGPU

Chameleon 30b

Nymbot Lite

Llama-3.1-8B-Instruct

ollama-Chat

Llama CSV Agent

MagicPrompt Stable Diffusion

WebLLM JSON Playground

Webllm Simple Chat

Gemma 2 2B IT

Cohere Command R+ inference

Phi-3-Mini-4k-Instruct

Yi-1.5-34B-Chat

Mistral-7B-Instruct-v0.3

Falcon Mamba Playground

MiniCPM-V-2 6

Instant SmolLM

LongWriter

Phi-3.5-Mini-Instruct

Inference Playground

HF's Missing Inference Widget

1-Shot LLM Playground

Phi-3.5-Mini WebLLM

Phi 3.5 Vision

Qwen2-VL-2B

Kotaemon

Dataset Rewriter

Reflection 70B llama.cpp

Joy Caption Alpha One

Llama-3.2-3B-Instruct

Llama-3.2-1B-Instruct

HTML To Markdown

Llama-Vision-11B

Qwen-2.5 WebLLM

Llama-3.2 WebLLM