Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Edit Models filters

Apps
llama.cpp
LM Studio
Jan
Draw Things
DiffusionBee
Jellybox
JoyFusion
LocalAI
vLLM
Ollama
MLX LM
Docker Model Runner
Lemonade
SGLang
Pi
Inference Providers
Groq
Novita
Cerebras
SambaNova
Nscale
fal
Hyperbolic
Together AI
Fireworks
Featherless AI
Zai
Replicate
Cohere
Scaleway
Public AI
OVHcloud AI Endpoints
HF Inference API
WaveSpeed
Misc
RLVR
Inference Endpoints
text-generation-inference
Eval Results (legacy)
text-embeddings-inference
4-bit precision
Merge
custom_code
8-bit precision
Mixture of Experts
Carbon Emissions
Eval Results

Models

10
Full-text search
Active filters: RLVR

SultanR/SmolTulu-1.7b-Reinforced

Text Generation • 2B • Updated Dec 17, 2024 • 20 • 5

SultanR/SmolTulu-1.7b-RM

Text Classification • 2B • Updated Dec 17, 2024 • 3 • 2

mradermacher/SmolTulu-1.7b-Reinforced-GGUF

2B • Updated Dec 18, 2024 • 63

mradermacher/SmolTulu-1.7b-RM-GGUF

2B • Updated Dec 17, 2024 • 106

mradermacher/SmolTulu-1.7b-RM-i1-GGUF

2B • Updated Dec 17, 2024 • 99

Ach0/GCPO-R1-1.5B

Text Generation • 2B • Updated Oct 11, 2025 • 4

mradermacher/GCPO-R1-1.5B-GGUF

2B • Updated Oct 11, 2025 • 15

mradermacher/GCPO-R1-1.5B-i1-GGUF

2B • Updated Dec 6, 2025 • 28

Nagi-ovo/DeepSeek-V3.1-Math-RL-G16-LoRA

Updated Jan 31

Supreeth/searchlm-qwen2.5-3b-rlhf

Text Generation • 3B • Updated Jan 31
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs