Edit Models filters

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

34

Full-text search

Active filters: nm-vllm

RedHatAI/TinyLlama-1.1B-Chat-v1.0-pruned2.4

Text Generation • Updated Mar 5, 2024 • 10 • 1

RedHatAI/MiniChat-2-3B-pruned2.4

Text Generation • Updated Mar 5, 2024 • 7

RedHatAI/OpenHermes-2.5-Mistral-7B-pruned2.4

Text Generation • Updated Mar 5, 2024 • 408

RedHatAI/OpenHermes-2.5-Mistral-7B-pruned50

Text Generation • Updated Mar 5, 2024 • 411 • 1

RedHatAI/Nous-Hermes-2-SOLAR-10.7B-pruned2.4

Text Generation • Updated Mar 5, 2024 • 4

RedHatAI/Nous-Hermes-2-Yi-34B-pruned2.4

Text Generation • Updated Mar 5, 2024 • 11

RedHatAI/Nous-Hermes-2-Yi-34B-pruned50

Text Generation • Updated Mar 5, 2024 • 15

RedHatAI/zephyr-7b-beta-marlin

Text Generation • 1B • Updated Mar 6, 2024 • 71

RedHatAI/llama2.c-stories110M-pruned2.4

Text Generation • Updated Mar 5, 2024 • 4

RedHatAI/llama2.c-stories110M-pruned50

Text Generation • Updated Mar 5, 2024 • 1.71k

RedHatAI/phi-2-pruned50

Text Generation • 3B • Updated Mar 5, 2024 • 5

RedHatAI/TinyLlama-1.1B-Chat-v1.0-marlin

Text Generation • 0.3B • Updated Mar 6, 2024 • 5.49k • 1

RedHatAI/OpenHermes-2.5-Mistral-7B-marlin

Text Generation • 1B • Updated Mar 6, 2024 • 566 • 2

RedHatAI/Nous-Hermes-2-Yi-34B-marlin

Text Generation • 5B • Updated Mar 6, 2024 • 3 • 5

softmax/Llama-2-70b-chat-hf-marlin

Text Generation • 10B • Updated Mar 17, 2024 • 6

softmax/falcon-180B-chat-marlin

Text Generation • 26B • Updated Mar 21, 2024 • 6

dtransposed/llama2.c-stories110M-pruned50-compressed-tensors

Text Generation • 0.1B • Updated Apr 23, 2024 • 6

nm-testing/llama2.c-stories110M-pruned50-compressed-tensors

Text Generation • 0.1B • Updated Apr 25, 2024 • 4

mradermacher/Nous-Hermes-2-SOLAR-10.7B-pruned2.4-GGUF

11B • Updated Apr 10 • 26

mradermacher/Nous-Hermes-2-SOLAR-10.7B-pruned2.4-i1-GGUF

11B • Updated Apr 10 • 128

tensorblock/llama2.c-stories110M-pruned50-GGUF

0.1B • Updated Jul 9 • 95

mradermacher/phi-2-pruned50-GGUF

3B • Updated Aug 1 • 78

mradermacher/llama2.c-stories110M-pruned50-GGUF

0.1B • Updated Apr 10 • 90

mradermacher/OpenHermes-2.5-Mistral-7B-pruned50-GGUF

7B • Updated Apr 10 • 24 • 1

mradermacher/MiniChat-2-3B-pruned2.4-GGUF

3B • Updated Apr 10 • 7

mradermacher/OpenHermes-2.5-Mistral-7B-pruned50-i1-GGUF

7B • Updated Apr 10 • 165

mradermacher/llama2.c-stories110M-pruned50-i1-GGUF

0.1B • Updated Apr 10 • 93

mradermacher/OpenHermes-2.5-Mistral-7B-pruned2.4-GGUF

7B • Updated Apr 10 • 26

mradermacher/OpenHermes-2.5-Mistral-7B-pruned2.4-i1-GGUF

7B • Updated Apr 10 • 357

tensorblock/OpenHermes-2.5-Mistral-7B-pruned2.4-GGUF

7B • Updated Jul 9 • 25