ArunKr
/

LLM-Model-Serving

Model card Files Files and versions Community

Edit model card

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

LLM Model Serving

Model Performance Optimization

Model Quantization
Model Prunning
Machine Learning Compilation (MLC-LLM)
Neural Magic (DeepSparse)

Model Serving on Bare-metal Server

Docker Container

Model Serving on Kubernernetes Cluster

Inference Servers

TorchServe
TGI
Triton
Seldon Core
vLLM(CPU/GPU)
LLamaCPP
Ollama

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference API

Unable to determine this model's library. Check the docs .

Collection including ArunKr/LLM-Model-Serving

MyCollections

17 items • Updated 13 days ago • 1