New discussion

Serving with TGI or vLLM?

1
#3 opened 11 months ago by kno10

only use one gpu?

2
#2 opened 11 months ago by jgbrblmd

persist dequantized model

1
#1 opened 11 months ago by nudelbrot