Text Generation
Transformers
Safetensors
English
llama
nvidia
llama3.1
conversational
text-generation-inference

How to inference it on a 40 GB A100 and 80 GB Ram of Colab PRO?

#17
by SadeghPouriyan - opened

I want to use this model on colab pro and i have 40 bg a100 and 80 gb ram of the runtime. what is the best practise to use it on this system?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment