Text Generation
Transformers
PyTorch
llama
text-generation-inference
Inference Endpoints
TheBloke's picture
Maximum sequence length for a Llama 2 model is 4096
d3850bc