This is the Llama 2 13B model with <|end_of_turn|> token added as id 32000 and <|PAD|> as id 32001. The token input/output embedding is initialized as the mean of all existing input/output token embeddings, respectively.
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.