Edit model card

pabloce/Tess-v2.5-Qwen2-72B

This model is a converted version of migtissera/Tess-v2.5-Qwen2-72B in GGUF format.

For more details on the original model, please refer to its model card.

Installation

To use this model with llama.cpp, you can install llama.cpp through brew on Mac and Linux:

brew install llama.cpp

Usage

Command Line Interface (CLI)

To use the model via the CLI, run the following command:

llama --hf-repo pabloce/Tess-v2.5-Qwen2-72B-gguff --hf-file tess-2.5-qwen-2-70b-q3_k_m.gguf -p "The meaning to life and the universe is"

Server

To start the llama.cpp server with this model, use the following command:

llama-server --hf-repo pabloce/Tess-v2.5-Qwen2-72B-gguff --hf-file tess-2.5-qwen-2-70b-q3_k_m.gguf -c 2048

Alternative Usage

You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repository.

  1. Clone the llama.cpp repository from GitHub:

    git clone https://github.com/ggerganov/llama.cpp
    
  2. Navigate to the llama.cpp folder and build it with the LLAMA_CURL=1 flag. You can also include other hardware-specific flags (e.g., LLAMA_CUDA=1 for Nvidia GPUs on Linux):

    cd llama.cpp && LLAMA_CURL=1 make
    
  3. Run inference through the main binary:

    ./main --hf-repo pabloce/Tess-v2.5-Qwen2-72B-gguf --hf-file tess-2.5-qwen-2-70b-q3_k_m.gguf -p "The meaning to life and the universe is"
    

    or start the server:

    ./server --hf-repo pabloce/Tess-v2.5-Qwen2-72B-gguf --hf-file tess-2.5-qwen-2-70b-q3_k_m.gguf -c 2048
    
Downloads last month
30
GGUF
Model size
72.7B params
Architecture
qwen2

3-bit

4-bit

Inference API
Unable to determine this model's library. Check the docs .

Model tree for pabloce/Tess-v2.5-Qwen2-72B-gguf

Quantized
(4)
this model