Spaces:

imperialwool
/

llama-cpp-api

Running

imperialwool commited on Dec 4, 2024

Commit

6d6b007

verified ·

1 Parent(s): 1d0b9e6

Update gradio_app.py

Files changed (1) hide show

gradio_app.py CHANGED Viewed

@@ -6,15 +6,15 @@ import psutil
 # Initing things
 print("! INITING LLAMA MODEL !")
-llm = Llama(model_path="./model.bin")                              # LLaMa model
-llama_model_name = "Vikhrmodels/Vikhr-Llama-3.2-1B-instruct-GGUF"  # This is just for indication in "three dots menu"
 print("! INITING DONE !")
 # Preparing things to work
 title = "llama.cpp API"
 desc = '''<h1>Hello, world!</h1>
-This is showcase how to make own server with Llama2 model using llama_cpp.<br>
-I'm using here 7b model just for example. Also here's only CPU power.<br>
 But you can use GPU power as well!<br><br>
 <h1>How to GPU?</h1>
 Change <code>`CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS`</code> in Dockerfile on <code>`CMAKE_ARGS="-DLLAMA_CUBLAS=on"`</code>. Also you can try <code>`DLLAMA_CLBLAST`</code> or <code>`DLLAMA_METAL`</code>.<br><br>

 # Initing things
 print("! INITING LLAMA MODEL !")
+llm = Llama(model_path="./model.bin")                               # LLaMa model
+llama_model_name = "Vikhrmodels/Vikhr-Qwen-2.5-1.5B-Instruct-GGUF"  # This is just for indication in "three dots menu"
 print("! INITING DONE !")
 # Preparing things to work
 title = "llama.cpp API"
 desc = '''<h1>Hello, world!</h1>
+This is showcase how to make own server with any Llama based model using llama_cpp.<br>
+I'm using here 1.5b model just for example. Also here's only CPU power.<br>
 But you can use GPU power as well!<br><br>
 <h1>How to GPU?</h1>
 Change <code>`CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS`</code> in Dockerfile on <code>`CMAKE_ARGS="-DLLAMA_CUBLAS=on"`</code>. Also you can try <code>`DLLAMA_CLBLAST`</code> or <code>`DLLAMA_METAL`</code>.<br><br>