germanjke commited on
Commit
d31deac
1 Parent(s): e62edf3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +88 -0
README.md ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ru
4
+ base_model: t-tech/T-pro-it-1.0
5
+ tags:
6
+ - llama-cpp
7
+ ---
8
+
9
+ # T-pro-it-1.0-Q8_0-GGUF
10
+
11
+ **🚨 T-pro is designed for further fine-tuning and is not intended as a ready-to-use conversational assistant. Users are advised to exercise caution and are responsible for any additional training and oversight required to ensure the model's responses meet acceptable ethical and safety standards. The responsibility for incorporating this model into industrial or commercial solutions lies entirely with those who choose to deploy it.**
12
+
13
+ ## Description
14
+
15
+ This repository contains the [`T-pro-it-1.0`](https://huggingface.co/t-tech/T-pro-it-1.0/) model, which has been quantized into the GGUF format using the [`llama.cpp`](https://github.com/ggerganov/llama.cpp) repository.
16
+
17
+ ## 📊 Benchmarks
18
+
19
+ Proprietary models:
20
+
21
+ | Benchmark | T-pro-it-1.0 | T-pro-it-1.0-Q4_K_M |T-pro-it-1.0-Q5_K_M |T-pro-it-1.0-Q6_K |T-pro-it-1.0-Q8_0 |GPT-4o | GPT-4o-mini | GigaChat Max 1.0.26.20 |
22
+ |------------------------------------------------|-----------------------|------------------------|-----------------------|------------------|------------------|------------------------------|-----------------------|---------------------|
23
+ | Arena-Hard-Ru | **90.17** | 89.0 |89.29 |88.5 |89.35 | <u>84.87</u> | 81 | - |
24
+
25
+ Open-source models:
26
+
27
+ | Benchmark | T-pro-it-1.0 | T-pro-it-1.0-Q4_K_M |T-pro-it-1.0-Q5_K_M |T-pro-it-1.0-Q6_K |T-pro-it-1.0-Q8_0 | Qwen-2.5-32B-Instruct | T-pro-it-1.0 | gemma-2-27b-it | Llama-3.3-70B-Instruct |
28
+ |------------------------------------------------|---------------------------|------------------------|-----------------------|------------------|------------------|-------------------------------|------------------------------|------------------------|------------------------|
29
+ | Arena-Hard-Ru | **90.17** | 89.0 |89.29 |88.5 |89.35 | 74.54 | <u>80.23</u> | 66.4 | 76.51 |
30
+
31
+ ## Llama.cpp usage
32
+
33
+ ### Server
34
+
35
+ From HF:
36
+
37
+ ```bash
38
+ llama-server --hf-repo t-tech/T-pro-it-1.0-Q8_0-GGUF --hf-file t-pro-it-1.0-q8_0.gguf -c 8192
39
+ ```
40
+
41
+ Or locally:
42
+
43
+ ```bash
44
+ ./build/bin/llama-server -m t-pro-it-1.0-q8_0.gguf -c 8192
45
+ ```
46
+
47
+ ### POST
48
+
49
+ ```bash
50
+ curl --request POST \
51
+ --url http://localhost:8080/completion \
52
+ --header "Content-Type: application/json" \
53
+ --data '{
54
+ "prompt": "<|im_start|>user\nРасскажи мне чем отличается Python от C++?\n<|im_end|>\n<|im_start|>assistant\n",
55
+ "n_predict": 256
56
+ }'
57
+
58
+ ```
59
+
60
+
61
+ ## ollama usage
62
+
63
+ ### Serve
64
+
65
+ ```bash
66
+ ollama serve
67
+ ```
68
+
69
+ ### Run
70
+
71
+ From HF:
72
+
73
+ ```bash
74
+ ollama run hf.co/t-tech/T-pro-it-1.0-Q8_0-GGUF/
75
+ ```
76
+
77
+ Or locally:
78
+
79
+ ```bash
80
+ ollama create example -f Modelfile
81
+ ollama run example "Расскажи мне про отличия C++ и Python"
82
+ ```
83
+
84
+ where `Modelfile` is
85
+
86
+ ```bash
87
+ FROM ./t-pro-it-1.0-q8_0.gguf
88
+ ```