File size: 1,046 Bytes
b555ab8
 
e168d9e
 
 
 
 
 
 
 
 
 
b555ab8
 
e168d9e
b555ab8
e168d9e
b555ab8
e168d9e
b555ab8
e168d9e
b555ab8
e168d9e
 
b555ab8
e168d9e
b555ab8
e168d9e
 
b555ab8
 
e168d9e
 
 
 
 
b555ab8
 
e168d9e
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
---
license: other
license_name: glm-4
license_link: LICENSE
language:
  - zh
  - en
pipeline_tag: text-generation
tags:
  - glm
  - edge
inference: false
---

# Glm-Edge-Chat-4B-GGUF

中文阅读, 点击[这里](README_zh.md)

## Inference with Ollama

### Installation

The code for adapting this model is actively being integrated into the official `llama.cpp`. You can test it using the
following adapted version:

```bash
git clone https://github.com/piDack/llama.cpp -b support_glm_edge_model
cmake -B build -DGGML_CUDA=ON # Or enable other acceleration hardware
cmake --build build -- -j 
```

### Inference

After installation, you can start the GLM-Edge Chat model using the following command:

```shell
llama-cli -m <path>/model.gguf -p "<|user|>\nhi<|assistant|>\n" -ngl 999
```

In the command-line interface, you can interact with the model by entering your requests, and the model will provide the
corresponding responses.

## License

The usage of this model’s weights is subject to the terms outlined in the [LICENSE](LICENSE).