Update README_zh.md (#1)

Browse files

- Update README_zh.md (9678610124843848e40ab39e5c327c08fe61f8ce)

Co-authored-by: sixgod <sixsixcoder@users.noreply.huggingface.co>

Files changed (1) hide show

README_zh.md +53 -25

README_zh.md CHANGED Viewed

@@ -1,27 +1,20 @@
-# GLM-4-9B
-If you are using the weights from this repository, please update to
-<span style="color:red; font-weight:bold;"> transformers>=4.46.0 </span>
-These weights are **not compatible** with older versions of the transformers library.
-## Model Introduction
-GLM-4-9B is the open-source version of the latest generation of pre-trained models in the GLM-4 series launched by Zhipu
-AI. In the evaluation of data sets in semantics, mathematics, reasoning, code, and knowledge, **GLM-4-9B**
-and its human preference-aligned version **GLM-4-9B-Chat** have shown superior performance beyond Llama-3-8B. In
-addition to multi-round conversations, GLM-4-9B-Chat also has advanced features such as web browsing, code execution,
-custom tool calls (Function Call), and long text
-reasoning (supporting up to 128K context). This generation of models has added multi-language support, supporting 26
-languages including Japanese, Korean, and German. We have also launched the **GLM-4-9B-Chat-1M** model that supports 1M
-context length (about 2 million Chinese characters) and the multimodal model GLM-4V-9B based on GLM-4-9B.
-**GLM-4V-9B** possesses dialogue capabilities in both Chinese and English at a high resolution of 1120*1120.
-In various multimodal evaluations, including comprehensive abilities in Chinese and English, perception & reasoning,
-text recognition, and chart understanding, GLM-4V-9B demonstrates superior performance compared to
-GPT-4-turbo-2024-04-09, Gemini 1.0 Pro, Qwen-VL-Max, and Claude 3 Opus.
-We evaluated the GLM-4-9B base model on some typical tasks, and the results are as follows:
 | Model               |   MMLU   |  C-Eval  |   GPQA   |  GSM8K   |   MATH   | HumanEval |
 |:--------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:---------:|
@@ -30,18 +23,53 @@ We evaluated the GLM-4-9B base model on some typical tasks, and the results are
 | ChatGLM3-6B-Base    |   61.4   |   69.0   |    -     |   72.3   |   25.7   |     -     |
 | GLM-4-9B            | **74.7** | **77.1** | **34.3** | **84.0** | **30.4** | **70.1**  |
-**This repository is the base version of GLM-4-9B, supporting 8K context length.**
-For more inference code and requirements, please visit our [github page](https://github.com/THUDM/GLM-4).
-## LICENSE
-The weights of the GLM-4 model are available under the terms of [LICENSE](LICENSE).
-## Citations
-If you find our work useful, please consider citing the following paper.
 ```
 @misc{glm2024chatglm,
@@ -52,4 +80,4 @@ If you find our work useful, please consider citing the following paper.
       archivePrefix={arXiv},
       primaryClass={id='cs.CL' full_name='Computation and Language' is_active=True alt_name='cmp-lg' in_archive='cs' is_general=False description='Covers natural language processing. Roughly includes material in ACM Subject Class I.2.7. Note that work on artificial languages (programming languages, logics, formal systems) that does not explicitly address natural-language issues broadly construed (natural-language processing, computational linguistics, speech, text retrieval, etc.) is not appropriate for this area.'}
 }
-```

+# glm-4-9b
+Read this in [English](README.md).
+如果您使用的是这个仓库中的权重，请更新到
+<span style="color:red; font-weight:bold;"> transformers>=4.46.0 </span>
+这些权重 **不兼容** 较早版本的 transformers 库。
+GLM-4-9B 是智谱 AI 推出的最新一代预训练模型 GLM-4 系列中的开源版本。
+在语义、数学、推理、代码和知识等多方面的数据集测评中，GLM-4-9B 及其人类偏好对齐的版本 GLM-4-9B-Chat 均表现出较高的性能。
+除了能进行多轮对话，GLM-4-9B-Chat 还具备网页浏览、代码执行、自定义工具调用（Function Call）和长文本推理（支持最大 128K
+上下文）等高级功能。
+本代模型增加了多语言支持，支持包括日语，韩语，德语在内的 26 种语言。我们还推出了支持 1M 上下文长度（约 200 万中文字符）的模型。
+我们在一些典型任务上对 GLM-4-9B 基座模型进行了评测，结果如下：
 | Model               |   MMLU   |  C-Eval  |   GPQA   |  GSM8K   |   MATH   | HumanEval |
 |:--------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:---------:|
 | ChatGLM3-6B-Base    |   61.4   |   69.0   |    -     |   72.3   |   25.7   |     -     |
 | GLM-4-9B            | **74.7** | **77.1** | **34.3** | **84.0** | **30.4** | **70.1**  |
+**本仓库是 GLM-4-9B 的基座版本，支持`8K`上下文长度。**
+## 运行模型
+**更多推理代码和依赖信息，请访问我们的 [github](https://github.com/THUDM/GLM-4)。**
+**请严格按照[依赖](https://github.com/THUDM/GLM-4/blob/main/basic_demo/requirements.txt)安装，否则无法正常运行。**
+### Transformers 推理代码
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import os
+os.environ['CUDA_VISIBLE_DEVICES'] = '0' # 设置 GPU 编号，如果单机单卡指定一个，单机多卡指定多个 GPU 编号
+MODEL_PATH = "THUDM/glm-4-9b-hf"
+model = AutoModelForCausalLM.from_pretrained(
+    MODEL_PATH,
+    torch_dtype=torch.bfloat16,
+    low_cpu_mem_usage=True,
+    trust_remote_code=True,
+    device_map="auto"
+).eval()
+device = "cuda" if torch.cuda.is_available() else "cpu"
+tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, trust_remote_code=True)
+encoding = tokenizer("你是谁<|endoftext|>")
+inputs = {key: torch.tensor([value]).to(device) for key, value in encoding.items()}
+gen_kwargs = {"max_length": 2500, "do_sample": True, "top_k": 1}
+with torch.no_grad():
+    outputs = model.generate(**inputs, **gen_kwargs)
+    outputs = outputs[:, inputs['input_ids'].shape[1]:]
+    print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+## 协议
+GLM-4 模型的权重的使用则需要遵循 [LICENSE](LICENSE)。
+## 引用
+如果你觉得我们的工作有帮助的话，请考虑引用下列论文。
 ```
 @misc{glm2024chatglm,
       archivePrefix={arXiv},
       primaryClass={id='cs.CL' full_name='Computation and Language' is_active=True alt_name='cmp-lg' in_archive='cs' is_general=False description='Covers natural language processing. Roughly includes material in ACM Subject Class I.2.7. Note that work on artificial languages (programming languages, logics, formal systems) that does not explicitly address natural-language issues broadly construed (natural-language processing, computational linguistics, speech, text retrieval, etc.) is not appropriate for this area.'}
 }
+```