zRzRzRzRzRzRzR sixsixcoder commited on
Commit
82d25da
·
verified ·
1 Parent(s): 2c51392

Update README.md (#2)

Browse files

- Update README.md (f8a3bcc90c8f7eeceae25bbe8983e4c87d71fd5c)


Co-authored-by: sixgod <sixsixcoder@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +38 -1
README.md CHANGED
@@ -49,7 +49,44 @@ We evaluated the GLM-4-9B base model on some typical tasks, and the results are
49
 
50
  **This repository is the base version of GLM-4-9B, supporting 8K context length.**
51
 
52
- For more inference code and requirements, please visit our [github page](https://github.com/THUDM/GLM-4).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
 
54
  ## LICENSE
55
 
 
49
 
50
  **This repository is the base version of GLM-4-9B, supporting 8K context length.**
51
 
52
+
53
+ ## Quick Start
54
+
55
+ **For more inference code and requirements, please visit our [github page](https://github.com/THUDM/GLM-4).**
56
+
57
+ **Please strictly follow the [dependencies](https://github.com/THUDM/GLM-4/blob/main/basic_demo/requirements.txt) to
58
+ install, otherwise it will not run properly**
59
+
60
+ ### Transformers Lib(4.46.0 and later version) for inference:
61
+
62
+ ```python
63
+ import torch
64
+ from transformers import AutoModelForCausalLM, AutoTokenizer
65
+ import os
66
+
67
+ os.environ['CUDA_VISIBLE_DEVICES'] = '0' # Set the GPU number. If a single machine has a single card, specify one. If a single machine has multiple cards, specify multiple GPU numbers.
68
+
69
+ MODEL_PATH = "THUDM/glm-4-9b-hf"
70
+
71
+ model = AutoModelForCausalLM.from_pretrained(
72
+ MODEL_PATH,
73
+ torch_dtype=torch.bfloat16,
74
+ low_cpu_mem_usage=True,
75
+ trust_remote_code=True,
76
+ device_map="auto"
77
+ ).eval()
78
+ device = "cuda" if torch.cuda.is_available() else "cpu"
79
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, trust_remote_code=True)
80
+
81
+ encoding = tokenizer("what is your name?<|endoftext|>")
82
+ inputs = {key: torch.tensor([value]).to(device) for key, value in encoding.items()}
83
+
84
+ gen_kwargs = {"max_length": 2500, "do_sample": True, "top_k": 1}
85
+ with torch.no_grad():
86
+ outputs = model.generate(**inputs, **gen_kwargs)
87
+ outputs = outputs[:, inputs['input_ids'].shape[1]:]
88
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
89
+ ```
90
 
91
  ## LICENSE
92