remove files
Browse files- CHANGELOG.md +0 -3
- Dockerfile +0 -11
- README.md +0 -120
- demo.py +0 -35
- lyraChatGLM/__init__.py +0 -10
- lyraChatGLM/model.py +0 -131
- models/config.json +0 -25
- models/configuration_chatglm.py +0 -92
- models/ice_text.model +0 -3
- models/tokenization_chatglm.py +0 -346
- models/tokenizer_config.json +0 -19
- requirements.txt +0 -4
CHANGELOG.md
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
## v1.0
|
2 |
-
|
3 |
-
- Add accelerated ChatGLM-6B model (from: https://huggingface.co/THUDM/chatglm-6b)
|
|
|
|
|
|
|
|
Dockerfile
DELETED
@@ -1,11 +0,0 @@
|
|
1 |
-
FROM nvcr.io/nvidia/pytorch:23.02-py3
|
2 |
-
|
3 |
-
WORKDIR /workdir
|
4 |
-
|
5 |
-
COPY requirements.txt /workdir/
|
6 |
-
|
7 |
-
# since installing icetk will install protobuf 3.18.3, and we need protobuf==3.20.3
|
8 |
-
RUN pip install -r requirements.txt && \
|
9 |
-
pip install protobuf==3.20.3
|
10 |
-
|
11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
README.md
DELETED
@@ -1,120 +0,0 @@
|
|
1 |
-
---
|
2 |
-
license: creativeml-openrail-m
|
3 |
-
language:
|
4 |
-
- en
|
5 |
-
tags:
|
6 |
-
- LLM
|
7 |
-
- tensorRT
|
8 |
-
- ChatGLM
|
9 |
-
---
|
10 |
-
## Model Card for lyraChatGLM
|
11 |
-
|
12 |
-
lyraChatGLM is currently the **fastest ChatGLM-6B** available. To the best of our knowledge, it is the **first accelerated version of ChatGLM-6B**.
|
13 |
-
|
14 |
-
The inference speed of lyraChatGLM has achieved **10x** acceleration upon the ealry original version. We are still working hard to further improve the performance.
|
15 |
-
|
16 |
-
Among its main features are:
|
17 |
-
|
18 |
-
- weights: original ChatGLM-6B weights released by THUDM.
|
19 |
-
- device: lyraChatGLM is mainly based on TensorRT compiled for SM=80 (A100, for example).
|
20 |
-
- batch_size: compiled with dynamic batch size, max batch_size = 8
|
21 |
-
|
22 |
-
## Speed
|
23 |
-
|
24 |
-
### test environment
|
25 |
-
|
26 |
-
- device: Nvidia A100 40G
|
27 |
-
- batch size: 8
|
28 |
-
|
29 |
-
**Since early chatGLM version didn't suport batch inference, `original` in below table was measured on batch_size=1**
|
30 |
-
|
31 |
-
|
32 |
-
**According to [this discussion](https://huggingface.co/TMElyralab/lyraChatGLM/discussions/6), this bug has been fixed and the speed on batch_size=8 reachs up to 137 tokens/s. We will evaluate and update the latest performance.**
|
33 |
-
|
34 |
-
|version|speed|
|
35 |
-
|:-:|:-:|
|
36 |
-
|original|30 tokens/s|
|
37 |
-
|lyraChatGLM|310 tokens/s|
|
38 |
-
|
39 |
-
|
40 |
-
## Model Sources
|
41 |
-
|
42 |
-
- **Repository:** https://huggingface.co/THUDM/chatglm-6b
|
43 |
-
|
44 |
-
## Try Demo in 2 fast steps
|
45 |
-
|
46 |
-
``` bash
|
47 |
-
#step 1
|
48 |
-
git clone https://huggingface.co/TMElyralab/lyraChatGLM
|
49 |
-
cd lyraChatGLM
|
50 |
-
|
51 |
-
#step 2
|
52 |
-
docker run --gpus=1 --rm --net=host -v ${PWD}:/workdir yibolu96/lyra-chatglm-env:0.0.1 python3 /workdir/demo.py
|
53 |
-
```
|
54 |
-
|
55 |
-
## Uses
|
56 |
-
|
57 |
-
```python
|
58 |
-
from transformers import AutoTokenizer
|
59 |
-
from lyraChatGLM import GLM6B, FasterChatGLM
|
60 |
-
import os
|
61 |
-
|
62 |
-
current_workdir = os.path.dirname(__file__)
|
63 |
-
|
64 |
-
MAX_OUT_LEN = 100
|
65 |
-
chatglm6b_dir = os.path.join(current_workdir, "models")
|
66 |
-
tokenizer = AutoTokenizer.from_pretrained(chatglm6b_dir, trust_remote_code=True)
|
67 |
-
input_str = ["为什么我们需要对深度学习模型加速?", ]
|
68 |
-
inputs = tokenizer(input_str, return_tensors="pt", padding=True)
|
69 |
-
input_ids = inputs.input_ids.to('cuda:0')
|
70 |
-
|
71 |
-
plan_path = os.path.join(current_workdir, "models/glm6b-bs8.ftm")
|
72 |
-
|
73 |
-
# kernel for chat model.
|
74 |
-
kernel = GLM6B(plan_path=plan_path,
|
75 |
-
batch_size=1,
|
76 |
-
num_beams=1,
|
77 |
-
use_cache=True,
|
78 |
-
num_heads=32,
|
79 |
-
emb_size_per_heads=128,
|
80 |
-
decoder_layers=28,
|
81 |
-
vocab_size=150528,
|
82 |
-
max_seq_len=MAX_OUT_LEN)
|
83 |
-
|
84 |
-
chat = FasterChatGLM(model_dir=chatglm6b_dir, kernel=kernel).half().cuda()
|
85 |
-
|
86 |
-
# generate
|
87 |
-
sample_output = chat.generate(inputs=input_ids, max_length=MAX_OUT_LEN)
|
88 |
-
# de-tokenize model output to text
|
89 |
-
res = tokenizer.decode(sample_output[0], skip_special_tokens=True)
|
90 |
-
print(res)
|
91 |
-
```
|
92 |
-
## Demo output
|
93 |
-
|
94 |
-
### input
|
95 |
-
为什么我们需要对深度学习模型加速? 。
|
96 |
-
|
97 |
-
### output
|
98 |
-
为什么我们需要对深度学习模型加速? 深度学习模型的训练需要大量计算资源,特别是在训练模型时,需要大量的内存、GPU(图形处理器)和其他计算资源。因此,训练深度学习模型需要一定的时间,并且如果模型不能快速训练,则可能会导致训练进度缓慢或无法训练。
|
99 |
-
|
100 |
-
以下是一些原因我们需要对深度学习模型加速:
|
101 |
-
|
102 |
-
1. 训练深度神经网络需要大量的计算资源,特别是在训练深度神经网络时,需要更多的计算资源,因此需要更快的训练速度。
|
103 |
-
|
104 |
-
### TODO:
|
105 |
-
|
106 |
-
We plan to implement a FasterTransformer version to publish a much faster release. Stay tuned!
|
107 |
-
|
108 |
-
## Citation
|
109 |
-
``` bibtex
|
110 |
-
@Misc{lyraChatGLM2023,
|
111 |
-
author = {Kangjian Wu, Zhengtao Wang, Yibo Lu, Bin Wu},
|
112 |
-
title = {lyraChatGLM: Accelerating ChatGLM by 10x+},
|
113 |
-
howpublished = {\url{https://huggingface.co/TMElyralab/lyraChatGLM}},
|
114 |
-
year = {2023}
|
115 |
-
}
|
116 |
-
```
|
117 |
-
|
118 |
-
## Report bug
|
119 |
-
- start a discussion to report any bugs!--> https://huggingface.co/TMElyralab/lyraChatGLM/discussions
|
120 |
-
- report bug with a `[bug]` mark in the title.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
demo.py
DELETED
@@ -1,35 +0,0 @@
|
|
1 |
-
# coding=utf-8
|
2 |
-
|
3 |
-
from transformers import AutoTokenizer
|
4 |
-
from lyraChatGLM import GLM6B, FasterChatGLM
|
5 |
-
import os
|
6 |
-
|
7 |
-
current_workdir = os.path.dirname(__file__)
|
8 |
-
|
9 |
-
MAX_OUT_LEN = 100
|
10 |
-
chatglm6b_dir = os.path.join(current_workdir, "models")
|
11 |
-
tokenizer = AutoTokenizer.from_pretrained(chatglm6b_dir, trust_remote_code=True)
|
12 |
-
input_str = ["为什么我们需要对深度学习模型加速?", ]
|
13 |
-
inputs = tokenizer(input_str, return_tensors="pt", padding=True)
|
14 |
-
input_ids = inputs.input_ids.to('cuda:0')
|
15 |
-
|
16 |
-
plan_path = os.path.join(current_workdir, "models/glm6b-bs8.ftm")
|
17 |
-
|
18 |
-
# kernel for chat model.
|
19 |
-
kernel = GLM6B(plan_path=plan_path,
|
20 |
-
batch_size=1,
|
21 |
-
num_beams=1,
|
22 |
-
use_cache=True,
|
23 |
-
num_heads=32,
|
24 |
-
emb_size_per_heads=128,
|
25 |
-
decoder_layers=28,
|
26 |
-
vocab_size=150528,
|
27 |
-
max_seq_len=MAX_OUT_LEN)
|
28 |
-
|
29 |
-
chat = FasterChatGLM(model_dir=chatglm6b_dir, kernel=kernel).half().cuda()
|
30 |
-
|
31 |
-
# generate
|
32 |
-
sample_output = chat.generate(inputs=input_ids, max_length=MAX_OUT_LEN)
|
33 |
-
# de-tokenize model output to text
|
34 |
-
res = tokenizer.decode(sample_output[0], skip_special_tokens=True)
|
35 |
-
print(res)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
lyraChatGLM/__init__.py
DELETED
@@ -1,10 +0,0 @@
|
|
1 |
-
import os
|
2 |
-
import ctypes
|
3 |
-
|
4 |
-
current_workdir = os.path.dirname(__file__)
|
5 |
-
ctypes.cdll.LoadLibrary(os.path.join(current_workdir, "libnvinfer_plugin.so"))
|
6 |
-
os.environ["TORCH_USE_RTLD_GLOBAL"]="YES"
|
7 |
-
|
8 |
-
import torch
|
9 |
-
from .glm import GLM6B
|
10 |
-
from .model import FasterChatGLM
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
lyraChatGLM/model.py
DELETED
@@ -1,131 +0,0 @@
|
|
1 |
-
import torch
|
2 |
-
from transformers.modeling_outputs import CausalLMOutputWithPast
|
3 |
-
from transformers.modeling_utils import PreTrainedModel
|
4 |
-
from transformers import AutoConfig
|
5 |
-
from typing import Dict, List, Tuple, Union, Optional
|
6 |
-
|
7 |
-
|
8 |
-
class FasterChatGLM(PreTrainedModel):
|
9 |
-
def __init__(self, model_dir, kernel, *inputs, **kwargs):
|
10 |
-
config = AutoConfig.from_pretrained(model_dir, trust_remote_code=True)
|
11 |
-
config.n_head = config.num_attention_heads
|
12 |
-
config.n_embd = config.hidden_size
|
13 |
-
config.n_layer = config.num_layers
|
14 |
-
super().__init__(config, *inputs, **kwargs)
|
15 |
-
self.kernel = kernel
|
16 |
-
self.fake_reg = torch.nn.Linear(2, 2)
|
17 |
-
self.position_encoding_2d = True
|
18 |
-
|
19 |
-
def forward(self, input_ids, position_ids, attention_mask, past_key_values, *args, **kwargs):
|
20 |
-
inputs_values = [input_ids, position_ids, attention_mask]
|
21 |
-
if past_key_values is not None:
|
22 |
-
inputs_values = inputs_values + past_key_values
|
23 |
-
|
24 |
-
computed = self.kernel.infer(inputs_values)
|
25 |
-
logits = computed[0]
|
26 |
-
if len(computed) == 1:
|
27 |
-
present_key_values = None
|
28 |
-
else:
|
29 |
-
present_key_values = computed[1:]
|
30 |
-
|
31 |
-
return CausalLMOutputWithPast(logits=logits, past_key_values=present_key_values)
|
32 |
-
|
33 |
-
def get_masks_and_position_ids(self, seq, mask_position, context_length, device, gmask=False):
|
34 |
-
attention_mask = torch.ones((1, context_length, context_length), device=device)
|
35 |
-
attention_mask.tril_()
|
36 |
-
attention_mask[..., :context_length - 1] = 1
|
37 |
-
attention_mask.unsqueeze_(1)
|
38 |
-
attention_mask = (attention_mask < 0.5).bool()
|
39 |
-
|
40 |
-
if self.position_encoding_2d:
|
41 |
-
seq_length = seq.index(150004)
|
42 |
-
position_ids = torch.arange(context_length, dtype=torch.long, device=device)
|
43 |
-
if not gmask:
|
44 |
-
position_ids[seq_length:] = mask_position
|
45 |
-
block_position_ids = torch.cat((
|
46 |
-
torch.zeros(seq_length, dtype=torch.long, device=device),
|
47 |
-
torch.arange(context_length - seq_length, dtype=torch.long, device=device) + 1
|
48 |
-
))
|
49 |
-
position_ids = torch.stack((position_ids, block_position_ids), dim=0)
|
50 |
-
else:
|
51 |
-
position_ids = torch.arange(context_length, dtype=torch.long, device=device)
|
52 |
-
if not gmask:
|
53 |
-
position_ids[context_length - 1:] = mask_position
|
54 |
-
|
55 |
-
position_ids = position_ids.unsqueeze(0)
|
56 |
-
|
57 |
-
return attention_mask, position_ids
|
58 |
-
|
59 |
-
def prepare_one_sample(self, input_id, mask_token, past, past_key_values, use_gmask):
|
60 |
-
|
61 |
-
seq = input_id.tolist()
|
62 |
-
mask_position = seq.index(mask_token)
|
63 |
-
|
64 |
-
if mask_token not in seq:
|
65 |
-
raise ValueError("You have to add either [MASK] or [gMASK] in your input")
|
66 |
-
|
67 |
-
# only last token for input_ids if past is not None
|
68 |
-
if past is not None or past_key_values is not None:
|
69 |
-
context_length = seq.index(150004)
|
70 |
-
last_token = input_id[-1].unsqueeze(-1).unsqueeze(0) # 2 dim
|
71 |
-
proc_input_id = last_token
|
72 |
-
if self.position_encoding_2d:
|
73 |
-
position_ids = torch.tensor([[[mask_position], [len(seq) - context_length]]], dtype=torch.long,
|
74 |
-
device=input_id.device)
|
75 |
-
else:
|
76 |
-
position_ids = torch.tensor([[mask_position]], dtype=torch.long, device=input_id.device)
|
77 |
-
|
78 |
-
attention_mask = torch.zeros(1, 1, 1, 1, device=input_id.device)
|
79 |
-
else:
|
80 |
-
proc_input_id = input_id.unsqueeze(0)
|
81 |
-
attention_mask, position_ids = self.get_masks_and_position_ids(
|
82 |
-
seq=seq,
|
83 |
-
mask_position=mask_position,
|
84 |
-
context_length=len(seq),
|
85 |
-
device=input_id.device,
|
86 |
-
gmask=use_gmask
|
87 |
-
)
|
88 |
-
|
89 |
-
return (proc_input_id.to(torch.int32), position_ids.to(torch.int32),
|
90 |
-
attention_mask.to(torch.bool))
|
91 |
-
|
92 |
-
def prepare_inputs_for_generation(
|
93 |
-
self,
|
94 |
-
input_ids: torch.LongTensor,
|
95 |
-
past: Optional[torch.Tensor] = None,
|
96 |
-
past_key_values: Optional[torch.Tensor] = None,
|
97 |
-
attention_mask: Optional[torch.Tensor] = None,
|
98 |
-
use_cache: bool = None,
|
99 |
-
**kwargs
|
100 |
-
) -> dict:
|
101 |
-
|
102 |
-
MASK, gMASK = 150000, 150001
|
103 |
-
mask_token = MASK if MASK in input_ids else gMASK
|
104 |
-
use_gmask = False if MASK in input_ids else gMASK
|
105 |
-
|
106 |
-
batch_input_ids, batch_position_ids, batch_attention_mask = [], [], []
|
107 |
-
for input_id in input_ids:
|
108 |
-
proc_input_id, position_id, attention_mask = self.prepare_one_sample(
|
109 |
-
input_id, mask_token, past, past_key_values, use_gmask)
|
110 |
-
batch_input_ids.append(proc_input_id)
|
111 |
-
batch_position_ids.append(position_id)
|
112 |
-
batch_attention_mask.append(attention_mask)
|
113 |
-
|
114 |
-
batch_input_ids = torch.vstack(batch_input_ids)
|
115 |
-
batch_position_ids = torch.vstack(batch_position_ids)
|
116 |
-
batch_attention_mask = torch.vstack(batch_attention_mask)
|
117 |
-
|
118 |
-
if past is None:
|
119 |
-
past = past_key_values
|
120 |
-
|
121 |
-
if past is not None or past_key_values is not None:
|
122 |
-
self.kernel.set_context_mode(False)
|
123 |
-
else:
|
124 |
-
self.kernel.set_context_mode(self.config.use_cache)
|
125 |
-
|
126 |
-
return {
|
127 |
-
"input_ids": batch_input_ids,
|
128 |
-
"past_key_values": past_key_values,
|
129 |
-
"position_ids": batch_position_ids,
|
130 |
-
"attention_mask": batch_attention_mask
|
131 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
models/config.json
DELETED
@@ -1,25 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"_name_or_path": "THUDM/chatglm-6b",
|
3 |
-
"architectures": [
|
4 |
-
"ChatGLMModel"
|
5 |
-
],
|
6 |
-
"auto_map": {
|
7 |
-
"AutoConfig": "configuration_chatglm.ChatGLMConfig",
|
8 |
-
"AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
|
9 |
-
"AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration"
|
10 |
-
},
|
11 |
-
"bos_token_id": 150004,
|
12 |
-
"eos_token_id": 150005,
|
13 |
-
"hidden_size": 4096,
|
14 |
-
"inner_hidden_size": 16384,
|
15 |
-
"layernorm_epsilon": 1e-05,
|
16 |
-
"max_sequence_length": 2048,
|
17 |
-
"model_type": "chatglm",
|
18 |
-
"num_attention_heads": 32,
|
19 |
-
"num_layers": 28,
|
20 |
-
"position_encoding_2d": true,
|
21 |
-
"torch_dtype": "float16",
|
22 |
-
"transformers_version": "4.23.1",
|
23 |
-
"use_cache": true,
|
24 |
-
"vocab_size": 150528
|
25 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
models/configuration_chatglm.py
DELETED
@@ -1,92 +0,0 @@
|
|
1 |
-
""" ChatGLM model configuration """
|
2 |
-
|
3 |
-
from transformers.configuration_utils import PretrainedConfig
|
4 |
-
from transformers.utils import logging
|
5 |
-
|
6 |
-
logger = logging.get_logger(__name__)
|
7 |
-
|
8 |
-
|
9 |
-
class ChatGLMConfig(PretrainedConfig):
|
10 |
-
r"""
|
11 |
-
This is the configuration class to store the configuration of a [`~ChatGLMModel`].
|
12 |
-
It is used to instantiate an ChatGLM model according to the specified arguments, defining the model
|
13 |
-
architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of
|
14 |
-
the ChatGLM-6B [THUDM/ChatGLM-6B](https://huggingface.co/THUDM/chatglm-6b) architecture.
|
15 |
-
|
16 |
-
Configuration objects inherit from [`PretrainedConfig`] and can be used
|
17 |
-
to control the model outputs. Read the documentation from [`PretrainedConfig`]
|
18 |
-
for more information.
|
19 |
-
|
20 |
-
|
21 |
-
Args:
|
22 |
-
vocab_size (`int`, *optional*, defaults to 150528):
|
23 |
-
Vocabulary size of the ChatGLM-6B model. Defines the number of different tokens that can be represented by the
|
24 |
-
`inputs_ids` passed when calling [`~ChatGLMModel`] or
|
25 |
-
[`~TFChatGLMModel`].
|
26 |
-
hidden_size (`int`, *optional*, defaults to 4096):
|
27 |
-
Dimension of the encoder layers and the pooler layer.
|
28 |
-
num_hidden_layers (`int`, *optional*, defaults to 28):
|
29 |
-
Number of hidden layers in the Transformer encoder.
|
30 |
-
num_attention_heads (`int`, *optional*, defaults to 32):
|
31 |
-
Number of attention heads for each attention layer in the Transformer encoder.
|
32 |
-
inner_hidden_size (`int`, *optional*, defaults to 16384):
|
33 |
-
Dimension of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
|
34 |
-
max_sequence_length (`int`, *optional*, defaults to 512):
|
35 |
-
The maximum sequence length that this model might ever be used with.
|
36 |
-
Typically set this to something large just in case (e.g., 512 or 1024 or 2048).
|
37 |
-
layernorm_epsilon (`float`, *optional*, defaults to 1e-5):
|
38 |
-
The epsilon used by the layer normalization layers.
|
39 |
-
use_cache (`bool`, *optional*, defaults to `True`):
|
40 |
-
Whether the model should return the last key/values attentions (not used by all models).
|
41 |
-
Example:
|
42 |
-
|
43 |
-
```python
|
44 |
-
>>> from configuration_chatglm import ChatGLMConfig
|
45 |
-
>>> from modeling_chatglm import ChatGLMModel
|
46 |
-
|
47 |
-
>>> # Initializing a ChatGLM-6B THUDM/ChatGLM-6B style configuration
|
48 |
-
>>> configuration = ChatGLMConfig()
|
49 |
-
|
50 |
-
>>> # Initializing a model from the THUDM/ChatGLM-6B style configuration
|
51 |
-
>>> model = ChatGLMModel(configuration)
|
52 |
-
|
53 |
-
>>> # Accessing the model configuration
|
54 |
-
>>> configuration = model.config
|
55 |
-
```
|
56 |
-
"""
|
57 |
-
model_type = "chatglm"
|
58 |
-
|
59 |
-
def __init__(
|
60 |
-
self,
|
61 |
-
vocab_size=150528,
|
62 |
-
hidden_size=4096,
|
63 |
-
num_layers=28,
|
64 |
-
num_attention_heads=32,
|
65 |
-
layernorm_epsilon=1e-5,
|
66 |
-
use_cache=False,
|
67 |
-
bos_token_id=150004,
|
68 |
-
eos_token_id=150005,
|
69 |
-
pad_token_id=0,
|
70 |
-
max_sequence_length=2048,
|
71 |
-
inner_hidden_size=16384,
|
72 |
-
position_encoding_2d=True,
|
73 |
-
**kwargs
|
74 |
-
):
|
75 |
-
self.num_layers = num_layers
|
76 |
-
self.vocab_size = vocab_size
|
77 |
-
self.hidden_size = hidden_size
|
78 |
-
self.num_attention_heads = num_attention_heads
|
79 |
-
self.max_sequence_length = max_sequence_length
|
80 |
-
self.layernorm_epsilon = layernorm_epsilon
|
81 |
-
self.inner_hidden_size = inner_hidden_size
|
82 |
-
self.use_cache = use_cache
|
83 |
-
self.bos_token_id = bos_token_id
|
84 |
-
self.eos_token_id = eos_token_id
|
85 |
-
self.pad_token_id = pad_token_id
|
86 |
-
self.position_encoding_2d = position_encoding_2d
|
87 |
-
super().__init__(
|
88 |
-
pad_token_id=pad_token_id,
|
89 |
-
bos_token_id=bos_token_id,
|
90 |
-
eos_token_id=eos_token_id,
|
91 |
-
**kwargs
|
92 |
-
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
models/ice_text.model
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:99871e0c85db81ad7af1028854fd091cd5778c8414ae9d94bbbc10d02c831c21
|
3 |
-
size 2699926
|
|
|
|
|
|
|
|
models/tokenization_chatglm.py
DELETED
@@ -1,346 +0,0 @@
|
|
1 |
-
"""Tokenization classes for ChatGLM."""
|
2 |
-
import sys
|
3 |
-
import unicodedata
|
4 |
-
from typing import List, Optional, Union
|
5 |
-
from functools import lru_cache
|
6 |
-
import os
|
7 |
-
import collections
|
8 |
-
import re
|
9 |
-
|
10 |
-
from transformers.tokenization_utils import PreTrainedTokenizer
|
11 |
-
from icetk.text_tokenizer import TextTokenizer
|
12 |
-
from icetk.utils import auto_create
|
13 |
-
import icetk.sentencepiece_model_pb2 as sp_model
|
14 |
-
from transformers.utils import logging
|
15 |
-
|
16 |
-
logger = logging.get_logger(__name__)
|
17 |
-
|
18 |
-
PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES = {
|
19 |
-
"THUDM/chatglm-6b": 2048,
|
20 |
-
}
|
21 |
-
|
22 |
-
|
23 |
-
class SPTokenizer:
|
24 |
-
def __init__(
|
25 |
-
self,
|
26 |
-
vocab_file,
|
27 |
-
max_blank_length=80,
|
28 |
-
byte_fallback=True,
|
29 |
-
):
|
30 |
-
assert vocab_file is not None
|
31 |
-
self.vocab_file = vocab_file
|
32 |
-
self.special_tokens = ["[MASK]", "[gMASK]", "[sMASK]", "<unused_0>", "<sop>", "<eop>", "<ENC>", "<dBLOCK>"]
|
33 |
-
self.max_blank_length = max_blank_length
|
34 |
-
self.byte_fallback = byte_fallback
|
35 |
-
self.text_tokenizer = self._build_text_tokenizer(encode_special_tokens=False)
|
36 |
-
self.special_text_tokenizer = self._build_text_tokenizer(encode_special_tokens=True)
|
37 |
-
|
38 |
-
@staticmethod
|
39 |
-
def _configure_tokenizer(
|
40 |
-
text_tokenizer: TextTokenizer,
|
41 |
-
special_tokens: List[str],
|
42 |
-
max_blank_length: int,
|
43 |
-
byte_fallback: bool,
|
44 |
-
encode_special_tokens=False,
|
45 |
-
):
|
46 |
-
# special token
|
47 |
-
special_token_type = 4 if encode_special_tokens else 3 # 3 - CONTROL, 4 - USER_DEFINE
|
48 |
-
for token in special_tokens:
|
49 |
-
text_tokenizer.proto.pieces.append(
|
50 |
-
sp_model.ModelProto.SentencePiece(piece=token, score=0.0, type=special_token_type)
|
51 |
-
)
|
52 |
-
# whitespaces
|
53 |
-
for token in [SPTokenizer.get_tab_token()] + [
|
54 |
-
SPTokenizer.get_blank_token(i) for i in range(2, max_blank_length + 1)
|
55 |
-
]:
|
56 |
-
text_tokenizer.proto.pieces.append(sp_model.ModelProto.SentencePiece(piece=token, score=0.0, type=4))
|
57 |
-
# byte fallback
|
58 |
-
if byte_fallback:
|
59 |
-
text_tokenizer.proto.trainer_spec.byte_fallback = True
|
60 |
-
for i in range(256):
|
61 |
-
text_tokenizer.proto.pieces.append(
|
62 |
-
sp_model.ModelProto.SentencePiece(piece="<0x{:02X}>".format(i), score=0.0, type=6)
|
63 |
-
)
|
64 |
-
text_tokenizer.refresh()
|
65 |
-
|
66 |
-
def _build_text_tokenizer(self, encode_special_tokens=False):
|
67 |
-
tokenizer = TextTokenizer(self.vocab_file)
|
68 |
-
self._configure_tokenizer(
|
69 |
-
tokenizer, self.special_tokens, self.max_blank_length, self.byte_fallback, encode_special_tokens
|
70 |
-
)
|
71 |
-
return tokenizer
|
72 |
-
|
73 |
-
def _get_text_tokenizer(self, encode_special_tokens=False):
|
74 |
-
if encode_special_tokens:
|
75 |
-
return self.special_text_tokenizer
|
76 |
-
else:
|
77 |
-
return self.text_tokenizer
|
78 |
-
|
79 |
-
@staticmethod
|
80 |
-
def get_blank_token(length: int):
|
81 |
-
assert length >= 2
|
82 |
-
return f"<|blank_{length}|>"
|
83 |
-
|
84 |
-
@staticmethod
|
85 |
-
def get_tab_token():
|
86 |
-
return f"<|tab|>"
|
87 |
-
|
88 |
-
@property
|
89 |
-
def num_image_tokens(self):
|
90 |
-
return 20000
|
91 |
-
|
92 |
-
@property
|
93 |
-
def num_text_tokens(self):
|
94 |
-
return self.text_tokenizer.num_tokens
|
95 |
-
|
96 |
-
@property
|
97 |
-
def num_tokens(self):
|
98 |
-
return self.num_image_tokens + self.num_text_tokens
|
99 |
-
|
100 |
-
@staticmethod
|
101 |
-
def _encode_whitespaces(text: str, max_len: int = 80):
|
102 |
-
text = text.replace("\t", SPTokenizer.get_tab_token())
|
103 |
-
for i in range(max_len, 1, -1):
|
104 |
-
text = text.replace(" " * i, SPTokenizer.get_blank_token(i))
|
105 |
-
return text
|
106 |
-
|
107 |
-
def _preprocess(self, text: str, linebreak=True, whitespaces=True):
|
108 |
-
if linebreak:
|
109 |
-
text = text.replace("\n", "<n>")
|
110 |
-
if whitespaces:
|
111 |
-
text = self._encode_whitespaces(text, max_len=self.max_blank_length)
|
112 |
-
return text
|
113 |
-
|
114 |
-
def encode(
|
115 |
-
self, text: str, linebreak=True, whitespaces=True, special_tokens=False, add_dummy_prefix=True
|
116 |
-
) -> List[int]:
|
117 |
-
"""
|
118 |
-
@param text: Text to encode.
|
119 |
-
@param linebreak: Whether to encode newline (\n) in text.
|
120 |
-
@param whitespaces: Whether to encode multiple whitespaces or tab in text, useful for source code encoding.
|
121 |
-
@param special_tokens: Whether to encode special token ([MASK], [gMASK], etc.) in text.
|
122 |
-
@param add_dummy_prefix: Whether to add dummy blank space in the beginning.
|
123 |
-
"""
|
124 |
-
text = self._preprocess(text, linebreak, whitespaces)
|
125 |
-
if not add_dummy_prefix:
|
126 |
-
text = "<n>" + text
|
127 |
-
tmp = self._get_text_tokenizer(encode_special_tokens=special_tokens).encode(text)
|
128 |
-
tokens = [x + self.num_image_tokens for x in tmp]
|
129 |
-
return tokens if add_dummy_prefix else tokens[2:]
|
130 |
-
|
131 |
-
def decode(self, text_ids: List[int], special_tokens=False) -> str:
|
132 |
-
ids = [int(_id) - self.num_image_tokens for _id in text_ids]
|
133 |
-
ids = [_id for _id in ids if _id >= 0]
|
134 |
-
text = self._get_text_tokenizer(encode_special_tokens=special_tokens).decode(ids)
|
135 |
-
text = text.replace("<n>", "\n")
|
136 |
-
text = text.replace(SPTokenizer.get_tab_token(), "\t")
|
137 |
-
for i in range(2, self.max_blank_length + 1):
|
138 |
-
text = text.replace(self.get_blank_token(i), " " * i)
|
139 |
-
return text
|
140 |
-
|
141 |
-
def tokenize(
|
142 |
-
self, text: str, linebreak=True, whitespaces=True, special_tokens=False, add_dummy_prefix=True
|
143 |
-
) -> List[str]:
|
144 |
-
"""
|
145 |
-
@param text: Text to encode.
|
146 |
-
@param linebreak: Whether to encode newline (\n) in text.
|
147 |
-
@param whitespaces: Whether to encode multiple whitespaces or tab in text, useful for source code encoding.
|
148 |
-
@param special_tokens: Whether to encode special token ([MASK], [gMASK], etc.) in text.
|
149 |
-
@param add_dummy_prefix: Whether to add dummy blank space in the beginning.
|
150 |
-
"""
|
151 |
-
text = self._preprocess(text, linebreak, whitespaces)
|
152 |
-
if not add_dummy_prefix:
|
153 |
-
text = "<n>" + text
|
154 |
-
tokens = self._get_text_tokenizer(encode_special_tokens=special_tokens).tokenize(text)
|
155 |
-
return tokens if add_dummy_prefix else tokens[2:]
|
156 |
-
|
157 |
-
def __getitem__(self, x: Union[int, str]):
|
158 |
-
if isinstance(x, int):
|
159 |
-
if x < self.num_image_tokens:
|
160 |
-
return "<image_{}>".format(x)
|
161 |
-
else:
|
162 |
-
return self.text_tokenizer.convert_id_to_token(x - self.num_image_tokens)
|
163 |
-
elif isinstance(x, str):
|
164 |
-
if x.startswith("<image_") and x.endswith(">") and x[7:-1].isdigit():
|
165 |
-
return int(x[7:-1])
|
166 |
-
else:
|
167 |
-
return self.text_tokenizer.convert_token_to_id(x) + self.num_image_tokens
|
168 |
-
else:
|
169 |
-
raise ValueError("The key should be str or int.")
|
170 |
-
|
171 |
-
|
172 |
-
class ChatGLMTokenizer(PreTrainedTokenizer):
|
173 |
-
"""
|
174 |
-
Construct a ChatGLM tokenizer. Based on byte-level Byte-Pair-Encoding.
|
175 |
-
|
176 |
-
Args:
|
177 |
-
vocab_file (`str`):
|
178 |
-
Path to the vocabulary file.
|
179 |
-
"""
|
180 |
-
|
181 |
-
vocab_files_names = {"vocab_file": "ice_text.model"}
|
182 |
-
max_model_input_sizes = PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
|
183 |
-
model_input_names = ["input_ids"]
|
184 |
-
|
185 |
-
def __init__(
|
186 |
-
self,
|
187 |
-
vocab_file,
|
188 |
-
do_lower_case=False,
|
189 |
-
remove_space=False,
|
190 |
-
bos_token='sop',
|
191 |
-
eos_token='eos',
|
192 |
-
eop_token='eop',
|
193 |
-
mask_token='[MASK]',
|
194 |
-
gmask_token='[gMASK]',
|
195 |
-
padding_side="left",
|
196 |
-
**kwargs
|
197 |
-
) -> None:
|
198 |
-
super().__init__(
|
199 |
-
do_lower_case=do_lower_case,
|
200 |
-
remove_space=remove_space,
|
201 |
-
padding_side=padding_side,
|
202 |
-
**kwargs
|
203 |
-
)
|
204 |
-
|
205 |
-
self.do_lower_case = do_lower_case
|
206 |
-
self.remove_space = remove_space
|
207 |
-
self.vocab_file = vocab_file
|
208 |
-
|
209 |
-
self.bos_token = bos_token
|
210 |
-
self.eos_token = eos_token
|
211 |
-
self.eop_token = eop_token
|
212 |
-
self.mask_token = mask_token
|
213 |
-
self.gMASK_token = gmask_token
|
214 |
-
|
215 |
-
self.sp_tokenizer = SPTokenizer(vocab_file)
|
216 |
-
|
217 |
-
""" Initialisation """
|
218 |
-
|
219 |
-
@property
|
220 |
-
def eop_token_id(self) -> Optional[int]:
|
221 |
-
"""
|
222 |
-
`Optional[int]`: Id of the end of sentence token in the vocabulary. Returns `None` if the token has not been
|
223 |
-
set.
|
224 |
-
"""
|
225 |
-
if self.eop_token is None:
|
226 |
-
return None
|
227 |
-
return self.convert_tokens_to_ids(self.eop_token)
|
228 |
-
|
229 |
-
@property
|
230 |
-
def vocab_size(self):
|
231 |
-
""" Returns vocab size """
|
232 |
-
return self.sp_tokenizer.num_tokens
|
233 |
-
|
234 |
-
def get_vocab(self):
|
235 |
-
""" Returns vocab as a dict """
|
236 |
-
vocab = {self._convert_id_to_token(i): i for i in range(self.vocab_size)}
|
237 |
-
vocab.update(self.added_tokens_encoder)
|
238 |
-
return vocab
|
239 |
-
|
240 |
-
def preprocess_text(self, inputs):
|
241 |
-
if self.remove_space:
|
242 |
-
outputs = " ".join(inputs.strip().split())
|
243 |
-
else:
|
244 |
-
outputs = inputs
|
245 |
-
|
246 |
-
if self.do_lower_case:
|
247 |
-
outputs = outputs.lower()
|
248 |
-
|
249 |
-
return outputs
|
250 |
-
|
251 |
-
def _tokenize(self, text, **kwargs):
|
252 |
-
""" Returns a tokenized string. """
|
253 |
-
text = self.preprocess_text(text)
|
254 |
-
|
255 |
-
seq = self.sp_tokenizer.tokenize(text)
|
256 |
-
|
257 |
-
return seq
|
258 |
-
|
259 |
-
def decode(
|
260 |
-
self,
|
261 |
-
token_ids: Union[List[int], List[List[int]]],
|
262 |
-
skip_special_tokens: bool = False,
|
263 |
-
clean_up_tokenization_spaces: bool = True,
|
264 |
-
spaces_between_special_tokens: bool = True,
|
265 |
-
**kwargs
|
266 |
-
) -> str:
|
267 |
-
if isinstance(token_ids[0], list):
|
268 |
-
tokens = []
|
269 |
-
for single_token_ids in token_ids:
|
270 |
-
if self.pad_token_id in single_token_ids: # remove pad
|
271 |
-
single_token_ids = list(filter((self.pad_token_id).__ne__, single_token_ids))
|
272 |
-
tokens.append(self.sp_tokenizer.decode(single_token_ids))
|
273 |
-
return (tokens)
|
274 |
-
else:
|
275 |
-
if self.pad_token_id in token_ids: # remove pad
|
276 |
-
token_ids = list(filter((self.pad_token_id).__ne__, token_ids))
|
277 |
-
return self.sp_tokenizer.decode(token_ids)
|
278 |
-
|
279 |
-
def _convert_token_to_id(self, token):
|
280 |
-
""" Converts a token (str) in an id using the vocab. """
|
281 |
-
return self.sp_tokenizer[token]
|
282 |
-
|
283 |
-
def _convert_id_to_token(self, index):
|
284 |
-
"""Converts an index (integer) in a token (str) using the vocab."""
|
285 |
-
return self.sp_tokenizer[index]
|
286 |
-
|
287 |
-
def save_vocabulary(self, save_directory, filename_prefix=None):
|
288 |
-
"""
|
289 |
-
Save the vocabulary and special tokens file to a directory.
|
290 |
-
|
291 |
-
Args:
|
292 |
-
save_directory (`str`):
|
293 |
-
The directory in which to save the vocabulary.
|
294 |
-
filename_prefix (`str`, *optional*):
|
295 |
-
An optional prefix to add to the named of the saved files.
|
296 |
-
|
297 |
-
Returns:
|
298 |
-
`Tuple(str)`: Paths to the files saved.
|
299 |
-
"""
|
300 |
-
if os.path.isdir(save_directory):
|
301 |
-
vocab_file = os.path.join(
|
302 |
-
save_directory, VOCAB_FILES_NAMES["vocab_file"]
|
303 |
-
)
|
304 |
-
else:
|
305 |
-
vocab_file = save_directory
|
306 |
-
|
307 |
-
with open(self.vocab_file, 'rb') as fin:
|
308 |
-
proto_str = fin.read()
|
309 |
-
|
310 |
-
with open(vocab_file, "wb") as writer:
|
311 |
-
writer.write(proto_str)
|
312 |
-
|
313 |
-
return (vocab_file,)
|
314 |
-
|
315 |
-
def build_inputs_with_special_tokens(
|
316 |
-
self, token_ids_0: List[int], token_ids_1: Optional[List[int]] = None
|
317 |
-
) -> List[int]:
|
318 |
-
"""
|
319 |
-
Build model inputs from a sequence or a pair of sequence for sequence classification tasks by concatenating and
|
320 |
-
adding special tokens. A BERT sequence has the following format:
|
321 |
-
|
322 |
-
- single sequence: `[CLS] X [SEP]`
|
323 |
-
- pair of sequences: `[CLS] A [SEP] B [SEP]`
|
324 |
-
|
325 |
-
Args:
|
326 |
-
token_ids_0 (`List[int]`):
|
327 |
-
List of IDs to which the special tokens will be added.
|
328 |
-
token_ids_1 (`List[int]`, *optional*):
|
329 |
-
Optional second list of IDs for sequence pairs.
|
330 |
-
|
331 |
-
Returns:
|
332 |
-
`List[int]`: List of [input IDs](../glossary#input-ids) with the appropriate special tokens.
|
333 |
-
"""
|
334 |
-
if token_ids_1 is not None:
|
335 |
-
token_ids_0 += token_ids_1
|
336 |
-
mask_ids = self.sp_tokenizer[self.mask_token]
|
337 |
-
gmask_ids = self.sp_tokenizer[self.gMASK_token]
|
338 |
-
if mask_ids not in token_ids_0 and gmask_ids not in token_ids_0:
|
339 |
-
token_ids_0 += [gmask_ids]
|
340 |
-
|
341 |
-
if token_ids_0[-1] != mask_ids and token_ids_0[-1] != gmask_ids:
|
342 |
-
token_ids_0 += [self.sp_tokenizer[self.eos_token]]
|
343 |
-
|
344 |
-
token_ids_0 += [self.sp_tokenizer[self.bos_token]]
|
345 |
-
|
346 |
-
return token_ids_0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
models/tokenizer_config.json
DELETED
@@ -1,19 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"name_or_path": "THUDM/chatglm-6b",
|
3 |
-
"bos_token": "<sop>",
|
4 |
-
"eop_token": "<eop>",
|
5 |
-
"eos_token": "</s>",
|
6 |
-
"gmask_token": "[gMASK]",
|
7 |
-
"mask_token": "[MASK]",
|
8 |
-
"pad_token": "<pad>",
|
9 |
-
"unk_token": "<unk>",
|
10 |
-
"remove_space": false,
|
11 |
-
"do_lower_case": false,
|
12 |
-
"tokenizer_class": "ChatGLMTokenizer",
|
13 |
-
"auto_map": {
|
14 |
-
"AutoTokenizer": [
|
15 |
-
"tokenization_chatglm.ChatGLMTokenizer",
|
16 |
-
null
|
17 |
-
]
|
18 |
-
}
|
19 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
requirements.txt
DELETED
@@ -1,4 +0,0 @@
|
|
1 |
-
icetk
|
2 |
-
torch
|
3 |
-
transformers
|
4 |
-
|
|
|
|
|
|
|
|
|
|