Jamie@TitanML
commited on
Commit
•
08036e4
1
Parent(s):
29c2ccd
Upload folder using huggingface_hub
Browse files- .gitattributes +0 -1
- README.md +344 -0
- config.json +25 -0
- ct_output_models/config.json +6 -0
- ct_output_models/model.bin +3 -0
- ct_output_models/vocabulary.json +0 -0
- generation_config.json +6 -0
- special_tokens_map.json +5 -0
- tokenizer.json +0 -0
- tokenizer_config.json +9 -0
.gitattributes
CHANGED
@@ -25,7 +25,6 @@
|
|
25 |
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
26 |
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
27 |
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
28 |
-
*.tar filter=lfs diff=lfs merge=lfs -text
|
29 |
*.tflite filter=lfs diff=lfs merge=lfs -text
|
30 |
*.tgz filter=lfs diff=lfs merge=lfs -text
|
31 |
*.wasm filter=lfs diff=lfs merge=lfs -text
|
|
|
25 |
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
26 |
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
27 |
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
|
|
28 |
*.tflite filter=lfs diff=lfs merge=lfs -text
|
29 |
*.tgz filter=lfs diff=lfs merge=lfs -text
|
30 |
*.wasm filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
@@ -0,0 +1,344 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
datasets:
|
6 |
+
- togethercomputer/RedPajama-Data-1T
|
7 |
+
- togethercomputer/RedPajama-Data-Instruct
|
8 |
+
widget:
|
9 |
+
- text: |-
|
10 |
+
Label the tweets as either 'positive', 'negative', 'mixed', or 'neutral':
|
11 |
+
|
12 |
+
Tweet: I can say that there isn't anything I would change.
|
13 |
+
Label: positive
|
14 |
+
|
15 |
+
Tweet: I'm not sure about this.
|
16 |
+
Label: neutral
|
17 |
+
|
18 |
+
Tweet: I liked some parts but I didn't like other parts.
|
19 |
+
Label: mixed
|
20 |
+
|
21 |
+
Tweet: I think the background image could have been better.
|
22 |
+
Label: negative
|
23 |
+
|
24 |
+
Tweet: I really like it.
|
25 |
+
Label:
|
26 |
+
example_title: Sentiment Analysis
|
27 |
+
- text: |-
|
28 |
+
Please answer the following question:
|
29 |
+
|
30 |
+
Question: What is the capital of Canada?
|
31 |
+
Answer: Ottawa
|
32 |
+
|
33 |
+
Question: What is the currency of Switzerland?
|
34 |
+
Answer: Swiss franc
|
35 |
+
|
36 |
+
Question: In which country is Wisconsin located?
|
37 |
+
Answer:
|
38 |
+
example_title: Question Answering
|
39 |
+
- text: >-
|
40 |
+
Given a news article, classify its topic.
|
41 |
+
|
42 |
+
Possible labels: 1. World 2. Sports 3. Business 4. Sci/Tech
|
43 |
+
|
44 |
+
|
45 |
+
Article: A nearby star thought to harbor comets and asteroids now appears to
|
46 |
+
be home to planets, too.
|
47 |
+
|
48 |
+
Label: Sci/Tech
|
49 |
+
|
50 |
+
|
51 |
+
Article: Soaring crude prices plus worries about the economy and the outlook
|
52 |
+
for earnings are expected to hang over the stock market next week during the
|
53 |
+
depth of the summer doldrums.
|
54 |
+
|
55 |
+
Label: Business
|
56 |
+
|
57 |
+
|
58 |
+
Article: Murtagh a stickler for success Northeastern field hockey coach
|
59 |
+
Cheryl Murtagh doesn't want the glare of the spotlight that shines on her to
|
60 |
+
detract from a team that has been the America East champion for the past
|
61 |
+
three years and has been to the NCAA tournament 13 times.
|
62 |
+
|
63 |
+
Label::
|
64 |
+
example_title: Topic Classification
|
65 |
+
- text: |-
|
66 |
+
Paraphrase the given sentence into a different sentence.
|
67 |
+
|
68 |
+
Input: Can you recommend some upscale restaurants in New York?
|
69 |
+
Output: What upscale restaurants do you recommend in New York?
|
70 |
+
|
71 |
+
Input: What are the famous places we should not miss in Paris?
|
72 |
+
Output: Recommend some of the best places to visit in Paris?
|
73 |
+
|
74 |
+
Input: Could you recommend some hotels that have cheap price in Zurich?
|
75 |
+
Output:
|
76 |
+
example_title: Paraphrasing
|
77 |
+
- text: >-
|
78 |
+
Given a review from Amazon's food products, the task is to generate a short
|
79 |
+
summary of the given review in the input.
|
80 |
+
|
81 |
+
|
82 |
+
Input: I have bought several of the Vitality canned dog food products and
|
83 |
+
have found them all to be of good quality. The product looks more like a
|
84 |
+
stew than a processed meat and it smells better. My Labrador is finicky and
|
85 |
+
she appreciates this product better than most.
|
86 |
+
|
87 |
+
Output: Good Quality Dog Food
|
88 |
+
|
89 |
+
|
90 |
+
Input: Product arrived labeled as Jumbo Salted Peanuts...the peanuts were
|
91 |
+
actually small sized unsalted. Not sure if this was an error or if the
|
92 |
+
vendor intended to represent the product as 'Jumbo'.
|
93 |
+
|
94 |
+
Output: Not as Advertised
|
95 |
+
|
96 |
+
|
97 |
+
Input: My toddler loves this game to a point where he asks for it. That's a
|
98 |
+
big thing for me. Secondly, no glitching unlike one of their competitors
|
99 |
+
(PlayShifu). Any tech I don’t have to reach out to support for help is a
|
100 |
+
good tech for me. I even enjoy some of the games and activities in this.
|
101 |
+
Overall, this is a product that shows that the developers took their time
|
102 |
+
and made sure people would not be asking for refund. I’ve become bias
|
103 |
+
regarding this product and honestly I look forward to buying more of this
|
104 |
+
company’s stuff. Please keep up the great work.
|
105 |
+
|
106 |
+
Output:
|
107 |
+
example_title: Text Summarization
|
108 |
+
- text: |-
|
109 |
+
Identify which sense of a word is meant in a given context.
|
110 |
+
|
111 |
+
Context: The river overflowed the bank.
|
112 |
+
Word: bank
|
113 |
+
Sense: river bank
|
114 |
+
|
115 |
+
Context: A mouse takes much more room than a trackball.
|
116 |
+
Word: mouse
|
117 |
+
Sense: computer mouse
|
118 |
+
|
119 |
+
Context: The bank will not be accepting cash on Saturdays.
|
120 |
+
Word: bank
|
121 |
+
Sense: commercial (finance) banks
|
122 |
+
|
123 |
+
Context: Bill killed the project
|
124 |
+
Word: kill
|
125 |
+
Sense:
|
126 |
+
example_title: Word Sense Disambiguation
|
127 |
+
- text: >-
|
128 |
+
Given a pair of sentences, choose whether the two sentences agree
|
129 |
+
(entailment)/disagree (contradiction) with each other.
|
130 |
+
|
131 |
+
Possible labels: 1. entailment 2. contradiction
|
132 |
+
|
133 |
+
|
134 |
+
Sentence 1: The skier was on the edge of the ramp. Sentence 2: The skier was
|
135 |
+
dressed in winter clothes.
|
136 |
+
|
137 |
+
Label: entailment
|
138 |
+
|
139 |
+
|
140 |
+
Sentence 1: The boy skated down the staircase railing. Sentence 2: The boy
|
141 |
+
is a newbie skater.
|
142 |
+
|
143 |
+
Label: contradiction
|
144 |
+
|
145 |
+
|
146 |
+
Sentence 1: Two middle-aged people stand by a golf hole. Sentence 2: A
|
147 |
+
couple riding in a golf cart.
|
148 |
+
|
149 |
+
Label:
|
150 |
+
example_title: Natural Language Inference
|
151 |
+
inference:
|
152 |
+
parameters:
|
153 |
+
temperature: 0.7
|
154 |
+
top_p: 0.7
|
155 |
+
top_k: 50
|
156 |
+
max_new_tokens: 128
|
157 |
+
---
|
158 |
+
|
159 |
+
# RedPajama-INCITE-7B-Instruct
|
160 |
+
|
161 |
+
RedPajama-INCITE-7B-Instruct was developed by Together and leaders from the open-source AI community including Ontocord.ai, ETH DS3Lab, AAI CERC, Université de Montréal, MILA - Québec AI Institute, Stanford Center for Research on Foundation Models (CRFM), Stanford Hazy Research research group and LAION.
|
162 |
+
|
163 |
+
The model was fine-tuned for few-shot applications on the data of [GPT-JT](https://huggingface.co/togethercomputer/GPT-JT-6B-v1), with exclusion of tasks that overlap with the HELM core scenarios.
|
164 |
+
|
165 |
+
- Base Model: [RedPajama-INCITE-7B-Base](https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Base)
|
166 |
+
- Instruction-tuned Version: [RedPajama-INCITE-7B-Instruct](https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Instruct)
|
167 |
+
- Chat Version: [RedPajama-INCITE-7B-Chat](https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Chat)
|
168 |
+
|
169 |
+
|
170 |
+
## Model Details
|
171 |
+
- **Developed by**: Together Computer.
|
172 |
+
- **Model type**: Language Model
|
173 |
+
- **Language(s)**: English
|
174 |
+
- **License**: Apache 2.0
|
175 |
+
- **Model Description**: A 6.9B parameter pretrained language model.
|
176 |
+
|
177 |
+
# Quick Start
|
178 |
+
|
179 |
+
Please note that the model requires `transformers` version >= 4.25.1.
|
180 |
+
|
181 |
+
## GPU Inference
|
182 |
+
|
183 |
+
This requires a GPU with 16GB memory.
|
184 |
+
|
185 |
+
```python
|
186 |
+
import torch
|
187 |
+
import transformers
|
188 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
189 |
+
|
190 |
+
MIN_TRANSFORMERS_VERSION = '4.25.1'
|
191 |
+
|
192 |
+
# check transformers version
|
193 |
+
assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.'
|
194 |
+
|
195 |
+
# init
|
196 |
+
tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Instruct")
|
197 |
+
model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Instruct", torch_dtype=torch.float16)
|
198 |
+
model = model.to('cuda:0')
|
199 |
+
# infer
|
200 |
+
prompt = "Q: The capital of France is?\nA:"
|
201 |
+
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
|
202 |
+
input_length = inputs.input_ids.shape[1]
|
203 |
+
outputs = model.generate(
|
204 |
+
**inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True
|
205 |
+
)
|
206 |
+
token = outputs.sequences[0, input_length:]
|
207 |
+
output_str = tokenizer.decode(token)
|
208 |
+
print(output_str)
|
209 |
+
"""
|
210 |
+
Paris
|
211 |
+
"""
|
212 |
+
```
|
213 |
+
|
214 |
+
## GPU Inference in Int8
|
215 |
+
|
216 |
+
This requires a GPU with 12GB memory.
|
217 |
+
|
218 |
+
To run inference with int8, please ensure you have installed accelerate and bitandbytes. You can install them with the following command:
|
219 |
+
|
220 |
+
```bash
|
221 |
+
pip install accelerate
|
222 |
+
pip install bitsandbytes
|
223 |
+
```
|
224 |
+
|
225 |
+
Then you can run inference with int8 as follows:
|
226 |
+
|
227 |
+
```python
|
228 |
+
import torch
|
229 |
+
import transformers
|
230 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
231 |
+
|
232 |
+
MIN_TRANSFORMERS_VERSION = '4.25.1'
|
233 |
+
|
234 |
+
# check transformers version
|
235 |
+
assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.'
|
236 |
+
|
237 |
+
# init
|
238 |
+
tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Instruct")
|
239 |
+
model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Instruct", device_map='auto', torch_dtype=torch.float16, load_in_8bit=True)
|
240 |
+
|
241 |
+
# infer
|
242 |
+
prompt = "Q: The capital of France is?\nA:"
|
243 |
+
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
|
244 |
+
input_length = inputs.input_ids.shape[1]
|
245 |
+
outputs = model.generate(
|
246 |
+
**inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True
|
247 |
+
)
|
248 |
+
token = outputs.sequences[0, input_length:]
|
249 |
+
output_str = tokenizer.decode(token)
|
250 |
+
print(output_str)
|
251 |
+
"""
|
252 |
+
Paris
|
253 |
+
"""
|
254 |
+
```
|
255 |
+
|
256 |
+
## CPU Inference
|
257 |
+
|
258 |
+
```python
|
259 |
+
import torch
|
260 |
+
import transformers
|
261 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
262 |
+
|
263 |
+
MIN_TRANSFORMERS_VERSION = '4.25.1'
|
264 |
+
|
265 |
+
# check transformers version
|
266 |
+
assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.'
|
267 |
+
|
268 |
+
# init
|
269 |
+
tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Instruct")
|
270 |
+
model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Instruct", torch_dtype=torch.bfloat16)
|
271 |
+
# infer
|
272 |
+
prompt = "Q: The capital of France is?\nA:"
|
273 |
+
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
|
274 |
+
input_length = inputs.input_ids.shape[1]
|
275 |
+
outputs = model.generate(
|
276 |
+
**inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True
|
277 |
+
)
|
278 |
+
token = outputs.sequences[0, input_length:]
|
279 |
+
output_str = tokenizer.decode(token)
|
280 |
+
print(output_str)
|
281 |
+
"""
|
282 |
+
Paris
|
283 |
+
"""
|
284 |
+
```
|
285 |
+
|
286 |
+
Please note that since `LayerNormKernelImpl` is not implemented in fp16 for CPU, we use `bfloat16` for CPU inference.
|
287 |
+
|
288 |
+
|
289 |
+
# Uses
|
290 |
+
|
291 |
+
## Direct Use
|
292 |
+
|
293 |
+
Excluded uses are described below.
|
294 |
+
|
295 |
+
### Misuse, Malicious Use, and Out-of-Scope Use
|
296 |
+
|
297 |
+
It is the responsibility of the end user to ensure that the model is used in a responsible and ethical manner.
|
298 |
+
|
299 |
+
#### Out-of-Scope Use
|
300 |
+
|
301 |
+
RedPajama-INCITE-7B-Instruct is a language model and may not perform well for other use cases outside of its intended scope.
|
302 |
+
For example, it may not be suitable for use in safety-critical applications or for making decisions that have a significant impact on individuals or society.
|
303 |
+
It is important to consider the limitations of the model and to only use it for its intended purpose.
|
304 |
+
|
305 |
+
#### Misuse and Malicious Use
|
306 |
+
|
307 |
+
RedPajama-INCITE-7B-Instruct is designed for language modeling.
|
308 |
+
Misuse of the model, such as using it to engage in illegal or unethical activities, is strictly prohibited and goes against the principles of the project.
|
309 |
+
|
310 |
+
Using the model to generate content that is cruel to individuals is a misuse of this model. This includes, but is not limited to:
|
311 |
+
|
312 |
+
- Generating fake news, misinformation, or propaganda
|
313 |
+
- Promoting hate speech, discrimination, or violence against individuals or groups
|
314 |
+
- Impersonating individuals or organizations without their consent
|
315 |
+
- Engaging in cyberbullying or harassment
|
316 |
+
- Defamatory content
|
317 |
+
- Spamming or scamming
|
318 |
+
- Sharing confidential or sensitive information without proper authorization
|
319 |
+
- Violating the terms of use of the model or the data used to train it
|
320 |
+
- Creating automated bots for malicious purposes such as spreading malware, phishing scams, or spamming
|
321 |
+
|
322 |
+
## Limitations
|
323 |
+
|
324 |
+
RedPajama-INCITE-7B-Instruct, like other language models, has limitations that should be taken into consideration.
|
325 |
+
For example, the model may not always provide accurate or relevant answers, particularly for questions that are complex, ambiguous, or outside of its training data.
|
326 |
+
We therefore welcome contributions from individuals and organizations, and encourage collaboration towards creating a more robust and inclusive chatbot.
|
327 |
+
|
328 |
+
## Training
|
329 |
+
|
330 |
+
**Training Data**
|
331 |
+
|
332 |
+
Please refer to [togethercomputer/RedPajama-Data-1T](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T)
|
333 |
+
|
334 |
+
**Training Procedure**
|
335 |
+
|
336 |
+
- **Hardware:** 8 A100
|
337 |
+
- **Optimizer:** Adam
|
338 |
+
- **Gradient Accumulations**: 1
|
339 |
+
- **Num of Tokens:** 1B tokens
|
340 |
+
- **Learning rate:** 1e-5
|
341 |
+
|
342 |
+
## Community
|
343 |
+
|
344 |
+
Join us on [Together Discord](https://discord.gg/6ZVDU8tTD4)
|
config.json
ADDED
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_name_or_path": "togethercomputer/RedPajama-INCITE-7B-Instruct",
|
3 |
+
"architectures": [
|
4 |
+
"GPTNeoXForCausalLM"
|
5 |
+
],
|
6 |
+
"bos_token_id": 0,
|
7 |
+
"eos_token_id": 0,
|
8 |
+
"hidden_act": "gelu",
|
9 |
+
"hidden_size": 4096,
|
10 |
+
"initializer_range": 0.02,
|
11 |
+
"intermediate_size": 16384,
|
12 |
+
"layer_norm_eps": 1e-05,
|
13 |
+
"max_position_embeddings": 2048,
|
14 |
+
"model_type": "gpt_neox",
|
15 |
+
"num_attention_heads": 32,
|
16 |
+
"num_hidden_layers": 32,
|
17 |
+
"rotary_emb_base": 10000,
|
18 |
+
"rotary_pct": 1.0,
|
19 |
+
"tie_word_embeddings": false,
|
20 |
+
"torch_dtype": "float16",
|
21 |
+
"transformers_version": "4.28.1",
|
22 |
+
"use_cache": true,
|
23 |
+
"use_parallel_residual": false,
|
24 |
+
"vocab_size": 50432
|
25 |
+
}
|
ct_output_models/config.json
ADDED
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"bos_token": "<|endoftext|>",
|
3 |
+
"eos_token": "<|endoftext|>",
|
4 |
+
"layer_norm_epsilon": null,
|
5 |
+
"unk_token": "<|endoftext|>"
|
6 |
+
}
|
ct_output_models/model.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b8570b40313c8703bb9051c5f4ef13f58f658bdac4a4feb8d19e3df3d9c23ba7
|
3 |
+
size 6867593490
|
ct_output_models/vocabulary.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
generation_config.json
ADDED
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_from_model_config": true,
|
3 |
+
"bos_token_id": 0,
|
4 |
+
"eos_token_id": 0,
|
5 |
+
"transformers_version": "4.28.1"
|
6 |
+
}
|
special_tokens_map.json
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"bos_token": "<|endoftext|>",
|
3 |
+
"eos_token": "<|endoftext|>",
|
4 |
+
"unk_token": "<|endoftext|>"
|
5 |
+
}
|
tokenizer.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
tokenizer_config.json
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"add_prefix_space": false,
|
3 |
+
"bos_token": "<|endoftext|>",
|
4 |
+
"clean_up_tokenization_spaces": true,
|
5 |
+
"eos_token": "<|endoftext|>",
|
6 |
+
"model_max_length": 2048,
|
7 |
+
"tokenizer_class": "GPTNeoXTokenizer",
|
8 |
+
"unk_token": "<|endoftext|>"
|
9 |
+
}
|