Text Generation
Transformers
English
gpt_neox
Inference Endpoints
Jamie@TitanML commited on
Commit
08036e4
1 Parent(s): 29c2ccd

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -25,7 +25,6 @@
25
  *.safetensors filter=lfs diff=lfs merge=lfs -text
26
  saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
  *.tar.* filter=lfs diff=lfs merge=lfs -text
28
- *.tar filter=lfs diff=lfs merge=lfs -text
29
  *.tflite filter=lfs diff=lfs merge=lfs -text
30
  *.tgz filter=lfs diff=lfs merge=lfs -text
31
  *.wasm filter=lfs diff=lfs merge=lfs -text
 
25
  *.safetensors filter=lfs diff=lfs merge=lfs -text
26
  saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
  *.tar.* filter=lfs diff=lfs merge=lfs -text
 
28
  *.tflite filter=lfs diff=lfs merge=lfs -text
29
  *.tgz filter=lfs diff=lfs merge=lfs -text
30
  *.wasm filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,344 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ datasets:
6
+ - togethercomputer/RedPajama-Data-1T
7
+ - togethercomputer/RedPajama-Data-Instruct
8
+ widget:
9
+ - text: |-
10
+ Label the tweets as either 'positive', 'negative', 'mixed', or 'neutral':
11
+
12
+ Tweet: I can say that there isn't anything I would change.
13
+ Label: positive
14
+
15
+ Tweet: I'm not sure about this.
16
+ Label: neutral
17
+
18
+ Tweet: I liked some parts but I didn't like other parts.
19
+ Label: mixed
20
+
21
+ Tweet: I think the background image could have been better.
22
+ Label: negative
23
+
24
+ Tweet: I really like it.
25
+ Label:
26
+ example_title: Sentiment Analysis
27
+ - text: |-
28
+ Please answer the following question:
29
+
30
+ Question: What is the capital of Canada?
31
+ Answer: Ottawa
32
+
33
+ Question: What is the currency of Switzerland?
34
+ Answer: Swiss franc
35
+
36
+ Question: In which country is Wisconsin located?
37
+ Answer:
38
+ example_title: Question Answering
39
+ - text: >-
40
+ Given a news article, classify its topic.
41
+
42
+ Possible labels: 1. World 2. Sports 3. Business 4. Sci/Tech
43
+
44
+
45
+ Article: A nearby star thought to harbor comets and asteroids now appears to
46
+ be home to planets, too.
47
+
48
+ Label: Sci/Tech
49
+
50
+
51
+ Article: Soaring crude prices plus worries about the economy and the outlook
52
+ for earnings are expected to hang over the stock market next week during the
53
+ depth of the summer doldrums.
54
+
55
+ Label: Business
56
+
57
+
58
+ Article: Murtagh a stickler for success Northeastern field hockey coach
59
+ Cheryl Murtagh doesn't want the glare of the spotlight that shines on her to
60
+ detract from a team that has been the America East champion for the past
61
+ three years and has been to the NCAA tournament 13 times.
62
+
63
+ Label::
64
+ example_title: Topic Classification
65
+ - text: |-
66
+ Paraphrase the given sentence into a different sentence.
67
+
68
+ Input: Can you recommend some upscale restaurants in New York?
69
+ Output: What upscale restaurants do you recommend in New York?
70
+
71
+ Input: What are the famous places we should not miss in Paris?
72
+ Output: Recommend some of the best places to visit in Paris?
73
+
74
+ Input: Could you recommend some hotels that have cheap price in Zurich?
75
+ Output:
76
+ example_title: Paraphrasing
77
+ - text: >-
78
+ Given a review from Amazon's food products, the task is to generate a short
79
+ summary of the given review in the input.
80
+
81
+
82
+ Input: I have bought several of the Vitality canned dog food products and
83
+ have found them all to be of good quality. The product looks more like a
84
+ stew than a processed meat and it smells better. My Labrador is finicky and
85
+ she appreciates this product better than most.
86
+
87
+ Output: Good Quality Dog Food
88
+
89
+
90
+ Input: Product arrived labeled as Jumbo Salted Peanuts...the peanuts were
91
+ actually small sized unsalted. Not sure if this was an error or if the
92
+ vendor intended to represent the product as 'Jumbo'.
93
+
94
+ Output: Not as Advertised
95
+
96
+
97
+ Input: My toddler loves this game to a point where he asks for it. That's a
98
+ big thing for me. Secondly, no glitching unlike one of their competitors
99
+ (PlayShifu). Any tech I don’t have to reach out to support for help is a
100
+ good tech for me. I even enjoy some of the games and activities in this.
101
+ Overall, this is a product that shows that the developers took their time
102
+ and made sure people would not be asking for refund. I’ve become bias
103
+ regarding this product and honestly I look forward to buying more of this
104
+ company’s stuff. Please keep up the great work.
105
+
106
+ Output:
107
+ example_title: Text Summarization
108
+ - text: |-
109
+ Identify which sense of a word is meant in a given context.
110
+
111
+ Context: The river overflowed the bank.
112
+ Word: bank
113
+ Sense: river bank
114
+
115
+ Context: A mouse takes much more room than a trackball.
116
+ Word: mouse
117
+ Sense: computer mouse
118
+
119
+ Context: The bank will not be accepting cash on Saturdays.
120
+ Word: bank
121
+ Sense: commercial (finance) banks
122
+
123
+ Context: Bill killed the project
124
+ Word: kill
125
+ Sense:
126
+ example_title: Word Sense Disambiguation
127
+ - text: >-
128
+ Given a pair of sentences, choose whether the two sentences agree
129
+ (entailment)/disagree (contradiction) with each other.
130
+
131
+ Possible labels: 1. entailment 2. contradiction
132
+
133
+
134
+ Sentence 1: The skier was on the edge of the ramp. Sentence 2: The skier was
135
+ dressed in winter clothes.
136
+
137
+ Label: entailment
138
+
139
+
140
+ Sentence 1: The boy skated down the staircase railing. Sentence 2: The boy
141
+ is a newbie skater.
142
+
143
+ Label: contradiction
144
+
145
+
146
+ Sentence 1: Two middle-aged people stand by a golf hole. Sentence 2: A
147
+ couple riding in a golf cart.
148
+
149
+ Label:
150
+ example_title: Natural Language Inference
151
+ inference:
152
+ parameters:
153
+ temperature: 0.7
154
+ top_p: 0.7
155
+ top_k: 50
156
+ max_new_tokens: 128
157
+ ---
158
+
159
+ # RedPajama-INCITE-7B-Instruct
160
+
161
+ RedPajama-INCITE-7B-Instruct was developed by Together and leaders from the open-source AI community including Ontocord.ai, ETH DS3Lab, AAI CERC, Université de Montréal, MILA - Québec AI Institute, Stanford Center for Research on Foundation Models (CRFM), Stanford Hazy Research research group and LAION.
162
+
163
+ The model was fine-tuned for few-shot applications on the data of [GPT-JT](https://huggingface.co/togethercomputer/GPT-JT-6B-v1), with exclusion of tasks that overlap with the HELM core scenarios.
164
+
165
+ - Base Model: [RedPajama-INCITE-7B-Base](https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Base)
166
+ - Instruction-tuned Version: [RedPajama-INCITE-7B-Instruct](https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Instruct)
167
+ - Chat Version: [RedPajama-INCITE-7B-Chat](https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Chat)
168
+
169
+
170
+ ## Model Details
171
+ - **Developed by**: Together Computer.
172
+ - **Model type**: Language Model
173
+ - **Language(s)**: English
174
+ - **License**: Apache 2.0
175
+ - **Model Description**: A 6.9B parameter pretrained language model.
176
+
177
+ # Quick Start
178
+
179
+ Please note that the model requires `transformers` version >= 4.25.1.
180
+
181
+ ## GPU Inference
182
+
183
+ This requires a GPU with 16GB memory.
184
+
185
+ ```python
186
+ import torch
187
+ import transformers
188
+ from transformers import AutoTokenizer, AutoModelForCausalLM
189
+
190
+ MIN_TRANSFORMERS_VERSION = '4.25.1'
191
+
192
+ # check transformers version
193
+ assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.'
194
+
195
+ # init
196
+ tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Instruct")
197
+ model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Instruct", torch_dtype=torch.float16)
198
+ model = model.to('cuda:0')
199
+ # infer
200
+ prompt = "Q: The capital of France is?\nA:"
201
+ inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
202
+ input_length = inputs.input_ids.shape[1]
203
+ outputs = model.generate(
204
+ **inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True
205
+ )
206
+ token = outputs.sequences[0, input_length:]
207
+ output_str = tokenizer.decode(token)
208
+ print(output_str)
209
+ """
210
+ Paris
211
+ """
212
+ ```
213
+
214
+ ## GPU Inference in Int8
215
+
216
+ This requires a GPU with 12GB memory.
217
+
218
+ To run inference with int8, please ensure you have installed accelerate and bitandbytes. You can install them with the following command:
219
+
220
+ ```bash
221
+ pip install accelerate
222
+ pip install bitsandbytes
223
+ ```
224
+
225
+ Then you can run inference with int8 as follows:
226
+
227
+ ```python
228
+ import torch
229
+ import transformers
230
+ from transformers import AutoTokenizer, AutoModelForCausalLM
231
+
232
+ MIN_TRANSFORMERS_VERSION = '4.25.1'
233
+
234
+ # check transformers version
235
+ assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.'
236
+
237
+ # init
238
+ tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Instruct")
239
+ model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Instruct", device_map='auto', torch_dtype=torch.float16, load_in_8bit=True)
240
+
241
+ # infer
242
+ prompt = "Q: The capital of France is?\nA:"
243
+ inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
244
+ input_length = inputs.input_ids.shape[1]
245
+ outputs = model.generate(
246
+ **inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True
247
+ )
248
+ token = outputs.sequences[0, input_length:]
249
+ output_str = tokenizer.decode(token)
250
+ print(output_str)
251
+ """
252
+ Paris
253
+ """
254
+ ```
255
+
256
+ ## CPU Inference
257
+
258
+ ```python
259
+ import torch
260
+ import transformers
261
+ from transformers import AutoTokenizer, AutoModelForCausalLM
262
+
263
+ MIN_TRANSFORMERS_VERSION = '4.25.1'
264
+
265
+ # check transformers version
266
+ assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.'
267
+
268
+ # init
269
+ tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Instruct")
270
+ model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Instruct", torch_dtype=torch.bfloat16)
271
+ # infer
272
+ prompt = "Q: The capital of France is?\nA:"
273
+ inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
274
+ input_length = inputs.input_ids.shape[1]
275
+ outputs = model.generate(
276
+ **inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True
277
+ )
278
+ token = outputs.sequences[0, input_length:]
279
+ output_str = tokenizer.decode(token)
280
+ print(output_str)
281
+ """
282
+ Paris
283
+ """
284
+ ```
285
+
286
+ Please note that since `LayerNormKernelImpl` is not implemented in fp16 for CPU, we use `bfloat16` for CPU inference.
287
+
288
+
289
+ # Uses
290
+
291
+ ## Direct Use
292
+
293
+ Excluded uses are described below.
294
+
295
+ ### Misuse, Malicious Use, and Out-of-Scope Use
296
+
297
+ It is the responsibility of the end user to ensure that the model is used in a responsible and ethical manner.
298
+
299
+ #### Out-of-Scope Use
300
+
301
+ RedPajama-INCITE-7B-Instruct is a language model and may not perform well for other use cases outside of its intended scope.
302
+ For example, it may not be suitable for use in safety-critical applications or for making decisions that have a significant impact on individuals or society.
303
+ It is important to consider the limitations of the model and to only use it for its intended purpose.
304
+
305
+ #### Misuse and Malicious Use
306
+
307
+ RedPajama-INCITE-7B-Instruct is designed for language modeling.
308
+ Misuse of the model, such as using it to engage in illegal or unethical activities, is strictly prohibited and goes against the principles of the project.
309
+
310
+ Using the model to generate content that is cruel to individuals is a misuse of this model. This includes, but is not limited to:
311
+
312
+ - Generating fake news, misinformation, or propaganda
313
+ - Promoting hate speech, discrimination, or violence against individuals or groups
314
+ - Impersonating individuals or organizations without their consent
315
+ - Engaging in cyberbullying or harassment
316
+ - Defamatory content
317
+ - Spamming or scamming
318
+ - Sharing confidential or sensitive information without proper authorization
319
+ - Violating the terms of use of the model or the data used to train it
320
+ - Creating automated bots for malicious purposes such as spreading malware, phishing scams, or spamming
321
+
322
+ ## Limitations
323
+
324
+ RedPajama-INCITE-7B-Instruct, like other language models, has limitations that should be taken into consideration.
325
+ For example, the model may not always provide accurate or relevant answers, particularly for questions that are complex, ambiguous, or outside of its training data.
326
+ We therefore welcome contributions from individuals and organizations, and encourage collaboration towards creating a more robust and inclusive chatbot.
327
+
328
+ ## Training
329
+
330
+ **Training Data**
331
+
332
+ Please refer to [togethercomputer/RedPajama-Data-1T](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T)
333
+
334
+ **Training Procedure**
335
+
336
+ - **Hardware:** 8 A100
337
+ - **Optimizer:** Adam
338
+ - **Gradient Accumulations**: 1
339
+ - **Num of Tokens:** 1B tokens
340
+ - **Learning rate:** 1e-5
341
+
342
+ ## Community
343
+
344
+ Join us on [Together Discord](https://discord.gg/6ZVDU8tTD4)
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "togethercomputer/RedPajama-INCITE-7B-Instruct",
3
+ "architectures": [
4
+ "GPTNeoXForCausalLM"
5
+ ],
6
+ "bos_token_id": 0,
7
+ "eos_token_id": 0,
8
+ "hidden_act": "gelu",
9
+ "hidden_size": 4096,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 16384,
12
+ "layer_norm_eps": 1e-05,
13
+ "max_position_embeddings": 2048,
14
+ "model_type": "gpt_neox",
15
+ "num_attention_heads": 32,
16
+ "num_hidden_layers": 32,
17
+ "rotary_emb_base": 10000,
18
+ "rotary_pct": 1.0,
19
+ "tie_word_embeddings": false,
20
+ "torch_dtype": "float16",
21
+ "transformers_version": "4.28.1",
22
+ "use_cache": true,
23
+ "use_parallel_residual": false,
24
+ "vocab_size": 50432
25
+ }
ct_output_models/config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<|endoftext|>",
3
+ "eos_token": "<|endoftext|>",
4
+ "layer_norm_epsilon": null,
5
+ "unk_token": "<|endoftext|>"
6
+ }
ct_output_models/model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8570b40313c8703bb9051c5f4ef13f58f658bdac4a4feb8d19e3df3d9c23ba7
3
+ size 6867593490
ct_output_models/vocabulary.json ADDED
The diff for this file is too large to render. See raw diff
 
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 0,
4
+ "eos_token_id": 0,
5
+ "transformers_version": "4.28.1"
6
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<|endoftext|>",
3
+ "eos_token": "<|endoftext|>",
4
+ "unk_token": "<|endoftext|>"
5
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "bos_token": "<|endoftext|>",
4
+ "clean_up_tokenization_spaces": true,
5
+ "eos_token": "<|endoftext|>",
6
+ "model_max_length": 2048,
7
+ "tokenizer_class": "GPTNeoXTokenizer",
8
+ "unk_token": "<|endoftext|>"
9
+ }