grandiose-pizza commited on
Commit
bb28a9d
1 Parent(s): 25e49d8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -16
README.md CHANGED
@@ -73,7 +73,7 @@ Below is sample code to use the model. Note that the model requires a custom mod
73
  import torch
74
  from transformers import AutoTokenizer, AutoModelForCausalLM
75
 
76
- model_path = "inceptionai/jais-family-30b-16k-chat"
77
 
78
  prompt_eng = "### Instruction:Your name is 'Jais', and you are named after Jebel Jais, the highest mountain in UAE. You were made by 'Inception' in the UAE. You are a helpful, respectful, and honest assistant. Always answer as helpfully as possible, while being safe. Complete the conversation between [|Human|] and [|AI|]:\n### Input: [|Human|] {Question}\n[|AI|]\n### Response :"
79
  prompt_ar = "### Instruction:اسمك \"جيس\" وسميت على اسم جبل جيس اعلى جبل في الامارات. تم بنائك بواسطة Inception في الإمارات. أنت مساعد مفيد ومحترم وصادق. أجب دائمًا بأكبر قدر ممكن من المساعدة، مع الحفاظ على البقاء أمناً. أكمل المحادثة بين [|Human|] و[|AI|] :\n### Input:[|Human|] {Question}\n[|AI|]\n### Response :"
@@ -165,20 +165,6 @@ During the adapted pre-training of the (`jais-adapted-*`) models, we first initi
165
  During instruction tuning, each training example consists of a single-turn or multi-turn prompt and it's response. Instead of one example per sequence, examples are packed together while the loss is masked on the prompt tokens. This approach speeds up training by allowing more examples to be processed per batch.
166
 
167
 
168
- ### Training Hyperparameters:
169
-
170
- #### Jais-family-30b-16k-chat
171
-
172
- | Hyperparameter | Value |
173
- |----------------|-------------------------------------------|
174
- | Precision | fp32 |
175
- | Optimizer | AdamW |
176
- | Learning rate | 0 to 0.0016(<=192 warmup steps)<br>0.0016 to 0.00016(>69 and <=11342 steps)|
177
- | Weight decay | 0.1 |
178
- | Batch size | 120|
179
- | Context Length | 16384|
180
- | Steps | 11342 |
181
-
182
  ### Compute Infrastructure
183
 
184
  The training process was performed on the Condor Galaxy (CG) supercomputer platform. A CG contains 64 Cerebras CS-2 Wafer-Scale Engines (WSE-2) with 40 GB of SRAM, and achieves a total of 960 PetaFLOP/s.
@@ -377,6 +363,6 @@ Through this release, we aim to make LLMs more accessible to Arabic NLP research
377
  title={Jais Family Model Card},
378
  author={Inception},
379
  year={2024},
380
- url = {https://huggingface.co/inceptionai/jais-family-30b-16k-chat/blob/main/README.md}
381
  }
382
  ```
 
73
  import torch
74
  from transformers import AutoTokenizer, AutoModelForCausalLM
75
 
76
+ model_path = "inceptionai/Jais-family-256m-chat"
77
 
78
  prompt_eng = "### Instruction:Your name is 'Jais', and you are named after Jebel Jais, the highest mountain in UAE. You were made by 'Inception' in the UAE. You are a helpful, respectful, and honest assistant. Always answer as helpfully as possible, while being safe. Complete the conversation between [|Human|] and [|AI|]:\n### Input: [|Human|] {Question}\n[|AI|]\n### Response :"
79
  prompt_ar = "### Instruction:اسمك \"جيس\" وسميت على اسم جبل جيس اعلى جبل في الامارات. تم بنائك بواسطة Inception في الإمارات. أنت مساعد مفيد ومحترم وصادق. أجب دائمًا بأكبر قدر ممكن من المساعدة، مع الحفاظ على البقاء أمناً. أكمل المحادثة بين [|Human|] و[|AI|] :\n### Input:[|Human|] {Question}\n[|AI|]\n### Response :"
 
165
  During instruction tuning, each training example consists of a single-turn or multi-turn prompt and it's response. Instead of one example per sequence, examples are packed together while the loss is masked on the prompt tokens. This approach speeds up training by allowing more examples to be processed per batch.
166
 
167
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
168
  ### Compute Infrastructure
169
 
170
  The training process was performed on the Condor Galaxy (CG) supercomputer platform. A CG contains 64 Cerebras CS-2 Wafer-Scale Engines (WSE-2) with 40 GB of SRAM, and achieves a total of 960 PetaFLOP/s.
 
363
  title={Jais Family Model Card},
364
  author={Inception},
365
  year={2024},
366
+ url = {https://huggingface.co/inceptionai/Jais-family-256m-chat/blob/main/README.md}
367
  }
368
  ```