Iker commited on
Commit
adc2c1a
1 Parent(s): 3786308

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -4
README.md CHANGED
@@ -255,7 +255,7 @@ To train the model, we have developed our own training and annotation library: [
255
 
256
  For the hackathon, we decided to train a model with 7 trillion parameters, since using 4-bit quantization, it is possible to run the model on domestic hardware. After analyzing the performance of a large number of LLMs, we chose [openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) due to its high performance without the need for pretraining. To minimally disturb the prior knowledge of the model that allows for this performance, we opted to use the *Low-Rank Adaptation* (LoRA) training technique.
257
 
258
- The exact training configuration is available at []()
259
 
260
 
261
  #### Training Hyperparameters
@@ -269,12 +269,11 @@ The exact training configuration is available at []()
269
  - **Optimizer:**: AdamW
270
  - **Software**: Huggingface, Peft, Pytorch, Deepspeed
271
 
 
272
 
273
  ## Evaluation
274
 
275
 
276
-
277
-
278
  ### Testing Data, Factors & Metrics
279
 
280
  #### Testing Data
@@ -302,7 +301,6 @@ After training, our model acquires the ability to perform summaries with a capac
302
 
303
  ## Environmental Impact
304
 
305
-
306
  For the carbon footprint estimation, we estimated the values considering a 400W consumption per GPU with a 0.083 kg/kWh carbon intensity: https://app.electricitymaps.com/map
307
 
308
  - **Hardware Type:** 4 X Nvidia A100 80Gb
 
255
 
256
  For the hackathon, we decided to train a model with 7 trillion parameters, since using 4-bit quantization, it is possible to run the model on domestic hardware. After analyzing the performance of a large number of LLMs, we chose [openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) due to its high performance without the need for pretraining. To minimally disturb the prior knowledge of the model that allows for this performance, we opted to use the *Low-Rank Adaptation* (LoRA) training technique.
257
 
258
+ The exact training configuration is available at: https://huggingface.co/somosnlp/NoticIA-7B/blob/main/openchat-3.5-0106_LoRA.yaml
259
 
260
 
261
  #### Training Hyperparameters
 
269
  - **Optimizer:**: AdamW
270
  - **Software**: Huggingface, Peft, Pytorch, Deepspeed
271
 
272
+ The exact training configuration is available at: https://huggingface.co/somosnlp/NoticIA-7B/blob/main/openchat-3.5-0106_LoRA.yaml
273
 
274
  ## Evaluation
275
 
276
 
 
 
277
  ### Testing Data, Factors & Metrics
278
 
279
  #### Testing Data
 
301
 
302
  ## Environmental Impact
303
 
 
304
  For the carbon footprint estimation, we estimated the values considering a 400W consumption per GPU with a 0.083 kg/kWh carbon intensity: https://app.electricitymaps.com/map
305
 
306
  - **Hardware Type:** 4 X Nvidia A100 80Gb