Update README.md
Browse files
README.md
CHANGED
@@ -255,7 +255,7 @@ To train the model, we have developed our own training and annotation library: [
|
|
255 |
|
256 |
For the hackathon, we decided to train a model with 7 trillion parameters, since using 4-bit quantization, it is possible to run the model on domestic hardware. After analyzing the performance of a large number of LLMs, we chose [openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) due to its high performance without the need for pretraining. To minimally disturb the prior knowledge of the model that allows for this performance, we opted to use the *Low-Rank Adaptation* (LoRA) training technique.
|
257 |
|
258 |
-
The exact training configuration is available at
|
259 |
|
260 |
|
261 |
#### Training Hyperparameters
|
@@ -269,12 +269,11 @@ The exact training configuration is available at []()
|
|
269 |
- **Optimizer:**: AdamW
|
270 |
- **Software**: Huggingface, Peft, Pytorch, Deepspeed
|
271 |
|
|
|
272 |
|
273 |
## Evaluation
|
274 |
|
275 |
|
276 |
-
|
277 |
-
|
278 |
### Testing Data, Factors & Metrics
|
279 |
|
280 |
#### Testing Data
|
@@ -302,7 +301,6 @@ After training, our model acquires the ability to perform summaries with a capac
|
|
302 |
|
303 |
## Environmental Impact
|
304 |
|
305 |
-
|
306 |
For the carbon footprint estimation, we estimated the values considering a 400W consumption per GPU with a 0.083 kg/kWh carbon intensity: https://app.electricitymaps.com/map
|
307 |
|
308 |
- **Hardware Type:** 4 X Nvidia A100 80Gb
|
|
|
255 |
|
256 |
For the hackathon, we decided to train a model with 7 trillion parameters, since using 4-bit quantization, it is possible to run the model on domestic hardware. After analyzing the performance of a large number of LLMs, we chose [openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) due to its high performance without the need for pretraining. To minimally disturb the prior knowledge of the model that allows for this performance, we opted to use the *Low-Rank Adaptation* (LoRA) training technique.
|
257 |
|
258 |
+
The exact training configuration is available at: https://huggingface.co/somosnlp/NoticIA-7B/blob/main/openchat-3.5-0106_LoRA.yaml
|
259 |
|
260 |
|
261 |
#### Training Hyperparameters
|
|
|
269 |
- **Optimizer:**: AdamW
|
270 |
- **Software**: Huggingface, Peft, Pytorch, Deepspeed
|
271 |
|
272 |
+
The exact training configuration is available at: https://huggingface.co/somosnlp/NoticIA-7B/blob/main/openchat-3.5-0106_LoRA.yaml
|
273 |
|
274 |
## Evaluation
|
275 |
|
276 |
|
|
|
|
|
277 |
### Testing Data, Factors & Metrics
|
278 |
|
279 |
#### Testing Data
|
|
|
301 |
|
302 |
## Environmental Impact
|
303 |
|
|
|
304 |
For the carbon footprint estimation, we estimated the values considering a 400W consumption per GPU with a 0.083 kg/kWh carbon intensity: https://app.electricitymaps.com/map
|
305 |
|
306 |
- **Hardware Type:** 4 X Nvidia A100 80Gb
|