somosnlp
/

NoticIA-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Iker commited on Apr 24

Commit

adc2c1a

•

1 Parent(s): 3786308

Update README.md

Files changed (1) hide show

README.md +2 -4

README.md CHANGED Viewed

@@ -255,7 +255,7 @@ To train the model, we have developed our own training and annotation library: [
 For the hackathon, we decided to train a model with 7 trillion parameters, since using 4-bit quantization, it is possible to run the model on domestic hardware. After analyzing the performance of a large number of LLMs, we chose [openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) due to its high performance without the need for pretraining. To minimally disturb the prior knowledge of the model that allows for this performance, we opted to use the *Low-Rank Adaptation* (LoRA) training technique.
-The exact training configuration is available at []()
 #### Training Hyperparameters
@@ -269,12 +269,11 @@ The exact training configuration is available at []()
 - **Optimizer:**: AdamW
 - **Software**: Huggingface, Peft, Pytorch, Deepspeed
 ## Evaluation
 ### Testing Data, Factors & Metrics
 #### Testing Data
@@ -302,7 +301,6 @@ After training, our model acquires the ability to perform summaries with a capac
 ## Environmental Impact
 For the carbon footprint estimation, we estimated the values considering a 400W consumption per GPU with a 0.083 kg/kWh carbon intensity: https://app.electricitymaps.com/map
 - **Hardware Type:** 4 X Nvidia A100 80Gb

 For the hackathon, we decided to train a model with 7 trillion parameters, since using 4-bit quantization, it is possible to run the model on domestic hardware. After analyzing the performance of a large number of LLMs, we chose [openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) due to its high performance without the need for pretraining. To minimally disturb the prior knowledge of the model that allows for this performance, we opted to use the *Low-Rank Adaptation* (LoRA) training technique.
+The exact training configuration is available at: https://huggingface.co/somosnlp/NoticIA-7B/blob/main/openchat-3.5-0106_LoRA.yaml
 #### Training Hyperparameters
 - **Optimizer:**: AdamW
 - **Software**: Huggingface, Peft, Pytorch, Deepspeed
+The exact training configuration is available at: https://huggingface.co/somosnlp/NoticIA-7B/blob/main/openchat-3.5-0106_LoRA.yaml
 ## Evaluation
 ### Testing Data, Factors & Metrics
 #### Testing Data
 ## Environmental Impact
 For the carbon footprint estimation, we estimated the values considering a 400W consumption per GPU with a 0.083 kg/kWh carbon intensity: https://app.electricitymaps.com/map
 - **Hardware Type:** 4 X Nvidia A100 80Gb