Upload 12 files

Browse files

Files changed (8) hide show

AIRA_FineTuning.ipynb +0 -0
Aira_emissions.csv +1 -1
README.md +8 -9
config.json +1 -1
generation_config.json +1 -1
pytorch_model.bin +1 -1
training_stats.parquet +1 -1
vocab.json +0 -0

AIRA_FineTuning.ipynb CHANGED Viewed

The diff for this file is too large to render. See raw diff

Aira_emissions.csv CHANGED Viewed

	@@ -1,2 +1,2 @@
1	timestamp,project_name,run_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud,pue
2	- 2023-06-~~11T00~~:44:09,Aira_emissions,~~cf6bd6e6~~-~~4983~~-~~41ba~~-~~b2a3~~-~~bea0e4ca0a1b~~,~~4011~~.~~559634208679~~,0.~~15083361627850095~~,3.~~7599744247165954e~~-05,42.5,~~343~~.~~846~~,31.30528450012207,0.~~04735859059956342~~,0.~~37325643035497413~~,0.~~034865123505880974~~,0.~~4554801444604184~~,~~Netherlands~~,~~NLD~~,~~groningen~~,,,Linux-5.15.107+-x86_64-with-glibc2.31,3.10.12,2.2.3,12,Intel(R) Xeon(R) CPU @ 2.20GHz,1,1 x NVIDIA A100-SXM4-40GB,6.~~5765~~,53.~~2157~~,83.48075866699219,machine,N,1.0


1	timestamp,project_name,run_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud,pue
2	+ 2023-06-26T22:38:01,Aira_emissions,bd08affb-b1e2-4849-8513-a85a02cf0f84,3690.1905386447906,0.0009893192359507477,2.6809435057358087e-07,42.5,296.394,31.30528450012207,0.04356464091208248,0.34052867170535045,0.03207338637952947,0.41616669899696207,Canada,CAN,quebec,,,Linux-5.15.107+-x86_64-with-glibc2.31,3.10.12,2.2.4,12,Intel(R) Xeon(R) CPU @ 2.20GHz,1,1 x NVIDIA A100-SXM4-40GB,-71.2,46.8,83.48075866699219,machine,N,1.0

README.md CHANGED Viewed

@@ -44,7 +44,6 @@ inference:
 The dataset used to train this model combines the following sources of data: the [`synthetic-instruct-gptj-pairwise`](https://huggingface.co/datasets/Dahoas/synthetic-instruct-gptj-pairwise) dataset, the [`databricks_dolly_15k`](https://huggingface.co/datasets/HuggingFaceH4/databricks_dolly_15k) dataset, the [`instruction-dataset`](https://huggingface.co/datasets/HuggingFaceH4/instruction-dataset) dataset, and a subset of [Aira's](https://github.com/Nkluge-correa/Aira-EXPERT) fine-tuning dataset, focused on Q&A related to Ethics, AI, AI safety, and other related topics. The dataset is available in both Portuguese and English.
 Check our gradio-demo in [Spaces](https://huggingface.co/spaces/nicholasKluge/Aira-Demo).
 ## Details
@@ -56,22 +55,22 @@ Check our gradio-demo in [Spaces](https://huggingface.co/spaces/nicholasKluge/Ai
 - **Batch size:** 32
 - **Optimizer:** `torch.optim.AdamW` (warmup_steps = 1e2, learning_rate = 5e-4, epsilon = 1e-8)
 - **GPU:** 1 NVIDIA A100-SXM4-40GB
-- **Emissions:** 0.15 KgCO2 (Netherlands)
-- **Total Energy Consumption:** 0.45 kWh
 | Epoch/Loss|Training|Validation|
 |---|---|---|
-| 1 |0.932626|0.767844|
-| 2 |0.728739|0.723823|
-| 3 |0.649202|0.705316|
-| 4 |0.589048|0.698928|
-| 5 |0.542641|0.700216|
 This repository has the notebook used to train this model.
 ## Usage
-Two special tokens are used to mark the user side of the interaction and the model's response:
 `<|startoftext|>`What is a language model?`<|endoftext|>`A language model is a probability distribution over a vocabulary.`<|endoftext|>`

 The dataset used to train this model combines the following sources of data: the [`synthetic-instruct-gptj-pairwise`](https://huggingface.co/datasets/Dahoas/synthetic-instruct-gptj-pairwise) dataset, the [`databricks_dolly_15k`](https://huggingface.co/datasets/HuggingFaceH4/databricks_dolly_15k) dataset, the [`instruction-dataset`](https://huggingface.co/datasets/HuggingFaceH4/instruction-dataset) dataset, and a subset of [Aira's](https://github.com/Nkluge-correa/Aira-EXPERT) fine-tuning dataset, focused on Q&A related to Ethics, AI, AI safety, and other related topics. The dataset is available in both Portuguese and English.
 Check our gradio-demo in [Spaces](https://huggingface.co/spaces/nicholasKluge/Aira-Demo).
 ## Details
 - **Batch size:** 32
 - **Optimizer:** `torch.optim.AdamW` (warmup_steps = 1e2, learning_rate = 5e-4, epsilon = 1e-8)
 - **GPU:** 1 NVIDIA A100-SXM4-40GB
+- **Emissions:** 0.0009 KgCO2 (Canada)
+- **Total Energy Consumption:** 0.41 kWh
 | Epoch/Loss|Training|Validation|
 |---|---|---|
+| 1 |0.947100|0.774946|
+| 2 |0.737357|0.730962|
+| 3 |0.657410|0.710232|
+| 4 |0.597437|0.705064|
+| 5 |0.551684|0.704830|
 This repository has the notebook used to train this model.
 ## Usage
+Two special tokens are used to mark the user side of the interaction and the model's response:
 `<|startoftext|>`What is a language model?`<|endoftext|>`A language model is a probability distribution over a vocabulary.`<|endoftext|>`

config.json CHANGED Viewed

@@ -33,7 +33,7 @@
     }
   },
   "torch_dtype": "float32",
-  "transformers_version": "4.30.1",
   "use_cache": true,
   "vocab_size": 50259
 }

     }
   },
   "torch_dtype": "float32",
+  "transformers_version": "4.30.2",
   "use_cache": true,
   "vocab_size": 50259
 }

generation_config.json CHANGED Viewed

@@ -2,5 +2,5 @@
   "_from_model_config": true,
   "bos_token_id": 50256,
   "eos_token_id": 50256,
-  "transformers_version": "4.30.1"
 }

   "_from_model_config": true,
   "bos_token_id": 50256,
   "eos_token_id": 50256,
+  "transformers_version": "4.30.2"
 }

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a26f52c1f1d8775168735421c8af722a3239d675cab65ad4be2e790c32e43ac5
 size 497813341

 version https://git-lfs.github.com/spec/v1
+oid sha256:7f01da44af4eef5e609983099507e6c2e6c92bb149afa3723d555cdf3a32c4c5
 size 497813341

training_stats.parquet CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0fe12b4d63961cadd00a221e2e9b61bbda28bc7adce68cdea4ee6282117fbfee
 size 3108

 version https://git-lfs.github.com/spec/v1
+oid sha256:63cec774a93f84808183ddf0dacaca250ff645a5e6883cdfd4ea3f96a0cce3fa
 size 3108

vocab.json CHANGED Viewed

The diff for this file is too large to render. See raw diff