mlconvexai commited on
Commit
28acfe6
1 Parent(s): 0a27f0e

Upload 20 files

Browse files
Poro-34B-Lora-1/README.md ADDED
@@ -0,0 +1,204 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: peft
3
+ base_model: LumiOpen/Poro-34B
4
+ ---
5
+
6
+ # Model Card for Model ID
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ <!-- Provide a longer summary of what this model is. -->
17
+
18
+
19
+
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
+
28
+ ### Model Sources [optional]
29
+
30
+ <!-- Provide the basic links for the model. -->
31
+
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
+
76
+ ## Training Details
77
+
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]
200
+
201
+
202
+ ### Framework versions
203
+
204
+ - PEFT 0.9.0
Poro-34B-Lora-1/adapter_config.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "LumiOpen/Poro-34B",
5
+ "bias": "none",
6
+ "fan_in_fan_out": false,
7
+ "inference_mode": true,
8
+ "init_lora_weights": true,
9
+ "layers_pattern": null,
10
+ "layers_to_transform": null,
11
+ "loftq_config": {},
12
+ "lora_alpha": 8,
13
+ "lora_dropout": 0.05,
14
+ "megatron_config": null,
15
+ "megatron_core": "megatron.core",
16
+ "modules_to_save": null,
17
+ "peft_type": "LORA",
18
+ "r": 8,
19
+ "rank_pattern": {},
20
+ "revision": null,
21
+ "target_modules": [
22
+ "query_key_value"
23
+ ],
24
+ "task_type": "CAUSAL_LM",
25
+ "use_dora": false,
26
+ "use_rslora": false
27
+ }
Poro-34B-Lora-1/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f394f45e0d2e87fbed7ca75c190d8dc84ceb0a88fdadaed203fcff5b25e546f7
3
+ size 24788784
Poro-34B-Lora-185/README.md ADDED
@@ -0,0 +1,204 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: peft
3
+ base_model: LumiOpen/Poro-34B
4
+ ---
5
+
6
+ # Model Card for Model ID
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ <!-- Provide a longer summary of what this model is. -->
17
+
18
+
19
+
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
+
28
+ ### Model Sources [optional]
29
+
30
+ <!-- Provide the basic links for the model. -->
31
+
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
+
76
+ ## Training Details
77
+
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]
200
+
201
+
202
+ ### Framework versions
203
+
204
+ - PEFT 0.9.0
Poro-34B-Lora-185/adapter_config.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "LumiOpen/Poro-34B",
5
+ "bias": "none",
6
+ "fan_in_fan_out": false,
7
+ "inference_mode": true,
8
+ "init_lora_weights": true,
9
+ "layers_pattern": null,
10
+ "layers_to_transform": null,
11
+ "loftq_config": {},
12
+ "lora_alpha": 8,
13
+ "lora_dropout": 0.05,
14
+ "megatron_config": null,
15
+ "megatron_core": "megatron.core",
16
+ "modules_to_save": null,
17
+ "peft_type": "LORA",
18
+ "r": 8,
19
+ "rank_pattern": {},
20
+ "revision": null,
21
+ "target_modules": [
22
+ "query_key_value"
23
+ ],
24
+ "task_type": "CAUSAL_LM",
25
+ "use_dora": false,
26
+ "use_rslora": false
27
+ }
Poro-34B-Lora-185/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f41dee2188266055a20fac03619585b016dc7d8accb5fd7f543c7f3159f4ad4e
3
+ size 24788784
Poro-34B-Lora-185QA.ipynb ADDED
@@ -0,0 +1,789 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "id": "0c4ecb49-ce58-4b65-849a-760980576e48",
6
+ "metadata": {},
7
+ "source": [
8
+ "# Poro34B Lora fine-tuning with S-Group's data"
9
+ ]
10
+ },
11
+ {
12
+ "cell_type": "code",
13
+ "execution_count": null,
14
+ "id": "5b686006-65a7-43af-8207-1c7309a5e423",
15
+ "metadata": {},
16
+ "outputs": [],
17
+ "source": [
18
+ "# This script finetunes the Poro34B model with 185 Questions and Answers pair"
19
+ ]
20
+ },
21
+ {
22
+ "cell_type": "markdown",
23
+ "id": "defcdb6f-3b69-4b03-b2dc-07c4b3027fd6",
24
+ "metadata": {},
25
+ "source": [
26
+ "## Initialization"
27
+ ]
28
+ },
29
+ {
30
+ "cell_type": "code",
31
+ "execution_count": 2,
32
+ "id": "67f730e6-3467-4a19-ab76-e8baace8e02e",
33
+ "metadata": {},
34
+ "outputs": [
35
+ {
36
+ "name": "stdout",
37
+ "output_type": "stream",
38
+ "text": [
39
+ "Requirement already satisfied: peft in /opt/conda/lib/python3.10/site-packages (0.9.0)\n",
40
+ "Requirement already satisfied: numpy>=1.17 in /opt/conda/lib/python3.10/site-packages (from peft) (1.26.3)\n",
41
+ "Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.10/site-packages (from peft) (23.2)\n",
42
+ "Requirement already satisfied: psutil in /opt/conda/lib/python3.10/site-packages (from peft) (5.9.8)\n",
43
+ "Requirement already satisfied: pyyaml in /opt/conda/lib/python3.10/site-packages (from peft) (6.0.1)\n",
44
+ "Requirement already satisfied: torch>=1.13.0 in /opt/conda/lib/python3.10/site-packages (from peft) (2.0.0.post101)\n",
45
+ "Requirement already satisfied: transformers in /opt/conda/lib/python3.10/site-packages (from peft) (4.31.0)\n",
46
+ "Requirement already satisfied: tqdm in /opt/conda/lib/python3.10/site-packages (from peft) (4.66.1)\n",
47
+ "Requirement already satisfied: accelerate>=0.21.0 in /opt/conda/lib/python3.10/site-packages (from peft) (0.21.0)\n",
48
+ "Requirement already satisfied: safetensors in /opt/conda/lib/python3.10/site-packages (from peft) (0.3.3)\n",
49
+ "Requirement already satisfied: huggingface-hub>=0.17.0 in /opt/conda/lib/python3.10/site-packages (from peft) (0.20.2)\n",
50
+ "Requirement already satisfied: filelock in /opt/conda/lib/python3.10/site-packages (from huggingface-hub>=0.17.0->peft) (3.13.1)\n",
51
+ "Requirement already satisfied: fsspec>=2023.5.0 in /opt/conda/lib/python3.10/site-packages (from huggingface-hub>=0.17.0->peft) (2023.6.0)\n",
52
+ "Requirement already satisfied: requests in /opt/conda/lib/python3.10/site-packages (from huggingface-hub>=0.17.0->peft) (2.31.0)\n",
53
+ "Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.10/site-packages (from huggingface-hub>=0.17.0->peft) (4.5.0)\n",
54
+ "Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torch>=1.13.0->peft) (1.12)\n",
55
+ "Requirement already satisfied: networkx in /opt/conda/lib/python3.10/site-packages (from torch>=1.13.0->peft) (3.2.1)\n",
56
+ "Requirement already satisfied: jinja2 in /opt/conda/lib/python3.10/site-packages (from torch>=1.13.0->peft) (3.1.3)\n",
57
+ "Requirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.10/site-packages (from transformers->peft) (2023.12.25)\n",
58
+ "Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /opt/conda/lib/python3.10/site-packages (from transformers->peft) (0.13.3)\n",
59
+ "Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.10/site-packages (from jinja2->torch>=1.13.0->peft) (2.1.4)\n",
60
+ "Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.10/site-packages (from requests->huggingface-hub>=0.17.0->peft) (3.3.2)\n",
61
+ "Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.10/site-packages (from requests->huggingface-hub>=0.17.0->peft) (3.6)\n",
62
+ "Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.10/site-packages (from requests->huggingface-hub>=0.17.0->peft) (1.26.18)\n",
63
+ "Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.10/site-packages (from requests->huggingface-hub>=0.17.0->peft) (2023.11.17)\n",
64
+ "Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.10/site-packages (from sympy->torch>=1.13.0->peft) (1.3.0)\n"
65
+ ]
66
+ }
67
+ ],
68
+ "source": [
69
+ "# pip install peft, all other Python libraries are already in AWS image\n",
70
+ "!pip install peft"
71
+ ]
72
+ },
73
+ {
74
+ "cell_type": "code",
75
+ "execution_count": 3,
76
+ "id": "80b24df2-140b-4792-aaf1-6f6aff92ece8",
77
+ "metadata": {},
78
+ "outputs": [
79
+ {
80
+ "name": "stderr",
81
+ "output_type": "stream",
82
+ "text": [
83
+ "2024-02-29 16:17:22.775245: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
84
+ "To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
85
+ ]
86
+ }
87
+ ],
88
+ "source": [
89
+ "import torch\n",
90
+ "import json\n",
91
+ "from transformers import AutoModelForCausalLM, AutoTokenizer \n",
92
+ "from transformers import TrainingArguments, Trainer\n",
93
+ "from transformers import pipeline\n",
94
+ "from peft import get_peft_model, PromptTuningConfig, TaskType, PromptTuningInit\n",
95
+ "from datasets import load_dataset"
96
+ ]
97
+ },
98
+ {
99
+ "cell_type": "code",
100
+ "execution_count": 4,
101
+ "id": "d31adfc6-a460-419e-871b-d0437501b026",
102
+ "metadata": {},
103
+ "outputs": [],
104
+ "source": [
105
+ "# this checks wether we have GPU\n",
106
+ "device = torch.device(\"cuda\") if torch.cuda.is_available() else torch.device(\"cpu\")"
107
+ ]
108
+ },
109
+ {
110
+ "cell_type": "code",
111
+ "execution_count": 4,
112
+ "id": "2c5a9b07-c92b-4d1d-b5b5-96e8c234e14f",
113
+ "metadata": {},
114
+ "outputs": [
115
+ {
116
+ "name": "stdout",
117
+ "output_type": "stream",
118
+ "text": [
119
+ "cpu\n"
120
+ ]
121
+ }
122
+ ],
123
+ "source": [
124
+ "print(device)"
125
+ ]
126
+ },
127
+ {
128
+ "cell_type": "markdown",
129
+ "id": "6ea88a10-f5f1-4342-939b-60d2b9c5bb91",
130
+ "metadata": {},
131
+ "source": [
132
+ "## Foundation model import"
133
+ ]
134
+ },
135
+ {
136
+ "cell_type": "code",
137
+ "execution_count": 5,
138
+ "id": "2c0f7b3a-9d56-46ce-9dc8-5fe40b2628a6",
139
+ "metadata": {},
140
+ "outputs": [],
141
+ "source": [
142
+ "# Foundation model\n",
143
+ "model_name='LumiOpen/Poro-34B'"
144
+ ]
145
+ },
146
+ {
147
+ "cell_type": "code",
148
+ "execution_count": 6,
149
+ "id": "4e4c9089-a195-4fd7-91b2-6240cafb4989",
150
+ "metadata": {},
151
+ "outputs": [],
152
+ "source": [
153
+ "tokenizer = AutoTokenizer.from_pretrained(model_name)"
154
+ ]
155
+ },
156
+ {
157
+ "cell_type": "code",
158
+ "execution_count": 7,
159
+ "id": "a42e0fb6-40d4-483b-a034-84ff351c021d",
160
+ "metadata": {},
161
+ "outputs": [
162
+ {
163
+ "data": {
164
+ "application/vnd.jupyter.widget-view+json": {
165
+ "model_id": "34a67d537808415ab77b583333186ae3",
166
+ "version_major": 2,
167
+ "version_minor": 0
168
+ },
169
+ "text/plain": [
170
+ "Loading checkpoint shards: 0%| | 0/14 [00:00<?, ?it/s]"
171
+ ]
172
+ },
173
+ "metadata": {},
174
+ "output_type": "display_data"
175
+ }
176
+ ],
177
+ "source": [
178
+ "branch = \"1000B\"\n",
179
+ "model = AutoModelForCausalLM.from_pretrained(model_name,\n",
180
+ " torch_dtype=torch.bfloat16,\n",
181
+ " revision=branch,\n",
182
+ ")"
183
+ ]
184
+ },
185
+ {
186
+ "cell_type": "markdown",
187
+ "id": "258df7e6-27c1-48a2-b20a-d377dc885884",
188
+ "metadata": {},
189
+ "source": [
190
+ "## Setting up the Lora parameters"
191
+ ]
192
+ },
193
+ {
194
+ "cell_type": "code",
195
+ "execution_count": 11,
196
+ "id": "63151e65-6bff-4b65-a8ae-af4d6c53036f",
197
+ "metadata": {},
198
+ "outputs": [],
199
+ "source": [
200
+ "from peft import LoraConfig, get_peft_model"
201
+ ]
202
+ },
203
+ {
204
+ "cell_type": "code",
205
+ "execution_count": 12,
206
+ "id": "f35f934f-23c6-47db-95bb-df20526e29e7",
207
+ "metadata": {},
208
+ "outputs": [],
209
+ "source": [
210
+ "config = LoraConfig(\n",
211
+ " r=8,\n",
212
+ " lora_alpha=8,\n",
213
+ " target_modules=[\"query_key_value\"],\n",
214
+ " lora_dropout=0.05,\n",
215
+ " bias=\"none\",\n",
216
+ " task_type=\"CAUSAL_LM\"\n",
217
+ ")"
218
+ ]
219
+ },
220
+ {
221
+ "cell_type": "code",
222
+ "execution_count": 13,
223
+ "id": "a62cb983-6e28-40f1-9f8d-64a1a6ccd0f3",
224
+ "metadata": {},
225
+ "outputs": [],
226
+ "source": [
227
+ "peft_model = get_peft_model(model, config)"
228
+ ]
229
+ },
230
+ {
231
+ "cell_type": "code",
232
+ "execution_count": 14,
233
+ "id": "fe7e7078-2998-4f17-abda-f03a32e04735",
234
+ "metadata": {},
235
+ "outputs": [
236
+ {
237
+ "name": "stdout",
238
+ "output_type": "stream",
239
+ "text": [
240
+ "trainable params: 12386304\n",
241
+ "all params: 34229336064\n",
242
+ "trainable: 0.04%\n"
243
+ ]
244
+ }
245
+ ],
246
+ "source": [
247
+ "trainable_params = 0\n",
248
+ "all_param = 0\n",
249
+ "\n",
250
+ "# iterating over all parameters\n",
251
+ "for _, param in peft_model.named_parameters():\n",
252
+ " # adding parameters to total\n",
253
+ " all_param += param.numel()\n",
254
+ " # adding parameters to trainable if they require a graident\n",
255
+ " if param.requires_grad:\n",
256
+ " trainable_params += param.numel()\n",
257
+ "\n",
258
+ "# print number of trainable parameters\n",
259
+ "print(f\"trainable params: {trainable_params}\")\n",
260
+ "print(f\"all params: {all_param}\")\n",
261
+ "print(f\"trainable: {100 * trainable_params / all_param:.2f}%\")"
262
+ ]
263
+ },
264
+ {
265
+ "cell_type": "markdown",
266
+ "id": "d029921d-60db-43ad-ae8b-2ab9910bd490",
267
+ "metadata": {},
268
+ "source": [
269
+ "## Preparing the training data"
270
+ ]
271
+ },
272
+ {
273
+ "cell_type": "code",
274
+ "execution_count": 15,
275
+ "id": "216aa90b-a87d-4a37-b178-e7e83bf987ce",
276
+ "metadata": {},
277
+ "outputs": [],
278
+ "source": [
279
+ "# prepare the data for training\n",
280
+ "def prepare_train_data(data):\n",
281
+ " text_input = data['text']\n",
282
+ " tokenized_input = tokenizer(text_input, return_tensors='pt', padding=True)\n",
283
+ " tokenized_input['labels'] = tokenized_input['input_ids']\n",
284
+ " return tokenized_input"
285
+ ]
286
+ },
287
+ {
288
+ "cell_type": "code",
289
+ "execution_count": 16,
290
+ "id": "5551fac2-9a26-4a86-b8b4-d2f572ecfaa9",
291
+ "metadata": {},
292
+ "outputs": [
293
+ {
294
+ "data": {
295
+ "application/vnd.jupyter.widget-view+json": {
296
+ "model_id": "711294e3ec68409abdc6a3f29c271d3a",
297
+ "version_major": 2,
298
+ "version_minor": 0
299
+ },
300
+ "text/plain": [
301
+ "Generating train split: 0 examples [00:00, ? examples/s]"
302
+ ]
303
+ },
304
+ "metadata": {},
305
+ "output_type": "display_data"
306
+ }
307
+ ],
308
+ "source": [
309
+ "dataset = load_dataset(\"json\", data_files=\"prompts_185.json\")"
310
+ ]
311
+ },
312
+ {
313
+ "cell_type": "code",
314
+ "execution_count": 17,
315
+ "id": "dfa8382a-3e66-4433-b6dd-97d59a7945f6",
316
+ "metadata": {},
317
+ "outputs": [
318
+ {
319
+ "data": {
320
+ "application/vnd.jupyter.widget-view+json": {
321
+ "model_id": "72acd5c60a62468c86206a93b17a87ab",
322
+ "version_major": 2,
323
+ "version_minor": 0
324
+ },
325
+ "text/plain": [
326
+ "Map: 0%| | 0/185 [00:00<?, ? examples/s]"
327
+ ]
328
+ },
329
+ "metadata": {},
330
+ "output_type": "display_data"
331
+ }
332
+ ],
333
+ "source": [
334
+ "train_dataset = dataset['train'].map(prepare_train_data, batched=True, remove_columns=[\"text\"])"
335
+ ]
336
+ },
337
+ {
338
+ "cell_type": "markdown",
339
+ "id": "319de4fe-f5e0-4ea6-9a61-6f18c241bd02",
340
+ "metadata": {},
341
+ "source": [
342
+ "## Setting up the training parameters"
343
+ ]
344
+ },
345
+ {
346
+ "cell_type": "code",
347
+ "execution_count": 18,
348
+ "id": "9bb42764-af93-463f-8cf6-68707f21151b",
349
+ "metadata": {},
350
+ "outputs": [],
351
+ "source": [
352
+ "from transformers import DataCollatorForLanguageModeling"
353
+ ]
354
+ },
355
+ {
356
+ "cell_type": "code",
357
+ "execution_count": 19,
358
+ "id": "4b8a94c9-2648-4626-9238-4475645aa695",
359
+ "metadata": {},
360
+ "outputs": [],
361
+ "source": [
362
+ "trainer = Trainer(\n",
363
+ " model=peft_model,\n",
364
+ " train_dataset=train_dataset,\n",
365
+ " args=TrainingArguments(\n",
366
+ " per_device_train_batch_size=4,\n",
367
+ " gradient_accumulation_steps=4,\n",
368
+ " warmup_steps=20,\n",
369
+ " max_steps=20,\n",
370
+ " learning_rate=1e-3,\n",
371
+ " logging_steps=1,\n",
372
+ " output_dir='outputs',\n",
373
+ " ),\n",
374
+ " data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False)\n",
375
+ ")"
376
+ ]
377
+ },
378
+ {
379
+ "cell_type": "code",
380
+ "execution_count": null,
381
+ "id": "fbd62db8-65e3-4d05-b11e-179aaf8f0e65",
382
+ "metadata": {},
383
+ "outputs": [
384
+ {
385
+ "name": "stderr",
386
+ "output_type": "stream",
387
+ "text": [
388
+ "/opt/conda/lib/python3.10/site-packages/transformers/optimization.py:411: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning\n",
389
+ " warnings.warn(\n",
390
+ "\u001b[34m\u001b[1mwandb\u001b[0m: Currently logged in as: \u001b[33mtimo-au-laine\u001b[0m. Use \u001b[1m`wandb login --relogin`\u001b[0m to force relogin\n"
391
+ ]
392
+ },
393
+ {
394
+ "data": {
395
+ "text/html": [
396
+ "wandb version 0.16.3 is available! To upgrade, please run:\n",
397
+ " $ pip install wandb --upgrade"
398
+ ],
399
+ "text/plain": [
400
+ "<IPython.core.display.HTML object>"
401
+ ]
402
+ },
403
+ "metadata": {},
404
+ "output_type": "display_data"
405
+ },
406
+ {
407
+ "data": {
408
+ "text/html": [
409
+ "Tracking run with wandb version 0.16.2"
410
+ ],
411
+ "text/plain": [
412
+ "<IPython.core.display.HTML object>"
413
+ ]
414
+ },
415
+ "metadata": {},
416
+ "output_type": "display_data"
417
+ },
418
+ {
419
+ "data": {
420
+ "text/html": [
421
+ "Run data is saved locally in <code>/home/sagemaker-user/wandb/run-20240229_162718-2nyysvnx</code>"
422
+ ],
423
+ "text/plain": [
424
+ "<IPython.core.display.HTML object>"
425
+ ]
426
+ },
427
+ "metadata": {},
428
+ "output_type": "display_data"
429
+ },
430
+ {
431
+ "data": {
432
+ "text/html": [
433
+ "Syncing run <strong><a href='https://wandb.ai/timo-au-laine/huggingface/runs/2nyysvnx' target=\"_blank\">trim-dust-9</a></strong> to <a href='https://wandb.ai/timo-au-laine/huggingface' target=\"_blank\">Weights & Biases</a> (<a href='https://wandb.me/run' target=\"_blank\">docs</a>)<br/>"
434
+ ],
435
+ "text/plain": [
436
+ "<IPython.core.display.HTML object>"
437
+ ]
438
+ },
439
+ "metadata": {},
440
+ "output_type": "display_data"
441
+ },
442
+ {
443
+ "data": {
444
+ "text/html": [
445
+ " View project at <a href='https://wandb.ai/timo-au-laine/huggingface' target=\"_blank\">https://wandb.ai/timo-au-laine/huggingface</a>"
446
+ ],
447
+ "text/plain": [
448
+ "<IPython.core.display.HTML object>"
449
+ ]
450
+ },
451
+ "metadata": {},
452
+ "output_type": "display_data"
453
+ },
454
+ {
455
+ "data": {
456
+ "text/html": [
457
+ " View run at <a href='https://wandb.ai/timo-au-laine/huggingface/runs/2nyysvnx' target=\"_blank\">https://wandb.ai/timo-au-laine/huggingface/runs/2nyysvnx</a>"
458
+ ],
459
+ "text/plain": [
460
+ "<IPython.core.display.HTML object>"
461
+ ]
462
+ },
463
+ "metadata": {},
464
+ "output_type": "display_data"
465
+ },
466
+ {
467
+ "name": "stderr",
468
+ "output_type": "stream",
469
+ "text": [
470
+ "You're using a BloomTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.\n"
471
+ ]
472
+ },
473
+ {
474
+ "data": {
475
+ "text/html": [
476
+ "\n",
477
+ " <div>\n",
478
+ " \n",
479
+ " <progress value='19' max='20' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
480
+ " [19/20 5:10:38 < 18:16, 0.00 it/s, Epoch 1.53/2]\n",
481
+ " </div>\n",
482
+ " <table border=\"1\" class=\"dataframe\">\n",
483
+ " <thead>\n",
484
+ " <tr style=\"text-align: left;\">\n",
485
+ " <th>Step</th>\n",
486
+ " <th>Training Loss</th>\n",
487
+ " </tr>\n",
488
+ " </thead>\n",
489
+ " <tbody>\n",
490
+ " <tr>\n",
491
+ " <td>1</td>\n",
492
+ " <td>2.511700</td>\n",
493
+ " </tr>\n",
494
+ " <tr>\n",
495
+ " <td>2</td>\n",
496
+ " <td>2.543000</td>\n",
497
+ " </tr>\n",
498
+ " <tr>\n",
499
+ " <td>3</td>\n",
500
+ " <td>2.566400</td>\n",
501
+ " </tr>\n",
502
+ " <tr>\n",
503
+ " <td>4</td>\n",
504
+ " <td>2.472700</td>\n",
505
+ " </tr>\n",
506
+ " <tr>\n",
507
+ " <td>5</td>\n",
508
+ " <td>2.500000</td>\n",
509
+ " </tr>\n",
510
+ " <tr>\n",
511
+ " <td>6</td>\n",
512
+ " <td>2.500000</td>\n",
513
+ " </tr>\n",
514
+ " <tr>\n",
515
+ " <td>7</td>\n",
516
+ " <td>2.625000</td>\n",
517
+ " </tr>\n",
518
+ " <tr>\n",
519
+ " <td>8</td>\n",
520
+ " <td>2.277300</td>\n",
521
+ " </tr>\n",
522
+ " <tr>\n",
523
+ " <td>9</td>\n",
524
+ " <td>2.359400</td>\n",
525
+ " </tr>\n",
526
+ " <tr>\n",
527
+ " <td>10</td>\n",
528
+ " <td>2.175800</td>\n",
529
+ " </tr>\n",
530
+ " <tr>\n",
531
+ " <td>11</td>\n",
532
+ " <td>2.293000</td>\n",
533
+ " </tr>\n",
534
+ " <tr>\n",
535
+ " <td>12</td>\n",
536
+ " <td>2.132800</td>\n",
537
+ " </tr>\n",
538
+ " <tr>\n",
539
+ " <td>13</td>\n",
540
+ " <td>1.974600</td>\n",
541
+ " </tr>\n",
542
+ " <tr>\n",
543
+ " <td>14</td>\n",
544
+ " <td>2.076200</td>\n",
545
+ " </tr>\n",
546
+ " <tr>\n",
547
+ " <td>15</td>\n",
548
+ " <td>1.869100</td>\n",
549
+ " </tr>\n",
550
+ " <tr>\n",
551
+ " <td>16</td>\n",
552
+ " <td>1.640600</td>\n",
553
+ " </tr>\n",
554
+ " <tr>\n",
555
+ " <td>17</td>\n",
556
+ " <td>1.769500</td>\n",
557
+ " </tr>\n",
558
+ " </tbody>\n",
559
+ "</table><p>"
560
+ ],
561
+ "text/plain": [
562
+ "<IPython.core.display.HTML object>"
563
+ ]
564
+ },
565
+ "metadata": {},
566
+ "output_type": "display_data"
567
+ }
568
+ ],
569
+ "source": [
570
+ "trainer.train()"
571
+ ]
572
+ },
573
+ {
574
+ "cell_type": "markdown",
575
+ "id": "1ed2cf09-3683-4016-88d9-9ada1ddb4345",
576
+ "metadata": {},
577
+ "source": [
578
+ "## Saving the finetuned model"
579
+ ]
580
+ },
581
+ {
582
+ "cell_type": "code",
583
+ "execution_count": null,
584
+ "id": "c37902bf-47e5-4f89-9128-a6b7d91cb437",
585
+ "metadata": {},
586
+ "outputs": [],
587
+ "source": [
588
+ "model_id185 = \"Poro-34B-Lora-185\""
589
+ ]
590
+ },
591
+ {
592
+ "cell_type": "code",
593
+ "execution_count": null,
594
+ "id": "163b54c4-3027-4e0d-9d52-7e3d698020da",
595
+ "metadata": {},
596
+ "outputs": [],
597
+ "source": [
598
+ "peft_model.save_pretrained(model_id185)"
599
+ ]
600
+ },
601
+ {
602
+ "cell_type": "code",
603
+ "execution_count": null,
604
+ "id": "ec432db5-4f0c-43c7-b4e4-ef087f057bd0",
605
+ "metadata": {},
606
+ "outputs": [],
607
+ "source": [
608
+ "!ls -lh {model_id185} # Lora parameters file size"
609
+ ]
610
+ },
611
+ {
612
+ "cell_type": "markdown",
613
+ "id": "11460d4e-3e11-4fdb-b134-61b45bb84018",
614
+ "metadata": {},
615
+ "source": [
616
+ "## Testing"
617
+ ]
618
+ },
619
+ {
620
+ "cell_type": "code",
621
+ "execution_count": 8,
622
+ "id": "eb6a1213-a7ab-4bb5-8ffc-0e2666286dc6",
623
+ "metadata": {},
624
+ "outputs": [],
625
+ "source": [
626
+ "def generate_output(model, inputs, max_new_tokens=100):\n",
627
+ " outputs = model.generate(\n",
628
+ " input_ids=inputs[\"input_ids\"],\n",
629
+ " max_new_tokens=max_new_tokens,\n",
630
+ " temperature=0.1,\n",
631
+ " )\n",
632
+ " return outputs"
633
+ ]
634
+ },
635
+ {
636
+ "cell_type": "markdown",
637
+ "id": "0a844312-2a1e-4c76-9078-96506b252522",
638
+ "metadata": {},
639
+ "source": [
640
+ "### Original model"
641
+ ]
642
+ },
643
+ {
644
+ "cell_type": "code",
645
+ "execution_count": 9,
646
+ "id": "d38bbed0-e938-43ef-b816-b5e0f9d066fd",
647
+ "metadata": {},
648
+ "outputs": [
649
+ {
650
+ "name": "stdout",
651
+ "output_type": "stream",
652
+ "text": [
653
+ "['Given the question delimited by triple backticks ```{ Mikä on osuusmaksu ja miksi se pitää maksaa? }```, what is the answer? Answer: Osuusmaksu on osuuskaupan osuusmaksu, joka maksetaan liittymisen yhteydessä. Osuusmaksu on sijoitus osuuskauppaan ja se palautetaan, kun jäsen eroaa osuuskaupasta. Osuusmaksun suuruus vaihtelee osuuskaupoittain. Osuusmaksun suuruus on 100 euroa. Osuusmaksu on sijoitus osuuskauppaan ja se palautetaan, kun jäsen eroaa osuuskaupasta. Osuusmaksun suuruus vaihtelee osuuskaupoittain. Osuusmaksun suuruus on 100 euroa. Osuusmaksu on sijoitus osuuskauppaan ja se palautetaan, kun jäsen eroaa osuuskaupasta. Osuusmaksun suuruus vaihtelee osuuskaupoittain.']\n"
654
+ ]
655
+ }
656
+ ],
657
+ "source": [
658
+ "prompt = tokenizer('Given the question delimited by triple backticks ```{ Mikä on osuusmaksu ja miksi se pitää maksaa? }```, what is the answer? Answer:', return_tensors=\"pt\")\n",
659
+ "result = generate_output(model,prompt)\n",
660
+ "print(tokenizer.batch_decode(result, skip_special_tokens=True))"
661
+ ]
662
+ },
663
+ {
664
+ "cell_type": "code",
665
+ "execution_count": 10,
666
+ "id": "d6091480-e399-4890-bd32-7a51d1cbb50f",
667
+ "metadata": {},
668
+ "outputs": [
669
+ {
670
+ "name": "stdout",
671
+ "output_type": "stream",
672
+ "text": [
673
+ "['Given the question delimited by triple backticks ```{ Mistä näen S-Tilini tapahtumat ja saldon? }```, what is the answer? Answer: S-Pankin verkkopankissa. The answer is the first sentence in the paragraph. The answer is the first sentence in the paragraph. The answer is the first sentence in the paragraph. The answer is the first sentence in the paragraph. The answer is the first sentence in the paragraph. The answer is the first sentence in the paragraph. The answer is the first sentence in the paragraph. The answer is the first sentence in the paragraph. The answer is the first sentence in the paragraph. The answer is the']\n"
674
+ ]
675
+ }
676
+ ],
677
+ "source": [
678
+ "prompt = tokenizer('Given the question delimited by triple backticks ```{ Mistä näen S-Tilini tapahtumat ja saldon? }```, what is the answer? Answer:', return_tensors=\"pt\")\n",
679
+ "result = generate_output(model,prompt)\n",
680
+ "print(tokenizer.batch_decode(result, skip_special_tokens=True))"
681
+ ]
682
+ },
683
+ {
684
+ "cell_type": "markdown",
685
+ "id": "ae3c3d6a-2b07-4e46-9ddc-dccadfd07196",
686
+ "metadata": {},
687
+ "source": [
688
+ "### Finetuned model"
689
+ ]
690
+ },
691
+ {
692
+ "cell_type": "code",
693
+ "execution_count": 27,
694
+ "id": "4cf53f39-ad3f-43e2-8daa-79853b054cd2",
695
+ "metadata": {},
696
+ "outputs": [],
697
+ "source": [
698
+ "from peft import PeftModel"
699
+ ]
700
+ },
701
+ {
702
+ "cell_type": "code",
703
+ "execution_count": 28,
704
+ "id": "142bf57d-cffc-47b2-ae91-8a5420c46d32",
705
+ "metadata": {},
706
+ "outputs": [],
707
+ "source": [
708
+ "loaded_model = PeftModel.from_pretrained(model,model_id185,is_trainable=False)"
709
+ ]
710
+ },
711
+ {
712
+ "cell_type": "code",
713
+ "execution_count": 29,
714
+ "id": "c3cacd26-edff-494c-9428-55b7659988de",
715
+ "metadata": {},
716
+ "outputs": [
717
+ {
718
+ "name": "stdout",
719
+ "output_type": "stream",
720
+ "text": [
721
+ "['Given the question delimited by triple backticks ```{ Mikä on osuusmaksu ja miksi se pitää maksaa? }```, what is the answer? Answer: {Osuusmaksun suuruus on 100 euroa. Se maksetaan vain kerran, ja se palautetaan osuuskaupan asiakasomistajuudesta luovuttaessa. Osuusmaksun voi maksaa kerralla kokonaan tai kerryttää sitä Bonuksilla. Osuusmaksun voi maksaa myös S-Etukortilla, jolloin se veloitetaan S-Etukorttiin liitetyltä maksuvälineeltä.}\\n\\n{Osuusmaksun voi maksaa myös S-ryhmän lahjakortilla. Lahjakortin saldo ei kuitenkaan kerrytä Bonusta.}\\n\\n{Osuusmaksun voi maksaa myös S-ryhmän lahjak']\n"
722
+ ]
723
+ }
724
+ ],
725
+ "source": [
726
+ "prompt = tokenizer('Given the question delimited by triple backticks ```{ Mikä on osuusmaksu ja miksi se pitää maksaa? }```, what is the answer? Answer:', return_tensors=\"pt\")\n",
727
+ "result = generate_output(loaded_model,prompt)\n",
728
+ "print(tokenizer.batch_decode(result, skip_special_tokens=True))"
729
+ ]
730
+ },
731
+ {
732
+ "cell_type": "code",
733
+ "execution_count": 30,
734
+ "id": "57029580-6429-4c43-8482-1a052839bc05",
735
+ "metadata": {},
736
+ "outputs": [
737
+ {
738
+ "name": "stdout",
739
+ "output_type": "stream",
740
+ "text": [
741
+ "['Given the question delimited by triple backticks ```{ Mistä näen S-Tilini tapahtumat ja saldon? }```, what is the answer? Answer: {S-Tilin tapahtumat ja saldo ovat nähtävissä S-mobiilissa ja S-Pankin verkkopankissa.}\\n\\n{S-mobiilissa S-Tilin tapahtumat ja saldo ovat nähtävissä S-Etukortti-osiossa.}\\n\\n{S-Pankin verkkopankissa S-Tilin tapahtumat ja saldo ovat nähtävissä Tilit-osiossa.}\\n\\n{S-Tilin tapahtumat ja saldo ovat nähtävissä myös S-mobiilin ja S-Pankin verkkopankin S-Etukortti- ja Tilit-osiossa, kun olet']\n"
742
+ ]
743
+ }
744
+ ],
745
+ "source": [
746
+ "prompt = tokenizer('Given the question delimited by triple backticks ```{ Mistä näen S-Tilini tapahtumat ja saldon? }```, what is the answer? Answer:', return_tensors=\"pt\")\n",
747
+ "result = generate_output(loaded_model,prompt)\n",
748
+ "print(tokenizer.batch_decode(result, skip_special_tokens=True))"
749
+ ]
750
+ },
751
+ {
752
+ "cell_type": "code",
753
+ "execution_count": null,
754
+ "id": "166c476c-01a2-49cc-b03f-6cb1d9ae6136",
755
+ "metadata": {},
756
+ "outputs": [],
757
+ "source": []
758
+ },
759
+ {
760
+ "cell_type": "code",
761
+ "execution_count": null,
762
+ "id": "897448ea-f680-4a5b-a148-f65df6704bac",
763
+ "metadata": {},
764
+ "outputs": [],
765
+ "source": []
766
+ }
767
+ ],
768
+ "metadata": {
769
+ "kernelspec": {
770
+ "display_name": "Python 3 (ipykernel)",
771
+ "language": "python",
772
+ "name": "python3"
773
+ },
774
+ "language_info": {
775
+ "codemirror_mode": {
776
+ "name": "ipython",
777
+ "version": 3
778
+ },
779
+ "file_extension": ".py",
780
+ "mimetype": "text/x-python",
781
+ "name": "python",
782
+ "nbconvert_exporter": "python",
783
+ "pygments_lexer": "ipython3",
784
+ "version": "3.10.13"
785
+ }
786
+ },
787
+ "nbformat": 4,
788
+ "nbformat_minor": 5
789
+ }
Poro-34B-Lora-1QA.ipynb ADDED
@@ -0,0 +1,620 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "id": "0c4ecb49-ce58-4b65-849a-760980576e48",
6
+ "metadata": {},
7
+ "source": [
8
+ "# Poro34B Lora fine-tuning with S-Group's data - 1 Q/A"
9
+ ]
10
+ },
11
+ {
12
+ "cell_type": "code",
13
+ "execution_count": null,
14
+ "id": "5b686006-65a7-43af-8207-1c7309a5e423",
15
+ "metadata": {},
16
+ "outputs": [],
17
+ "source": [
18
+ "# This script finetunes the Poro34B model with 1 Question and Answer pair"
19
+ ]
20
+ },
21
+ {
22
+ "cell_type": "markdown",
23
+ "id": "defcdb6f-3b69-4b03-b2dc-07c4b3027fd6",
24
+ "metadata": {},
25
+ "source": [
26
+ "## Initialization"
27
+ ]
28
+ },
29
+ {
30
+ "cell_type": "code",
31
+ "execution_count": null,
32
+ "id": "67f730e6-3467-4a19-ab76-e8baace8e02e",
33
+ "metadata": {},
34
+ "outputs": [],
35
+ "source": [
36
+ "# pip install peft, all other Python libraries are already in AWS image\n",
37
+ "!pip install peft"
38
+ ]
39
+ },
40
+ {
41
+ "cell_type": "code",
42
+ "execution_count": 2,
43
+ "id": "80b24df2-140b-4792-aaf1-6f6aff92ece8",
44
+ "metadata": {},
45
+ "outputs": [
46
+ {
47
+ "name": "stderr",
48
+ "output_type": "stream",
49
+ "text": [
50
+ "2024-02-29 15:06:36.945989: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
51
+ "To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
52
+ ]
53
+ }
54
+ ],
55
+ "source": [
56
+ "import torch\n",
57
+ "import json\n",
58
+ "from transformers import AutoModelForCausalLM, AutoTokenizer \n",
59
+ "from transformers import TrainingArguments, Trainer\n",
60
+ "from transformers import pipeline\n",
61
+ "from peft import get_peft_model, PromptTuningConfig, TaskType, PromptTuningInit\n",
62
+ "from datasets import load_dataset"
63
+ ]
64
+ },
65
+ {
66
+ "cell_type": "code",
67
+ "execution_count": 3,
68
+ "id": "d31adfc6-a460-419e-871b-d0437501b026",
69
+ "metadata": {},
70
+ "outputs": [],
71
+ "source": [
72
+ "# this checks wether we have GPU\n",
73
+ "device = torch.device(\"cuda\") if torch.cuda.is_available() else torch.device(\"cpu\")"
74
+ ]
75
+ },
76
+ {
77
+ "cell_type": "code",
78
+ "execution_count": 4,
79
+ "id": "2c5a9b07-c92b-4d1d-b5b5-96e8c234e14f",
80
+ "metadata": {},
81
+ "outputs": [
82
+ {
83
+ "name": "stdout",
84
+ "output_type": "stream",
85
+ "text": [
86
+ "cpu\n"
87
+ ]
88
+ }
89
+ ],
90
+ "source": [
91
+ "print(device)"
92
+ ]
93
+ },
94
+ {
95
+ "cell_type": "markdown",
96
+ "id": "6ea88a10-f5f1-4342-939b-60d2b9c5bb91",
97
+ "metadata": {},
98
+ "source": [
99
+ "## Foundation model import"
100
+ ]
101
+ },
102
+ {
103
+ "cell_type": "code",
104
+ "execution_count": 5,
105
+ "id": "2c0f7b3a-9d56-46ce-9dc8-5fe40b2628a6",
106
+ "metadata": {},
107
+ "outputs": [],
108
+ "source": [
109
+ "# Foundation model\n",
110
+ "model_name='LumiOpen/Poro-34B'"
111
+ ]
112
+ },
113
+ {
114
+ "cell_type": "code",
115
+ "execution_count": 6,
116
+ "id": "4e4c9089-a195-4fd7-91b2-6240cafb4989",
117
+ "metadata": {},
118
+ "outputs": [],
119
+ "source": [
120
+ "tokenizer = AutoTokenizer.from_pretrained(model_name)"
121
+ ]
122
+ },
123
+ {
124
+ "cell_type": "code",
125
+ "execution_count": 7,
126
+ "id": "a42e0fb6-40d4-483b-a034-84ff351c021d",
127
+ "metadata": {},
128
+ "outputs": [
129
+ {
130
+ "data": {
131
+ "application/vnd.jupyter.widget-view+json": {
132
+ "model_id": "3a476b270f8d413c8d54e413fe791a82",
133
+ "version_major": 2,
134
+ "version_minor": 0
135
+ },
136
+ "text/plain": [
137
+ "Loading checkpoint shards: 0%| | 0/14 [00:00<?, ?it/s]"
138
+ ]
139
+ },
140
+ "metadata": {},
141
+ "output_type": "display_data"
142
+ }
143
+ ],
144
+ "source": [
145
+ "branch = \"1000B\"\n",
146
+ "model = AutoModelForCausalLM.from_pretrained(model_name,\n",
147
+ " torch_dtype=torch.bfloat16,\n",
148
+ " revision=branch,\n",
149
+ ")"
150
+ ]
151
+ },
152
+ {
153
+ "cell_type": "markdown",
154
+ "id": "258df7e6-27c1-48a2-b20a-d377dc885884",
155
+ "metadata": {},
156
+ "source": [
157
+ "## Setting up the Lora parameters"
158
+ ]
159
+ },
160
+ {
161
+ "cell_type": "code",
162
+ "execution_count": 1,
163
+ "id": "63151e65-6bff-4b65-a8ae-af4d6c53036f",
164
+ "metadata": {},
165
+ "outputs": [
166
+ {
167
+ "name": "stderr",
168
+ "output_type": "stream",
169
+ "text": [
170
+ "2024-02-29 15:28:54.008508: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
171
+ "To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
172
+ ]
173
+ }
174
+ ],
175
+ "source": [
176
+ "from peft import LoraConfig, get_peft_model"
177
+ ]
178
+ },
179
+ {
180
+ "cell_type": "code",
181
+ "execution_count": 38,
182
+ "id": "f35f934f-23c6-47db-95bb-df20526e29e7",
183
+ "metadata": {},
184
+ "outputs": [],
185
+ "source": [
186
+ "config = LoraConfig(\n",
187
+ " r=8,\n",
188
+ " lora_alpha=8,\n",
189
+ " target_modules=[\"query_key_value\"],\n",
190
+ " lora_dropout=0.05,\n",
191
+ " bias=\"none\",\n",
192
+ " task_type=\"CAUSAL_LM\"\n",
193
+ ")"
194
+ ]
195
+ },
196
+ {
197
+ "cell_type": "code",
198
+ "execution_count": 39,
199
+ "id": "a62cb983-6e28-40f1-9f8d-64a1a6ccd0f3",
200
+ "metadata": {},
201
+ "outputs": [],
202
+ "source": [
203
+ "peft_model = get_peft_model(model, config)"
204
+ ]
205
+ },
206
+ {
207
+ "cell_type": "code",
208
+ "execution_count": 40,
209
+ "id": "fe7e7078-2998-4f17-abda-f03a32e04735",
210
+ "metadata": {},
211
+ "outputs": [
212
+ {
213
+ "name": "stdout",
214
+ "output_type": "stream",
215
+ "text": [
216
+ "trainable params: 12386304\n",
217
+ "all params: 34229336064\n",
218
+ "trainable: 0.04%\n"
219
+ ]
220
+ }
221
+ ],
222
+ "source": [
223
+ "trainable_params = 0\n",
224
+ "all_param = 0\n",
225
+ "\n",
226
+ "# iterating over all parameters\n",
227
+ "for _, param in peft_model.named_parameters():\n",
228
+ " # adding parameters to total\n",
229
+ " all_param += param.numel()\n",
230
+ " # adding parameters to trainable if they require a graident\n",
231
+ " if param.requires_grad:\n",
232
+ " trainable_params += param.numel()\n",
233
+ "\n",
234
+ "# print number of trainable parameters\n",
235
+ "print(f\"trainable params: {trainable_params}\")\n",
236
+ "print(f\"all params: {all_param}\")\n",
237
+ "print(f\"trainable: {100 * trainable_params / all_param:.2f}%\")"
238
+ ]
239
+ },
240
+ {
241
+ "cell_type": "markdown",
242
+ "id": "d029921d-60db-43ad-ae8b-2ab9910bd490",
243
+ "metadata": {},
244
+ "source": [
245
+ "## Preparing the training data"
246
+ ]
247
+ },
248
+ {
249
+ "cell_type": "code",
250
+ "execution_count": 10,
251
+ "id": "216aa90b-a87d-4a37-b178-e7e83bf987ce",
252
+ "metadata": {},
253
+ "outputs": [],
254
+ "source": [
255
+ "# prepare the data for training\n",
256
+ "def prepare_train_data(data):\n",
257
+ " text_input = data['text']\n",
258
+ " tokenized_input = tokenizer(text_input, return_tensors='pt', padding=True)\n",
259
+ " tokenized_input['labels'] = tokenized_input['input_ids']\n",
260
+ " return tokenized_input"
261
+ ]
262
+ },
263
+ {
264
+ "cell_type": "code",
265
+ "execution_count": 41,
266
+ "id": "5551fac2-9a26-4a86-b8b4-d2f572ecfaa9",
267
+ "metadata": {},
268
+ "outputs": [],
269
+ "source": [
270
+ "dataset = load_dataset(\"json\", data_files=\"prompts_1.json\")"
271
+ ]
272
+ },
273
+ {
274
+ "cell_type": "code",
275
+ "execution_count": 42,
276
+ "id": "dfa8382a-3e66-4433-b6dd-97d59a7945f6",
277
+ "metadata": {},
278
+ "outputs": [],
279
+ "source": [
280
+ "train_dataset = dataset['train'].map(prepare_train_data, batched=True, remove_columns=[\"text\"])"
281
+ ]
282
+ },
283
+ {
284
+ "cell_type": "markdown",
285
+ "id": "319de4fe-f5e0-4ea6-9a61-6f18c241bd02",
286
+ "metadata": {},
287
+ "source": [
288
+ "## Setting up the training parameters"
289
+ ]
290
+ },
291
+ {
292
+ "cell_type": "code",
293
+ "execution_count": 50,
294
+ "id": "9bb42764-af93-463f-8cf6-68707f21151b",
295
+ "metadata": {},
296
+ "outputs": [],
297
+ "source": [
298
+ "from transformers import DataCollatorForLanguageModeling"
299
+ ]
300
+ },
301
+ {
302
+ "cell_type": "code",
303
+ "execution_count": 53,
304
+ "id": "4b8a94c9-2648-4626-9238-4475645aa695",
305
+ "metadata": {},
306
+ "outputs": [],
307
+ "source": [
308
+ "trainer = Trainer(\n",
309
+ " model=peft_model,\n",
310
+ " train_dataset=train_dataset,\n",
311
+ " args=TrainingArguments(\n",
312
+ " per_device_train_batch_size=4,\n",
313
+ " gradient_accumulation_steps=4,\n",
314
+ " warmup_steps=20,\n",
315
+ " max_steps=20,\n",
316
+ " learning_rate=1e-3,\n",
317
+ " logging_steps=1,\n",
318
+ " output_dir='outputs',\n",
319
+ " ),\n",
320
+ " data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False)\n",
321
+ ")"
322
+ ]
323
+ },
324
+ {
325
+ "cell_type": "code",
326
+ "execution_count": 54,
327
+ "id": "fbd62db8-65e3-4d05-b11e-179aaf8f0e65",
328
+ "metadata": {},
329
+ "outputs": [
330
+ {
331
+ "data": {
332
+ "text/html": [
333
+ "\n",
334
+ " <div>\n",
335
+ " \n",
336
+ " <progress value='20' max='20' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
337
+ " [20/20 27:40, Epoch 20/20]\n",
338
+ " </div>\n",
339
+ " <table border=\"1\" class=\"dataframe\">\n",
340
+ " <thead>\n",
341
+ " <tr style=\"text-align: left;\">\n",
342
+ " <th>Step</th>\n",
343
+ " <th>Training Loss</th>\n",
344
+ " </tr>\n",
345
+ " </thead>\n",
346
+ " <tbody>\n",
347
+ " <tr>\n",
348
+ " <td>1</td>\n",
349
+ " <td>0.816400</td>\n",
350
+ " </tr>\n",
351
+ " <tr>\n",
352
+ " <td>2</td>\n",
353
+ " <td>0.808600</td>\n",
354
+ " </tr>\n",
355
+ " <tr>\n",
356
+ " <td>3</td>\n",
357
+ " <td>0.808600</td>\n",
358
+ " </tr>\n",
359
+ " <tr>\n",
360
+ " <td>4</td>\n",
361
+ " <td>0.804700</td>\n",
362
+ " </tr>\n",
363
+ " <tr>\n",
364
+ " <td>5</td>\n",
365
+ " <td>0.793000</td>\n",
366
+ " </tr>\n",
367
+ " <tr>\n",
368
+ " <td>6</td>\n",
369
+ " <td>0.757800</td>\n",
370
+ " </tr>\n",
371
+ " <tr>\n",
372
+ " <td>7</td>\n",
373
+ " <td>0.699200</td>\n",
374
+ " </tr>\n",
375
+ " <tr>\n",
376
+ " <td>8</td>\n",
377
+ " <td>0.640600</td>\n",
378
+ " </tr>\n",
379
+ " <tr>\n",
380
+ " <td>9</td>\n",
381
+ " <td>0.570300</td>\n",
382
+ " </tr>\n",
383
+ " <tr>\n",
384
+ " <td>10</td>\n",
385
+ " <td>0.492200</td>\n",
386
+ " </tr>\n",
387
+ " <tr>\n",
388
+ " <td>11</td>\n",
389
+ " <td>0.392600</td>\n",
390
+ " </tr>\n",
391
+ " <tr>\n",
392
+ " <td>12</td>\n",
393
+ " <td>0.291000</td>\n",
394
+ " </tr>\n",
395
+ " <tr>\n",
396
+ " <td>13</td>\n",
397
+ " <td>0.196300</td>\n",
398
+ " </tr>\n",
399
+ " <tr>\n",
400
+ " <td>14</td>\n",
401
+ " <td>0.140600</td>\n",
402
+ " </tr>\n",
403
+ " <tr>\n",
404
+ " <td>15</td>\n",
405
+ " <td>0.112800</td>\n",
406
+ " </tr>\n",
407
+ " <tr>\n",
408
+ " <td>16</td>\n",
409
+ " <td>0.090300</td>\n",
410
+ " </tr>\n",
411
+ " <tr>\n",
412
+ " <td>17</td>\n",
413
+ " <td>0.064500</td>\n",
414
+ " </tr>\n",
415
+ " <tr>\n",
416
+ " <td>18</td>\n",
417
+ " <td>0.048800</td>\n",
418
+ " </tr>\n",
419
+ " <tr>\n",
420
+ " <td>19</td>\n",
421
+ " <td>0.024900</td>\n",
422
+ " </tr>\n",
423
+ " <tr>\n",
424
+ " <td>20</td>\n",
425
+ " <td>0.018900</td>\n",
426
+ " </tr>\n",
427
+ " </tbody>\n",
428
+ "</table><p>"
429
+ ],
430
+ "text/plain": [
431
+ "<IPython.core.display.HTML object>"
432
+ ]
433
+ },
434
+ "metadata": {},
435
+ "output_type": "display_data"
436
+ },
437
+ {
438
+ "data": {
439
+ "text/plain": [
440
+ "TrainOutput(global_step=20, training_loss=0.428607177734375, metrics={'train_runtime': 1747.2099, 'train_samples_per_second': 0.183, 'train_steps_per_second': 0.011, 'total_flos': 219858091622400.0, 'train_loss': 0.428607177734375, 'epoch': 20.0})"
441
+ ]
442
+ },
443
+ "execution_count": 54,
444
+ "metadata": {},
445
+ "output_type": "execute_result"
446
+ }
447
+ ],
448
+ "source": [
449
+ "trainer.train()"
450
+ ]
451
+ },
452
+ {
453
+ "cell_type": "markdown",
454
+ "id": "1ed2cf09-3683-4016-88d9-9ada1ddb4345",
455
+ "metadata": {},
456
+ "source": [
457
+ "## Saving the finetuned model"
458
+ ]
459
+ },
460
+ {
461
+ "cell_type": "code",
462
+ "execution_count": 13,
463
+ "id": "c37902bf-47e5-4f89-9128-a6b7d91cb437",
464
+ "metadata": {},
465
+ "outputs": [],
466
+ "source": [
467
+ "model_id = \"Poro-34B-Lora-1\""
468
+ ]
469
+ },
470
+ {
471
+ "cell_type": "code",
472
+ "execution_count": null,
473
+ "id": "163b54c4-3027-4e0d-9d52-7e3d698020da",
474
+ "metadata": {},
475
+ "outputs": [],
476
+ "source": [
477
+ "peft_model.save_pretrained(model_id)"
478
+ ]
479
+ },
480
+ {
481
+ "cell_type": "code",
482
+ "execution_count": null,
483
+ "id": "ec432db5-4f0c-43c7-b4e4-ef087f057bd0",
484
+ "metadata": {},
485
+ "outputs": [],
486
+ "source": [
487
+ "!ls -lh {model_id} # Lora parameters file size"
488
+ ]
489
+ },
490
+ {
491
+ "cell_type": "markdown",
492
+ "id": "11460d4e-3e11-4fdb-b134-61b45bb84018",
493
+ "metadata": {},
494
+ "source": [
495
+ "## Testing"
496
+ ]
497
+ },
498
+ {
499
+ "cell_type": "code",
500
+ "execution_count": 8,
501
+ "id": "eb6a1213-a7ab-4bb5-8ffc-0e2666286dc6",
502
+ "metadata": {},
503
+ "outputs": [],
504
+ "source": [
505
+ "def generate_output(model, inputs, max_new_tokens=100):\n",
506
+ " outputs = model.generate(\n",
507
+ " input_ids=inputs[\"input_ids\"],\n",
508
+ " max_new_tokens=max_new_tokens,\n",
509
+ " temperature=0.1,\n",
510
+ " )\n",
511
+ " return outputs"
512
+ ]
513
+ },
514
+ {
515
+ "cell_type": "markdown",
516
+ "id": "0a844312-2a1e-4c76-9078-96506b252522",
517
+ "metadata": {},
518
+ "source": [
519
+ "### Original model"
520
+ ]
521
+ },
522
+ {
523
+ "cell_type": "code",
524
+ "execution_count": 9,
525
+ "id": "d38bbed0-e938-43ef-b816-b5e0f9d066fd",
526
+ "metadata": {},
527
+ "outputs": [
528
+ {
529
+ "name": "stdout",
530
+ "output_type": "stream",
531
+ "text": [
532
+ "['Given the question delimited by triple backticks ```{ Kuinka vaihdan uutiskirjeen sähköpostiosoitteen? }```, what is the answer? Answer: ```{ Voit vaihtaa uutiskirjeen sähköpostiosoitteen kirjautumalla sisään ja menemällä Oma tili -osioon. }```\\n']\n"
533
+ ]
534
+ }
535
+ ],
536
+ "source": [
537
+ "prompt = tokenizer('Given the question delimited by triple backticks ```{ Kuinka vaihdan uutiskirjeen sähköpostiosoitteen? }```, what is the answer? Answer:', return_tensors=\"pt\")\n",
538
+ "result = generate_output(model,prompt)\n",
539
+ "print(tokenizer.batch_decode(result, skip_special_tokens=True))"
540
+ ]
541
+ },
542
+ {
543
+ "cell_type": "markdown",
544
+ "id": "ae3c3d6a-2b07-4e46-9ddc-dccadfd07196",
545
+ "metadata": {},
546
+ "source": [
547
+ "### Finetuned model"
548
+ ]
549
+ },
550
+ {
551
+ "cell_type": "code",
552
+ "execution_count": 11,
553
+ "id": "4cf53f39-ad3f-43e2-8daa-79853b054cd2",
554
+ "metadata": {},
555
+ "outputs": [],
556
+ "source": [
557
+ "from peft import PeftModel"
558
+ ]
559
+ },
560
+ {
561
+ "cell_type": "code",
562
+ "execution_count": 14,
563
+ "id": "142bf57d-cffc-47b2-ae91-8a5420c46d32",
564
+ "metadata": {},
565
+ "outputs": [],
566
+ "source": [
567
+ "loaded_model = PeftModel.from_pretrained(model,model_id,is_trainable=False)"
568
+ ]
569
+ },
570
+ {
571
+ "cell_type": "code",
572
+ "execution_count": 15,
573
+ "id": "c3cacd26-edff-494c-9428-55b7659988de",
574
+ "metadata": {},
575
+ "outputs": [
576
+ {
577
+ "name": "stdout",
578
+ "output_type": "stream",
579
+ "text": [
580
+ "['Given the question delimited by triple backticks ```{ Kuinka vaihdan uutiskirjeen sähköpostiosoitteen? }```, what is the answer? Answer: { Peruuta ensin vanhaan osoitteeseen tilattu uutiskirje kirjeen alareunan “Peruuta tilaus” -linkistä.\\nTilaa uutiskirje uudelleen oikeaan osoitteeseen. }.\\nKuinka vaihdan uutiskirjeen sähköpostiosoitteen?\\nPeruuta ensin vanhaan osoitteeseen tilattu uutiskirje kirjeen alareunan “Peruuta tilaus” -linkistä.\\nTilaa uutiskirje uudelleen oikeaan osoitteeseen.\\nPeruuta uutiskirjeen tilaus kirjeen alareunan “Peruuta tilaus” -linkistä.\\nTilaa uutiskirje uudelleen oikeaan osoitteeseen.\\nPeruuta uutiskirjeen tilaus kirjeen alareunan “Peruuta tilaus” -linkistä.\\nTilaa uutiskirje']\n"
581
+ ]
582
+ }
583
+ ],
584
+ "source": [
585
+ "prompt = tokenizer('Given the question delimited by triple backticks ```{ Kuinka vaihdan uutiskirjeen sähköpostiosoitteen? }```, what is the answer? Answer:', return_tensors=\"pt\")\n",
586
+ "result = generate_output(loaded_model,prompt)\n",
587
+ "print(tokenizer.batch_decode(result, skip_special_tokens=True))"
588
+ ]
589
+ },
590
+ {
591
+ "cell_type": "code",
592
+ "execution_count": null,
593
+ "id": "166c476c-01a2-49cc-b03f-6cb1d9ae6136",
594
+ "metadata": {},
595
+ "outputs": [],
596
+ "source": []
597
+ }
598
+ ],
599
+ "metadata": {
600
+ "kernelspec": {
601
+ "display_name": "Python 3 (ipykernel)",
602
+ "language": "python",
603
+ "name": "python3"
604
+ },
605
+ "language_info": {
606
+ "codemirror_mode": {
607
+ "name": "ipython",
608
+ "version": 3
609
+ },
610
+ "file_extension": ".py",
611
+ "mimetype": "text/x-python",
612
+ "name": "python",
613
+ "nbconvert_exporter": "python",
614
+ "pygments_lexer": "ipython3",
615
+ "version": "3.10.13"
616
+ }
617
+ },
618
+ "nbformat": 4,
619
+ "nbformat_minor": 5
620
+ }
Poro-34B-Lora-2/README.md ADDED
@@ -0,0 +1,204 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: peft
3
+ base_model: LumiOpen/Poro-34B
4
+ ---
5
+
6
+ # Model Card for Model ID
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ <!-- Provide a longer summary of what this model is. -->
17
+
18
+
19
+
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
+
28
+ ### Model Sources [optional]
29
+
30
+ <!-- Provide the basic links for the model. -->
31
+
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
+
76
+ ## Training Details
77
+
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]
200
+
201
+
202
+ ### Framework versions
203
+
204
+ - PEFT 0.9.0
Poro-34B-Lora-2/adapter_config.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "LumiOpen/Poro-34B",
5
+ "bias": "none",
6
+ "fan_in_fan_out": false,
7
+ "inference_mode": true,
8
+ "init_lora_weights": true,
9
+ "layers_pattern": null,
10
+ "layers_to_transform": null,
11
+ "loftq_config": {},
12
+ "lora_alpha": 8,
13
+ "lora_dropout": 0.05,
14
+ "megatron_config": null,
15
+ "megatron_core": "megatron.core",
16
+ "modules_to_save": null,
17
+ "peft_type": "LORA",
18
+ "r": 8,
19
+ "rank_pattern": {},
20
+ "revision": null,
21
+ "target_modules": [
22
+ "query_key_value"
23
+ ],
24
+ "task_type": "CAUSAL_LM",
25
+ "use_dora": false,
26
+ "use_rslora": false
27
+ }
Poro-34B-Lora-2/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c2097d27cf08d1fd1f7026c70dc0d1d994a56cad1db1361424a2baea1eb0ed64
3
+ size 24788784
Poro-34B-Lora-2QA.ipynb ADDED
@@ -0,0 +1,808 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "id": "983e6b42-12bb-4513-af47-c8c4e34e3177",
6
+ "metadata": {},
7
+ "source": [
8
+ "# Poro34B Lora fine-tuning with S-Group's data - 2 Q/A"
9
+ ]
10
+ },
11
+ {
12
+ "cell_type": "code",
13
+ "execution_count": null,
14
+ "id": "e5f5a80c-0501-41a1-80a9-fb5792b45fea",
15
+ "metadata": {},
16
+ "outputs": [],
17
+ "source": [
18
+ "# This script finetunes the Poro34B model with 1 Question and Answer pair"
19
+ ]
20
+ },
21
+ {
22
+ "cell_type": "markdown",
23
+ "id": "6441fdf4-ed64-447a-b2c6-542738dc2658",
24
+ "metadata": {},
25
+ "source": [
26
+ "## Initialization"
27
+ ]
28
+ },
29
+ {
30
+ "cell_type": "code",
31
+ "execution_count": 1,
32
+ "id": "67f730e6-3467-4a19-ab76-e8baace8e02e",
33
+ "metadata": {},
34
+ "outputs": [
35
+ {
36
+ "name": "stdout",
37
+ "output_type": "stream",
38
+ "text": [
39
+ "Requirement already satisfied: peft in /opt/conda/lib/python3.10/site-packages (0.9.0)\n",
40
+ "Requirement already satisfied: numpy>=1.17 in /opt/conda/lib/python3.10/site-packages (from peft) (1.26.3)\n",
41
+ "Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.10/site-packages (from peft) (23.2)\n",
42
+ "Requirement already satisfied: psutil in /opt/conda/lib/python3.10/site-packages (from peft) (5.9.8)\n",
43
+ "Requirement already satisfied: pyyaml in /opt/conda/lib/python3.10/site-packages (from peft) (6.0.1)\n",
44
+ "Requirement already satisfied: torch>=1.13.0 in /opt/conda/lib/python3.10/site-packages (from peft) (2.0.0.post101)\n",
45
+ "Requirement already satisfied: transformers in /opt/conda/lib/python3.10/site-packages (from peft) (4.31.0)\n",
46
+ "Requirement already satisfied: tqdm in /opt/conda/lib/python3.10/site-packages (from peft) (4.66.1)\n",
47
+ "Requirement already satisfied: accelerate>=0.21.0 in /opt/conda/lib/python3.10/site-packages (from peft) (0.21.0)\n",
48
+ "Requirement already satisfied: safetensors in /opt/conda/lib/python3.10/site-packages (from peft) (0.3.3)\n",
49
+ "Requirement already satisfied: huggingface-hub>=0.17.0 in /opt/conda/lib/python3.10/site-packages (from peft) (0.20.2)\n",
50
+ "Requirement already satisfied: filelock in /opt/conda/lib/python3.10/site-packages (from huggingface-hub>=0.17.0->peft) (3.13.1)\n",
51
+ "Requirement already satisfied: fsspec>=2023.5.0 in /opt/conda/lib/python3.10/site-packages (from huggingface-hub>=0.17.0->peft) (2023.6.0)\n",
52
+ "Requirement already satisfied: requests in /opt/conda/lib/python3.10/site-packages (from huggingface-hub>=0.17.0->peft) (2.31.0)\n",
53
+ "Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.10/site-packages (from huggingface-hub>=0.17.0->peft) (4.5.0)\n",
54
+ "Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torch>=1.13.0->peft) (1.12)\n",
55
+ "Requirement already satisfied: networkx in /opt/conda/lib/python3.10/site-packages (from torch>=1.13.0->peft) (3.2.1)\n",
56
+ "Requirement already satisfied: jinja2 in /opt/conda/lib/python3.10/site-packages (from torch>=1.13.0->peft) (3.1.3)\n",
57
+ "Requirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.10/site-packages (from transformers->peft) (2023.12.25)\n",
58
+ "Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /opt/conda/lib/python3.10/site-packages (from transformers->peft) (0.13.3)\n",
59
+ "Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.10/site-packages (from jinja2->torch>=1.13.0->peft) (2.1.4)\n",
60
+ "Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.10/site-packages (from requests->huggingface-hub>=0.17.0->peft) (3.3.2)\n",
61
+ "Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.10/site-packages (from requests->huggingface-hub>=0.17.0->peft) (3.6)\n",
62
+ "Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.10/site-packages (from requests->huggingface-hub>=0.17.0->peft) (1.26.18)\n",
63
+ "Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.10/site-packages (from requests->huggingface-hub>=0.17.0->peft) (2023.11.17)\n",
64
+ "Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.10/site-packages (from sympy->torch>=1.13.0->peft) (1.3.0)\n"
65
+ ]
66
+ }
67
+ ],
68
+ "source": [
69
+ "!pip install peft"
70
+ ]
71
+ },
72
+ {
73
+ "cell_type": "code",
74
+ "execution_count": 2,
75
+ "id": "80b24df2-140b-4792-aaf1-6f6aff92ece8",
76
+ "metadata": {},
77
+ "outputs": [
78
+ {
79
+ "name": "stderr",
80
+ "output_type": "stream",
81
+ "text": [
82
+ "2024-02-29 16:09:52.067525: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
83
+ "To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
84
+ ]
85
+ }
86
+ ],
87
+ "source": [
88
+ "import torch\n",
89
+ "import json\n",
90
+ "from transformers import AutoModelForCausalLM, AutoTokenizer \n",
91
+ "from transformers import TrainingArguments, Trainer\n",
92
+ "from transformers import pipeline\n",
93
+ "from peft import get_peft_model, PromptTuningConfig, TaskType, PromptTuningInit\n",
94
+ "from datasets import load_dataset"
95
+ ]
96
+ },
97
+ {
98
+ "cell_type": "code",
99
+ "execution_count": 3,
100
+ "id": "d31adfc6-a460-419e-871b-d0437501b026",
101
+ "metadata": {},
102
+ "outputs": [],
103
+ "source": [
104
+ "device = torch.device(\"cuda\") if torch.cuda.is_available() else torch.device(\"cpu\")"
105
+ ]
106
+ },
107
+ {
108
+ "cell_type": "code",
109
+ "execution_count": 4,
110
+ "id": "2c5a9b07-c92b-4d1d-b5b5-96e8c234e14f",
111
+ "metadata": {},
112
+ "outputs": [
113
+ {
114
+ "name": "stdout",
115
+ "output_type": "stream",
116
+ "text": [
117
+ "cpu\n"
118
+ ]
119
+ }
120
+ ],
121
+ "source": [
122
+ "print(device)"
123
+ ]
124
+ },
125
+ {
126
+ "cell_type": "markdown",
127
+ "id": "4b1ec235-cdd1-4b68-b9b0-fe214cbeb2be",
128
+ "metadata": {},
129
+ "source": [
130
+ "## Foundation model import"
131
+ ]
132
+ },
133
+ {
134
+ "cell_type": "code",
135
+ "execution_count": 3,
136
+ "id": "2c0f7b3a-9d56-46ce-9dc8-5fe40b2628a6",
137
+ "metadata": {},
138
+ "outputs": [],
139
+ "source": [
140
+ "model_name='LumiOpen/Poro-34B'"
141
+ ]
142
+ },
143
+ {
144
+ "cell_type": "code",
145
+ "execution_count": 4,
146
+ "id": "4e4c9089-a195-4fd7-91b2-6240cafb4989",
147
+ "metadata": {},
148
+ "outputs": [],
149
+ "source": [
150
+ "tokenizer = AutoTokenizer.from_pretrained(model_name)"
151
+ ]
152
+ },
153
+ {
154
+ "cell_type": "code",
155
+ "execution_count": null,
156
+ "id": "a42e0fb6-40d4-483b-a034-84ff351c021d",
157
+ "metadata": {},
158
+ "outputs": [
159
+ {
160
+ "data": {
161
+ "application/vnd.jupyter.widget-view+json": {
162
+ "model_id": "2784e3e4025e44d4aa5edb0aa58a2aaa",
163
+ "version_major": 2,
164
+ "version_minor": 0
165
+ },
166
+ "text/plain": [
167
+ "Loading checkpoint shards: 0%| | 0/14 [00:00<?, ?it/s]"
168
+ ]
169
+ },
170
+ "metadata": {},
171
+ "output_type": "display_data"
172
+ }
173
+ ],
174
+ "source": [
175
+ "branch = \"1000B\"\n",
176
+ "model = AutoModelForCausalLM.from_pretrained(model_name,\n",
177
+ " torch_dtype=torch.bfloat16,\n",
178
+ " revision=branch,\n",
179
+ ")"
180
+ ]
181
+ },
182
+ {
183
+ "cell_type": "markdown",
184
+ "id": "32bce199-3107-4329-87e4-1c33b9e70c30",
185
+ "metadata": {},
186
+ "source": [
187
+ "## Setting up the Lora parameters"
188
+ ]
189
+ },
190
+ {
191
+ "cell_type": "code",
192
+ "execution_count": 8,
193
+ "id": "63151e65-6bff-4b65-a8ae-af4d6c53036f",
194
+ "metadata": {},
195
+ "outputs": [],
196
+ "source": [
197
+ "from peft import LoraConfig, get_peft_model"
198
+ ]
199
+ },
200
+ {
201
+ "cell_type": "code",
202
+ "execution_count": 9,
203
+ "id": "f35f934f-23c6-47db-95bb-df20526e29e7",
204
+ "metadata": {},
205
+ "outputs": [],
206
+ "source": [
207
+ "config = LoraConfig(\n",
208
+ " r=8,\n",
209
+ " lora_alpha=8,\n",
210
+ " target_modules=[\"query_key_value\"],\n",
211
+ " lora_dropout=0.05,\n",
212
+ " bias=\"none\",\n",
213
+ " task_type=\"CAUSAL_LM\"\n",
214
+ ")"
215
+ ]
216
+ },
217
+ {
218
+ "cell_type": "code",
219
+ "execution_count": 10,
220
+ "id": "a62cb983-6e28-40f1-9f8d-64a1a6ccd0f3",
221
+ "metadata": {},
222
+ "outputs": [],
223
+ "source": [
224
+ "peft_model = get_peft_model(model, config)"
225
+ ]
226
+ },
227
+ {
228
+ "cell_type": "code",
229
+ "execution_count": 11,
230
+ "id": "fe7e7078-2998-4f17-abda-f03a32e04735",
231
+ "metadata": {},
232
+ "outputs": [
233
+ {
234
+ "name": "stdout",
235
+ "output_type": "stream",
236
+ "text": [
237
+ "trainable params: 12386304\n",
238
+ "all params: 34229336064\n",
239
+ "trainable: 0.04%\n"
240
+ ]
241
+ }
242
+ ],
243
+ "source": [
244
+ "trainable_params = 0\n",
245
+ "all_param = 0\n",
246
+ "\n",
247
+ "# iterating over all parameters\n",
248
+ "for _, param in peft_model.named_parameters():\n",
249
+ " # adding parameters to total\n",
250
+ " all_param += param.numel()\n",
251
+ " # adding parameters to trainable if they require a graident\n",
252
+ " if param.requires_grad:\n",
253
+ " trainable_params += param.numel()\n",
254
+ "\n",
255
+ "# print number of trainable parameters\n",
256
+ "print(f\"trainable params: {trainable_params}\")\n",
257
+ "print(f\"all params: {all_param}\")\n",
258
+ "print(f\"trainable: {100 * trainable_params / all_param:.2f}%\")"
259
+ ]
260
+ },
261
+ {
262
+ "cell_type": "markdown",
263
+ "id": "6c1b2946-5575-4de0-bf99-a05fa3994964",
264
+ "metadata": {},
265
+ "source": [
266
+ "## Preparing the training data"
267
+ ]
268
+ },
269
+ {
270
+ "cell_type": "code",
271
+ "execution_count": 12,
272
+ "id": "216aa90b-a87d-4a37-b178-e7e83bf987ce",
273
+ "metadata": {},
274
+ "outputs": [],
275
+ "source": [
276
+ "# prepare the data for training\n",
277
+ "def prepare_train_data(data):\n",
278
+ " text_input = data['text']\n",
279
+ " tokenized_input = tokenizer(text_input, return_tensors='pt', padding=True)\n",
280
+ " tokenized_input['labels'] = tokenized_input['input_ids']\n",
281
+ " return tokenized_input"
282
+ ]
283
+ },
284
+ {
285
+ "cell_type": "code",
286
+ "execution_count": 13,
287
+ "id": "5551fac2-9a26-4a86-b8b4-d2f572ecfaa9",
288
+ "metadata": {},
289
+ "outputs": [
290
+ {
291
+ "data": {
292
+ "application/vnd.jupyter.widget-view+json": {
293
+ "model_id": "df4aebc8dac54280a04c82106b935f82",
294
+ "version_major": 2,
295
+ "version_minor": 0
296
+ },
297
+ "text/plain": [
298
+ "Generating train split: 0 examples [00:00, ? examples/s]"
299
+ ]
300
+ },
301
+ "metadata": {},
302
+ "output_type": "display_data"
303
+ }
304
+ ],
305
+ "source": [
306
+ "dataset = load_dataset(\"json\", data_files=\"prompts_2.json\")"
307
+ ]
308
+ },
309
+ {
310
+ "cell_type": "code",
311
+ "execution_count": 14,
312
+ "id": "dfa8382a-3e66-4433-b6dd-97d59a7945f6",
313
+ "metadata": {},
314
+ "outputs": [
315
+ {
316
+ "data": {
317
+ "application/vnd.jupyter.widget-view+json": {
318
+ "model_id": "a20ed74f503748eeac691e1fbf026fec",
319
+ "version_major": 2,
320
+ "version_minor": 0
321
+ },
322
+ "text/plain": [
323
+ "Map: 0%| | 0/2 [00:00<?, ? examples/s]"
324
+ ]
325
+ },
326
+ "metadata": {},
327
+ "output_type": "display_data"
328
+ }
329
+ ],
330
+ "source": [
331
+ "train_dataset = dataset['train'].map(prepare_train_data, batched=True, remove_columns=[\"text\"])"
332
+ ]
333
+ },
334
+ {
335
+ "cell_type": "markdown",
336
+ "id": "19e453fc-728e-4ad9-b44c-e6b95b37707b",
337
+ "metadata": {},
338
+ "source": [
339
+ "## Setting up the training parameters"
340
+ ]
341
+ },
342
+ {
343
+ "cell_type": "code",
344
+ "execution_count": 15,
345
+ "id": "9bb42764-af93-463f-8cf6-68707f21151b",
346
+ "metadata": {},
347
+ "outputs": [],
348
+ "source": [
349
+ "from transformers import DataCollatorForLanguageModeling"
350
+ ]
351
+ },
352
+ {
353
+ "cell_type": "code",
354
+ "execution_count": 16,
355
+ "id": "4b8a94c9-2648-4626-9238-4475645aa695",
356
+ "metadata": {},
357
+ "outputs": [],
358
+ "source": [
359
+ "trainer = Trainer(\n",
360
+ " model=peft_model,\n",
361
+ " train_dataset=train_dataset,\n",
362
+ " args=TrainingArguments(\n",
363
+ " per_device_train_batch_size=4,\n",
364
+ " gradient_accumulation_steps=4,\n",
365
+ " warmup_steps=20,\n",
366
+ " max_steps=20,\n",
367
+ " learning_rate=1e-3,\n",
368
+ " logging_steps=1,\n",
369
+ " output_dir='outputs',\n",
370
+ " ),\n",
371
+ " data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False)\n",
372
+ ")"
373
+ ]
374
+ },
375
+ {
376
+ "cell_type": "code",
377
+ "execution_count": 17,
378
+ "id": "fbd62db8-65e3-4d05-b11e-179aaf8f0e65",
379
+ "metadata": {},
380
+ "outputs": [
381
+ {
382
+ "name": "stderr",
383
+ "output_type": "stream",
384
+ "text": [
385
+ "/opt/conda/lib/python3.10/site-packages/transformers/optimization.py:411: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning\n",
386
+ " warnings.warn(\n",
387
+ "\u001b[34m\u001b[1mwandb\u001b[0m: Currently logged in as: \u001b[33mtimo-au-laine\u001b[0m. Use \u001b[1m`wandb login --relogin`\u001b[0m to force relogin\n"
388
+ ]
389
+ },
390
+ {
391
+ "data": {
392
+ "text/html": [
393
+ "wandb version 0.16.3 is available! To upgrade, please run:\n",
394
+ " $ pip install wandb --upgrade"
395
+ ],
396
+ "text/plain": [
397
+ "<IPython.core.display.HTML object>"
398
+ ]
399
+ },
400
+ "metadata": {},
401
+ "output_type": "display_data"
402
+ },
403
+ {
404
+ "data": {
405
+ "text/html": [
406
+ "Tracking run with wandb version 0.16.2"
407
+ ],
408
+ "text/plain": [
409
+ "<IPython.core.display.HTML object>"
410
+ ]
411
+ },
412
+ "metadata": {},
413
+ "output_type": "display_data"
414
+ },
415
+ {
416
+ "data": {
417
+ "text/html": [
418
+ "Run data is saved locally in <code>/home/sagemaker-user/wandb/run-20240229_153102-95aeur18</code>"
419
+ ],
420
+ "text/plain": [
421
+ "<IPython.core.display.HTML object>"
422
+ ]
423
+ },
424
+ "metadata": {},
425
+ "output_type": "display_data"
426
+ },
427
+ {
428
+ "data": {
429
+ "text/html": [
430
+ "Syncing run <strong><a href='https://wandb.ai/timo-au-laine/huggingface/runs/95aeur18' target=\"_blank\">brisk-fog-8</a></strong> to <a href='https://wandb.ai/timo-au-laine/huggingface' target=\"_blank\">Weights & Biases</a> (<a href='https://wandb.me/run' target=\"_blank\">docs</a>)<br/>"
431
+ ],
432
+ "text/plain": [
433
+ "<IPython.core.display.HTML object>"
434
+ ]
435
+ },
436
+ "metadata": {},
437
+ "output_type": "display_data"
438
+ },
439
+ {
440
+ "data": {
441
+ "text/html": [
442
+ " View project at <a href='https://wandb.ai/timo-au-laine/huggingface' target=\"_blank\">https://wandb.ai/timo-au-laine/huggingface</a>"
443
+ ],
444
+ "text/plain": [
445
+ "<IPython.core.display.HTML object>"
446
+ ]
447
+ },
448
+ "metadata": {},
449
+ "output_type": "display_data"
450
+ },
451
+ {
452
+ "data": {
453
+ "text/html": [
454
+ " View run at <a href='https://wandb.ai/timo-au-laine/huggingface/runs/95aeur18' target=\"_blank\">https://wandb.ai/timo-au-laine/huggingface/runs/95aeur18</a>"
455
+ ],
456
+ "text/plain": [
457
+ "<IPython.core.display.HTML object>"
458
+ ]
459
+ },
460
+ "metadata": {},
461
+ "output_type": "display_data"
462
+ },
463
+ {
464
+ "name": "stderr",
465
+ "output_type": "stream",
466
+ "text": [
467
+ "You're using a BloomTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.\n"
468
+ ]
469
+ },
470
+ {
471
+ "data": {
472
+ "text/html": [
473
+ "\n",
474
+ " <div>\n",
475
+ " \n",
476
+ " <progress value='20' max='20' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
477
+ " [20/20 29:32, Epoch 20/20]\n",
478
+ " </div>\n",
479
+ " <table border=\"1\" class=\"dataframe\">\n",
480
+ " <thead>\n",
481
+ " <tr style=\"text-align: left;\">\n",
482
+ " <th>Step</th>\n",
483
+ " <th>Training Loss</th>\n",
484
+ " </tr>\n",
485
+ " </thead>\n",
486
+ " <tbody>\n",
487
+ " <tr>\n",
488
+ " <td>1</td>\n",
489
+ " <td>0.687500</td>\n",
490
+ " </tr>\n",
491
+ " <tr>\n",
492
+ " <td>2</td>\n",
493
+ " <td>0.683600</td>\n",
494
+ " </tr>\n",
495
+ " <tr>\n",
496
+ " <td>3</td>\n",
497
+ " <td>0.683600</td>\n",
498
+ " </tr>\n",
499
+ " <tr>\n",
500
+ " <td>4</td>\n",
501
+ " <td>0.683600</td>\n",
502
+ " </tr>\n",
503
+ " <tr>\n",
504
+ " <td>5</td>\n",
505
+ " <td>0.683600</td>\n",
506
+ " </tr>\n",
507
+ " <tr>\n",
508
+ " <td>6</td>\n",
509
+ " <td>0.671900</td>\n",
510
+ " </tr>\n",
511
+ " <tr>\n",
512
+ " <td>7</td>\n",
513
+ " <td>0.656200</td>\n",
514
+ " </tr>\n",
515
+ " <tr>\n",
516
+ " <td>8</td>\n",
517
+ " <td>0.617200</td>\n",
518
+ " </tr>\n",
519
+ " <tr>\n",
520
+ " <td>9</td>\n",
521
+ " <td>0.585900</td>\n",
522
+ " </tr>\n",
523
+ " <tr>\n",
524
+ " <td>10</td>\n",
525
+ " <td>0.543000</td>\n",
526
+ " </tr>\n",
527
+ " <tr>\n",
528
+ " <td>11</td>\n",
529
+ " <td>0.488300</td>\n",
530
+ " </tr>\n",
531
+ " <tr>\n",
532
+ " <td>12</td>\n",
533
+ " <td>0.423800</td>\n",
534
+ " </tr>\n",
535
+ " <tr>\n",
536
+ " <td>13</td>\n",
537
+ " <td>0.357400</td>\n",
538
+ " </tr>\n",
539
+ " <tr>\n",
540
+ " <td>14</td>\n",
541
+ " <td>0.281200</td>\n",
542
+ " </tr>\n",
543
+ " <tr>\n",
544
+ " <td>15</td>\n",
545
+ " <td>0.204100</td>\n",
546
+ " </tr>\n",
547
+ " <tr>\n",
548
+ " <td>16</td>\n",
549
+ " <td>0.137700</td>\n",
550
+ " </tr>\n",
551
+ " <tr>\n",
552
+ " <td>17</td>\n",
553
+ " <td>0.088400</td>\n",
554
+ " </tr>\n",
555
+ " <tr>\n",
556
+ " <td>18</td>\n",
557
+ " <td>0.058100</td>\n",
558
+ " </tr>\n",
559
+ " <tr>\n",
560
+ " <td>19</td>\n",
561
+ " <td>0.043500</td>\n",
562
+ " </tr>\n",
563
+ " <tr>\n",
564
+ " <td>20</td>\n",
565
+ " <td>0.027600</td>\n",
566
+ " </tr>\n",
567
+ " </tbody>\n",
568
+ "</table><p>"
569
+ ],
570
+ "text/plain": [
571
+ "<IPython.core.display.HTML object>"
572
+ ]
573
+ },
574
+ "metadata": {},
575
+ "output_type": "display_data"
576
+ },
577
+ {
578
+ "data": {
579
+ "text/plain": [
580
+ "TrainOutput(global_step=20, training_loss=0.43031005859375, metrics={'train_runtime': 1884.4748, 'train_samples_per_second': 0.17, 'train_steps_per_second': 0.011, 'total_flos': 1415086626078720.0, 'train_loss': 0.43031005859375, 'epoch': 20.0})"
581
+ ]
582
+ },
583
+ "execution_count": 17,
584
+ "metadata": {},
585
+ "output_type": "execute_result"
586
+ }
587
+ ],
588
+ "source": [
589
+ "trainer.train()"
590
+ ]
591
+ },
592
+ {
593
+ "cell_type": "markdown",
594
+ "id": "efc3ef10-e402-481a-8abd-4174444d668a",
595
+ "metadata": {},
596
+ "source": [
597
+ "## Saving the finetuned model"
598
+ ]
599
+ },
600
+ {
601
+ "cell_type": "code",
602
+ "execution_count": 18,
603
+ "id": "c37902bf-47e5-4f89-9128-a6b7d91cb437",
604
+ "metadata": {},
605
+ "outputs": [],
606
+ "source": [
607
+ "model_id2 = \"Poro-34B-Lora-2\""
608
+ ]
609
+ },
610
+ {
611
+ "cell_type": "code",
612
+ "execution_count": 19,
613
+ "id": "163b54c4-3027-4e0d-9d52-7e3d698020da",
614
+ "metadata": {},
615
+ "outputs": [],
616
+ "source": [
617
+ "peft_model.save_pretrained(model_id2)"
618
+ ]
619
+ },
620
+ {
621
+ "cell_type": "code",
622
+ "execution_count": null,
623
+ "id": "ec432db5-4f0c-43c7-b4e4-ef087f057bd0",
624
+ "metadata": {},
625
+ "outputs": [],
626
+ "source": [
627
+ "!ls -lh {model_id2}"
628
+ ]
629
+ },
630
+ {
631
+ "cell_type": "markdown",
632
+ "id": "059b6ee5-bfc0-4dcf-901c-ef869bedbb90",
633
+ "metadata": {},
634
+ "source": [
635
+ "## Testing"
636
+ ]
637
+ },
638
+ {
639
+ "cell_type": "code",
640
+ "execution_count": null,
641
+ "id": "eb6a1213-a7ab-4bb5-8ffc-0e2666286dc6",
642
+ "metadata": {},
643
+ "outputs": [],
644
+ "source": [
645
+ "def generate_output(model, inputs, max_new_tokens=100):\n",
646
+ " outputs = model.generate(\n",
647
+ " input_ids=inputs[\"input_ids\"],\n",
648
+ " max_new_tokens=max_new_tokens,\n",
649
+ " temperature=0.1,\n",
650
+ " )\n",
651
+ " return outputs"
652
+ ]
653
+ },
654
+ {
655
+ "cell_type": "markdown",
656
+ "id": "619f7e85-3310-40d1-8706-1289676b98ca",
657
+ "metadata": {},
658
+ "source": [
659
+ "### Original model"
660
+ ]
661
+ },
662
+ {
663
+ "cell_type": "code",
664
+ "execution_count": null,
665
+ "id": "d38bbed0-e938-43ef-b816-b5e0f9d066fd",
666
+ "metadata": {},
667
+ "outputs": [
668
+ {
669
+ "name": "stdout",
670
+ "output_type": "stream",
671
+ "text": [
672
+ "['Given the question delimited by triple backticks ```{ Kuinka vaihdan uutiskirjeen sähköpostiosoitteen? }```, what is the answer? Answer: ```{ Voit vaihtaa uutiskirjeen sähköpostiosoitteen kirjautumalla sisään ja menemällä Oma tili -osioon. }```\\n']\n"
673
+ ]
674
+ }
675
+ ],
676
+ "source": [
677
+ "prompt = tokenizer('Given the question delimited by triple backticks ```{ Kuinka vaihdan uutiskirjeen sähköpostiosoitteen? }```, what is the answer? Answer:', return_tensors=\"pt\")\n",
678
+ "result = generate_output(model,prompt)\n",
679
+ "print(tokenizer.batch_decode(result, skip_special_tokens=True))"
680
+ ]
681
+ },
682
+ {
683
+ "cell_type": "code",
684
+ "execution_count": null,
685
+ "id": "6d354261-2dbb-4ecd-a75e-3af454c3c051",
686
+ "metadata": {},
687
+ "outputs": [
688
+ {
689
+ "name": "stdout",
690
+ "output_type": "stream",
691
+ "text": [
692
+ "['Given the question delimited by triple backticks ```{ Miksi sähköpostiosoite tulee vahvistaa? }```, what is the answer? Answer: Because the email address needs to be confirmed.']\n"
693
+ ]
694
+ }
695
+ ],
696
+ "source": [
697
+ "prompt = tokenizer('Given the question delimited by triple backticks ```{ Miksi sähköpostiosoite tulee vahvistaa? }```, what is the answer? Answer:', return_tensors=\"pt\")\n",
698
+ "result = generate_output(model,prompt)\n",
699
+ "print(tokenizer.batch_decode(result, skip_special_tokens=True))"
700
+ ]
701
+ },
702
+ {
703
+ "cell_type": "markdown",
704
+ "id": "3afd691c-b853-41e1-abc5-3e462dd8f9a0",
705
+ "metadata": {},
706
+ "source": [
707
+ "### Finetuned model"
708
+ ]
709
+ },
710
+ {
711
+ "cell_type": "code",
712
+ "execution_count": 23,
713
+ "id": "4cf53f39-ad3f-43e2-8daa-79853b054cd2",
714
+ "metadata": {},
715
+ "outputs": [],
716
+ "source": [
717
+ "from peft import PeftModel"
718
+ ]
719
+ },
720
+ {
721
+ "cell_type": "code",
722
+ "execution_count": 24,
723
+ "id": "142bf57d-cffc-47b2-ae91-8a5420c46d32",
724
+ "metadata": {},
725
+ "outputs": [],
726
+ "source": [
727
+ "loaded_model = PeftModel.from_pretrained(model,model_id2,is_trainable=False)"
728
+ ]
729
+ },
730
+ {
731
+ "cell_type": "code",
732
+ "execution_count": 25,
733
+ "id": "c3cacd26-edff-494c-9428-55b7659988de",
734
+ "metadata": {},
735
+ "outputs": [
736
+ {
737
+ "name": "stdout",
738
+ "output_type": "stream",
739
+ "text": [
740
+ "['Given the question delimited by triple backticks ```{ Kuinka vaihdan uutiskirjeen sähköpostiosoitteen? }```, what is the answer? Answer: { Peruuta ensin vanhaan osoitteeseen tilattu uutiskirje kirjeen alareunan “Peruuta tilaus” -linkistä.\\nTilaa uutiskirje uudelleen oikeaan osoitteeseen. }\\n\\n### Answering a Given Question\\n\\nGiven a question delimited by triple backticks ```{ Kuinka vaihdan uutiskirjeen sähköpostiosoitteen? }```, what is the answer? Answer: { Peruuta ensin vanhaan osoitteeseen tilattu uutiskirje kirjeen alareunan “Peruuta tilaus” -linkistä.\\nTilaa uutiskirje uudelleen oikeaan osoitteeseen. }\\n\\n## Data Sources\\n\\nThe data for currently supported']\n"
741
+ ]
742
+ }
743
+ ],
744
+ "source": [
745
+ "prompt = tokenizer('Given the question delimited by triple backticks ```{ Kuinka vaihdan uutiskirjeen sähköpostiosoitteen? }```, what is the answer? Answer:', return_tensors=\"pt\")\n",
746
+ "result = generate_output(loaded_model,prompt)\n",
747
+ "print(tokenizer.batch_decode(result, skip_special_tokens=True))"
748
+ ]
749
+ },
750
+ {
751
+ "cell_type": "code",
752
+ "execution_count": 26,
753
+ "id": "b1ad5b3e-8f22-4763-b3e1-91b604731048",
754
+ "metadata": {},
755
+ "outputs": [
756
+ {
757
+ "name": "stdout",
758
+ "output_type": "stream",
759
+ "text": [
760
+ "['Given the question delimited by triple backticks ```{ Miksi sähköpostiosoite tulee vahvistaa? }```, what is the answer? Answer: {Sähköpostiosoitteiden vahvistaminen on yleisesti käytössä oleva tapa varmistua siitä, että henkilöllä itsellään on pääsy hänen tiedoissaan olevaan sähköpostiosoitteeseen.\\n\\nSähköpostiosoite tulee vahvistaa itse, joko S-mobiilissa tai samalla kun luot itsellesi S-käyttäjätilin. Kun lähetät vahvistusviestin omissa tiedoissasi näkyvään sähköpostiosoitteeseen ja vahvistat itse osoitteen oikeaksi sähköpostiisi lähetetyllä vahvistuskoodilla, saamme varmistuksen, että osoitteesi on voimassa ja kuuluu juuri sinulle.\\n\\nJos asiakastiedoissasi olevasta sähköpostiosoitteesta puuttuu vielä vahvistus, näkyy']\n"
761
+ ]
762
+ }
763
+ ],
764
+ "source": [
765
+ "prompt = tokenizer('Given the question delimited by triple backticks ```{ Miksi sähköpostiosoite tulee vahvistaa? }```, what is the answer? Answer:', return_tensors=\"pt\")\n",
766
+ "result = generate_output(loaded_model,prompt)\n",
767
+ "print(tokenizer.batch_decode(result, skip_special_tokens=True))"
768
+ ]
769
+ },
770
+ {
771
+ "cell_type": "code",
772
+ "execution_count": null,
773
+ "id": "166c476c-01a2-49cc-b03f-6cb1d9ae6136",
774
+ "metadata": {},
775
+ "outputs": [],
776
+ "source": []
777
+ },
778
+ {
779
+ "cell_type": "code",
780
+ "execution_count": null,
781
+ "id": "09523d41-51e5-4cc6-b68b-c848063e5095",
782
+ "metadata": {},
783
+ "outputs": [],
784
+ "source": []
785
+ }
786
+ ],
787
+ "metadata": {
788
+ "kernelspec": {
789
+ "display_name": "Python 3 (ipykernel)",
790
+ "language": "python",
791
+ "name": "python3"
792
+ },
793
+ "language_info": {
794
+ "codemirror_mode": {
795
+ "name": "ipython",
796
+ "version": 3
797
+ },
798
+ "file_extension": ".py",
799
+ "mimetype": "text/x-python",
800
+ "name": "python",
801
+ "nbconvert_exporter": "python",
802
+ "pygments_lexer": "ipython3",
803
+ "version": "3.10.13"
804
+ }
805
+ },
806
+ "nbformat": 4,
807
+ "nbformat_minor": 5
808
+ }
Prepare_data.ipynb ADDED
@@ -0,0 +1,504 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "id": "57176cfe-974a-430b-b51b-11f5eae772f7",
6
+ "metadata": {
7
+ "id": "mGXJ4xe93YSr"
8
+ },
9
+ "source": [
10
+ "# Creating data set from excel to json"
11
+ ]
12
+ },
13
+ {
14
+ "cell_type": "code",
15
+ "execution_count": null,
16
+ "id": "9e6e6551",
17
+ "metadata": {
18
+ "id": "9e6e6551"
19
+ },
20
+ "outputs": [],
21
+ "source": [
22
+ "import pandas as pd\n",
23
+ "import json"
24
+ ]
25
+ },
26
+ {
27
+ "cell_type": "code",
28
+ "execution_count": null,
29
+ "id": "a410636f",
30
+ "metadata": {
31
+ "id": "a410636f"
32
+ },
33
+ "outputs": [],
34
+ "source": [
35
+ "pd.set_option('display.max_rows', None)\n",
36
+ "pd.set_option('display.max_columns', None)\n",
37
+ "pd.set_option('display.width', None)\n",
38
+ "pd.set_option('display.max_colwidth', None)"
39
+ ]
40
+ },
41
+ {
42
+ "cell_type": "code",
43
+ "execution_count": null,
44
+ "id": "9a95c015",
45
+ "metadata": {
46
+ "id": "9a95c015"
47
+ },
48
+ "outputs": [],
49
+ "source": [
50
+ "# read data set in excel\n",
51
+ "dataset = pd.read_excel(\"S-Kanava_2.xlsx\")"
52
+ ]
53
+ },
54
+ {
55
+ "cell_type": "code",
56
+ "execution_count": null,
57
+ "id": "5cbbfcdb",
58
+ "metadata": {
59
+ "colab": {
60
+ "base_uri": "https://localhost:8080/"
61
+ },
62
+ "id": "5cbbfcdb",
63
+ "outputId": "d4a21e61-6a4e-4d67-cb28-315f4decba3e"
64
+ },
65
+ "outputs": [
66
+ {
67
+ "data": {
68
+ "text/plain": [
69
+ "(2, 2)"
70
+ ]
71
+ },
72
+ "execution_count": 21,
73
+ "metadata": {},
74
+ "output_type": "execute_result"
75
+ }
76
+ ],
77
+ "source": [
78
+ "dataset.shape"
79
+ ]
80
+ },
81
+ {
82
+ "cell_type": "code",
83
+ "execution_count": null,
84
+ "id": "6d0f665a",
85
+ "metadata": {
86
+ "colab": {
87
+ "base_uri": "https://localhost:8080/"
88
+ },
89
+ "id": "6d0f665a",
90
+ "outputId": "9dfe6ffa-bfb2-4752-a0cd-97b182b34cce"
91
+ },
92
+ "outputs": [
93
+ {
94
+ "data": {
95
+ "text/plain": [
96
+ "(2, 2)"
97
+ ]
98
+ },
99
+ "execution_count": 22,
100
+ "metadata": {},
101
+ "output_type": "execute_result"
102
+ }
103
+ ],
104
+ "source": [
105
+ "dataset.dropna()\n",
106
+ "dataset.shape"
107
+ ]
108
+ },
109
+ {
110
+ "cell_type": "code",
111
+ "execution_count": null,
112
+ "id": "446d22c9",
113
+ "metadata": {
114
+ "colab": {
115
+ "base_uri": "https://localhost:8080/",
116
+ "height": 230
117
+ },
118
+ "id": "446d22c9",
119
+ "outputId": "9d08ba83-7011-475d-ed4a-fd5736fa1e95"
120
+ },
121
+ "outputs": [
122
+ {
123
+ "data": {
124
+ "application/vnd.google.colaboratory.intrinsic+json": {
125
+ "summary": "{\n \"name\": \"dataset\",\n \"rows\": 2,\n \"fields\": [\n {\n \"column\": \"Kysymys\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 2,\n \"samples\": [\n \"Miksi s\\u00e4hk\\u00f6postiosoite tulee vahvistaa?\",\n \"Kuinka vaihdan uutiskirjeen s\\u00e4hk\\u00f6postiosoitteen?\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Vastaus\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 2,\n \"samples\": [\n \"S\\u00e4hk\\u00f6postiosoitteiden vahvistaminen on yleisesti k\\u00e4yt\\u00f6ss\\u00e4 oleva tapa varmistua siit\\u00e4, ett\\u00e4 henkil\\u00f6ll\\u00e4 itsell\\u00e4\\u00e4n on p\\u00e4\\u00e4sy h\\u00e4nen tiedoissaan olevaan s\\u00e4hk\\u00f6postiosoitteeseen.\\n\\nS\\u00e4hk\\u00f6postiosoite tulee vahvistaa itse, joko S-mobiilissa tai samalla kun luot itsellesi S-k\\u00e4ytt\\u00e4j\\u00e4tilin. Kun l\\u00e4het\\u00e4t vahvistusviestin omissa tiedoissasi n\\u00e4kyv\\u00e4\\u00e4n s\\u00e4hk\\u00f6postiosoitteeseen ja vahvistat itse osoitteen oikeaksi s\\u00e4hk\\u00f6postiisi l\\u00e4hetetyll\\u00e4 vahvistuskoodilla, saamme varmistuksen, ett\\u00e4 osoitteesi on voimassa ja kuuluu juuri sinulle.\\n\\nJos asiakastiedoissasi olevasta s\\u00e4hk\\u00f6postiosoitteesta puuttuu viel\\u00e4 vahvistus, n\\u00e4kyy osoitteen yhteydess\\u00e4 Vahvista -painike.\\n\\nBonustiedot ja muut henkil\\u00f6kohtaiset tiedotteet l\\u00e4hetet\\u00e4\\u00e4n vain vahvistettuun s\\u00e4hk\\u00f6postiosoitteeseen. Bonustilanteesi voit kuitenkin jatkossakin tarkastaa S-mobiilista, S-k\\u00e4ytt\\u00e4j\\u00e4tililt\\u00e4 sek\\u00e4 toimipaikkojen S-Etukorttip\\u00e4\\u00e4tteilt\\u00e4.\",\n \"Peruuta ensin vanhaan osoitteeseen tilattu uutiskirje kirjeen alareunan \\u201cPeruuta tilaus\\u201d -linkist\\u00e4.\\nTilaa uutiskirje uudelleen oikeaan osoitteeseen.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
126
+ "type": "dataframe",
127
+ "variable_name": "dataset"
128
+ },
129
+ "text/html": [
130
+ "\n",
131
+ " <div id=\"df-6dd34700-e7fc-44b2-ad18-b7f45e54b1ee\" class=\"colab-df-container\">\n",
132
+ " <div>\n",
133
+ "<style scoped>\n",
134
+ " .dataframe tbody tr th:only-of-type {\n",
135
+ " vertical-align: middle;\n",
136
+ " }\n",
137
+ "\n",
138
+ " .dataframe tbody tr th {\n",
139
+ " vertical-align: top;\n",
140
+ " }\n",
141
+ "\n",
142
+ " .dataframe thead th {\n",
143
+ " text-align: right;\n",
144
+ " }\n",
145
+ "</style>\n",
146
+ "<table border=\"1\" class=\"dataframe\">\n",
147
+ " <thead>\n",
148
+ " <tr style=\"text-align: right;\">\n",
149
+ " <th></th>\n",
150
+ " <th>Kysymys</th>\n",
151
+ " <th>Vastaus</th>\n",
152
+ " </tr>\n",
153
+ " </thead>\n",
154
+ " <tbody>\n",
155
+ " <tr>\n",
156
+ " <th>0</th>\n",
157
+ " <td>Kuinka vaihdan uutiskirjeen sähköpostiosoitteen?</td>\n",
158
+ " <td>Peruuta ensin vanhaan osoitteeseen tilattu uutiskirje kirjeen alareunan “Peruuta tilaus” -linkistä.\\nTilaa uutiskirje uudelleen oikeaan osoitteeseen.</td>\n",
159
+ " </tr>\n",
160
+ " <tr>\n",
161
+ " <th>1</th>\n",
162
+ " <td>Miksi sähköpostiosoite tulee vahvistaa?</td>\n",
163
+ " <td>Sähköpostiosoitteiden vahvistaminen on yleisesti käytössä oleva tapa varmistua siitä, että henkilöllä itsellään on pääsy hänen tiedoissaan olevaan sähköpostiosoitteeseen.\\n\\nSähköpostiosoite tulee vahvistaa itse, joko S-mobiilissa tai samalla kun luot itsellesi S-käyttäjätilin. Kun lähetät vahvistusviestin omissa tiedoissasi näkyvään sähköpostiosoitteeseen ja vahvistat itse osoitteen oikeaksi sähköpostiisi lähetetyllä vahvistuskoodilla, saamme varmistuksen, että osoitteesi on voimassa ja kuuluu juuri sinulle.\\n\\nJos asiakastiedoissasi olevasta sähköpostiosoitteesta puuttuu vielä vahvistus, näkyy osoitteen yhteydessä Vahvista -painike.\\n\\nBonustiedot ja muut henkilökohtaiset tiedotteet lähetetään vain vahvistettuun sähköpostiosoitteeseen. Bonustilanteesi voit kuitenkin jatkossakin tarkastaa S-mobiilista, S-käyttäjätililtä sekä toimipaikkojen S-Etukorttipäätteiltä.</td>\n",
164
+ " </tr>\n",
165
+ " </tbody>\n",
166
+ "</table>\n",
167
+ "</div>\n",
168
+ " <div class=\"colab-df-buttons\">\n",
169
+ "\n",
170
+ " <div class=\"colab-df-container\">\n",
171
+ " <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-6dd34700-e7fc-44b2-ad18-b7f45e54b1ee')\"\n",
172
+ " title=\"Convert this dataframe to an interactive table.\"\n",
173
+ " style=\"display:none;\">\n",
174
+ "\n",
175
+ " <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
176
+ " <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
177
+ " </svg>\n",
178
+ " </button>\n",
179
+ "\n",
180
+ " <style>\n",
181
+ " .colab-df-container {\n",
182
+ " display:flex;\n",
183
+ " gap: 12px;\n",
184
+ " }\n",
185
+ "\n",
186
+ " .colab-df-convert {\n",
187
+ " background-color: #E8F0FE;\n",
188
+ " border: none;\n",
189
+ " border-radius: 50%;\n",
190
+ " cursor: pointer;\n",
191
+ " display: none;\n",
192
+ " fill: #1967D2;\n",
193
+ " height: 32px;\n",
194
+ " padding: 0 0 0 0;\n",
195
+ " width: 32px;\n",
196
+ " }\n",
197
+ "\n",
198
+ " .colab-df-convert:hover {\n",
199
+ " background-color: #E2EBFA;\n",
200
+ " box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
201
+ " fill: #174EA6;\n",
202
+ " }\n",
203
+ "\n",
204
+ " .colab-df-buttons div {\n",
205
+ " margin-bottom: 4px;\n",
206
+ " }\n",
207
+ "\n",
208
+ " [theme=dark] .colab-df-convert {\n",
209
+ " background-color: #3B4455;\n",
210
+ " fill: #D2E3FC;\n",
211
+ " }\n",
212
+ "\n",
213
+ " [theme=dark] .colab-df-convert:hover {\n",
214
+ " background-color: #434B5C;\n",
215
+ " box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
216
+ " filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
217
+ " fill: #FFFFFF;\n",
218
+ " }\n",
219
+ " </style>\n",
220
+ "\n",
221
+ " <script>\n",
222
+ " const buttonEl =\n",
223
+ " document.querySelector('#df-6dd34700-e7fc-44b2-ad18-b7f45e54b1ee button.colab-df-convert');\n",
224
+ " buttonEl.style.display =\n",
225
+ " google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
226
+ "\n",
227
+ " async function convertToInteractive(key) {\n",
228
+ " const element = document.querySelector('#df-6dd34700-e7fc-44b2-ad18-b7f45e54b1ee');\n",
229
+ " const dataTable =\n",
230
+ " await google.colab.kernel.invokeFunction('convertToInteractive',\n",
231
+ " [key], {});\n",
232
+ " if (!dataTable) return;\n",
233
+ "\n",
234
+ " const docLinkHtml = 'Like what you see? Visit the ' +\n",
235
+ " '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
236
+ " + ' to learn more about interactive tables.';\n",
237
+ " element.innerHTML = '';\n",
238
+ " dataTable['output_type'] = 'display_data';\n",
239
+ " await google.colab.output.renderOutput(dataTable, element);\n",
240
+ " const docLink = document.createElement('div');\n",
241
+ " docLink.innerHTML = docLinkHtml;\n",
242
+ " element.appendChild(docLink);\n",
243
+ " }\n",
244
+ " </script>\n",
245
+ " </div>\n",
246
+ "\n",
247
+ "\n",
248
+ "<div id=\"df-b15340aa-ec20-4a66-ba97-a0c56d03d18c\">\n",
249
+ " <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-b15340aa-ec20-4a66-ba97-a0c56d03d18c')\"\n",
250
+ " title=\"Suggest charts\"\n",
251
+ " style=\"display:none;\">\n",
252
+ "\n",
253
+ "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
254
+ " width=\"24px\">\n",
255
+ " <g>\n",
256
+ " <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
257
+ " </g>\n",
258
+ "</svg>\n",
259
+ " </button>\n",
260
+ "\n",
261
+ "<style>\n",
262
+ " .colab-df-quickchart {\n",
263
+ " --bg-color: #E8F0FE;\n",
264
+ " --fill-color: #1967D2;\n",
265
+ " --hover-bg-color: #E2EBFA;\n",
266
+ " --hover-fill-color: #174EA6;\n",
267
+ " --disabled-fill-color: #AAA;\n",
268
+ " --disabled-bg-color: #DDD;\n",
269
+ " }\n",
270
+ "\n",
271
+ " [theme=dark] .colab-df-quickchart {\n",
272
+ " --bg-color: #3B4455;\n",
273
+ " --fill-color: #D2E3FC;\n",
274
+ " --hover-bg-color: #434B5C;\n",
275
+ " --hover-fill-color: #FFFFFF;\n",
276
+ " --disabled-bg-color: #3B4455;\n",
277
+ " --disabled-fill-color: #666;\n",
278
+ " }\n",
279
+ "\n",
280
+ " .colab-df-quickchart {\n",
281
+ " background-color: var(--bg-color);\n",
282
+ " border: none;\n",
283
+ " border-radius: 50%;\n",
284
+ " cursor: pointer;\n",
285
+ " display: none;\n",
286
+ " fill: var(--fill-color);\n",
287
+ " height: 32px;\n",
288
+ " padding: 0;\n",
289
+ " width: 32px;\n",
290
+ " }\n",
291
+ "\n",
292
+ " .colab-df-quickchart:hover {\n",
293
+ " background-color: var(--hover-bg-color);\n",
294
+ " box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
295
+ " fill: var(--button-hover-fill-color);\n",
296
+ " }\n",
297
+ "\n",
298
+ " .colab-df-quickchart-complete:disabled,\n",
299
+ " .colab-df-quickchart-complete:disabled:hover {\n",
300
+ " background-color: var(--disabled-bg-color);\n",
301
+ " fill: var(--disabled-fill-color);\n",
302
+ " box-shadow: none;\n",
303
+ " }\n",
304
+ "\n",
305
+ " .colab-df-spinner {\n",
306
+ " border: 2px solid var(--fill-color);\n",
307
+ " border-color: transparent;\n",
308
+ " border-bottom-color: var(--fill-color);\n",
309
+ " animation:\n",
310
+ " spin 1s steps(1) infinite;\n",
311
+ " }\n",
312
+ "\n",
313
+ " @keyframes spin {\n",
314
+ " 0% {\n",
315
+ " border-color: transparent;\n",
316
+ " border-bottom-color: var(--fill-color);\n",
317
+ " border-left-color: var(--fill-color);\n",
318
+ " }\n",
319
+ " 20% {\n",
320
+ " border-color: transparent;\n",
321
+ " border-left-color: var(--fill-color);\n",
322
+ " border-top-color: var(--fill-color);\n",
323
+ " }\n",
324
+ " 30% {\n",
325
+ " border-color: transparent;\n",
326
+ " border-left-color: var(--fill-color);\n",
327
+ " border-top-color: var(--fill-color);\n",
328
+ " border-right-color: var(--fill-color);\n",
329
+ " }\n",
330
+ " 40% {\n",
331
+ " border-color: transparent;\n",
332
+ " border-right-color: var(--fill-color);\n",
333
+ " border-top-color: var(--fill-color);\n",
334
+ " }\n",
335
+ " 60% {\n",
336
+ " border-color: transparent;\n",
337
+ " border-right-color: var(--fill-color);\n",
338
+ " }\n",
339
+ " 80% {\n",
340
+ " border-color: transparent;\n",
341
+ " border-right-color: var(--fill-color);\n",
342
+ " border-bottom-color: var(--fill-color);\n",
343
+ " }\n",
344
+ " 90% {\n",
345
+ " border-color: transparent;\n",
346
+ " border-bottom-color: var(--fill-color);\n",
347
+ " }\n",
348
+ " }\n",
349
+ "</style>\n",
350
+ "\n",
351
+ " <script>\n",
352
+ " async function quickchart(key) {\n",
353
+ " const quickchartButtonEl =\n",
354
+ " document.querySelector('#' + key + ' button');\n",
355
+ " quickchartButtonEl.disabled = true; // To prevent multiple clicks.\n",
356
+ " quickchartButtonEl.classList.add('colab-df-spinner');\n",
357
+ " try {\n",
358
+ " const charts = await google.colab.kernel.invokeFunction(\n",
359
+ " 'suggestCharts', [key], {});\n",
360
+ " } catch (error) {\n",
361
+ " console.error('Error during call to suggestCharts:', error);\n",
362
+ " }\n",
363
+ " quickchartButtonEl.classList.remove('colab-df-spinner');\n",
364
+ " quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
365
+ " }\n",
366
+ " (() => {\n",
367
+ " let quickchartButtonEl =\n",
368
+ " document.querySelector('#df-b15340aa-ec20-4a66-ba97-a0c56d03d18c button');\n",
369
+ " quickchartButtonEl.style.display =\n",
370
+ " google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
371
+ " })();\n",
372
+ " </script>\n",
373
+ "</div>\n",
374
+ " </div>\n",
375
+ " </div>\n"
376
+ ],
377
+ "text/plain": [
378
+ " Kysymys \\\n",
379
+ "0 Kuinka vaihdan uutiskirjeen sähköpostiosoitteen? \n",
380
+ "1 Miksi sähköpostiosoite tulee vahvistaa? \n",
381
+ "\n",
382
+ " Vastaus \n",
383
+ "0 Peruuta ensin vanhaan osoitteeseen tilattu uutiskirje kirjeen alareunan “Peruuta tilaus” -linkistä.\\nTilaa uutiskirje uudelleen oikeaan osoitteeseen. \n",
384
+ "1 Sähköpostiosoitteiden vahvistaminen on yleisesti käytössä oleva tapa varmistua siitä, että henkilöllä itsellään on pääsy hänen tiedoissaan olevaan sähköpostiosoitteeseen.\\n\\nSähköpostiosoite tulee vahvistaa itse, joko S-mobiilissa tai samalla kun luot itsellesi S-käyttäjätilin. Kun lähetät vahvistusviestin omissa tiedoissasi näkyvään sähköpostiosoitteeseen ja vahvistat itse osoitteen oikeaksi sähköpostiisi lähetetyllä vahvistuskoodilla, saamme varmistuksen, että osoitteesi on voimassa ja kuuluu juuri sinulle.\\n\\nJos asiakastiedoissasi olevasta sähköpostiosoitteesta puuttuu vielä vahvistus, näkyy osoitteen yhteydessä Vahvista -painike.\\n\\nBonustiedot ja muut henkilökohtaiset tiedotteet lähetetään vain vahvistettuun sähköpostiosoitteeseen. Bonustilanteesi voit kuitenkin jatkossakin tarkastaa S-mobiilista, S-käyttäjätililtä sekä toimipaikkojen S-Etukorttipäätteiltä. "
385
+ ]
386
+ },
387
+ "execution_count": 23,
388
+ "metadata": {},
389
+ "output_type": "execute_result"
390
+ }
391
+ ],
392
+ "source": [
393
+ "dataset.head()"
394
+ ]
395
+ },
396
+ {
397
+ "cell_type": "code",
398
+ "execution_count": null,
399
+ "id": "d75eeb59",
400
+ "metadata": {
401
+ "id": "d75eeb59"
402
+ },
403
+ "outputs": [],
404
+ "source": [
405
+ "def buildprompt(data):\n",
406
+ " prompt = {}\n",
407
+ " prompt['text'] = \"Given the question delimited by triple backticks ```{\" + data['Kysymys'] + \"}```, what is the answer? Answer: {\" + data['Vastaus'] + \"}\"\n",
408
+ " return prompt"
409
+ ]
410
+ },
411
+ {
412
+ "cell_type": "code",
413
+ "execution_count": null,
414
+ "id": "dfc2d587",
415
+ "metadata": {
416
+ "id": "dfc2d587"
417
+ },
418
+ "outputs": [],
419
+ "source": [
420
+ "dataset['prompt'] = dataset.apply(buildprompt, axis=1)"
421
+ ]
422
+ },
423
+ {
424
+ "cell_type": "code",
425
+ "execution_count": null,
426
+ "id": "25972ccd",
427
+ "metadata": {
428
+ "colab": {
429
+ "base_uri": "https://localhost:8080/"
430
+ },
431
+ "id": "25972ccd",
432
+ "outputId": "e397d17c-1d6b-4ea5-ce44-8e662f0e656b"
433
+ },
434
+ "outputs": [
435
+ {
436
+ "name": "stdout",
437
+ "output_type": "stream",
438
+ "text": [
439
+ "{'text': 'Given the question delimited by triple backticks ```{Miksi sähköpostiosoite tulee vahvistaa?}```, what is the answer? Answer: {Sähköpostiosoitteiden vahvistaminen on yleisesti käytössä oleva tapa varmistua siitä, että henkilöllä itsellään on pääsy hänen tiedoissaan olevaan sähköpostiosoitteeseen.\\n\\nSähköpostiosoite tulee vahvistaa itse, joko S-mobiilissa tai samalla kun luot itsellesi S-käyttäjätilin. Kun lähetät vahvistusviestin omissa tiedoissasi näkyvään sähköpostiosoitteeseen ja vahvistat itse osoitteen oikeaksi sähköpostiisi lähetetyllä vahvistuskoodilla, saamme varmistuksen, että osoitteesi on voimassa ja kuuluu juuri sinulle.\\n\\nJos asiakastiedoissasi olevasta sähköpostiosoitteesta puuttuu vielä vahvistus, näkyy osoitteen yhteydessä Vahvista -painike.\\n\\nBonustiedot ja muut henkilökohtaiset tiedotteet lähetetään vain vahvistettuun sähköpostiosoitteeseen. Bonustilanteesi voit kuitenkin jatkossakin tarkastaa S-mobiilista, S-käyttäjätililtä sekä toimipaikkojen S-Etukorttipäätteiltä.}'}\n"
440
+ ]
441
+ }
442
+ ],
443
+ "source": [
444
+ "print(dataset['prompt'][1])"
445
+ ]
446
+ },
447
+ {
448
+ "cell_type": "markdown",
449
+ "id": "xcO1VF8yEoNE",
450
+ "metadata": {
451
+ "id": "xcO1VF8yEoNE"
452
+ },
453
+ "source": []
454
+ },
455
+ {
456
+ "cell_type": "code",
457
+ "execution_count": null,
458
+ "id": "131b3149",
459
+ "metadata": {
460
+ "id": "131b3149"
461
+ },
462
+ "outputs": [],
463
+ "source": [
464
+ "result = dataset['prompt'].to_list()\n",
465
+ "with open('prompts_2.json', 'w') as outfile:\n",
466
+ " json.dump(result, outfile, ensure_ascii=False)"
467
+ ]
468
+ },
469
+ {
470
+ "cell_type": "code",
471
+ "execution_count": null,
472
+ "id": "91f808cf",
473
+ "metadata": {
474
+ "id": "91f808cf"
475
+ },
476
+ "outputs": [],
477
+ "source": []
478
+ }
479
+ ],
480
+ "metadata": {
481
+ "colab": {
482
+ "provenance": []
483
+ },
484
+ "kernelspec": {
485
+ "display_name": "Python 3 (ipykernel)",
486
+ "language": "python",
487
+ "name": "python3"
488
+ },
489
+ "language_info": {
490
+ "codemirror_mode": {
491
+ "name": "ipython",
492
+ "version": 3
493
+ },
494
+ "file_extension": ".py",
495
+ "mimetype": "text/x-python",
496
+ "name": "python",
497
+ "nbconvert_exporter": "python",
498
+ "pygments_lexer": "ipython3",
499
+ "version": "3.11.7"
500
+ }
501
+ },
502
+ "nbformat": 4,
503
+ "nbformat_minor": 5
504
+ }
S-Kanava_1.xlsx ADDED
Binary file (9.02 kB). View file
 
S-Kanava_185.xlsx ADDED
Binary file (53.9 kB). View file
 
S-Kanava_2.xlsx ADDED
Binary file (9.48 kB). View file
 
prompts_1.json ADDED
@@ -0,0 +1 @@
 
 
1
+ [{"text": "Given the question delimited by triple backticks ```{Kuinka vaihdan uutiskirjeen sähköpostiosoitteen?}```, what is the answer? Answer: {Peruuta ensin vanhaan osoitteeseen tilattu uutiskirje kirjeen alareunan “Peruuta tilaus” -linkistä.\nTilaa uutiskirje uudelleen oikeaan osoitteeseen.}"}]
prompts_185.json ADDED
The diff for this file is too large to render. See raw diff
 
prompts_2.json ADDED
@@ -0,0 +1 @@
 
 
1
+ [{"text": "Given the question delimited by triple backticks ```{Kuinka vaihdan uutiskirjeen sähköpostiosoitteen?}```, what is the answer? Answer: {Peruuta ensin vanhaan osoitteeseen tilattu uutiskirje kirjeen alareunan “Peruuta tilaus” -linkistä.\nTilaa uutiskirje uudelleen oikeaan osoitteeseen.}"}, {"text": "Given the question delimited by triple backticks ```{Miksi sähköpostiosoite tulee vahvistaa?}```, what is the answer? Answer: {Sähköpostiosoitteiden vahvistaminen on yleisesti käytössä oleva tapa varmistua siitä, että henkilöllä itsellään on pääsy hänen tiedoissaan olevaan sähköpostiosoitteeseen.\n\nSähköpostiosoite tulee vahvistaa itse, joko S-mobiilissa tai samalla kun luot itsellesi S-käyttäjätilin. Kun lähetät vahvistusviestin omissa tiedoissasi näkyvään sähköpostiosoitteeseen ja vahvistat itse osoitteen oikeaksi sähköpostiisi lähetetyllä vahvistuskoodilla, saamme varmistuksen, että osoitteesi on voimassa ja kuuluu juuri sinulle.\n\nJos asiakastiedoissasi olevasta sähköpostiosoitteesta puuttuu vielä vahvistus, näkyy osoitteen yhteydessä Vahvista -painike.\n\nBonustiedot ja muut henkilökohtaiset tiedotteet lähetetään vain vahvistettuun sähköpostiosoitteeseen. Bonustilanteesi voit kuitenkin jatkossakin tarkastaa S-mobiilista, S-käyttäjätililtä sekä toimipaikkojen S-Etukorttipäätteiltä.}"}]
requirements.txt ADDED
@@ -0,0 +1,425 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ absl-py==2.1.0
2
+ accelerate==0.21.0
3
+ aiobotocore==2.7.0
4
+ aiohttp==3.9.1
5
+ aioitertools==0.11.0
6
+ aiosignal==1.3.1
7
+ aiosqlite==0.19.0
8
+ altair==5.2.0
9
+ amazon-codewhisperer-jupyterlab-ext==2.0.1
10
+ amazon_sagemaker_jupyter_scheduler==3.0.6
11
+ annotated-types==0.6.0
12
+ ansi2html==0.0.0
13
+ ansiwrap==0.8.4
14
+ antlr4-python3-runtime==4.9.3
15
+ anyio==3.7.1
16
+ appdirs==1.4.4
17
+ archspec==0.2.2
18
+ argon2-cffi==23.1.0
19
+ argon2-cffi-bindings==21.2.0
20
+ arrow==1.3.0
21
+ astroid==3.0.2
22
+ asttokens==2.4.1
23
+ astunparse==1.6.3
24
+ async-lru==2.0.4
25
+ async-timeout==4.0.3
26
+ attrs==23.2.0
27
+ autogluon==0.8.2
28
+ autogluon.common==0.8.2
29
+ autogluon.core==0.8.2
30
+ autogluon.features==0.8.2
31
+ autogluon.multimodal==0.8.2
32
+ autogluon.tabular==0.8.2
33
+ autogluon.timeseries==0.8.2
34
+ autopep8==2.0.4
35
+ autovizwidget==0.21.0
36
+ aws-embedded-metrics==3.2.0
37
+ aws-glue-sessions==1.0.4
38
+ Babel==2.14.0
39
+ beautifulsoup4==4.12.3
40
+ binaryornot==0.4.4
41
+ black==24.1.1
42
+ bleach==6.1.0
43
+ blinker==1.7.0
44
+ blis==0.7.10
45
+ boltons==23.1.1
46
+ boto3==1.28.64
47
+ botocore==1.31.64
48
+ Brotli==1.0.9
49
+ cached-property==1.5.2
50
+ cachetools==5.3.2
51
+ catalogue==2.0.10
52
+ catboost==1.2.2
53
+ certifi==2023.11.17
54
+ cffi==1.16.0
55
+ chardet==5.2.0
56
+ charset-normalizer==3.3.2
57
+ click==8.1.7
58
+ cloudpathlib==0.16.0
59
+ cloudpickle==2.2.1
60
+ colorama==0.4.6
61
+ comm==0.2.1
62
+ conda==23.11.0
63
+ conda-libmamba-solver==24.1.0
64
+ conda-package-handling==2.2.0
65
+ conda_package_streaming==0.9.0
66
+ confection==0.1.4
67
+ contextlib2==21.6.0
68
+ contourpy==1.2.0
69
+ cookiecutter==2.5.0
70
+ croniter==1.4.1
71
+ cryptography==42.0.2
72
+ cycler==0.12.1
73
+ cymem==2.0.8
74
+ cytoolz==0.12.2
75
+ dash==2.15.0
76
+ dask==2024.1.1
77
+ dataclasses==0.8
78
+ dataclasses-json==0.6.3
79
+ datasets==2.16.1
80
+ debugpy==1.8.0
81
+ decorator==5.1.1
82
+ deepmerge==1.1.1
83
+ defusedxml==0.7.1
84
+ dill==0.3.7
85
+ distributed==2024.1.1
86
+ distro==1.9.0
87
+ docker-pycreds==0.4.0
88
+ docstring-to-markdown==0.13
89
+ entrypoints==0.4
90
+ et-xmlfile==1.1.0
91
+ evaluate==0.4.1
92
+ exceptiongroup==1.2.0
93
+ executing==2.0.1
94
+ faiss==1.7.4
95
+ fastai==2.7.14
96
+ fastapi==0.103.2
97
+ fastcore==1.5.29
98
+ fastdownload==0.0.7
99
+ fastjsonschema==2.19.1
100
+ fastprogress==1.0.3
101
+ filelock==3.13.1
102
+ flake8==6.1.0
103
+ Flask==3.0.1
104
+ flatbuffers==23.5.26
105
+ fonttools==4.47.2
106
+ fqdn==1.5.1
107
+ frozenlist==1.4.1
108
+ fsspec==2023.6.0
109
+ future==0.18.3
110
+ gast==0.4.0
111
+ gdown==5.0.1
112
+ gitdb==4.0.11
113
+ GitPython==3.1.41
114
+ gluonts==0.13.7
115
+ gmpy2==2.1.2
116
+ google-api-core==2.16.1
117
+ google-auth==2.27.0
118
+ google-auth-oauthlib==1.0.0
119
+ google-pasta==0.2.0
120
+ googleapis-common-protos==1.62.0
121
+ graphviz==0.20.1
122
+ greenlet==3.0.1
123
+ grpcio==1.54.3
124
+ gssapi==1.8.3
125
+ h11==0.14.0
126
+ h5py==3.10.0
127
+ hdijupyterutils==0.21.0
128
+ huggingface_hub==0.20.2
129
+ idna==3.6
130
+ imagecodecs==2023.8.12
131
+ imageio==2.33.1
132
+ importlib-metadata==6.10.0
133
+ importlib-resources==6.1.1
134
+ iniconfig==2.0.0
135
+ ipykernel==6.29.0
136
+ ipython==8.20.0
137
+ ipywidgets==8.1.1
138
+ isoduration==20.11.0
139
+ isort==5.13.2
140
+ itsdangerous==2.1.2
141
+ jax==0.4.20
142
+ jaxlib==0.4.14
143
+ jedi==0.19.1
144
+ Jinja2==3.1.3
145
+ jmespath==1.0.1
146
+ joblib==1.3.2
147
+ json5==0.9.14
148
+ jsonpatch==1.33
149
+ jsonpath-ng==1.6.1
150
+ jsonpointer==2.4
151
+ jsonschema==4.17.3
152
+ jupyter==1.0.0
153
+ jupyter_ai==2.8.1
154
+ jupyter_ai_magics==2.8.1
155
+ jupyter_client==8.6.0
156
+ jupyter-console==6.6.3
157
+ jupyter_core==5.7.1
158
+ jupyter-dash==0.4.2
159
+ jupyter-events==0.6.3
160
+ jupyter-lsp==2.2.2
161
+ jupyter_scheduler==2.4.0
162
+ jupyter_server==2.10.0
163
+ jupyter-server-mathjax==0.2.6
164
+ jupyter_server_proxy==4.1.0
165
+ jupyter_server_terminals==0.5.2
166
+ jupyterlab==4.0.12
167
+ jupyterlab_git==0.50.0
168
+ jupyterlab-lsp==5.0.2
169
+ jupyterlab_pygments==0.3.0
170
+ jupyterlab_server==2.24.0
171
+ jupyterlab-widgets==3.0.9
172
+ keras==2.12.0
173
+ Keras-Preprocessing==1.1.2
174
+ kiwisolver==1.4.5
175
+ krb5==0.5.1
176
+ langchain==0.0.350
177
+ langchain-community==0.0.7
178
+ langchain-core==0.1.3
179
+ langcodes==3.3.0
180
+ langsmith==0.0.85
181
+ libmambapy==1.5.6
182
+ lightgbm==3.3.5
183
+ lightning-utilities==0.10.1
184
+ llvmlite==0.41.1
185
+ locket==1.0.0
186
+ Markdown==3.5.2
187
+ markdown-it-py==3.0.0
188
+ MarkupSafe==2.1.4
189
+ marshmallow==3.20.2
190
+ matplotlib==3.8.2
191
+ matplotlib-inline==0.1.6
192
+ mccabe==0.7.0
193
+ mdurl==0.1.2
194
+ menuinst==2.0.2
195
+ mistune==3.0.2
196
+ ml-dtypes==0.3.2
197
+ mlforecast==0.7.3
198
+ mock==5.1.0
199
+ model-index==0.1.11
200
+ mpmath==1.3.0
201
+ msgpack==1.0.7
202
+ multidict==6.0.4
203
+ multiprocess==0.70.15
204
+ munkres==1.1.4
205
+ murmurhash==1.0.10
206
+ mypy-extensions==1.0.0
207
+ nbclient==0.8.0
208
+ nbconvert==7.14.2
209
+ nbdime==4.0.1
210
+ nbformat==5.9.2
211
+ nest_asyncio==1.6.0
212
+ networkx==3.2.1
213
+ nlpaug==1.1.11
214
+ nltk==3.8.1
215
+ nose==1.3.7
216
+ notebook==7.0.7
217
+ notebook_shim==0.2.3
218
+ nptyping==2.4.1
219
+ numba==0.58.1
220
+ numexpr==2.8.7
221
+ numpy==1.26.3
222
+ oauthlib==3.2.2
223
+ omegaconf==2.2.3
224
+ openai==0.28.1
225
+ openapi-schema-pydantic==1.2.4
226
+ openmim==0.3.7
227
+ openpyxl==3.1.2
228
+ opt-einsum==3.3.0
229
+ ordered-set==4.1.0
230
+ overrides==7.7.0
231
+ packaging==23.2
232
+ pandas==2.1.4
233
+ pandas-stubs==2.1.4.231227
234
+ pandocfilters==1.5.0
235
+ papermill==2.4.0
236
+ parso==0.8.3
237
+ partd==1.4.1
238
+ pathos==0.3.2
239
+ pathspec==0.12.1
240
+ pathtools==0.1.2
241
+ pathy==0.10.2
242
+ patsy==0.5.6
243
+ peft==0.9.0
244
+ pexpect==4.9.0
245
+ pickleshare==0.7.5
246
+ Pillow==9.5.0
247
+ pip==23.3.2
248
+ pkgutil_resolve_name==1.3.10
249
+ platformdirs==4.2.0
250
+ plotly==5.18.0
251
+ pluggy==1.4.0
252
+ ply==3.11
253
+ pox==0.3.4
254
+ ppft==1.7.6.8
255
+ preshed==3.0.9
256
+ prometheus-client==0.19.0
257
+ prompt-toolkit==3.0.42
258
+ protobuf==4.21.12
259
+ psutil==5.9.8
260
+ ptyprocess==0.7.0
261
+ pure-eval==0.2.2
262
+ pure-sasl==0.6.2
263
+ pyarrow==12.0.1
264
+ pyarrow-hotfix==0.6
265
+ pyasn1==0.5.1
266
+ pyasn1-modules==0.3.0
267
+ pycodestyle==2.11.1
268
+ pycosat==0.6.6
269
+ pycparser==2.21
270
+ pydantic==1.10.13
271
+ pydantic_core==2.14.3
272
+ pydocstyle==6.3.0
273
+ pyflakes==3.1.0
274
+ Pygments==2.17.2
275
+ PyHive==0.7.0
276
+ PyJWT==2.8.0
277
+ pylint==3.0.3
278
+ pyOpenSSL==24.0.0
279
+ pyparsing==3.1.1
280
+ PyQt5==5.15.9
281
+ PyQt5-sip==12.12.2
282
+ pyrsistent==0.20.0
283
+ PySocks==1.7.1
284
+ pyspnego==0.9.1
285
+ pytesseract==0.3.10
286
+ pytest==8.0.0
287
+ pytest-subtests==0.11.0
288
+ python-dateutil==2.8.2
289
+ python-json-logger==2.0.7
290
+ python-lsp-jsonrpc==1.1.2
291
+ python-lsp-server==1.9.0
292
+ python-slugify==8.0.3
293
+ pytoolconfig==1.2.5
294
+ pytorch-lightning==2.0.9
295
+ pytorch-metric-learning==1.7.3
296
+ pytz==2023.3
297
+ pyu2f==0.1.5
298
+ PyWavelets==1.4.1
299
+ PyYAML==6.0.1
300
+ pyzmq==25.1.2
301
+ qtconsole==5.5.1
302
+ QtPy==2.4.1
303
+ regex==2023.12.25
304
+ requests==2.31.0
305
+ requests-kerberos==0.14.0
306
+ requests-oauthlib==1.3.1
307
+ responses==0.18.0
308
+ retrying==1.3.3
309
+ rfc3339-validator==0.1.4
310
+ rfc3986-validator==0.1.1
311
+ rich==13.7.0
312
+ rope==1.12.0
313
+ rsa==4.9
314
+ ruamel.yaml==0.18.5
315
+ ruamel.yaml.clib==0.2.7
316
+ s3transfer==0.7.0
317
+ sacremoses==0.0.53
318
+ safetensors==0.3.3
319
+ sagemaker==2.198.1
320
+ sagemaker-headless-execution-driver==0.0.12
321
+ sagemaker-jupyterlab-emr-extension==0.1.9
322
+ sagemaker-jupyterlab-extension==0.2.0
323
+ sagemaker-jupyterlab-extension-common==0.1.9
324
+ sagemaker-kernel-wrapper==0.0.2
325
+ sagemaker-studio-analytics-extension==0.0.21
326
+ sagemaker-studio-sparkmagic-lib==0.1.4
327
+ sasl==0.3.1
328
+ schema==0.7.5
329
+ scikit-image==0.19.3
330
+ scikit-learn==1.3.2
331
+ SciPy==1.11.4
332
+ Send2Trash==1.8.2
333
+ sentencepiece==0.1.99
334
+ sentry-sdk==1.40.0
335
+ seqeval==1.2.2
336
+ setproctitle==1.3.3
337
+ setuptools==69.0.3
338
+ shellingham==1.5.4
339
+ simpervisor==1.0.0
340
+ sip==6.7.12
341
+ six==1.16.0
342
+ smart-open==5.2.1
343
+ smdebug-rulesconfig==1.0.1
344
+ smmap==5.0.0
345
+ sniffio==1.3.0
346
+ snowballstemmer==2.2.0
347
+ sortedcontainers==2.4.0
348
+ soupsieve==2.5
349
+ spacy==3.7.2
350
+ spacy-legacy==3.0.12
351
+ spacy-loggers==1.0.5
352
+ sparkmagic==0.21.0
353
+ SQLAlchemy==1.4.49
354
+ srsly==2.4.8
355
+ stack-data==0.6.2
356
+ starlette==0.27.0
357
+ statsforecast==1.4.0
358
+ statsmodels==0.14.1
359
+ supervisor==4.2.5
360
+ sympy==1.12
361
+ tabulate==0.9.0
362
+ tblib==1.7.0
363
+ tenacity==8.2.3
364
+ tensorboard==2.12.3
365
+ tensorboard-data-server==0.7.0
366
+ tensorflow==2.12.1
367
+ tensorflow-estimator==2.12.0
368
+ termcolor==2.4.0
369
+ terminado==0.18.0
370
+ text-unidecode==1.3
371
+ textwrap3==0.9.2
372
+ thinc==8.2.2
373
+ threadpoolctl==3.2.0
374
+ thrift==0.19.0
375
+ thrift-sasl==0.4.3
376
+ tifffile==2024.1.30
377
+ tiktoken==0.5.2
378
+ timm==0.9.12
379
+ tinycss2==1.2.1
380
+ tokenizers==0.13.3
381
+ toml==0.10.2
382
+ tomli==2.0.1
383
+ tomlkit==0.12.3
384
+ toolz==0.12.1
385
+ torch==2.0.0.post101
386
+ torchmetrics==1.0.3
387
+ torchvision==0.15.2a0+072ec57
388
+ tornado==6.3.3
389
+ tqdm==4.66.1
390
+ traitlets==5.14.1
391
+ transformers==4.31.0
392
+ truststore==0.8.0
393
+ typer==0.9.0
394
+ types-python-dateutil==2.8.19.20240106
395
+ types-pytz==2023.4.0.20240130
396
+ typing_extensions==4.5.0
397
+ typing-inspect==0.9.0
398
+ typing-utils==0.1.0
399
+ typish==1.9.3
400
+ tzdata==2023.4
401
+ ujson==5.9.0
402
+ unicodedata2==15.1.0
403
+ uri-template==1.3.0
404
+ urllib3==1.26.18
405
+ uvicorn==0.25.0
406
+ wandb==0.16.2
407
+ wasabi==1.1.2
408
+ wcwidth==0.2.13
409
+ weasel==0.3.4
410
+ webcolors==1.13
411
+ webencodings==0.5.1
412
+ websocket-client==1.7.0
413
+ Werkzeug==3.0.1
414
+ whatthepatch==1.0.5
415
+ wheel==0.42.0
416
+ widgetsnbextension==4.0.9
417
+ window-ops==0.0.14
418
+ wrapt==1.16.0
419
+ xgboost==1.7.6
420
+ xxhash==3.4.1
421
+ yapf==0.40.1
422
+ yarl==1.9.4
423
+ zict==3.0.0
424
+ zipp==3.17.0
425
+ zstandard==0.22.0