alonzogarbanzo
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -8,12 +8,12 @@ model-index:
|
|
8 |
results: []
|
9 |
---
|
10 |
|
11 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
12 |
-
should probably proofread and complete it, then remove this comment. -->
|
13 |
|
14 |
# Bloom-1b7-creative-writing-IT
|
15 |
|
16 |
-
This model is a fine-tuned version of [bigscience/bloom-1b7](https://huggingface.co/bigscience/bloom-1b7) on an
|
|
|
|
|
17 |
|
18 |
## Model description
|
19 |
|
@@ -25,10 +25,26 @@ More information needed
|
|
25 |
|
26 |
## Training and evaluation data
|
27 |
|
28 |
-
|
29 |
|
30 |
## Training procedure
|
31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
### Training hyperparameters
|
33 |
|
34 |
The following hyperparameters were used during training:
|
@@ -45,7 +61,9 @@ The following hyperparameters were used during training:
|
|
45 |
|
46 |
### Training results
|
47 |
|
|
|
48 |
|
|
|
49 |
|
50 |
### Framework versions
|
51 |
|
|
|
8 |
results: []
|
9 |
---
|
10 |
|
|
|
|
|
11 |
|
12 |
# Bloom-1b7-creative-writing-IT
|
13 |
|
14 |
+
This model is a fine-tuned version of [bigscience/bloom-1b7](https://huggingface.co/bigscience/bloom-1b7) on an a creative writing - short story dataset.
|
15 |
+
|
16 |
+
https://huggingface.co/datasets/adambjorn/UnrelatedForgettingOverhead/viewer/creative
|
17 |
|
18 |
## Model description
|
19 |
|
|
|
25 |
|
26 |
## Training and evaluation data
|
27 |
|
28 |
+
Training and evaluation data here: https://huggingface.co/datasets/adambjorn/UnrelatedForgettingOverhead/viewer/creative
|
29 |
|
30 |
## Training procedure
|
31 |
|
32 |
+
The model was instruction tuned on the dataset in the following way:
|
33 |
+
|
34 |
+
Given the set of promts:
|
35 |
+
|
36 |
+
prompts = [
|
37 |
+
"Write a creative short story based on the following title:",
|
38 |
+
"Here is a title for a story. Craft a short narrative around it:",
|
39 |
+
"Using the title given, develop a short story:",
|
40 |
+
"Imagine a short story that starts with this title:",
|
41 |
+
"Create a brief story with the following title:"
|
42 |
+
],
|
43 |
+
|
44 |
+
each training example is generated by concatenating one of the prompts with the 'title' and 'selftext' in the following way:
|
45 |
+
|
46 |
+
concatenated_texts = [random.choice(prompts) + " " + title + "</s>" + "Story: " + selftext for title, selftext in zip(titles, selftexts)]
|
47 |
+
|
48 |
### Training hyperparameters
|
49 |
|
50 |
The following hyperparameters were used during training:
|
|
|
61 |
|
62 |
### Training results
|
63 |
|
64 |
+
Final reported loss: {'loss': 0.0135, 'grad_norm': 0.6041152477264404, 'learning_rate': 7.446808510638299e-07, 'epoch': 9.89}
|
65 |
|
66 |
+
Average over tuning: {'train_runtime': 1111.4187, 'train_samples_per_second': 1.71, 'train_steps_per_second': 0.423, 'train_loss': 0.4682149670225509, 'epoch': 9.89}
|
67 |
|
68 |
### Framework versions
|
69 |
|