PseudoTerminal X commited on
Commit
c6099d8
·
verified ·
1 Parent(s): 0370680

Trained for 1 epochs and 49000 steps.

Browse files

Trained with datasets ['text-embeds-pixart-filter', 'photo-concept-bucket', 'midjourney-v6-520k-raw', 'sfwbooru', 'nijijourney-v6-520k-raw', 'dalle3']
Learning rate 1e-06, batch size 24, and 1 gradient accumulation steps.
Used DDPM noise scheduler for training with epsilon prediction type and rescaled_betas_zero_snr=False
Using 'linspace' timestep spacing.
Base model: ptx0/pixart-900m-1024-ft-large
VAE: madebyollin/sdxl-vae-fp16-fix

README.md CHANGED
@@ -47,7 +47,7 @@ You may reuse the base model text encoder for inference.
47
  ## Training settings
48
 
49
  - Training epochs: 1
50
- - Training steps: 48000
51
  - Learning rate: 1e-06
52
  - Effective batch size: 192
53
  - Micro-batch size: 24
@@ -64,180 +64,44 @@ You may reuse the base model text encoder for inference.
64
 
65
  ### photo-concept-bucket
66
  - Repeats: 0
67
- - Total number of images: ~564672
68
- - Total number of aspect buckets: 34
69
- - Resolution: 1.0 megapixels
70
- - Cropped: False
71
- - Crop style: None
72
- - Crop aspect: None
73
- ### moviecollection
74
- - Repeats: 15
75
- - Total number of images: ~768
76
- - Total number of aspect buckets: 11
77
- - Resolution: 1.0 megapixels
78
- - Cropped: True
79
- - Crop style: random
80
- - Crop aspect: random
81
- ### experimental
82
- - Repeats: 0
83
- - Total number of images: ~1728
84
- - Total number of aspect buckets: 11
85
  - Resolution: 1.0 megapixels
86
  - Cropped: True
87
  - Crop style: random
88
  - Crop aspect: random
89
- ### ethnic
90
- - Repeats: 0
91
- - Total number of images: ~1152
92
- - Total number of aspect buckets: 7
93
- - Resolution: 1.0 megapixels
94
- - Cropped: True
95
- - Crop style: random
96
- - Crop aspect: random
97
- ### sports
98
  - Repeats: 0
99
- - Total number of images: ~576
100
  - Total number of aspect buckets: 1
101
  - Resolution: 1.0 megapixels
102
  - Cropped: True
103
  - Crop style: random
104
  - Crop aspect: square
105
- ### architecture
106
  - Repeats: 0
107
- - Total number of images: ~4224
108
  - Total number of aspect buckets: 1
109
  - Resolution: 1.0 megapixels
110
  - Cropped: True
111
  - Crop style: random
112
  - Crop aspect: square
113
- ### shutterstock
114
- - Repeats: 0
115
- - Total number of images: ~14016
116
- - Total number of aspect buckets: 3
117
- - Resolution: 1.0 megapixels
118
- - Cropped: True
119
- - Crop style: random
120
- - Crop aspect: random
121
- ### cinemamix-1mp
122
- - Repeats: 0
123
- - Total number of images: ~7296
124
- - Total number of aspect buckets: 3
125
- - Resolution: 1.0 megapixels
126
- - Cropped: True
127
- - Crop style: random
128
- - Crop aspect: random
129
- ### nsfw-1024
130
- - Repeats: 0
131
- - Total number of images: ~10368
132
- - Total number of aspect buckets: 3
133
- - Resolution: 1.0 megapixels
134
- - Cropped: True
135
- - Crop style: random
136
- - Crop aspect: random
137
- ### anatomy
138
- - Repeats: 5
139
- - Total number of images: ~15168
140
- - Total number of aspect buckets: 3
141
- - Resolution: 1.0 megapixels
142
- - Cropped: True
143
- - Crop style: random
144
- - Crop aspect: random
145
- ### bg20k-1024
146
- - Repeats: 0
147
- - Total number of images: ~89088
148
- - Total number of aspect buckets: 3
149
- - Resolution: 1.0 megapixels
150
- - Cropped: True
151
- - Crop style: random
152
- - Crop aspect: random
153
- ### yoga
154
- - Repeats: 0
155
- - Total number of images: ~2880
156
- - Total number of aspect buckets: 3
157
- - Resolution: 1.0 megapixels
158
- - Cropped: True
159
- - Crop style: random
160
- - Crop aspect: random
161
- ### photo-aesthetics
162
  - Repeats: 0
163
- - Total number of images: ~28608
164
- - Total number of aspect buckets: 17
165
- - Resolution: 1.0 megapixels
166
- - Cropped: True
167
- - Crop style: random
168
- - Crop aspect: random
169
- ### text-1mp
170
- - Repeats: 125
171
- - Total number of images: ~12864
172
- - Total number of aspect buckets: 3
173
- - Resolution: 1.0 megapixels
174
- - Cropped: True
175
- - Crop style: random
176
- - Crop aspect: random
177
- ### movieposters
178
- - Repeats: 10
179
- - Total number of images: ~192
180
  - Total number of aspect buckets: 1
181
  - Resolution: 1.0 megapixels
182
  - Cropped: True
183
  - Crop style: random
184
  - Crop aspect: square
185
- ### normalnudes
186
- - Repeats: 10
187
- - Total number of images: ~384
188
- - Total number of aspect buckets: 8
189
- - Resolution: 1.0 megapixels
190
- - Cropped: True
191
- - Crop style: random
192
- - Crop aspect: random
193
- ### pixel-art
194
- - Repeats: 0
195
- - Total number of images: ~384
196
- - Total number of aspect buckets: 11
197
- - Resolution: 1.0 megapixels
198
- - Cropped: True
199
- - Crop style: random
200
- - Crop aspect: random
201
- ### signs
202
  - Repeats: 0
203
- - Total number of images: ~384
204
  - Total number of aspect buckets: 1
205
  - Resolution: 1.0 megapixels
206
  - Cropped: True
207
  - Crop style: random
208
  - Crop aspect: square
209
- ### midjourney-v6-520k-raw
210
- - Repeats: 0
211
- - Total number of images: ~513792
212
- - Total number of aspect buckets: 58
213
- - Resolution: 1.0 megapixels
214
- - Cropped: False
215
- - Crop style: None
216
- - Crop aspect: None
217
- ### sfwbooru
218
- - Repeats: 0
219
- - Total number of images: ~271488
220
- - Total number of aspect buckets: 73
221
- - Resolution: 1.0 megapixels
222
- - Cropped: False
223
- - Crop style: None
224
- - Crop aspect: None
225
- ### nijijourney-v6-520k-raw
226
- - Repeats: 0
227
- - Total number of images: ~516288
228
- - Total number of aspect buckets: 48
229
- - Resolution: 1.0 megapixels
230
- - Cropped: False
231
- - Crop style: None
232
- - Crop aspect: None
233
- ### dalle3
234
- - Repeats: 0
235
- - Total number of images: ~1119168
236
- - Total number of aspect buckets: 31
237
- - Resolution: 1.0 megapixels
238
- - Cropped: False
239
- - Crop style: None
240
- - Crop aspect: None
241
 
242
 
243
  ## Inference
 
47
  ## Training settings
48
 
49
  - Training epochs: 1
50
+ - Training steps: 49000
51
  - Learning rate: 1e-06
52
  - Effective batch size: 192
53
  - Micro-batch size: 24
 
64
 
65
  ### photo-concept-bucket
66
  - Repeats: 0
67
+ - Total number of images: ~567360
68
+ - Total number of aspect buckets: 4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69
  - Resolution: 1.0 megapixels
70
  - Cropped: True
71
  - Crop style: random
72
  - Crop aspect: random
73
+ ### midjourney-v6-520k-raw
 
 
 
 
 
 
 
 
74
  - Repeats: 0
75
+ - Total number of images: ~390912
76
  - Total number of aspect buckets: 1
77
  - Resolution: 1.0 megapixels
78
  - Cropped: True
79
  - Crop style: random
80
  - Crop aspect: square
81
+ ### sfwbooru
82
  - Repeats: 0
83
+ - Total number of images: ~233664
84
  - Total number of aspect buckets: 1
85
  - Resolution: 1.0 megapixels
86
  - Cropped: True
87
  - Crop style: random
88
  - Crop aspect: square
89
+ ### nijijourney-v6-520k-raw
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
  - Repeats: 0
91
+ - Total number of images: ~416064
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
92
  - Total number of aspect buckets: 1
93
  - Resolution: 1.0 megapixels
94
  - Cropped: True
95
  - Crop style: random
96
  - Crop aspect: square
97
+ ### dalle3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
98
  - Repeats: 0
99
+ - Total number of images: ~1889680
100
  - Total number of aspect buckets: 1
101
  - Resolution: 1.0 megapixels
102
  - Cropped: True
103
  - Crop style: random
104
  - Crop aspect: square
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
105
 
106
 
107
  ## Inference
optimizer.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5d0be51ff8aeeb782101c46a02642918315b9a58e73c71bad06e7747fe99f4e6
3
  size 5451415117
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31436e08edb8a07440cce2917fa89004f961ab7231c1a3661bbc223235e74181
3
  size 5451415117
random_states_0.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7261614624f985fb7151c899f96dd187df51cc6d2b2e13ba3b8d4024cb0b8dfe
3
- size 16100
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:642a92b3aa2cf8b8d4ae5f316f2a801766499a1de1c2814d9fb38d3570e395be
3
+ size 16036
scheduler.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9c60b0e4b960d5252fe080b28e96bee2bdd8a054c9be9d237c03574578d18016
3
  size 1000
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:24151575a07d0cb2bff32d0adaa4632b4d7035e26a5f5d7418148e7a8289bc63
3
  size 1000
training_state-dalle3.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c3f887bfbe58f353f8efdfc6cc9f2e7a4f493831c0ad768a1461c04fa0d48c36
3
- size 16471566
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:021479396c8e26c8703c6d3289b88f6d27f60c50d9378d9f6b233c21ef6eb3e3
3
+ size 18150691
training_state-midjourney-v6-520k-raw.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fce901404fbd7e24ea7299ad8e02d1966388f9511a12f55e303f8b86937617df
3
- size 6230950
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:51365d239748bf1c9254199ae86ec71707e258e32f6b502e9462265a4dbe26b2
3
+ size 4739976
training_state-nijijourney-v6-520k-raw.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b9cb4113c645a0329ef20e27ba715c174682d18e277eda12a8abf2da776169c5
3
- size 6712778
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:51b8582efb189da781d1653f4b91e2e883731dc06c5b574d04473df0c0e55e72
3
+ size 5409000
training_state-photo-concept-bucket.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:84f2a5908ae617088ef63fda479dd0d59e55a8a10ba5ba303b238f3ebf07b1fb
3
- size 5270267
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e9cbc69396a3a3bbc39cf06195e833c68ecab342f53495c6a8b5b22fa5b2982
3
+ size 5295188
training_state-sfwbooru.json CHANGED
The diff for this file is too large to render. See raw diff
 
training_state.json CHANGED
@@ -1 +1 @@
1
- {"global_step": 48000, "epoch_step": 1, "epoch": 2, "exhausted_backends": ["pixel-art", "signs", "sports", "ethnic", "experimental", "movieposters", "normalnudes", "yoga", "cinemamix-1mp", "architecture", "moviecollection", "shutterstock", "nsfw-1024", "photo-aesthetics", "bg20k-1024", "anatomy", "sfwbooru", "nijijourney-v6-520k-raw", "midjourney-v6-520k-raw", "photo-concept-bucket"], "repeats": {"bookcovers": 0, "signs": 0, "normalnudes": 0, "nijijourney": 0, "movieposters": 0, "celebrities": 0, "pixel-art": 0, "propagandaposters": 0, "sports": 0, "moviecollection": 0, "gay": 0, "experimental": 0, "yoga": 0, "ethnic": 0, "cinemamix-1mp": 0, "architecture": 0, "mj-60": 0, "text-1mp": 65, "shutterstock": 0, "nsfw-1024": 0, "photo-aesthetics": 0, "anatomy": 0, "bg20k-1024": 0, "sfwbooru": 0, "midjourney-v6-520k-raw": 0, "nijijourney-v6-520k-raw": 0, "photo-concept-bucket": 0, "dalle3": 0}}
 
1
+ {"global_step": 49000, "epoch_step": 1, "epoch": 2, "exhausted_backends": ["pixel-art", "signs", "sports", "ethnic", "experimental", "movieposters", "normalnudes", "yoga", "cinemamix-1mp", "architecture", "moviecollection", "shutterstock", "nsfw-1024", "photo-aesthetics", "bg20k-1024", "anatomy", "sfwbooru", "nijijourney-v6-520k-raw", "midjourney-v6-520k-raw", "photo-concept-bucket"], "repeats": {"bookcovers": 0, "signs": 0, "normalnudes": 0, "nijijourney": 0, "movieposters": 0, "celebrities": 0, "pixel-art": 0, "propagandaposters": 0, "sports": 0, "moviecollection": 0, "gay": 0, "experimental": 0, "yoga": 0, "ethnic": 0, "cinemamix-1mp": 0, "architecture": 0, "mj-60": 0, "text-1mp": 65, "shutterstock": 0, "nsfw-1024": 0, "photo-aesthetics": 0, "anatomy": 0, "bg20k-1024": 0, "sfwbooru": 0, "midjourney-v6-520k-raw": 0, "nijijourney-v6-520k-raw": 0, "photo-concept-bucket": 0, "dalle3": 0}}
transformer/config.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "_class_name": "PixArtTransformer2DModel",
3
  "_diffusers_version": "0.30.0.dev0",
4
- "_name_or_path": "/home/ubuntu/training/models/checkpoint-46000",
5
  "activation_fn": "gelu-approximate",
6
  "attention_bias": true,
7
  "attention_head_dim": 72,
 
1
  {
2
  "_class_name": "PixArtTransformer2DModel",
3
  "_diffusers_version": "0.30.0.dev0",
4
+ "_name_or_path": "/home/ubuntu/training/models/checkpoint-48000",
5
  "activation_fn": "gelu-approximate",
6
  "attention_bias": true,
7
  "attention_head_dim": 72,
transformer/diffusion_pytorch_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a47ee0ec97a4701d87142fd35a7984efb5e8a1e1fb940844b21e10fb366670cc
3
  size 1816969728
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cccac28f9d23032a0cfafe5cafb6f80a3ac40496ba900554ac200818f976ae95
3
  size 1816969728