File size: 147,769 Bytes
b90aa01
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
INFO:nncf:Ignored adding weight sparsifier in scope: BertForSequenceClassification/BertModel[bert]/BertPooler[pooler]/NNCFLinear[dense]/linear_0
INFO:nncf:Ignored adding weight sparsifier in scope: BertForSequenceClassification/NNCFLinear[classifier]/linear_0
INFO:nncf:Not adding activation input quantizer for operation: 6 BertForSequenceClassification/BertModel[bert]/BertEmbeddings[embeddings]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 9 BertForSequenceClassification/BertModel[bert]/BertEmbeddings[embeddings]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 23 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[0]/BertAttention[attention]/BertSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 26 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[0]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 32 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[0]/BertAttention[attention]/BertSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 33 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[0]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 38 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[0]/BertOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 39 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[0]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 52 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[1]/BertAttention[attention]/BertSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 55 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[1]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 61 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[1]/BertAttention[attention]/BertSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 62 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[1]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 67 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[1]/BertOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 68 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[1]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 81 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[2]/BertAttention[attention]/BertSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 84 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[2]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 90 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[2]/BertAttention[attention]/BertSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 91 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[2]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 96 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[2]/BertOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 97 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[2]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 110 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[3]/BertAttention[attention]/BertSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 113 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[3]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 119 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[3]/BertAttention[attention]/BertSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 120 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[3]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 125 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[3]/BertOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 126 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[3]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 139 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[4]/BertAttention[attention]/BertSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 142 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[4]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 148 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[4]/BertAttention[attention]/BertSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 149 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[4]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 154 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[4]/BertOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 155 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[4]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 168 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[5]/BertAttention[attention]/BertSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 171 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[5]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 177 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[5]/BertAttention[attention]/BertSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 178 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[5]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 183 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[5]/BertOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 184 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[5]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 197 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[6]/BertAttention[attention]/BertSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 200 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[6]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 206 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[6]/BertAttention[attention]/BertSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 207 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[6]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 212 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[6]/BertOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 213 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[6]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 226 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[7]/BertAttention[attention]/BertSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 229 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[7]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 235 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[7]/BertAttention[attention]/BertSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 236 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[7]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 241 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[7]/BertOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 242 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[7]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 255 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[8]/BertAttention[attention]/BertSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 258 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[8]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 264 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[8]/BertAttention[attention]/BertSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 265 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[8]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 270 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[8]/BertOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 271 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[8]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 284 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[9]/BertAttention[attention]/BertSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 287 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[9]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 293 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[9]/BertAttention[attention]/BertSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 294 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[9]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 299 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[9]/BertOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 300 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[9]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 313 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[10]/BertAttention[attention]/BertSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 316 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[10]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 322 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[10]/BertAttention[attention]/BertSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 323 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[10]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 328 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[10]/BertOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 329 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[10]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 342 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[11]/BertAttention[attention]/BertSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 345 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[11]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 351 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[11]/BertAttention[attention]/BertSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 352 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[11]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 357 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[11]/BertOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 358 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[11]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Collecting tensor statistics |████████████████| 1 / 1
INFO:nncf:BatchNorm statistics adaptation |██              | 1 / 7
INFO:nncf:BatchNorm statistics adaptation |████            | 2 / 7
INFO:nncf:BatchNorm statistics adaptation |██████          | 3 / 7
INFO:nncf:BatchNorm statistics adaptation |█████████       | 4 / 7
INFO:nncf:BatchNorm statistics adaptation |███████████     | 5 / 7
INFO:nncf:BatchNorm statistics adaptation |█████████████   | 6 / 7
INFO:nncf:BatchNorm statistics adaptation |████████████████| 7 / 7
WARNING:nncf:Number of potential building blocks is too much. The processing time can be high. Shallow the accepted range for the length of building blocks via max_block_size and min_block_size to accelerate the search process.
INFO:nncf:Movement sparsity scheduler updates importance threshold and regularizationfactor per optimizer step, but steps_per_epoch was not set in config. Will measure the actual steps per epoch as signaled by a .epoch_step() call.
INFO:nncf:Statistics of the sparsified model:
Epoch 0 |+-----------------------------------------+-------+
Epoch 0 ||            Statistic's name             | Value |
Epoch 0 |+=========================================+=======+
Epoch 0 || Sparsity level of the whole model       | 0.000 |
Epoch 0 |+-----------------------------------------+-------+
Epoch 0 || Sparsity level of all sparsified layers | 0     |
Epoch 0 |+-----------------------------------------+-------+
Epoch 0 |
Epoch 0 |Statistics by sparsified layers:
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 ||     Layer's name     | Weight's shape | Sparsity level | Weight's percentage |
Epoch 0 |+======================+================+================+=====================+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[0]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[0]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[0]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0          |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[0]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0/bias     |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[0]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[0]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[0]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0         |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[0]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0/bias    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072, 768]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[0]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0      |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072]         | 0              | 0.004               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[0]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0/bias |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 3072]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[0]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0                  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[0]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0/bias             |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[1]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[1]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[1]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0          |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[1]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0/bias     |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[1]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[1]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[1]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0         |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[1]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0/bias    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072, 768]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[1]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0      |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072]         | 0              | 0.004               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[1]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0/bias |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 3072]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[1]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0                  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[1]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0/bias             |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[2]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[2]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[2]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0          |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[2]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0/bias     |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[2]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[2]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[2]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0         |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[2]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0/bias    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072, 768]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[2]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0      |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072]         | 0              | 0.004               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[2]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0/bias |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 3072]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[2]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0                  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[2]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0/bias             |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[3]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[3]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[3]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0          |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[3]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0/bias     |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[3]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[3]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[3]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0         |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[3]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0/bias    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072, 768]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[3]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0      |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072]         | 0              | 0.004               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[3]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0/bias |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 3072]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[3]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0                  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[3]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0/bias             |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[4]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[4]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[4]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0          |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[4]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0/bias     |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[4]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[4]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[4]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0         |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[4]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0/bias    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072, 768]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[4]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0      |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072]         | 0              | 0.004               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[4]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0/bias |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 3072]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[4]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0                  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[4]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0/bias             |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[5]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[5]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[5]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0          |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[5]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0/bias     |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[5]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[5]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[5]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0         |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[5]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0/bias    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072, 768]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[5]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0      |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072]         | 0              | 0.004               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[5]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0/bias |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 3072]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[5]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0                  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[5]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0/bias             |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[6]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[6]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[6]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0          |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[6]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0/bias     |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[6]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[6]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[6]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0         |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[6]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0/bias    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072, 768]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[6]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0      |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072]         | 0              | 0.004               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[6]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0/bias |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 3072]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[6]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0                  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[6]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0/bias             |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[7]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[7]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[7]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0          |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[7]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0/bias     |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[7]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[7]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[7]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0         |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[7]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0/bias    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072, 768]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[7]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0      |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072]         | 0              | 0.004               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[7]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0/bias |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 3072]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[7]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0                  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[7]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0/bias             |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[8]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[8]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[8]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0          |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[8]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0/bias     |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[8]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[8]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[8]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0         |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[8]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0/bias    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072, 768]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[8]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0      |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072]         | 0              | 0.004               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[8]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0/bias |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 3072]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[8]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0                  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[8]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0/bias             |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[9]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[9]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[qu |                |                |                     |Epoch 0 || ery]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[9]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0          |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[9]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[ke |                |                |                     |Epoch 0 || y]/linear_0/bias     |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[9]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[9]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfAttention |                |                |                     |
Epoch 0 || [self]/NNCFLinear[va |                |                |                     |Epoch 0 || lue]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[9]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0         |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[9]/Be |                |                |                     |
Epoch 0 || rtAttention[attentio |                |                |                     |Epoch 0 || n]/BertSelfOutput[ou |                |                |                     |Epoch 0 || tput]/NNCFLinear[den |                |                |                     |Epoch 0 || se]/linear_0/bias    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072, 768]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[9]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0      |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072]         | 0              | 0.004               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[9]/Be |                |                |                     |
Epoch 0 || rtIntermediate[inter |                |                |                     |Epoch 0 || mediate]/NNCFLinear[ |                |                |                     |Epoch 0 || dense]/linear_0/bias |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 3072]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[9]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0                  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[9]/Be |                |                |                     |
Epoch 0 || rtOutput[output]/NNC |                |                |                     |
Epoch 0 || FLinear[dense]/linea |                |                |                     |
Epoch 0 || r_0/bias             |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[10]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfAttentio |                |                |                     |
Epoch 0 || n[self]/NNCFLinear[q |                |                |                     |Epoch 0 || uery]/linear_0       |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[10]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfAttentio |                |                |                     |
Epoch 0 || n[self]/NNCFLinear[q |                |                |                     |Epoch 0 || uery]/linear_0/bias  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[10]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfAttentio |                |                |                     |
Epoch 0 || n[self]/NNCFLinear[k |                |                |                     |Epoch 0 || ey]/linear_0         |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[10]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfAttentio |                |                |                     |
Epoch 0 || n[self]/NNCFLinear[k |                |                |                     |Epoch 0 || ey]/linear_0/bias    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[10]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfAttentio |                |                |                     |
Epoch 0 || n[self]/NNCFLinear[v |                |                |                     |Epoch 0 || alue]/linear_0       |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[10]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfAttentio |                |                |                     |
Epoch 0 || n[self]/NNCFLinear[v |                |                |                     |Epoch 0 || alue]/linear_0/bias  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[10]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfOutput[o |                |                |                     |Epoch 0 || utput]/NNCFLinear[de |                |                |                     |Epoch 0 || nse]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[10]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfOutput[o |                |                |                     |Epoch 0 || utput]/NNCFLinear[de |                |                |                     |Epoch 0 || nse]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072, 768]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[10]/B |                |                |                     |
Epoch 0 || ertIntermediate[inte |                |                |                     |Epoch 0 || rmediate]/NNCFLinear |                |                |                     |
Epoch 0 || [dense]/linear_0     |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072]         | 0              | 0.004               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[10]/B |                |                |                     |
Epoch 0 || ertIntermediate[inte |                |                |                     |Epoch 0 || rmediate]/NNCFLinear |                |                |                     |
Epoch 0 || [dense]/linear_0/bia |                |                |                     |
Epoch 0 || s                    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 3072]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[10]/B |                |                |                     |
Epoch 0 || ertOutput[output]/NN |                |                |                     |
Epoch 0 || CFLinear[dense]/line |                |                |                     |
Epoch 0 || ar_0                 |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[10]/B |                |                |                     |
Epoch 0 || ertOutput[output]/NN |                |                |                     |
Epoch 0 || CFLinear[dense]/line |                |                |                     |
Epoch 0 || ar_0/bias            |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[11]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfAttentio |                |                |                     |
Epoch 0 || n[self]/NNCFLinear[q |                |                |                     |Epoch 0 || uery]/linear_0       |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[11]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfAttentio |                |                |                     |
Epoch 0 || n[self]/NNCFLinear[q |                |                |                     |Epoch 0 || uery]/linear_0/bias  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[11]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfAttentio |                |                |                     |
Epoch 0 || n[self]/NNCFLinear[k |                |                |                     |Epoch 0 || ey]/linear_0         |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[11]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfAttentio |                |                |                     |
Epoch 0 || n[self]/NNCFLinear[k |                |                |                     |Epoch 0 || ey]/linear_0/bias    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[11]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfAttentio |                |                |                     |
Epoch 0 || n[self]/NNCFLinear[v |                |                |                     |Epoch 0 || alue]/linear_0       |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[11]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfAttentio |                |                |                     |
Epoch 0 || n[self]/NNCFLinear[v |                |                |                     |Epoch 0 || alue]/linear_0/bias  |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 768]     | 0              | 0.694               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[11]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfOutput[o |                |                |                     |Epoch 0 || utput]/NNCFLinear[de |                |                |                     |Epoch 0 || nse]/linear_0        |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[11]/B |                |                |                     |
Epoch 0 || ertAttention[attenti |                |                |                     |Epoch 0 || on]/BertSelfOutput[o |                |                |                     |Epoch 0 || utput]/NNCFLinear[de |                |                |                     |Epoch 0 || nse]/linear_0/bias   |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072, 768]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[11]/B |                |                |                     |
Epoch 0 || ertIntermediate[inte |                |                |                     |Epoch 0 || rmediate]/NNCFLinear |                |                |                     |
Epoch 0 || [dense]/linear_0     |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [3072]         | 0              | 0.004               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[11]/B |                |                |                     |
Epoch 0 || ertIntermediate[inte |                |                |                     |Epoch 0 || rmediate]/NNCFLinear |                |                |                     |
Epoch 0 || [dense]/linear_0/bia |                |                |                     |
Epoch 0 || s                    |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768, 3072]    | 0              | 2.775               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[11]/B |                |                |                     |
Epoch 0 || ertOutput[output]/NN |                |                |                     |
Epoch 0 || CFLinear[dense]/line |                |                |                     |
Epoch 0 || ar_0                 |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 || BertForSequenceClass | [768]          | 0              | 0.001               |
Epoch 0 || ification/BertModel[ |                |                |                     |Epoch 0 || bert]/BertEncoder[en |                |                |                     |Epoch 0 || coder]/ModuleList[la |                |                |                     |Epoch 0 || yer]/BertLayer[11]/B |                |                |                     |
Epoch 0 || ertOutput[output]/NN |                |                |                     |
Epoch 0 || CFLinear[dense]/line |                |                |                     |
Epoch 0 || ar_0/bias            |                |                |                     |
Epoch 0 |+----------------------+----------------+----------------+---------------------+
Epoch 0 |
Epoch 0 |Statistics of the movement-sparsity algorithm:
Epoch 0 |+----------------------------------+-------+
Epoch 0 ||         Statistic's name         | Value |
Epoch 0 |+==================================+=======+
Epoch 0 || Mask Importance Threshold        | -inf  |
Epoch 0 |+----------------------------------+-------+
Epoch 0 || Importance Regularization Factor | 0     |
Epoch 0 |+----------------------------------+-------+
Epoch 0 |
Epoch 0 |Statistics of the quantization algorithm:
Epoch 0 |+--------------------------------+-------+
Epoch 0 ||        Statistic's name        | Value |
Epoch 0 |+================================+=======+
Epoch 0 || Ratio of enabled quantizations | 100   |
Epoch 0 |+--------------------------------+-------+
Epoch 0 |
Epoch 0 |Statistics of the quantization share:
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 ||         Statistic's name         |        Value         |
Epoch 0 |+==================================+======================+
Epoch 0 || Symmetric WQs / All placed WQs   | 100.00 % (77 / 77)   |
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 || Asymmetric WQs / All placed WQs  | 0.00 % (0 / 77)      |
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 || Signed WQs / All placed WQs      | 100.00 % (77 / 77)   |
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 || Unsigned WQs / All placed WQs    | 0.00 % (0 / 77)      |
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 || Per-tensor WQs / All placed WQs  | 3.90 % (3 / 77)      |
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 || Per-channel WQs / All placed WQs | 96.10 % (74 / 77)    |
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 || Placed WQs / Potential WQs       | 75.49 % (77 / 102)   |
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 || Symmetric AQs / All placed AQs   | 23.76 % (24 / 101)   |
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 || Asymmetric AQs / All placed AQs  | 76.24 % (77 / 101)   |
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 || Signed AQs / All placed AQs      | 100.00 % (101 / 101) |
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 || Unsigned AQs / All placed AQs    | 0.00 % (0 / 101)     |
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 || Per-tensor AQs / All placed AQs  | 100.00 % (101 / 101) |
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 || Per-channel AQs / All placed AQs | 0.00 % (0 / 101)     |
Epoch 0 |+----------------------------------+----------------------+
Epoch 0 |
Epoch 0 |Statistics of the bitwidth distribution:
Epoch 0 |+--------------+---------------------+--------------------+--------------------+
Epoch 0 || Num bits (N) | N-bits WQs / Placed |    N-bits AQs /    | N-bits Qs / Placed |
Epoch 0 ||              |         WQs         |     Placed AQs     |         Qs         |
Epoch 0 |+==============+=====================+====================+====================+
Epoch 0 || 8            | 100.00 % (77 / 77)  | 100.00 % (101 /    | 100.00 % (178 /    |
Epoch 0 ||              |                     | 101)               | 178)               |
Epoch 0 |+--------------+---------------------+--------------------+--------------------+
INFO:nncf:Movement sparsity scheduler updates importance threshold and regularizationfactor per optimizer step, but steps_per_epoch was not set in config. Will measure the actual steps per epoch as signaled by a .epoch_step() call.