File size: 217,177 Bytes
7cef198
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
program(1.0)
[buildInfo = dict<tensor<string, []>, tensor<string, []>>({{"coremlc-component-MIL", "3304.5.2"}, {"coremlc-version", "3304.6.2"}})]
{
    func main<ios16>(tensor<int32, [1]> cache_length, tensor<fp16, [1, 448]> decoder_key_padding_mask, tensor<fp16, [1, 1280, 1, 1500]> encoder_output_embeds, tensor<int32, [1]> input_ids, tensor<fp16, [1, 5120, 1, 448]> key_cache, tensor<fp16, [1, 448]> kv_cache_update_mask, tensor<fp16, [1, 5120, 1, 448]> value_cache) {
            tensor<int32, []> var_24_axis_0 = const()[name = tensor<string, []>("op_24_axis_0"), val = tensor<int32, []>(0)];
            tensor<int32, []> var_24_batch_dims_0 = const()[name = tensor<string, []>("op_24_batch_dims_0"), val = tensor<int32, []>(0)];
            tensor<fp16, [51866, 1280]> embed_tokens_weight_to_fp16 = const()[name = tensor<string, []>("embed_tokens_weight_to_fp16"), val = tensor<fp16, [51866, 1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(64)))];
            tensor<fp16, [1, 1280]> var_24_cast_fp16 = gather(axis = var_24_axis_0, batch_dims = var_24_batch_dims_0, indices = input_ids, x = embed_tokens_weight_to_fp16)[name = tensor<string, []>("op_24_cast_fp16")];
            tensor<int32, []> var_31_axis_0 = const()[name = tensor<string, []>("op_31_axis_0"), val = tensor<int32, []>(0)];
            tensor<int32, []> var_31_batch_dims_0 = const()[name = tensor<string, []>("op_31_batch_dims_0"), val = tensor<int32, []>(0)];
            tensor<fp16, [448, 1280]> embed_positions_inlier_module_weight_to_fp16 = const()[name = tensor<string, []>("embed_positions_inlier_module_weight_to_fp16"), val = tensor<fp16, [448, 1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(132777088)))];
            tensor<fp16, [1, 1280]> var_31_cast_fp16 = gather(axis = var_31_axis_0, batch_dims = var_31_batch_dims_0, indices = cache_length, x = embed_positions_inlier_module_weight_to_fp16)[name = tensor<string, []>("op_31_cast_fp16")];
            tensor<int32, []> var_33_axis_0 = const()[name = tensor<string, []>("op_33_axis_0"), val = tensor<int32, []>(0)];
            tensor<int32, []> var_33_batch_dims_0 = const()[name = tensor<string, []>("op_33_batch_dims_0"), val = tensor<int32, []>(0)];
            tensor<fp16, [448, 1280]> embed_positions_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [71680]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(133941312))), name = tensor<string, []>("embed_positions_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [8582]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(133924032))), shape = tensor<uint32, [2]>([448, 1280])];
            tensor<fp16, [1, 1280]> var_33_cast_fp16 = gather(axis = var_33_axis_0, batch_dims = var_33_batch_dims_0, indices = cache_length, x = embed_positions_outlier_module_weight_to_fp16_sparsified)[name = tensor<string, []>("op_33_cast_fp16")];
            tensor<fp16, [1, 1280]> var_34_cast_fp16 = add(x = var_31_cast_fp16, y = var_33_cast_fp16)[name = tensor<string, []>("op_34_cast_fp16")];
            tensor<fp16, [1, 1280]> hidden_states_1_cast_fp16 = add(x = var_24_cast_fp16, y = var_34_cast_fp16)[name = tensor<string, []>("hidden_states_1_cast_fp16")];
            tensor<int32, [1]> var_48_axes_0 = const()[name = tensor<string, []>("op_48_axes_0"), val = tensor<int32, [1]>([2])];
            tensor<fp16, [1, 1280, 1]> var_48_cast_fp16 = expand_dims(axes = var_48_axes_0, x = hidden_states_1_cast_fp16)[name = tensor<string, []>("op_48_cast_fp16")];
            tensor<int32, [1]> inputs_1_axes_0 = const()[name = tensor<string, []>("inputs_1_axes_0"), val = tensor<int32, [1]>([3])];
            tensor<fp16, [1, 1280, 1, 1]> inputs_1_cast_fp16 = expand_dims(axes = inputs_1_axes_0, x = var_48_cast_fp16)[name = tensor<string, []>("inputs_1_cast_fp16")];
            tensor<int32, [4]> tile_0 = const()[name = tensor<string, []>("tile_0"), val = tensor<int32, [4]>([1280, 1280, 1280, 1280])];
            tensor<int32, []> var_53_axis_0 = const()[name = tensor<string, []>("op_53_axis_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1, 1280, 1, 448]> var_53_cast_fp16_0, tensor<fp16, [1, 1280, 1, 448]> var_53_cast_fp16_1, tensor<fp16, [1, 1280, 1, 448]> var_53_cast_fp16_2, tensor<fp16, [1, 1280, 1, 448]> var_53_cast_fp16_3 = split(axis = var_53_axis_0, split_sizes = tile_0, x = key_cache)[name = tensor<string, []>("op_53_cast_fp16")];
            tensor<int32, [4]> tile_1 = const()[name = tensor<string, []>("tile_1"), val = tensor<int32, [4]>([1280, 1280, 1280, 1280])];
            tensor<int32, []> var_60_axis_0 = const()[name = tensor<string, []>("op_60_axis_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1, 1280, 1, 448]> var_60_cast_fp16_0, tensor<fp16, [1, 1280, 1, 448]> var_60_cast_fp16_1, tensor<fp16, [1, 1280, 1, 448]> var_60_cast_fp16_2, tensor<fp16, [1, 1280, 1, 448]> var_60_cast_fp16_3 = split(axis = var_60_axis_0, split_sizes = tile_1, x = value_cache)[name = tensor<string, []>("op_60_cast_fp16")];
            tensor<int32, []> var_70 = const()[name = tensor<string, []>("op_70"), val = tensor<int32, []>(3)];
            tensor<int32, [1]> out_1_axes_0 = const()[name = tensor<string, []>("out_1_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, []> var_96_to_fp16 = const()[name = tensor<string, []>("op_96_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> out_1_cast_fp16 = layer_norm(axes = out_1_axes_0, epsilon = var_96_to_fp16, x = inputs_1_cast_fp16)[name = tensor<string, []>("out_1_cast_fp16")];
            tensor<fp16, [1280]> obj_1_mean_0_to_fp16 = const()[name = tensor<string, []>("obj_1_mean_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(134013056)))];
            tensor<fp16, [1280]> obj_1_variance_0_to_fp16 = const()[name = tensor<string, []>("obj_1_variance_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(134015680)))];
            tensor<fp16, [1280]> obj_1_gamma_0_to_fp16 = const()[name = tensor<string, []>("obj_1_gamma_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(134018304)))];
            tensor<fp16, [1280]> obj_1_beta_0_to_fp16 = const()[name = tensor<string, []>("obj_1_beta_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(134020928)))];
            tensor<fp16, []> obj_1_epsilon_0_to_fp16 = const()[name = tensor<string, []>("obj_1_epsilon_0_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> obj_1_cast_fp16 = batch_norm(beta = obj_1_beta_0_to_fp16, epsilon = obj_1_epsilon_0_to_fp16, gamma = obj_1_gamma_0_to_fp16, mean = obj_1_mean_0_to_fp16, variance = obj_1_variance_0_to_fp16, x = out_1_cast_fp16)[name = tensor<string, []>("obj_1_cast_fp16")];
            tensor<string, []> var_118_pad_type_0 = const()[name = tensor<string, []>("op_118_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_118_strides_0 = const()[name = tensor<string, []>("op_118_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_118_pad_0 = const()[name = tensor<string, []>("op_118_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_118_dilations_0 = const()[name = tensor<string, []>("op_118_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_118_groups_0 = const()[name = tensor<string, []>("op_118_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_self_attn_q_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(134023552))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(134842816))), name = tensor<string, []>("layers_0_self_attn_q_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_0_self_attn_q_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_0_self_attn_q_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(134842944)))];
            tensor<fp16, [1, 1280, 1, 1]> var_118_cast_fp16 = conv(bias = layers_0_self_attn_q_proj_inlier_module_bias_to_fp16, dilations = var_118_dilations_0, groups = var_118_groups_0, pad = var_118_pad_0, pad_type = var_118_pad_type_0, strides = var_118_strides_0, weight = layers_0_self_attn_q_proj_inlier_module_weight_to_fp16_palettized, x = obj_1_cast_fp16)[name = tensor<string, []>("op_118_cast_fp16")];
            tensor<string, []> var_124_pad_type_0 = const()[name = tensor<string, []>("op_124_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_124_strides_0 = const()[name = tensor<string, []>("op_124_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_124_pad_0 = const()[name = tensor<string, []>("op_124_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_124_dilations_0 = const()[name = tensor<string, []>("op_124_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_124_groups_0 = const()[name = tensor<string, []>("op_124_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_self_attn_q_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(134918592))), name = tensor<string, []>("layers_0_self_attn_q_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [36461]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(134845568))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_124_cast_fp16 = conv(dilations = var_124_dilations_0, groups = var_124_groups_0, pad = var_124_pad_0, pad_type = var_124_pad_type_0, strides = var_124_strides_0, weight = layers_0_self_attn_q_proj_outlier_module_weight_to_fp16_sparsified, x = obj_1_cast_fp16)[name = tensor<string, []>("op_124_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> query_1_cast_fp16 = add(x = var_118_cast_fp16, y = var_124_cast_fp16)[name = tensor<string, []>("query_1_cast_fp16")];
            tensor<string, []> var_133_pad_type_0 = const()[name = tensor<string, []>("op_133_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_133_strides_0 = const()[name = tensor<string, []>("op_133_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_133_pad_0 = const()[name = tensor<string, []>("op_133_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_133_dilations_0 = const()[name = tensor<string, []>("op_133_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_133_groups_0 = const()[name = tensor<string, []>("op_133_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_self_attn_k_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(135123456))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(135942720))), name = tensor<string, []>("layers_0_self_attn_k_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_133_cast_fp16 = conv(dilations = var_133_dilations_0, groups = var_133_groups_0, pad = var_133_pad_0, pad_type = var_133_pad_type_0, strides = var_133_strides_0, weight = layers_0_self_attn_k_proj_inlier_module_weight_to_fp16_palettized, x = obj_1_cast_fp16)[name = tensor<string, []>("op_133_cast_fp16")];
            tensor<string, []> var_139_pad_type_0 = const()[name = tensor<string, []>("op_139_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_139_strides_0 = const()[name = tensor<string, []>("op_139_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_139_pad_0 = const()[name = tensor<string, []>("op_139_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_139_dilations_0 = const()[name = tensor<string, []>("op_139_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_139_groups_0 = const()[name = tensor<string, []>("op_139_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_self_attn_k_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(135976320))), name = tensor<string, []>("layers_0_self_attn_k_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [16673]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(135942848))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_139_cast_fp16 = conv(dilations = var_139_dilations_0, groups = var_139_groups_0, pad = var_139_pad_0, pad_type = var_139_pad_type_0, strides = var_139_strides_0, weight = layers_0_self_attn_k_proj_outlier_module_weight_to_fp16_sparsified, x = obj_1_cast_fp16)[name = tensor<string, []>("op_139_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> current_key_1_cast_fp16 = add(x = var_133_cast_fp16, y = var_139_cast_fp16)[name = tensor<string, []>("current_key_1_cast_fp16")];
            tensor<string, []> var_149_pad_type_0 = const()[name = tensor<string, []>("op_149_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_149_strides_0 = const()[name = tensor<string, []>("op_149_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_149_pad_0 = const()[name = tensor<string, []>("op_149_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_149_dilations_0 = const()[name = tensor<string, []>("op_149_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_149_groups_0 = const()[name = tensor<string, []>("op_149_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_self_attn_v_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(136181184))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(137000448))), name = tensor<string, []>("layers_0_self_attn_v_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_0_self_attn_v_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_0_self_attn_v_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(137000576)))];
            tensor<fp16, [1, 1280, 1, 1]> var_149_cast_fp16 = conv(bias = layers_0_self_attn_v_proj_inlier_module_bias_to_fp16, dilations = var_149_dilations_0, groups = var_149_groups_0, pad = var_149_pad_0, pad_type = var_149_pad_type_0, strides = var_149_strides_0, weight = layers_0_self_attn_v_proj_inlier_module_weight_to_fp16_palettized, x = obj_1_cast_fp16)[name = tensor<string, []>("op_149_cast_fp16")];
            tensor<string, []> var_155_pad_type_0 = const()[name = tensor<string, []>("op_155_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_155_strides_0 = const()[name = tensor<string, []>("op_155_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_155_pad_0 = const()[name = tensor<string, []>("op_155_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_155_dilations_0 = const()[name = tensor<string, []>("op_155_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_155_groups_0 = const()[name = tensor<string, []>("op_155_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_self_attn_v_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(137046720))), name = tensor<string, []>("layers_0_self_attn_v_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [21721]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(137003200))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_155_cast_fp16 = conv(dilations = var_155_dilations_0, groups = var_155_groups_0, pad = var_155_pad_0, pad_type = var_155_pad_type_0, strides = var_155_strides_0, weight = layers_0_self_attn_v_proj_outlier_module_weight_to_fp16_sparsified, x = obj_1_cast_fp16)[name = tensor<string, []>("op_155_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> current_value_1_cast_fp16 = add(x = var_149_cast_fp16, y = var_155_cast_fp16)[name = tensor<string, []>("current_value_1_cast_fp16")];
            tensor<int32, [1]> var_158_axes_0 = const()[name = tensor<string, []>("op_158_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, [1, 1, 448]> var_158_cast_fp16 = expand_dims(axes = var_158_axes_0, x = kv_cache_update_mask)[name = tensor<string, []>("op_158_cast_fp16")];
            tensor<int32, [1]> var_159_axes_0 = const()[name = tensor<string, []>("op_159_axes_0"), val = tensor<int32, [1]>([2])];
            tensor<fp16, [1, 1, 1, 448]> var_159_cast_fp16 = expand_dims(axes = var_159_axes_0, x = var_158_cast_fp16)[name = tensor<string, []>("op_159_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_161_cast_fp16 = mul(x = current_key_1_cast_fp16, y = var_159_cast_fp16)[name = tensor<string, []>("op_161_cast_fp16")];
            tensor<fp16, []> var_71_to_fp16 = const()[name = tensor<string, []>("op_71_to_fp16"), val = tensor<fp16, []>(0x1p+0)];
            tensor<fp16, [1, 1, 1, 448]> var_162_cast_fp16 = sub(x = var_71_to_fp16, y = var_159_cast_fp16)[name = tensor<string, []>("op_162_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_163_cast_fp16 = mul(x = var_53_cast_fp16_0, y = var_162_cast_fp16)[name = tensor<string, []>("op_163_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> key_1_cast_fp16 = add(x = var_161_cast_fp16, y = var_163_cast_fp16)[name = tensor<string, []>("key_1_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_165_cast_fp16 = mul(x = current_value_1_cast_fp16, y = var_159_cast_fp16)[name = tensor<string, []>("op_165_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_167_cast_fp16 = mul(x = var_60_cast_fp16_0, y = var_162_cast_fp16)[name = tensor<string, []>("op_167_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> value_1_cast_fp16 = add(x = var_165_cast_fp16, y = var_167_cast_fp16)[name = tensor<string, []>("value_1_cast_fp16")];
            tensor<int32, [4]> var_170 = const()[name = tensor<string, []>("op_170"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1]> mh_q_1_cast_fp16 = reshape(shape = var_170, x = query_1_cast_fp16)[name = tensor<string, []>("mh_q_1_cast_fp16")];
            tensor<fp16, []> var_172_to_fp16 = const()[name = tensor<string, []>("op_172_to_fp16"), val = tensor<fp16, []>(0x1p-3)];
            tensor<fp16, [1, 20, 64, 1]> var_173_cast_fp16 = mul(x = mh_q_1_cast_fp16, y = var_172_to_fp16)[name = tensor<string, []>("op_173_cast_fp16")];
            tensor<int32, [4]> var_174 = const()[name = tensor<string, []>("op_174"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 448]> var_175_cast_fp16 = reshape(shape = var_174, x = key_1_cast_fp16)[name = tensor<string, []>("op_175_cast_fp16")];
            tensor<bool, []> mh_w_1_transpose_x_0 = const()[name = tensor<string, []>("mh_w_1_transpose_x_0"), val = tensor<bool, []>(true)];
            tensor<bool, []> mh_w_1_transpose_y_0 = const()[name = tensor<string, []>("mh_w_1_transpose_y_0"), val = tensor<bool, []>(false)];
            tensor<fp16, [1, 20, 1, 448]> mh_w_1_cast_fp16 = matmul(transpose_x = mh_w_1_transpose_x_0, transpose_y = mh_w_1_transpose_y_0, x = var_173_cast_fp16, y = var_175_cast_fp16)[name = tensor<string, []>("mh_w_1_cast_fp16")];
            tensor<int32, [1]> var_179_axes_0 = const()[name = tensor<string, []>("op_179_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, [1, 1, 448]> var_179_cast_fp16 = expand_dims(axes = var_179_axes_0, x = decoder_key_padding_mask)[name = tensor<string, []>("op_179_cast_fp16")];
            tensor<int32, [1]> var_180_axes_0 = const()[name = tensor<string, []>("op_180_axes_0"), val = tensor<int32, [1]>([2])];
            tensor<fp16, [1, 1, 1, 448]> var_180_cast_fp16 = expand_dims(axes = var_180_axes_0, x = var_179_cast_fp16)[name = tensor<string, []>("op_180_cast_fp16")];
            tensor<fp16, [1, 20, 1, 448]> mh_w_3_cast_fp16 = add(x = mh_w_1_cast_fp16, y = var_180_cast_fp16)[name = tensor<string, []>("mh_w_3_cast_fp16")];
            tensor<fp16, [1, 20, 1, 448]> var_183_cast_fp16 = softmax(axis = var_70, x = mh_w_3_cast_fp16)[name = tensor<string, []>("op_183_cast_fp16")];
            tensor<int32, [4]> var_184 = const()[name = tensor<string, []>("op_184"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 448]> var_185_cast_fp16 = reshape(shape = var_184, x = value_1_cast_fp16)[name = tensor<string, []>("op_185_cast_fp16")];
            tensor<bool, []> attn_1_transpose_x_0 = const()[name = tensor<string, []>("attn_1_transpose_x_0"), val = tensor<bool, []>(false)];
            tensor<bool, []> attn_1_transpose_y_0 = const()[name = tensor<string, []>("attn_1_transpose_y_0"), val = tensor<bool, []>(true)];
            tensor<fp16, [1, 20, 64, 1]> attn_1_cast_fp16 = matmul(transpose_x = attn_1_transpose_x_0, transpose_y = attn_1_transpose_y_0, x = var_185_cast_fp16, y = var_183_cast_fp16)[name = tensor<string, []>("attn_1_cast_fp16")];
            tensor<int32, [4]> var_188 = const()[name = tensor<string, []>("op_188"), val = tensor<int32, [4]>([1, 1280, 1, -1])];
            tensor<fp16, [1, 1280, 1, 1]> input_1_cast_fp16 = reshape(shape = var_188, x = attn_1_cast_fp16)[name = tensor<string, []>("input_1_cast_fp16")];
            tensor<string, []> var_198_pad_type_0 = const()[name = tensor<string, []>("op_198_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_198_strides_0 = const()[name = tensor<string, []>("op_198_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_198_pad_0 = const()[name = tensor<string, []>("op_198_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_198_dilations_0 = const()[name = tensor<string, []>("op_198_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_198_groups_0 = const()[name = tensor<string, []>("op_198_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_self_attn_o_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(137251584))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(138070848))), name = tensor<string, []>("layers_0_self_attn_o_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_0_self_attn_o_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_0_self_attn_o_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(138070976)))];
            tensor<fp16, [1, 1280, 1, 1]> var_198_cast_fp16 = conv(bias = layers_0_self_attn_o_proj_inlier_module_bias_to_fp16, dilations = var_198_dilations_0, groups = var_198_groups_0, pad = var_198_pad_0, pad_type = var_198_pad_type_0, strides = var_198_strides_0, weight = layers_0_self_attn_o_proj_inlier_module_weight_to_fp16_palettized, x = input_1_cast_fp16)[name = tensor<string, []>("op_198_cast_fp16")];
            tensor<string, []> var_204_pad_type_0 = const()[name = tensor<string, []>("op_204_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_204_strides_0 = const()[name = tensor<string, []>("op_204_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_204_pad_0 = const()[name = tensor<string, []>("op_204_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_204_dilations_0 = const()[name = tensor<string, []>("op_204_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_204_groups_0 = const()[name = tensor<string, []>("op_204_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_self_attn_o_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(138130624))), name = tensor<string, []>("layers_0_self_attn_o_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [28455]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(138073600))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_204_cast_fp16 = conv(dilations = var_204_dilations_0, groups = var_204_groups_0, pad = var_204_pad_0, pad_type = var_204_pad_type_0, strides = var_204_strides_0, weight = layers_0_self_attn_o_proj_outlier_module_weight_to_fp16_sparsified, x = input_1_cast_fp16)[name = tensor<string, []>("op_204_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> obj_7_cast_fp16 = add(x = var_198_cast_fp16, y = var_204_cast_fp16)[name = tensor<string, []>("obj_7_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> inputs_3_cast_fp16 = add(x = inputs_1_cast_fp16, y = obj_7_cast_fp16)[name = tensor<string, []>("inputs_3_cast_fp16")];
            tensor<int32, [1]> out_3_axes_0 = const()[name = tensor<string, []>("out_3_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, []> var_219_to_fp16 = const()[name = tensor<string, []>("op_219_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> out_3_cast_fp16 = layer_norm(axes = out_3_axes_0, epsilon = var_219_to_fp16, x = inputs_3_cast_fp16)[name = tensor<string, []>("out_3_cast_fp16")];
            tensor<fp16, [1280]> obj_9_gamma_0_to_fp16 = const()[name = tensor<string, []>("obj_9_gamma_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(138335488)))];
            tensor<fp16, [1280]> obj_9_beta_0_to_fp16 = const()[name = tensor<string, []>("obj_9_beta_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(138338112)))];
            tensor<fp16, []> obj_9_epsilon_0_to_fp16 = const()[name = tensor<string, []>("obj_9_epsilon_0_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> obj_9_cast_fp16 = batch_norm(beta = obj_9_beta_0_to_fp16, epsilon = obj_9_epsilon_0_to_fp16, gamma = obj_9_gamma_0_to_fp16, mean = obj_1_mean_0_to_fp16, variance = obj_1_variance_0_to_fp16, x = out_3_cast_fp16)[name = tensor<string, []>("obj_9_cast_fp16")];
            tensor<string, []> var_241_pad_type_0 = const()[name = tensor<string, []>("op_241_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_241_strides_0 = const()[name = tensor<string, []>("op_241_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_241_pad_0 = const()[name = tensor<string, []>("op_241_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_241_dilations_0 = const()[name = tensor<string, []>("op_241_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_241_groups_0 = const()[name = tensor<string, []>("op_241_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_encoder_attn_q_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(138340736))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(139160000))), name = tensor<string, []>("layers_0_encoder_attn_q_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_0_encoder_attn_q_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_0_encoder_attn_q_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(139160128)))];
            tensor<fp16, [1, 1280, 1, 1]> var_241_cast_fp16 = conv(bias = layers_0_encoder_attn_q_proj_inlier_module_bias_to_fp16, dilations = var_241_dilations_0, groups = var_241_groups_0, pad = var_241_pad_0, pad_type = var_241_pad_type_0, strides = var_241_strides_0, weight = layers_0_encoder_attn_q_proj_inlier_module_weight_to_fp16_palettized, x = obj_9_cast_fp16)[name = tensor<string, []>("op_241_cast_fp16")];
            tensor<string, []> var_247_pad_type_0 = const()[name = tensor<string, []>("op_247_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_247_strides_0 = const()[name = tensor<string, []>("op_247_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_247_pad_0 = const()[name = tensor<string, []>("op_247_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_247_dilations_0 = const()[name = tensor<string, []>("op_247_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_247_groups_0 = const()[name = tensor<string, []>("op_247_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_encoder_attn_q_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(139188224))), name = tensor<string, []>("layers_0_encoder_attn_q_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [12701]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(139162752))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_247_cast_fp16 = conv(dilations = var_247_dilations_0, groups = var_247_groups_0, pad = var_247_pad_0, pad_type = var_247_pad_type_0, strides = var_247_strides_0, weight = layers_0_encoder_attn_q_proj_outlier_module_weight_to_fp16_sparsified, x = obj_9_cast_fp16)[name = tensor<string, []>("op_247_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> query_3_cast_fp16 = add(x = var_241_cast_fp16, y = var_247_cast_fp16)[name = tensor<string, []>("query_3_cast_fp16")];
            tensor<string, []> var_256_pad_type_0 = const()[name = tensor<string, []>("op_256_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_256_strides_0 = const()[name = tensor<string, []>("op_256_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_256_pad_0 = const()[name = tensor<string, []>("op_256_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_256_dilations_0 = const()[name = tensor<string, []>("op_256_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_256_groups_0 = const()[name = tensor<string, []>("op_256_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_encoder_attn_k_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(139393088))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(140212352))), name = tensor<string, []>("layers_0_encoder_attn_k_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1500]> var_256_cast_fp16 = conv(dilations = var_256_dilations_0, groups = var_256_groups_0, pad = var_256_pad_0, pad_type = var_256_pad_type_0, strides = var_256_strides_0, weight = layers_0_encoder_attn_k_proj_inlier_module_weight_to_fp16_palettized, x = encoder_output_embeds)[name = tensor<string, []>("op_256_cast_fp16")];
            tensor<string, []> var_262_pad_type_0 = const()[name = tensor<string, []>("op_262_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_262_strides_0 = const()[name = tensor<string, []>("op_262_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_262_pad_0 = const()[name = tensor<string, []>("op_262_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_262_dilations_0 = const()[name = tensor<string, []>("op_262_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_262_groups_0 = const()[name = tensor<string, []>("op_262_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_encoder_attn_k_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(140272448))), name = tensor<string, []>("layers_0_encoder_attn_k_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [29949]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(140212480))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1500]> var_262_cast_fp16 = conv(dilations = var_262_dilations_0, groups = var_262_groups_0, pad = var_262_pad_0, pad_type = var_262_pad_type_0, strides = var_262_strides_0, weight = layers_0_encoder_attn_k_proj_outlier_module_weight_to_fp16_sparsified, x = encoder_output_embeds)[name = tensor<string, []>("op_262_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1500]> key_3_cast_fp16 = add(x = var_256_cast_fp16, y = var_262_cast_fp16)[name = tensor<string, []>("key_3_cast_fp16")];
            tensor<string, []> var_272_pad_type_0 = const()[name = tensor<string, []>("op_272_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_272_strides_0 = const()[name = tensor<string, []>("op_272_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_272_pad_0 = const()[name = tensor<string, []>("op_272_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_272_dilations_0 = const()[name = tensor<string, []>("op_272_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_272_groups_0 = const()[name = tensor<string, []>("op_272_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_encoder_attn_v_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(140477312))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(141296576))), name = tensor<string, []>("layers_0_encoder_attn_v_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_0_encoder_attn_v_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_0_encoder_attn_v_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(141296704)))];
            tensor<fp16, [1, 1280, 1, 1500]> var_272_cast_fp16 = conv(bias = layers_0_encoder_attn_v_proj_inlier_module_bias_to_fp16, dilations = var_272_dilations_0, groups = var_272_groups_0, pad = var_272_pad_0, pad_type = var_272_pad_type_0, strides = var_272_strides_0, weight = layers_0_encoder_attn_v_proj_inlier_module_weight_to_fp16_palettized, x = encoder_output_embeds)[name = tensor<string, []>("op_272_cast_fp16")];
            tensor<string, []> var_278_pad_type_0 = const()[name = tensor<string, []>("op_278_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_278_strides_0 = const()[name = tensor<string, []>("op_278_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_278_pad_0 = const()[name = tensor<string, []>("op_278_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_278_dilations_0 = const()[name = tensor<string, []>("op_278_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_278_groups_0 = const()[name = tensor<string, []>("op_278_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_encoder_attn_v_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(141310592))), name = tensor<string, []>("layers_0_encoder_attn_v_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [5596]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(141299328))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1500]> var_278_cast_fp16 = conv(dilations = var_278_dilations_0, groups = var_278_groups_0, pad = var_278_pad_0, pad_type = var_278_pad_type_0, strides = var_278_strides_0, weight = layers_0_encoder_attn_v_proj_outlier_module_weight_to_fp16_sparsified, x = encoder_output_embeds)[name = tensor<string, []>("op_278_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1500]> value_3_cast_fp16 = add(x = var_272_cast_fp16, y = var_278_cast_fp16)[name = tensor<string, []>("value_3_cast_fp16")];
            tensor<int32, [4]> var_281 = const()[name = tensor<string, []>("op_281"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1]> mh_q_3_cast_fp16 = reshape(shape = var_281, x = query_3_cast_fp16)[name = tensor<string, []>("mh_q_3_cast_fp16")];
            tensor<fp16, []> var_283_to_fp16 = const()[name = tensor<string, []>("op_283_to_fp16"), val = tensor<fp16, []>(0x1p-3)];
            tensor<fp16, [1, 20, 64, 1]> var_284_cast_fp16 = mul(x = mh_q_3_cast_fp16, y = var_283_to_fp16)[name = tensor<string, []>("op_284_cast_fp16")];
            tensor<int32, [4]> var_285 = const()[name = tensor<string, []>("op_285"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1500]> var_286_cast_fp16 = reshape(shape = var_285, x = key_3_cast_fp16)[name = tensor<string, []>("op_286_cast_fp16")];
            tensor<bool, []> mh_w_5_transpose_x_0 = const()[name = tensor<string, []>("mh_w_5_transpose_x_0"), val = tensor<bool, []>(true)];
            tensor<bool, []> mh_w_5_transpose_y_0 = const()[name = tensor<string, []>("mh_w_5_transpose_y_0"), val = tensor<bool, []>(false)];
            tensor<fp16, [1, 20, 1, 1500]> mh_w_5_cast_fp16 = matmul(transpose_x = mh_w_5_transpose_x_0, transpose_y = mh_w_5_transpose_y_0, x = var_284_cast_fp16, y = var_286_cast_fp16)[name = tensor<string, []>("mh_w_5_cast_fp16")];
            tensor<fp16, [1, 20, 1, 1500]> obj_13_cast_fp16 = softmax(axis = var_70, x = mh_w_5_cast_fp16)[name = tensor<string, []>("obj_13_cast_fp16")];
            tensor<int32, [4]> var_290 = const()[name = tensor<string, []>("op_290"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1500]> var_291_cast_fp16 = reshape(shape = var_290, x = value_3_cast_fp16)[name = tensor<string, []>("op_291_cast_fp16")];
            tensor<bool, []> attn_3_transpose_x_0 = const()[name = tensor<string, []>("attn_3_transpose_x_0"), val = tensor<bool, []>(false)];
            tensor<bool, []> attn_3_transpose_y_0 = const()[name = tensor<string, []>("attn_3_transpose_y_0"), val = tensor<bool, []>(true)];
            tensor<fp16, [1, 20, 64, 1]> attn_3_cast_fp16 = matmul(transpose_x = attn_3_transpose_x_0, transpose_y = attn_3_transpose_y_0, x = var_291_cast_fp16, y = obj_13_cast_fp16)[name = tensor<string, []>("attn_3_cast_fp16")];
            tensor<int32, [4]> var_294 = const()[name = tensor<string, []>("op_294"), val = tensor<int32, [4]>([1, 1280, 1, -1])];
            tensor<fp16, [1, 1280, 1, 1]> input_3_cast_fp16 = reshape(shape = var_294, x = attn_3_cast_fp16)[name = tensor<string, []>("input_3_cast_fp16")];
            tensor<string, []> var_304_pad_type_0 = const()[name = tensor<string, []>("op_304_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_304_strides_0 = const()[name = tensor<string, []>("op_304_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_304_pad_0 = const()[name = tensor<string, []>("op_304_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_304_dilations_0 = const()[name = tensor<string, []>("op_304_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_304_groups_0 = const()[name = tensor<string, []>("op_304_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_encoder_attn_o_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(141515456))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(142334720))), name = tensor<string, []>("layers_0_encoder_attn_o_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_0_encoder_attn_o_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_0_encoder_attn_o_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(142334848)))];
            tensor<fp16, [1, 1280, 1, 1]> var_304_cast_fp16 = conv(bias = layers_0_encoder_attn_o_proj_inlier_module_bias_to_fp16, dilations = var_304_dilations_0, groups = var_304_groups_0, pad = var_304_pad_0, pad_type = var_304_pad_type_0, strides = var_304_strides_0, weight = layers_0_encoder_attn_o_proj_inlier_module_weight_to_fp16_palettized, x = input_3_cast_fp16)[name = tensor<string, []>("op_304_cast_fp16")];
            tensor<string, []> var_310_pad_type_0 = const()[name = tensor<string, []>("op_310_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_310_strides_0 = const()[name = tensor<string, []>("op_310_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_310_pad_0 = const()[name = tensor<string, []>("op_310_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_310_dilations_0 = const()[name = tensor<string, []>("op_310_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_310_groups_0 = const()[name = tensor<string, []>("op_310_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_0_encoder_attn_o_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(142349632))), name = tensor<string, []>("layers_0_encoder_attn_o_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [6041]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(142337472))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_310_cast_fp16 = conv(dilations = var_310_dilations_0, groups = var_310_groups_0, pad = var_310_pad_0, pad_type = var_310_pad_type_0, strides = var_310_strides_0, weight = layers_0_encoder_attn_o_proj_outlier_module_weight_to_fp16_sparsified, x = input_3_cast_fp16)[name = tensor<string, []>("op_310_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> obj_11_cast_fp16 = add(x = var_304_cast_fp16, y = var_310_cast_fp16)[name = tensor<string, []>("obj_11_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> inputs_5_cast_fp16 = add(x = inputs_3_cast_fp16, y = obj_11_cast_fp16)[name = tensor<string, []>("inputs_5_cast_fp16")];
            tensor<int32, [1]> out_5_axes_0 = const()[name = tensor<string, []>("out_5_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, []> var_321_to_fp16 = const()[name = tensor<string, []>("op_321_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> out_5_cast_fp16 = layer_norm(axes = out_5_axes_0, epsilon = var_321_to_fp16, x = inputs_5_cast_fp16)[name = tensor<string, []>("out_5_cast_fp16")];
            tensor<fp16, [1280]> input_5_gamma_0_to_fp16 = const()[name = tensor<string, []>("input_5_gamma_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(142554496)))];
            tensor<fp16, [1280]> input_5_beta_0_to_fp16 = const()[name = tensor<string, []>("input_5_beta_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(142557120)))];
            tensor<fp16, []> input_5_epsilon_0_to_fp16 = const()[name = tensor<string, []>("input_5_epsilon_0_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> input_5_cast_fp16 = batch_norm(beta = input_5_beta_0_to_fp16, epsilon = input_5_epsilon_0_to_fp16, gamma = input_5_gamma_0_to_fp16, mean = obj_1_mean_0_to_fp16, variance = obj_1_variance_0_to_fp16, x = out_5_cast_fp16)[name = tensor<string, []>("input_5_cast_fp16")];
            tensor<string, []> var_339_pad_type_0 = const()[name = tensor<string, []>("op_339_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_339_strides_0 = const()[name = tensor<string, []>("op_339_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_339_pad_0 = const()[name = tensor<string, []>("op_339_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_339_dilations_0 = const()[name = tensor<string, []>("op_339_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_339_groups_0 = const()[name = tensor<string, []>("op_339_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [5120, 1280, 1, 1]> layers_0_fc1_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [3276800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(142559744))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(145836608))), name = tensor<string, []>("layers_0_fc1_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([5120, 1280, 1, 1])];
            tensor<fp16, [5120]> layers_0_fc1_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_0_fc1_inlier_module_bias_to_fp16"), val = tensor<fp16, [5120]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(145836736)))];
            tensor<fp16, [1, 5120, 1, 1]> var_339_cast_fp16 = conv(bias = layers_0_fc1_inlier_module_bias_to_fp16, dilations = var_339_dilations_0, groups = var_339_groups_0, pad = var_339_pad_0, pad_type = var_339_pad_type_0, strides = var_339_strides_0, weight = layers_0_fc1_inlier_module_weight_to_fp16_palettized, x = input_5_cast_fp16)[name = tensor<string, []>("op_339_cast_fp16")];
            tensor<string, []> var_345_pad_type_0 = const()[name = tensor<string, []>("op_345_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_345_strides_0 = const()[name = tensor<string, []>("op_345_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_345_pad_0 = const()[name = tensor<string, []>("op_345_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_345_dilations_0 = const()[name = tensor<string, []>("op_345_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_345_groups_0 = const()[name = tensor<string, []>("op_345_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [5120, 1280, 1, 1]> layers_0_fc1_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(145948608))), name = tensor<string, []>("layers_0_fc1_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [50752]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(145847040))), shape = tensor<uint32, [4]>([5120, 1280, 1, 1])];
            tensor<fp16, [1, 5120, 1, 1]> var_345_cast_fp16 = conv(dilations = var_345_dilations_0, groups = var_345_groups_0, pad = var_345_pad_0, pad_type = var_345_pad_type_0, strides = var_345_strides_0, weight = layers_0_fc1_outlier_module_weight_to_fp16_sparsified, x = input_5_cast_fp16)[name = tensor<string, []>("op_345_cast_fp16")];
            tensor<fp16, [1, 5120, 1, 1]> input_7_cast_fp16 = add(x = var_339_cast_fp16, y = var_345_cast_fp16)[name = tensor<string, []>("input_7_cast_fp16")];
            tensor<string, []> input_9_mode_0 = const()[name = tensor<string, []>("input_9_mode_0"), val = tensor<string, []>("EXACT")];
            tensor<fp16, [1, 5120, 1, 1]> input_9_cast_fp16 = gelu(mode = input_9_mode_0, x = input_7_cast_fp16)[name = tensor<string, []>("input_9_cast_fp16")];
            tensor<string, []> var_356_pad_type_0 = const()[name = tensor<string, []>("op_356_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_356_strides_0 = const()[name = tensor<string, []>("op_356_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_356_pad_0 = const()[name = tensor<string, []>("op_356_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_356_dilations_0 = const()[name = tensor<string, []>("op_356_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_356_groups_0 = const()[name = tensor<string, []>("op_356_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 5120, 1, 1]> layers_0_fc2_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [3276800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(146767872))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(150044736))), name = tensor<string, []>("layers_0_fc2_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 5120, 1, 1])];
            tensor<fp16, [1280]> layers_0_fc2_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_0_fc2_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(150044864)))];
            tensor<fp16, [1, 1280, 1, 1]> var_356_cast_fp16 = conv(bias = layers_0_fc2_inlier_module_bias_to_fp16, dilations = var_356_dilations_0, groups = var_356_groups_0, pad = var_356_pad_0, pad_type = var_356_pad_type_0, strides = var_356_strides_0, weight = layers_0_fc2_inlier_module_weight_to_fp16_palettized, x = input_9_cast_fp16)[name = tensor<string, []>("op_356_cast_fp16")];
            tensor<string, []> var_362_pad_type_0 = const()[name = tensor<string, []>("op_362_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_362_strides_0 = const()[name = tensor<string, []>("op_362_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_362_pad_0 = const()[name = tensor<string, []>("op_362_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_362_dilations_0 = const()[name = tensor<string, []>("op_362_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_362_groups_0 = const()[name = tensor<string, []>("op_362_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 5120, 1, 1]> layers_0_fc2_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(150230016))), name = tensor<string, []>("layers_0_fc2_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [91213]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(150047488))), shape = tensor<uint32, [4]>([1280, 5120, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_362_cast_fp16 = conv(dilations = var_362_dilations_0, groups = var_362_groups_0, pad = var_362_pad_0, pad_type = var_362_pad_type_0, strides = var_362_strides_0, weight = layers_0_fc2_outlier_module_weight_to_fp16_sparsified, x = input_9_cast_fp16)[name = tensor<string, []>("op_362_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> hidden_states_3_cast_fp16 = add(x = var_356_cast_fp16, y = var_362_cast_fp16)[name = tensor<string, []>("hidden_states_3_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> inputs_7_cast_fp16 = add(x = inputs_5_cast_fp16, y = hidden_states_3_cast_fp16)[name = tensor<string, []>("inputs_7_cast_fp16")];
            tensor<int32, []> var_374 = const()[name = tensor<string, []>("op_374"), val = tensor<int32, []>(3)];
            tensor<int32, [1]> out_7_axes_0 = const()[name = tensor<string, []>("out_7_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, []> var_400_to_fp16 = const()[name = tensor<string, []>("op_400_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> out_7_cast_fp16 = layer_norm(axes = out_7_axes_0, epsilon = var_400_to_fp16, x = inputs_7_cast_fp16)[name = tensor<string, []>("out_7_cast_fp16")];
            tensor<fp16, [1280]> obj_15_gamma_0_to_fp16 = const()[name = tensor<string, []>("obj_15_gamma_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(151049280)))];
            tensor<fp16, [1280]> obj_15_beta_0_to_fp16 = const()[name = tensor<string, []>("obj_15_beta_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(151051904)))];
            tensor<fp16, []> obj_15_epsilon_0_to_fp16 = const()[name = tensor<string, []>("obj_15_epsilon_0_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> obj_15_cast_fp16 = batch_norm(beta = obj_15_beta_0_to_fp16, epsilon = obj_15_epsilon_0_to_fp16, gamma = obj_15_gamma_0_to_fp16, mean = obj_1_mean_0_to_fp16, variance = obj_1_variance_0_to_fp16, x = out_7_cast_fp16)[name = tensor<string, []>("obj_15_cast_fp16")];
            tensor<string, []> var_422_pad_type_0 = const()[name = tensor<string, []>("op_422_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_422_strides_0 = const()[name = tensor<string, []>("op_422_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_422_pad_0 = const()[name = tensor<string, []>("op_422_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_422_dilations_0 = const()[name = tensor<string, []>("op_422_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_422_groups_0 = const()[name = tensor<string, []>("op_422_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_self_attn_q_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(151054528))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(151873792))), name = tensor<string, []>("layers_1_self_attn_q_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_1_self_attn_q_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_1_self_attn_q_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(151873920)))];
            tensor<fp16, [1, 1280, 1, 1]> var_422_cast_fp16 = conv(bias = layers_1_self_attn_q_proj_inlier_module_bias_to_fp16, dilations = var_422_dilations_0, groups = var_422_groups_0, pad = var_422_pad_0, pad_type = var_422_pad_type_0, strides = var_422_strides_0, weight = layers_1_self_attn_q_proj_inlier_module_weight_to_fp16_palettized, x = obj_15_cast_fp16)[name = tensor<string, []>("op_422_cast_fp16")];
            tensor<string, []> var_428_pad_type_0 = const()[name = tensor<string, []>("op_428_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_428_strides_0 = const()[name = tensor<string, []>("op_428_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_428_pad_0 = const()[name = tensor<string, []>("op_428_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_428_dilations_0 = const()[name = tensor<string, []>("op_428_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_428_groups_0 = const()[name = tensor<string, []>("op_428_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_self_attn_q_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(151936640))), name = tensor<string, []>("layers_1_self_attn_q_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [29985]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(151876544))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_428_cast_fp16 = conv(dilations = var_428_dilations_0, groups = var_428_groups_0, pad = var_428_pad_0, pad_type = var_428_pad_type_0, strides = var_428_strides_0, weight = layers_1_self_attn_q_proj_outlier_module_weight_to_fp16_sparsified, x = obj_15_cast_fp16)[name = tensor<string, []>("op_428_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> query_5_cast_fp16 = add(x = var_422_cast_fp16, y = var_428_cast_fp16)[name = tensor<string, []>("query_5_cast_fp16")];
            tensor<string, []> var_437_pad_type_0 = const()[name = tensor<string, []>("op_437_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_437_strides_0 = const()[name = tensor<string, []>("op_437_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_437_pad_0 = const()[name = tensor<string, []>("op_437_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_437_dilations_0 = const()[name = tensor<string, []>("op_437_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_437_groups_0 = const()[name = tensor<string, []>("op_437_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_self_attn_k_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(152141504))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(152960768))), name = tensor<string, []>("layers_1_self_attn_k_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_437_cast_fp16 = conv(dilations = var_437_dilations_0, groups = var_437_groups_0, pad = var_437_pad_0, pad_type = var_437_pad_type_0, strides = var_437_strides_0, weight = layers_1_self_attn_k_proj_inlier_module_weight_to_fp16_palettized, x = obj_15_cast_fp16)[name = tensor<string, []>("op_437_cast_fp16")];
            tensor<string, []> var_443_pad_type_0 = const()[name = tensor<string, []>("op_443_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_443_strides_0 = const()[name = tensor<string, []>("op_443_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_443_pad_0 = const()[name = tensor<string, []>("op_443_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_443_dilations_0 = const()[name = tensor<string, []>("op_443_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_443_groups_0 = const()[name = tensor<string, []>("op_443_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_self_attn_k_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(153007552))), name = tensor<string, []>("layers_1_self_attn_k_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [23287]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(152960896))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_443_cast_fp16 = conv(dilations = var_443_dilations_0, groups = var_443_groups_0, pad = var_443_pad_0, pad_type = var_443_pad_type_0, strides = var_443_strides_0, weight = layers_1_self_attn_k_proj_outlier_module_weight_to_fp16_sparsified, x = obj_15_cast_fp16)[name = tensor<string, []>("op_443_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> current_key_3_cast_fp16 = add(x = var_437_cast_fp16, y = var_443_cast_fp16)[name = tensor<string, []>("current_key_3_cast_fp16")];
            tensor<string, []> var_453_pad_type_0 = const()[name = tensor<string, []>("op_453_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_453_strides_0 = const()[name = tensor<string, []>("op_453_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_453_pad_0 = const()[name = tensor<string, []>("op_453_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_453_dilations_0 = const()[name = tensor<string, []>("op_453_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_453_groups_0 = const()[name = tensor<string, []>("op_453_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_self_attn_v_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(153212416))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(154031680))), name = tensor<string, []>("layers_1_self_attn_v_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_1_self_attn_v_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_1_self_attn_v_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(154031808)))];
            tensor<fp16, [1, 1280, 1, 1]> var_453_cast_fp16 = conv(bias = layers_1_self_attn_v_proj_inlier_module_bias_to_fp16, dilations = var_453_dilations_0, groups = var_453_groups_0, pad = var_453_pad_0, pad_type = var_453_pad_type_0, strides = var_453_strides_0, weight = layers_1_self_attn_v_proj_inlier_module_weight_to_fp16_palettized, x = obj_15_cast_fp16)[name = tensor<string, []>("op_453_cast_fp16")];
            tensor<string, []> var_459_pad_type_0 = const()[name = tensor<string, []>("op_459_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_459_strides_0 = const()[name = tensor<string, []>("op_459_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_459_pad_0 = const()[name = tensor<string, []>("op_459_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_459_dilations_0 = const()[name = tensor<string, []>("op_459_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_459_groups_0 = const()[name = tensor<string, []>("op_459_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_self_attn_v_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(154057088))), name = tensor<string, []>("layers_1_self_attn_v_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [11267]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(154034432))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_459_cast_fp16 = conv(dilations = var_459_dilations_0, groups = var_459_groups_0, pad = var_459_pad_0, pad_type = var_459_pad_type_0, strides = var_459_strides_0, weight = layers_1_self_attn_v_proj_outlier_module_weight_to_fp16_sparsified, x = obj_15_cast_fp16)[name = tensor<string, []>("op_459_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> current_value_3_cast_fp16 = add(x = var_453_cast_fp16, y = var_459_cast_fp16)[name = tensor<string, []>("current_value_3_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_465_cast_fp16 = mul(x = current_key_3_cast_fp16, y = var_159_cast_fp16)[name = tensor<string, []>("op_465_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_467_cast_fp16 = mul(x = var_53_cast_fp16_1, y = var_162_cast_fp16)[name = tensor<string, []>("op_467_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> key_5_cast_fp16 = add(x = var_465_cast_fp16, y = var_467_cast_fp16)[name = tensor<string, []>("key_5_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_469_cast_fp16 = mul(x = current_value_3_cast_fp16, y = var_159_cast_fp16)[name = tensor<string, []>("op_469_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_471_cast_fp16 = mul(x = var_60_cast_fp16_1, y = var_162_cast_fp16)[name = tensor<string, []>("op_471_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> value_5_cast_fp16 = add(x = var_469_cast_fp16, y = var_471_cast_fp16)[name = tensor<string, []>("value_5_cast_fp16")];
            tensor<int32, [4]> var_474 = const()[name = tensor<string, []>("op_474"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1]> mh_q_5_cast_fp16 = reshape(shape = var_474, x = query_5_cast_fp16)[name = tensor<string, []>("mh_q_5_cast_fp16")];
            tensor<fp16, []> var_476_to_fp16 = const()[name = tensor<string, []>("op_476_to_fp16"), val = tensor<fp16, []>(0x1p-3)];
            tensor<fp16, [1, 20, 64, 1]> var_477_cast_fp16 = mul(x = mh_q_5_cast_fp16, y = var_476_to_fp16)[name = tensor<string, []>("op_477_cast_fp16")];
            tensor<int32, [4]> var_478 = const()[name = tensor<string, []>("op_478"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 448]> var_479_cast_fp16 = reshape(shape = var_478, x = key_5_cast_fp16)[name = tensor<string, []>("op_479_cast_fp16")];
            tensor<bool, []> mh_w_7_transpose_x_0 = const()[name = tensor<string, []>("mh_w_7_transpose_x_0"), val = tensor<bool, []>(true)];
            tensor<bool, []> mh_w_7_transpose_y_0 = const()[name = tensor<string, []>("mh_w_7_transpose_y_0"), val = tensor<bool, []>(false)];
            tensor<fp16, [1, 20, 1, 448]> mh_w_7_cast_fp16 = matmul(transpose_x = mh_w_7_transpose_x_0, transpose_y = mh_w_7_transpose_y_0, x = var_477_cast_fp16, y = var_479_cast_fp16)[name = tensor<string, []>("mh_w_7_cast_fp16")];
            tensor<fp16, [1, 20, 1, 448]> mh_w_9_cast_fp16 = add(x = mh_w_7_cast_fp16, y = var_180_cast_fp16)[name = tensor<string, []>("mh_w_9_cast_fp16")];
            tensor<fp16, [1, 20, 1, 448]> var_487_cast_fp16 = softmax(axis = var_374, x = mh_w_9_cast_fp16)[name = tensor<string, []>("op_487_cast_fp16")];
            tensor<int32, [4]> var_488 = const()[name = tensor<string, []>("op_488"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 448]> var_489_cast_fp16 = reshape(shape = var_488, x = value_5_cast_fp16)[name = tensor<string, []>("op_489_cast_fp16")];
            tensor<bool, []> attn_5_transpose_x_0 = const()[name = tensor<string, []>("attn_5_transpose_x_0"), val = tensor<bool, []>(false)];
            tensor<bool, []> attn_5_transpose_y_0 = const()[name = tensor<string, []>("attn_5_transpose_y_0"), val = tensor<bool, []>(true)];
            tensor<fp16, [1, 20, 64, 1]> attn_5_cast_fp16 = matmul(transpose_x = attn_5_transpose_x_0, transpose_y = attn_5_transpose_y_0, x = var_489_cast_fp16, y = var_487_cast_fp16)[name = tensor<string, []>("attn_5_cast_fp16")];
            tensor<int32, [4]> var_492 = const()[name = tensor<string, []>("op_492"), val = tensor<int32, [4]>([1, 1280, 1, -1])];
            tensor<fp16, [1, 1280, 1, 1]> input_11_cast_fp16 = reshape(shape = var_492, x = attn_5_cast_fp16)[name = tensor<string, []>("input_11_cast_fp16")];
            tensor<string, []> var_502_pad_type_0 = const()[name = tensor<string, []>("op_502_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_502_strides_0 = const()[name = tensor<string, []>("op_502_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_502_pad_0 = const()[name = tensor<string, []>("op_502_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_502_dilations_0 = const()[name = tensor<string, []>("op_502_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_502_groups_0 = const()[name = tensor<string, []>("op_502_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_self_attn_o_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(154261952))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(155081216))), name = tensor<string, []>("layers_1_self_attn_o_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_1_self_attn_o_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_1_self_attn_o_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(155081344)))];
            tensor<fp16, [1, 1280, 1, 1]> var_502_cast_fp16 = conv(bias = layers_1_self_attn_o_proj_inlier_module_bias_to_fp16, dilations = var_502_dilations_0, groups = var_502_groups_0, pad = var_502_pad_0, pad_type = var_502_pad_type_0, strides = var_502_strides_0, weight = layers_1_self_attn_o_proj_inlier_module_weight_to_fp16_palettized, x = input_11_cast_fp16)[name = tensor<string, []>("op_502_cast_fp16")];
            tensor<string, []> var_508_pad_type_0 = const()[name = tensor<string, []>("op_508_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_508_strides_0 = const()[name = tensor<string, []>("op_508_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_508_pad_0 = const()[name = tensor<string, []>("op_508_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_508_dilations_0 = const()[name = tensor<string, []>("op_508_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_508_groups_0 = const()[name = tensor<string, []>("op_508_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_self_attn_o_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(155108416))), name = tensor<string, []>("layers_1_self_attn_o_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [12187]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(155083968))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_508_cast_fp16 = conv(dilations = var_508_dilations_0, groups = var_508_groups_0, pad = var_508_pad_0, pad_type = var_508_pad_type_0, strides = var_508_strides_0, weight = layers_1_self_attn_o_proj_outlier_module_weight_to_fp16_sparsified, x = input_11_cast_fp16)[name = tensor<string, []>("op_508_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> obj_21_cast_fp16 = add(x = var_502_cast_fp16, y = var_508_cast_fp16)[name = tensor<string, []>("obj_21_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> inputs_9_cast_fp16 = add(x = inputs_7_cast_fp16, y = obj_21_cast_fp16)[name = tensor<string, []>("inputs_9_cast_fp16")];
            tensor<int32, [1]> out_9_axes_0 = const()[name = tensor<string, []>("out_9_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, []> var_523_to_fp16 = const()[name = tensor<string, []>("op_523_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> out_9_cast_fp16 = layer_norm(axes = out_9_axes_0, epsilon = var_523_to_fp16, x = inputs_9_cast_fp16)[name = tensor<string, []>("out_9_cast_fp16")];
            tensor<fp16, [1280]> obj_23_gamma_0_to_fp16 = const()[name = tensor<string, []>("obj_23_gamma_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(155313280)))];
            tensor<fp16, [1280]> obj_23_beta_0_to_fp16 = const()[name = tensor<string, []>("obj_23_beta_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(155315904)))];
            tensor<fp16, []> obj_23_epsilon_0_to_fp16 = const()[name = tensor<string, []>("obj_23_epsilon_0_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> obj_23_cast_fp16 = batch_norm(beta = obj_23_beta_0_to_fp16, epsilon = obj_23_epsilon_0_to_fp16, gamma = obj_23_gamma_0_to_fp16, mean = obj_1_mean_0_to_fp16, variance = obj_1_variance_0_to_fp16, x = out_9_cast_fp16)[name = tensor<string, []>("obj_23_cast_fp16")];
            tensor<string, []> var_545_pad_type_0 = const()[name = tensor<string, []>("op_545_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_545_strides_0 = const()[name = tensor<string, []>("op_545_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_545_pad_0 = const()[name = tensor<string, []>("op_545_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_545_dilations_0 = const()[name = tensor<string, []>("op_545_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_545_groups_0 = const()[name = tensor<string, []>("op_545_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_encoder_attn_q_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(155318528))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(156137792))), name = tensor<string, []>("layers_1_encoder_attn_q_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_1_encoder_attn_q_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_1_encoder_attn_q_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(156137920)))];
            tensor<fp16, [1, 1280, 1, 1]> var_545_cast_fp16 = conv(bias = layers_1_encoder_attn_q_proj_inlier_module_bias_to_fp16, dilations = var_545_dilations_0, groups = var_545_groups_0, pad = var_545_pad_0, pad_type = var_545_pad_type_0, strides = var_545_strides_0, weight = layers_1_encoder_attn_q_proj_inlier_module_weight_to_fp16_palettized, x = obj_23_cast_fp16)[name = tensor<string, []>("op_545_cast_fp16")];
            tensor<string, []> var_551_pad_type_0 = const()[name = tensor<string, []>("op_551_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_551_strides_0 = const()[name = tensor<string, []>("op_551_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_551_pad_0 = const()[name = tensor<string, []>("op_551_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_551_dilations_0 = const()[name = tensor<string, []>("op_551_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_551_groups_0 = const()[name = tensor<string, []>("op_551_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_encoder_attn_q_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(156183616))), name = tensor<string, []>("layers_1_encoder_attn_q_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [21483]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(156140544))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_551_cast_fp16 = conv(dilations = var_551_dilations_0, groups = var_551_groups_0, pad = var_551_pad_0, pad_type = var_551_pad_type_0, strides = var_551_strides_0, weight = layers_1_encoder_attn_q_proj_outlier_module_weight_to_fp16_sparsified, x = obj_23_cast_fp16)[name = tensor<string, []>("op_551_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> query_7_cast_fp16 = add(x = var_545_cast_fp16, y = var_551_cast_fp16)[name = tensor<string, []>("query_7_cast_fp16")];
            tensor<string, []> var_560_pad_type_0 = const()[name = tensor<string, []>("op_560_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_560_strides_0 = const()[name = tensor<string, []>("op_560_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_560_pad_0 = const()[name = tensor<string, []>("op_560_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_560_dilations_0 = const()[name = tensor<string, []>("op_560_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_560_groups_0 = const()[name = tensor<string, []>("op_560_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_encoder_attn_k_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(156388480))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(157207744))), name = tensor<string, []>("layers_1_encoder_attn_k_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1500]> var_560_cast_fp16 = conv(dilations = var_560_dilations_0, groups = var_560_groups_0, pad = var_560_pad_0, pad_type = var_560_pad_type_0, strides = var_560_strides_0, weight = layers_1_encoder_attn_k_proj_inlier_module_weight_to_fp16_palettized, x = encoder_output_embeds)[name = tensor<string, []>("op_560_cast_fp16")];
            tensor<string, []> var_566_pad_type_0 = const()[name = tensor<string, []>("op_566_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_566_strides_0 = const()[name = tensor<string, []>("op_566_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_566_pad_0 = const()[name = tensor<string, []>("op_566_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_566_dilations_0 = const()[name = tensor<string, []>("op_566_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_566_groups_0 = const()[name = tensor<string, []>("op_566_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_encoder_attn_k_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(157251328))), name = tensor<string, []>("layers_1_encoder_attn_k_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [21667]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(157207872))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1500]> var_566_cast_fp16 = conv(dilations = var_566_dilations_0, groups = var_566_groups_0, pad = var_566_pad_0, pad_type = var_566_pad_type_0, strides = var_566_strides_0, weight = layers_1_encoder_attn_k_proj_outlier_module_weight_to_fp16_sparsified, x = encoder_output_embeds)[name = tensor<string, []>("op_566_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1500]> key_7_cast_fp16 = add(x = var_560_cast_fp16, y = var_566_cast_fp16)[name = tensor<string, []>("key_7_cast_fp16")];
            tensor<string, []> var_576_pad_type_0 = const()[name = tensor<string, []>("op_576_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_576_strides_0 = const()[name = tensor<string, []>("op_576_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_576_pad_0 = const()[name = tensor<string, []>("op_576_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_576_dilations_0 = const()[name = tensor<string, []>("op_576_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_576_groups_0 = const()[name = tensor<string, []>("op_576_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_encoder_attn_v_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(157456192))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(158275456))), name = tensor<string, []>("layers_1_encoder_attn_v_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_1_encoder_attn_v_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_1_encoder_attn_v_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(158275584)))];
            tensor<fp16, [1, 1280, 1, 1500]> var_576_cast_fp16 = conv(bias = layers_1_encoder_attn_v_proj_inlier_module_bias_to_fp16, dilations = var_576_dilations_0, groups = var_576_groups_0, pad = var_576_pad_0, pad_type = var_576_pad_type_0, strides = var_576_strides_0, weight = layers_1_encoder_attn_v_proj_inlier_module_weight_to_fp16_palettized, x = encoder_output_embeds)[name = tensor<string, []>("op_576_cast_fp16")];
            tensor<string, []> var_582_pad_type_0 = const()[name = tensor<string, []>("op_582_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_582_strides_0 = const()[name = tensor<string, []>("op_582_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_582_pad_0 = const()[name = tensor<string, []>("op_582_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_582_dilations_0 = const()[name = tensor<string, []>("op_582_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_582_groups_0 = const()[name = tensor<string, []>("op_582_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_encoder_attn_v_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(158289408))), name = tensor<string, []>("layers_1_encoder_attn_v_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [5557]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(158278208))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1500]> var_582_cast_fp16 = conv(dilations = var_582_dilations_0, groups = var_582_groups_0, pad = var_582_pad_0, pad_type = var_582_pad_type_0, strides = var_582_strides_0, weight = layers_1_encoder_attn_v_proj_outlier_module_weight_to_fp16_sparsified, x = encoder_output_embeds)[name = tensor<string, []>("op_582_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1500]> value_7_cast_fp16 = add(x = var_576_cast_fp16, y = var_582_cast_fp16)[name = tensor<string, []>("value_7_cast_fp16")];
            tensor<int32, [4]> var_585 = const()[name = tensor<string, []>("op_585"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1]> mh_q_7_cast_fp16 = reshape(shape = var_585, x = query_7_cast_fp16)[name = tensor<string, []>("mh_q_7_cast_fp16")];
            tensor<fp16, []> var_587_to_fp16 = const()[name = tensor<string, []>("op_587_to_fp16"), val = tensor<fp16, []>(0x1p-3)];
            tensor<fp16, [1, 20, 64, 1]> var_588_cast_fp16 = mul(x = mh_q_7_cast_fp16, y = var_587_to_fp16)[name = tensor<string, []>("op_588_cast_fp16")];
            tensor<int32, [4]> var_589 = const()[name = tensor<string, []>("op_589"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1500]> var_590_cast_fp16 = reshape(shape = var_589, x = key_7_cast_fp16)[name = tensor<string, []>("op_590_cast_fp16")];
            tensor<bool, []> mh_w_11_transpose_x_0 = const()[name = tensor<string, []>("mh_w_11_transpose_x_0"), val = tensor<bool, []>(true)];
            tensor<bool, []> mh_w_11_transpose_y_0 = const()[name = tensor<string, []>("mh_w_11_transpose_y_0"), val = tensor<bool, []>(false)];
            tensor<fp16, [1, 20, 1, 1500]> mh_w_11_cast_fp16 = matmul(transpose_x = mh_w_11_transpose_x_0, transpose_y = mh_w_11_transpose_y_0, x = var_588_cast_fp16, y = var_590_cast_fp16)[name = tensor<string, []>("mh_w_11_cast_fp16")];
            tensor<fp16, [1, 20, 1, 1500]> obj_27_cast_fp16 = softmax(axis = var_374, x = mh_w_11_cast_fp16)[name = tensor<string, []>("obj_27_cast_fp16")];
            tensor<int32, [4]> var_594 = const()[name = tensor<string, []>("op_594"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1500]> var_595_cast_fp16 = reshape(shape = var_594, x = value_7_cast_fp16)[name = tensor<string, []>("op_595_cast_fp16")];
            tensor<bool, []> attn_7_transpose_x_0 = const()[name = tensor<string, []>("attn_7_transpose_x_0"), val = tensor<bool, []>(false)];
            tensor<bool, []> attn_7_transpose_y_0 = const()[name = tensor<string, []>("attn_7_transpose_y_0"), val = tensor<bool, []>(true)];
            tensor<fp16, [1, 20, 64, 1]> attn_7_cast_fp16 = matmul(transpose_x = attn_7_transpose_x_0, transpose_y = attn_7_transpose_y_0, x = var_595_cast_fp16, y = obj_27_cast_fp16)[name = tensor<string, []>("attn_7_cast_fp16")];
            tensor<int32, [4]> var_598 = const()[name = tensor<string, []>("op_598"), val = tensor<int32, [4]>([1, 1280, 1, -1])];
            tensor<fp16, [1, 1280, 1, 1]> input_13_cast_fp16 = reshape(shape = var_598, x = attn_7_cast_fp16)[name = tensor<string, []>("input_13_cast_fp16")];
            tensor<string, []> var_608_pad_type_0 = const()[name = tensor<string, []>("op_608_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_608_strides_0 = const()[name = tensor<string, []>("op_608_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_608_pad_0 = const()[name = tensor<string, []>("op_608_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_608_dilations_0 = const()[name = tensor<string, []>("op_608_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_608_groups_0 = const()[name = tensor<string, []>("op_608_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_encoder_attn_o_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(158494272))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(159313536))), name = tensor<string, []>("layers_1_encoder_attn_o_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_1_encoder_attn_o_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_1_encoder_attn_o_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(159313664)))];
            tensor<fp16, [1, 1280, 1, 1]> var_608_cast_fp16 = conv(bias = layers_1_encoder_attn_o_proj_inlier_module_bias_to_fp16, dilations = var_608_dilations_0, groups = var_608_groups_0, pad = var_608_pad_0, pad_type = var_608_pad_type_0, strides = var_608_strides_0, weight = layers_1_encoder_attn_o_proj_inlier_module_weight_to_fp16_palettized, x = input_13_cast_fp16)[name = tensor<string, []>("op_608_cast_fp16")];
            tensor<string, []> var_614_pad_type_0 = const()[name = tensor<string, []>("op_614_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_614_strides_0 = const()[name = tensor<string, []>("op_614_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_614_pad_0 = const()[name = tensor<string, []>("op_614_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_614_dilations_0 = const()[name = tensor<string, []>("op_614_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_614_groups_0 = const()[name = tensor<string, []>("op_614_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_1_encoder_attn_o_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(159326656))), name = tensor<string, []>("layers_1_encoder_attn_o_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [5143]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(159316288))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_614_cast_fp16 = conv(dilations = var_614_dilations_0, groups = var_614_groups_0, pad = var_614_pad_0, pad_type = var_614_pad_type_0, strides = var_614_strides_0, weight = layers_1_encoder_attn_o_proj_outlier_module_weight_to_fp16_sparsified, x = input_13_cast_fp16)[name = tensor<string, []>("op_614_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> obj_25_cast_fp16 = add(x = var_608_cast_fp16, y = var_614_cast_fp16)[name = tensor<string, []>("obj_25_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> inputs_11_cast_fp16 = add(x = inputs_9_cast_fp16, y = obj_25_cast_fp16)[name = tensor<string, []>("inputs_11_cast_fp16")];
            tensor<int32, [1]> out_11_axes_0 = const()[name = tensor<string, []>("out_11_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, []> var_625_to_fp16 = const()[name = tensor<string, []>("op_625_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> out_11_cast_fp16 = layer_norm(axes = out_11_axes_0, epsilon = var_625_to_fp16, x = inputs_11_cast_fp16)[name = tensor<string, []>("out_11_cast_fp16")];
            tensor<fp16, [1280]> input_15_gamma_0_to_fp16 = const()[name = tensor<string, []>("input_15_gamma_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(159531520)))];
            tensor<fp16, [1280]> input_15_beta_0_to_fp16 = const()[name = tensor<string, []>("input_15_beta_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(159534144)))];
            tensor<fp16, []> input_15_epsilon_0_to_fp16 = const()[name = tensor<string, []>("input_15_epsilon_0_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> input_15_cast_fp16 = batch_norm(beta = input_15_beta_0_to_fp16, epsilon = input_15_epsilon_0_to_fp16, gamma = input_15_gamma_0_to_fp16, mean = obj_1_mean_0_to_fp16, variance = obj_1_variance_0_to_fp16, x = out_11_cast_fp16)[name = tensor<string, []>("input_15_cast_fp16")];
            tensor<string, []> var_643_pad_type_0 = const()[name = tensor<string, []>("op_643_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_643_strides_0 = const()[name = tensor<string, []>("op_643_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_643_pad_0 = const()[name = tensor<string, []>("op_643_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_643_dilations_0 = const()[name = tensor<string, []>("op_643_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_643_groups_0 = const()[name = tensor<string, []>("op_643_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [5120, 1280, 1, 1]> layers_1_fc1_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [3276800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(159536768))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(162813632))), name = tensor<string, []>("layers_1_fc1_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([5120, 1280, 1, 1])];
            tensor<fp16, [5120]> layers_1_fc1_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_1_fc1_inlier_module_bias_to_fp16"), val = tensor<fp16, [5120]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(162813760)))];
            tensor<fp16, [1, 5120, 1, 1]> var_643_cast_fp16 = conv(bias = layers_1_fc1_inlier_module_bias_to_fp16, dilations = var_643_dilations_0, groups = var_643_groups_0, pad = var_643_pad_0, pad_type = var_643_pad_type_0, strides = var_643_strides_0, weight = layers_1_fc1_inlier_module_weight_to_fp16_palettized, x = input_15_cast_fp16)[name = tensor<string, []>("op_643_cast_fp16")];
            tensor<string, []> var_649_pad_type_0 = const()[name = tensor<string, []>("op_649_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_649_strides_0 = const()[name = tensor<string, []>("op_649_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_649_pad_0 = const()[name = tensor<string, []>("op_649_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_649_dilations_0 = const()[name = tensor<string, []>("op_649_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_649_groups_0 = const()[name = tensor<string, []>("op_649_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [5120, 1280, 1, 1]> layers_1_fc1_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(162909312))), name = tensor<string, []>("layers_1_fc1_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [42562]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(162824064))), shape = tensor<uint32, [4]>([5120, 1280, 1, 1])];
            tensor<fp16, [1, 5120, 1, 1]> var_649_cast_fp16 = conv(dilations = var_649_dilations_0, groups = var_649_groups_0, pad = var_649_pad_0, pad_type = var_649_pad_type_0, strides = var_649_strides_0, weight = layers_1_fc1_outlier_module_weight_to_fp16_sparsified, x = input_15_cast_fp16)[name = tensor<string, []>("op_649_cast_fp16")];
            tensor<fp16, [1, 5120, 1, 1]> input_17_cast_fp16 = add(x = var_643_cast_fp16, y = var_649_cast_fp16)[name = tensor<string, []>("input_17_cast_fp16")];
            tensor<string, []> input_19_mode_0 = const()[name = tensor<string, []>("input_19_mode_0"), val = tensor<string, []>("EXACT")];
            tensor<fp16, [1, 5120, 1, 1]> input_19_cast_fp16 = gelu(mode = input_19_mode_0, x = input_17_cast_fp16)[name = tensor<string, []>("input_19_cast_fp16")];
            tensor<string, []> var_660_pad_type_0 = const()[name = tensor<string, []>("op_660_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_660_strides_0 = const()[name = tensor<string, []>("op_660_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_660_pad_0 = const()[name = tensor<string, []>("op_660_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_660_dilations_0 = const()[name = tensor<string, []>("op_660_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_660_groups_0 = const()[name = tensor<string, []>("op_660_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 5120, 1, 1]> layers_1_fc2_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [3276800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(163728576))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(167005440))), name = tensor<string, []>("layers_1_fc2_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 5120, 1, 1])];
            tensor<fp16, [1280]> layers_1_fc2_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_1_fc2_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(167005568)))];
            tensor<fp16, [1, 1280, 1, 1]> var_660_cast_fp16 = conv(bias = layers_1_fc2_inlier_module_bias_to_fp16, dilations = var_660_dilations_0, groups = var_660_groups_0, pad = var_660_pad_0, pad_type = var_660_pad_type_0, strides = var_660_strides_0, weight = layers_1_fc2_inlier_module_weight_to_fp16_palettized, x = input_19_cast_fp16)[name = tensor<string, []>("op_660_cast_fp16")];
            tensor<string, []> var_666_pad_type_0 = const()[name = tensor<string, []>("op_666_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_666_strides_0 = const()[name = tensor<string, []>("op_666_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_666_pad_0 = const()[name = tensor<string, []>("op_666_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_666_dilations_0 = const()[name = tensor<string, []>("op_666_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_666_groups_0 = const()[name = tensor<string, []>("op_666_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 5120, 1, 1]> layers_1_fc2_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(167096192))), name = tensor<string, []>("layers_1_fc2_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [43939]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(167008192))), shape = tensor<uint32, [4]>([1280, 5120, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_666_cast_fp16 = conv(dilations = var_666_dilations_0, groups = var_666_groups_0, pad = var_666_pad_0, pad_type = var_666_pad_type_0, strides = var_666_strides_0, weight = layers_1_fc2_outlier_module_weight_to_fp16_sparsified, x = input_19_cast_fp16)[name = tensor<string, []>("op_666_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> hidden_states_5_cast_fp16 = add(x = var_660_cast_fp16, y = var_666_cast_fp16)[name = tensor<string, []>("hidden_states_5_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> inputs_13_cast_fp16 = add(x = inputs_11_cast_fp16, y = hidden_states_5_cast_fp16)[name = tensor<string, []>("inputs_13_cast_fp16")];
            tensor<int32, []> var_678 = const()[name = tensor<string, []>("op_678"), val = tensor<int32, []>(3)];
            tensor<int32, [1]> out_13_axes_0 = const()[name = tensor<string, []>("out_13_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, []> var_704_to_fp16 = const()[name = tensor<string, []>("op_704_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> out_13_cast_fp16 = layer_norm(axes = out_13_axes_0, epsilon = var_704_to_fp16, x = inputs_13_cast_fp16)[name = tensor<string, []>("out_13_cast_fp16")];
            tensor<fp16, [1280]> obj_29_gamma_0_to_fp16 = const()[name = tensor<string, []>("obj_29_gamma_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(167915456)))];
            tensor<fp16, [1280]> obj_29_beta_0_to_fp16 = const()[name = tensor<string, []>("obj_29_beta_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(167918080)))];
            tensor<fp16, []> obj_29_epsilon_0_to_fp16 = const()[name = tensor<string, []>("obj_29_epsilon_0_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> obj_29_cast_fp16 = batch_norm(beta = obj_29_beta_0_to_fp16, epsilon = obj_29_epsilon_0_to_fp16, gamma = obj_29_gamma_0_to_fp16, mean = obj_1_mean_0_to_fp16, variance = obj_1_variance_0_to_fp16, x = out_13_cast_fp16)[name = tensor<string, []>("obj_29_cast_fp16")];
            tensor<string, []> var_726_pad_type_0 = const()[name = tensor<string, []>("op_726_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_726_strides_0 = const()[name = tensor<string, []>("op_726_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_726_pad_0 = const()[name = tensor<string, []>("op_726_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_726_dilations_0 = const()[name = tensor<string, []>("op_726_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_726_groups_0 = const()[name = tensor<string, []>("op_726_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_self_attn_q_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(167920704))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(168739968))), name = tensor<string, []>("layers_2_self_attn_q_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_2_self_attn_q_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_2_self_attn_q_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(168740096)))];
            tensor<fp16, [1, 1280, 1, 1]> var_726_cast_fp16 = conv(bias = layers_2_self_attn_q_proj_inlier_module_bias_to_fp16, dilations = var_726_dilations_0, groups = var_726_groups_0, pad = var_726_pad_0, pad_type = var_726_pad_type_0, strides = var_726_strides_0, weight = layers_2_self_attn_q_proj_inlier_module_weight_to_fp16_palettized, x = obj_29_cast_fp16)[name = tensor<string, []>("op_726_cast_fp16")];
            tensor<string, []> var_732_pad_type_0 = const()[name = tensor<string, []>("op_732_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_732_strides_0 = const()[name = tensor<string, []>("op_732_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_732_pad_0 = const()[name = tensor<string, []>("op_732_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_732_dilations_0 = const()[name = tensor<string, []>("op_732_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_732_groups_0 = const()[name = tensor<string, []>("op_732_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_self_attn_q_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(168774976))), name = tensor<string, []>("layers_2_self_attn_q_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [16094]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(168742720))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_732_cast_fp16 = conv(dilations = var_732_dilations_0, groups = var_732_groups_0, pad = var_732_pad_0, pad_type = var_732_pad_type_0, strides = var_732_strides_0, weight = layers_2_self_attn_q_proj_outlier_module_weight_to_fp16_sparsified, x = obj_29_cast_fp16)[name = tensor<string, []>("op_732_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> query_9_cast_fp16 = add(x = var_726_cast_fp16, y = var_732_cast_fp16)[name = tensor<string, []>("query_9_cast_fp16")];
            tensor<string, []> var_741_pad_type_0 = const()[name = tensor<string, []>("op_741_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_741_strides_0 = const()[name = tensor<string, []>("op_741_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_741_pad_0 = const()[name = tensor<string, []>("op_741_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_741_dilations_0 = const()[name = tensor<string, []>("op_741_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_741_groups_0 = const()[name = tensor<string, []>("op_741_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_self_attn_k_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(168979840))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(169799104))), name = tensor<string, []>("layers_2_self_attn_k_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_741_cast_fp16 = conv(dilations = var_741_dilations_0, groups = var_741_groups_0, pad = var_741_pad_0, pad_type = var_741_pad_type_0, strides = var_741_strides_0, weight = layers_2_self_attn_k_proj_inlier_module_weight_to_fp16_palettized, x = obj_29_cast_fp16)[name = tensor<string, []>("op_741_cast_fp16")];
            tensor<string, []> var_747_pad_type_0 = const()[name = tensor<string, []>("op_747_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_747_strides_0 = const()[name = tensor<string, []>("op_747_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_747_pad_0 = const()[name = tensor<string, []>("op_747_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_747_dilations_0 = const()[name = tensor<string, []>("op_747_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_747_groups_0 = const()[name = tensor<string, []>("op_747_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_self_attn_k_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(169836736))), name = tensor<string, []>("layers_2_self_attn_k_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [18690]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(169799232))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_747_cast_fp16 = conv(dilations = var_747_dilations_0, groups = var_747_groups_0, pad = var_747_pad_0, pad_type = var_747_pad_type_0, strides = var_747_strides_0, weight = layers_2_self_attn_k_proj_outlier_module_weight_to_fp16_sparsified, x = obj_29_cast_fp16)[name = tensor<string, []>("op_747_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> current_key_5_cast_fp16 = add(x = var_741_cast_fp16, y = var_747_cast_fp16)[name = tensor<string, []>("current_key_5_cast_fp16")];
            tensor<string, []> var_757_pad_type_0 = const()[name = tensor<string, []>("op_757_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_757_strides_0 = const()[name = tensor<string, []>("op_757_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_757_pad_0 = const()[name = tensor<string, []>("op_757_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_757_dilations_0 = const()[name = tensor<string, []>("op_757_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_757_groups_0 = const()[name = tensor<string, []>("op_757_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_self_attn_v_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(170041600))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(170860864))), name = tensor<string, []>("layers_2_self_attn_v_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_2_self_attn_v_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_2_self_attn_v_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(170860992)))];
            tensor<fp16, [1, 1280, 1, 1]> var_757_cast_fp16 = conv(bias = layers_2_self_attn_v_proj_inlier_module_bias_to_fp16, dilations = var_757_dilations_0, groups = var_757_groups_0, pad = var_757_pad_0, pad_type = var_757_pad_type_0, strides = var_757_strides_0, weight = layers_2_self_attn_v_proj_inlier_module_weight_to_fp16_palettized, x = obj_29_cast_fp16)[name = tensor<string, []>("op_757_cast_fp16")];
            tensor<string, []> var_763_pad_type_0 = const()[name = tensor<string, []>("op_763_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_763_strides_0 = const()[name = tensor<string, []>("op_763_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_763_pad_0 = const()[name = tensor<string, []>("op_763_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_763_dilations_0 = const()[name = tensor<string, []>("op_763_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_763_groups_0 = const()[name = tensor<string, []>("op_763_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_self_attn_v_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(170876544))), name = tensor<string, []>("layers_2_self_attn_v_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [6431]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(170863616))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_763_cast_fp16 = conv(dilations = var_763_dilations_0, groups = var_763_groups_0, pad = var_763_pad_0, pad_type = var_763_pad_type_0, strides = var_763_strides_0, weight = layers_2_self_attn_v_proj_outlier_module_weight_to_fp16_sparsified, x = obj_29_cast_fp16)[name = tensor<string, []>("op_763_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> current_value_5_cast_fp16 = add(x = var_757_cast_fp16, y = var_763_cast_fp16)[name = tensor<string, []>("current_value_5_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_769_cast_fp16 = mul(x = current_key_5_cast_fp16, y = var_159_cast_fp16)[name = tensor<string, []>("op_769_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_771_cast_fp16 = mul(x = var_53_cast_fp16_2, y = var_162_cast_fp16)[name = tensor<string, []>("op_771_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> key_9_cast_fp16 = add(x = var_769_cast_fp16, y = var_771_cast_fp16)[name = tensor<string, []>("key_9_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_773_cast_fp16 = mul(x = current_value_5_cast_fp16, y = var_159_cast_fp16)[name = tensor<string, []>("op_773_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_775_cast_fp16 = mul(x = var_60_cast_fp16_2, y = var_162_cast_fp16)[name = tensor<string, []>("op_775_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> value_9_cast_fp16 = add(x = var_773_cast_fp16, y = var_775_cast_fp16)[name = tensor<string, []>("value_9_cast_fp16")];
            tensor<int32, [4]> var_778 = const()[name = tensor<string, []>("op_778"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1]> mh_q_9_cast_fp16 = reshape(shape = var_778, x = query_9_cast_fp16)[name = tensor<string, []>("mh_q_9_cast_fp16")];
            tensor<fp16, []> var_780_to_fp16 = const()[name = tensor<string, []>("op_780_to_fp16"), val = tensor<fp16, []>(0x1p-3)];
            tensor<fp16, [1, 20, 64, 1]> var_781_cast_fp16 = mul(x = mh_q_9_cast_fp16, y = var_780_to_fp16)[name = tensor<string, []>("op_781_cast_fp16")];
            tensor<int32, [4]> var_782 = const()[name = tensor<string, []>("op_782"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 448]> var_783_cast_fp16 = reshape(shape = var_782, x = key_9_cast_fp16)[name = tensor<string, []>("op_783_cast_fp16")];
            tensor<bool, []> mh_w_13_transpose_x_0 = const()[name = tensor<string, []>("mh_w_13_transpose_x_0"), val = tensor<bool, []>(true)];
            tensor<bool, []> mh_w_13_transpose_y_0 = const()[name = tensor<string, []>("mh_w_13_transpose_y_0"), val = tensor<bool, []>(false)];
            tensor<fp16, [1, 20, 1, 448]> mh_w_13_cast_fp16 = matmul(transpose_x = mh_w_13_transpose_x_0, transpose_y = mh_w_13_transpose_y_0, x = var_781_cast_fp16, y = var_783_cast_fp16)[name = tensor<string, []>("mh_w_13_cast_fp16")];
            tensor<fp16, [1, 20, 1, 448]> mh_w_15_cast_fp16 = add(x = mh_w_13_cast_fp16, y = var_180_cast_fp16)[name = tensor<string, []>("mh_w_15_cast_fp16")];
            tensor<fp16, [1, 20, 1, 448]> var_791_cast_fp16 = softmax(axis = var_678, x = mh_w_15_cast_fp16)[name = tensor<string, []>("op_791_cast_fp16")];
            tensor<int32, [4]> var_792 = const()[name = tensor<string, []>("op_792"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 448]> var_793_cast_fp16 = reshape(shape = var_792, x = value_9_cast_fp16)[name = tensor<string, []>("op_793_cast_fp16")];
            tensor<bool, []> attn_9_transpose_x_0 = const()[name = tensor<string, []>("attn_9_transpose_x_0"), val = tensor<bool, []>(false)];
            tensor<bool, []> attn_9_transpose_y_0 = const()[name = tensor<string, []>("attn_9_transpose_y_0"), val = tensor<bool, []>(true)];
            tensor<fp16, [1, 20, 64, 1]> attn_9_cast_fp16 = matmul(transpose_x = attn_9_transpose_x_0, transpose_y = attn_9_transpose_y_0, x = var_793_cast_fp16, y = var_791_cast_fp16)[name = tensor<string, []>("attn_9_cast_fp16")];
            tensor<int32, [4]> var_796 = const()[name = tensor<string, []>("op_796"), val = tensor<int32, [4]>([1, 1280, 1, -1])];
            tensor<fp16, [1, 1280, 1, 1]> input_21_cast_fp16 = reshape(shape = var_796, x = attn_9_cast_fp16)[name = tensor<string, []>("input_21_cast_fp16")];
            tensor<string, []> var_806_pad_type_0 = const()[name = tensor<string, []>("op_806_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_806_strides_0 = const()[name = tensor<string, []>("op_806_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_806_pad_0 = const()[name = tensor<string, []>("op_806_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_806_dilations_0 = const()[name = tensor<string, []>("op_806_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_806_groups_0 = const()[name = tensor<string, []>("op_806_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_self_attn_o_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(171081408))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(171900672))), name = tensor<string, []>("layers_2_self_attn_o_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_2_self_attn_o_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_2_self_attn_o_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(171900800)))];
            tensor<fp16, [1, 1280, 1, 1]> var_806_cast_fp16 = conv(bias = layers_2_self_attn_o_proj_inlier_module_bias_to_fp16, dilations = var_806_dilations_0, groups = var_806_groups_0, pad = var_806_pad_0, pad_type = var_806_pad_type_0, strides = var_806_strides_0, weight = layers_2_self_attn_o_proj_inlier_module_weight_to_fp16_palettized, x = input_21_cast_fp16)[name = tensor<string, []>("op_806_cast_fp16")];
            tensor<string, []> var_812_pad_type_0 = const()[name = tensor<string, []>("op_812_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_812_strides_0 = const()[name = tensor<string, []>("op_812_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_812_pad_0 = const()[name = tensor<string, []>("op_812_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_812_dilations_0 = const()[name = tensor<string, []>("op_812_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_812_groups_0 = const()[name = tensor<string, []>("op_812_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_self_attn_o_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(171914880))), name = tensor<string, []>("layers_2_self_attn_o_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [5678]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(171903424))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_812_cast_fp16 = conv(dilations = var_812_dilations_0, groups = var_812_groups_0, pad = var_812_pad_0, pad_type = var_812_pad_type_0, strides = var_812_strides_0, weight = layers_2_self_attn_o_proj_outlier_module_weight_to_fp16_sparsified, x = input_21_cast_fp16)[name = tensor<string, []>("op_812_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> obj_35_cast_fp16 = add(x = var_806_cast_fp16, y = var_812_cast_fp16)[name = tensor<string, []>("obj_35_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> inputs_15_cast_fp16 = add(x = inputs_13_cast_fp16, y = obj_35_cast_fp16)[name = tensor<string, []>("inputs_15_cast_fp16")];
            tensor<int32, [1]> out_15_axes_0 = const()[name = tensor<string, []>("out_15_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, []> var_827_to_fp16 = const()[name = tensor<string, []>("op_827_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> out_15_cast_fp16 = layer_norm(axes = out_15_axes_0, epsilon = var_827_to_fp16, x = inputs_15_cast_fp16)[name = tensor<string, []>("out_15_cast_fp16")];
            tensor<fp16, [1280]> obj_37_gamma_0_to_fp16 = const()[name = tensor<string, []>("obj_37_gamma_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(172119744)))];
            tensor<fp16, [1280]> obj_37_beta_0_to_fp16 = const()[name = tensor<string, []>("obj_37_beta_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(172122368)))];
            tensor<fp16, []> obj_37_epsilon_0_to_fp16 = const()[name = tensor<string, []>("obj_37_epsilon_0_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> obj_37_cast_fp16 = batch_norm(beta = obj_37_beta_0_to_fp16, epsilon = obj_37_epsilon_0_to_fp16, gamma = obj_37_gamma_0_to_fp16, mean = obj_1_mean_0_to_fp16, variance = obj_1_variance_0_to_fp16, x = out_15_cast_fp16)[name = tensor<string, []>("obj_37_cast_fp16")];
            tensor<string, []> var_849_pad_type_0 = const()[name = tensor<string, []>("op_849_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_849_strides_0 = const()[name = tensor<string, []>("op_849_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_849_pad_0 = const()[name = tensor<string, []>("op_849_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_849_dilations_0 = const()[name = tensor<string, []>("op_849_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_849_groups_0 = const()[name = tensor<string, []>("op_849_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_encoder_attn_q_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(172124992))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(172944256))), name = tensor<string, []>("layers_2_encoder_attn_q_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_2_encoder_attn_q_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_2_encoder_attn_q_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(172944384)))];
            tensor<fp16, [1, 1280, 1, 1]> var_849_cast_fp16 = conv(bias = layers_2_encoder_attn_q_proj_inlier_module_bias_to_fp16, dilations = var_849_dilations_0, groups = var_849_groups_0, pad = var_849_pad_0, pad_type = var_849_pad_type_0, strides = var_849_strides_0, weight = layers_2_encoder_attn_q_proj_inlier_module_weight_to_fp16_palettized, x = obj_37_cast_fp16)[name = tensor<string, []>("op_849_cast_fp16")];
            tensor<string, []> var_855_pad_type_0 = const()[name = tensor<string, []>("op_855_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_855_strides_0 = const()[name = tensor<string, []>("op_855_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_855_pad_0 = const()[name = tensor<string, []>("op_855_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_855_dilations_0 = const()[name = tensor<string, []>("op_855_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_855_groups_0 = const()[name = tensor<string, []>("op_855_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_encoder_attn_q_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(172974720))), name = tensor<string, []>("layers_2_encoder_attn_q_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [13824]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(172947008))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_855_cast_fp16 = conv(dilations = var_855_dilations_0, groups = var_855_groups_0, pad = var_855_pad_0, pad_type = var_855_pad_type_0, strides = var_855_strides_0, weight = layers_2_encoder_attn_q_proj_outlier_module_weight_to_fp16_sparsified, x = obj_37_cast_fp16)[name = tensor<string, []>("op_855_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> query_11_cast_fp16 = add(x = var_849_cast_fp16, y = var_855_cast_fp16)[name = tensor<string, []>("query_11_cast_fp16")];
            tensor<string, []> var_864_pad_type_0 = const()[name = tensor<string, []>("op_864_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_864_strides_0 = const()[name = tensor<string, []>("op_864_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_864_pad_0 = const()[name = tensor<string, []>("op_864_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_864_dilations_0 = const()[name = tensor<string, []>("op_864_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_864_groups_0 = const()[name = tensor<string, []>("op_864_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_encoder_attn_k_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(173179584))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(173998848))), name = tensor<string, []>("layers_2_encoder_attn_k_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1500]> var_864_cast_fp16 = conv(dilations = var_864_dilations_0, groups = var_864_groups_0, pad = var_864_pad_0, pad_type = var_864_pad_type_0, strides = var_864_strides_0, weight = layers_2_encoder_attn_k_proj_inlier_module_weight_to_fp16_palettized, x = encoder_output_embeds)[name = tensor<string, []>("op_864_cast_fp16")];
            tensor<string, []> var_870_pad_type_0 = const()[name = tensor<string, []>("op_870_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_870_strides_0 = const()[name = tensor<string, []>("op_870_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_870_pad_0 = const()[name = tensor<string, []>("op_870_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_870_dilations_0 = const()[name = tensor<string, []>("op_870_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_870_groups_0 = const()[name = tensor<string, []>("op_870_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_encoder_attn_k_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(174026816))), name = tensor<string, []>("layers_2_encoder_attn_k_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [13879]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(173998976))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1500]> var_870_cast_fp16 = conv(dilations = var_870_dilations_0, groups = var_870_groups_0, pad = var_870_pad_0, pad_type = var_870_pad_type_0, strides = var_870_strides_0, weight = layers_2_encoder_attn_k_proj_outlier_module_weight_to_fp16_sparsified, x = encoder_output_embeds)[name = tensor<string, []>("op_870_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1500]> key_11_cast_fp16 = add(x = var_864_cast_fp16, y = var_870_cast_fp16)[name = tensor<string, []>("key_11_cast_fp16")];
            tensor<string, []> var_880_pad_type_0 = const()[name = tensor<string, []>("op_880_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_880_strides_0 = const()[name = tensor<string, []>("op_880_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_880_pad_0 = const()[name = tensor<string, []>("op_880_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_880_dilations_0 = const()[name = tensor<string, []>("op_880_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_880_groups_0 = const()[name = tensor<string, []>("op_880_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_encoder_attn_v_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(174231680))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(175050944))), name = tensor<string, []>("layers_2_encoder_attn_v_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_2_encoder_attn_v_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_2_encoder_attn_v_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(175051072)))];
            tensor<fp16, [1, 1280, 1, 1500]> var_880_cast_fp16 = conv(bias = layers_2_encoder_attn_v_proj_inlier_module_bias_to_fp16, dilations = var_880_dilations_0, groups = var_880_groups_0, pad = var_880_pad_0, pad_type = var_880_pad_type_0, strides = var_880_strides_0, weight = layers_2_encoder_attn_v_proj_inlier_module_weight_to_fp16_palettized, x = encoder_output_embeds)[name = tensor<string, []>("op_880_cast_fp16")];
            tensor<string, []> var_886_pad_type_0 = const()[name = tensor<string, []>("op_886_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_886_strides_0 = const()[name = tensor<string, []>("op_886_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_886_pad_0 = const()[name = tensor<string, []>("op_886_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_886_dilations_0 = const()[name = tensor<string, []>("op_886_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_886_groups_0 = const()[name = tensor<string, []>("op_886_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_encoder_attn_v_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(175065280))), name = tensor<string, []>("layers_2_encoder_attn_v_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [5756]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(175053696))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1500]> var_886_cast_fp16 = conv(dilations = var_886_dilations_0, groups = var_886_groups_0, pad = var_886_pad_0, pad_type = var_886_pad_type_0, strides = var_886_strides_0, weight = layers_2_encoder_attn_v_proj_outlier_module_weight_to_fp16_sparsified, x = encoder_output_embeds)[name = tensor<string, []>("op_886_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1500]> value_11_cast_fp16 = add(x = var_880_cast_fp16, y = var_886_cast_fp16)[name = tensor<string, []>("value_11_cast_fp16")];
            tensor<int32, [4]> var_889 = const()[name = tensor<string, []>("op_889"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1]> mh_q_11_cast_fp16 = reshape(shape = var_889, x = query_11_cast_fp16)[name = tensor<string, []>("mh_q_11_cast_fp16")];
            tensor<fp16, []> var_891_to_fp16 = const()[name = tensor<string, []>("op_891_to_fp16"), val = tensor<fp16, []>(0x1p-3)];
            tensor<fp16, [1, 20, 64, 1]> var_892_cast_fp16 = mul(x = mh_q_11_cast_fp16, y = var_891_to_fp16)[name = tensor<string, []>("op_892_cast_fp16")];
            tensor<int32, [4]> var_893 = const()[name = tensor<string, []>("op_893"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1500]> var_894_cast_fp16 = reshape(shape = var_893, x = key_11_cast_fp16)[name = tensor<string, []>("op_894_cast_fp16")];
            tensor<bool, []> mh_w_17_transpose_x_0 = const()[name = tensor<string, []>("mh_w_17_transpose_x_0"), val = tensor<bool, []>(true)];
            tensor<bool, []> mh_w_17_transpose_y_0 = const()[name = tensor<string, []>("mh_w_17_transpose_y_0"), val = tensor<bool, []>(false)];
            tensor<fp16, [1, 20, 1, 1500]> mh_w_17_cast_fp16 = matmul(transpose_x = mh_w_17_transpose_x_0, transpose_y = mh_w_17_transpose_y_0, x = var_892_cast_fp16, y = var_894_cast_fp16)[name = tensor<string, []>("mh_w_17_cast_fp16")];
            tensor<fp16, [1, 20, 1, 1500]> obj_41_cast_fp16 = softmax(axis = var_678, x = mh_w_17_cast_fp16)[name = tensor<string, []>("obj_41_cast_fp16")];
            tensor<int32, [4]> var_898 = const()[name = tensor<string, []>("op_898"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1500]> var_899_cast_fp16 = reshape(shape = var_898, x = value_11_cast_fp16)[name = tensor<string, []>("op_899_cast_fp16")];
            tensor<bool, []> attn_11_transpose_x_0 = const()[name = tensor<string, []>("attn_11_transpose_x_0"), val = tensor<bool, []>(false)];
            tensor<bool, []> attn_11_transpose_y_0 = const()[name = tensor<string, []>("attn_11_transpose_y_0"), val = tensor<bool, []>(true)];
            tensor<fp16, [1, 20, 64, 1]> attn_11_cast_fp16 = matmul(transpose_x = attn_11_transpose_x_0, transpose_y = attn_11_transpose_y_0, x = var_899_cast_fp16, y = obj_41_cast_fp16)[name = tensor<string, []>("attn_11_cast_fp16")];
            tensor<int32, [4]> var_902 = const()[name = tensor<string, []>("op_902"), val = tensor<int32, [4]>([1, 1280, 1, -1])];
            tensor<fp16, [1, 1280, 1, 1]> input_23_cast_fp16 = reshape(shape = var_902, x = attn_11_cast_fp16)[name = tensor<string, []>("input_23_cast_fp16")];
            tensor<string, []> var_912_pad_type_0 = const()[name = tensor<string, []>("op_912_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_912_strides_0 = const()[name = tensor<string, []>("op_912_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_912_pad_0 = const()[name = tensor<string, []>("op_912_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_912_dilations_0 = const()[name = tensor<string, []>("op_912_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_912_groups_0 = const()[name = tensor<string, []>("op_912_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_encoder_attn_o_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(175270144))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(176089408))), name = tensor<string, []>("layers_2_encoder_attn_o_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_2_encoder_attn_o_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_2_encoder_attn_o_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(176089536)))];
            tensor<fp16, [1, 1280, 1, 1]> var_912_cast_fp16 = conv(bias = layers_2_encoder_attn_o_proj_inlier_module_bias_to_fp16, dilations = var_912_dilations_0, groups = var_912_groups_0, pad = var_912_pad_0, pad_type = var_912_pad_type_0, strides = var_912_strides_0, weight = layers_2_encoder_attn_o_proj_inlier_module_weight_to_fp16_palettized, x = input_23_cast_fp16)[name = tensor<string, []>("op_912_cast_fp16")];
            tensor<string, []> var_918_pad_type_0 = const()[name = tensor<string, []>("op_918_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_918_strides_0 = const()[name = tensor<string, []>("op_918_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_918_pad_0 = const()[name = tensor<string, []>("op_918_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_918_dilations_0 = const()[name = tensor<string, []>("op_918_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_918_groups_0 = const()[name = tensor<string, []>("op_918_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_2_encoder_attn_o_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(176105152))), name = tensor<string, []>("layers_2_encoder_attn_o_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [6438]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(176092160))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_918_cast_fp16 = conv(dilations = var_918_dilations_0, groups = var_918_groups_0, pad = var_918_pad_0, pad_type = var_918_pad_type_0, strides = var_918_strides_0, weight = layers_2_encoder_attn_o_proj_outlier_module_weight_to_fp16_sparsified, x = input_23_cast_fp16)[name = tensor<string, []>("op_918_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> obj_39_cast_fp16 = add(x = var_912_cast_fp16, y = var_918_cast_fp16)[name = tensor<string, []>("obj_39_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> inputs_17_cast_fp16 = add(x = inputs_15_cast_fp16, y = obj_39_cast_fp16)[name = tensor<string, []>("inputs_17_cast_fp16")];
            tensor<int32, [1]> out_17_axes_0 = const()[name = tensor<string, []>("out_17_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, []> var_932_to_fp16 = const()[name = tensor<string, []>("op_932_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> out_17_cast_fp16 = layer_norm(axes = out_17_axes_0, epsilon = var_932_to_fp16, x = inputs_17_cast_fp16)[name = tensor<string, []>("out_17_cast_fp16")];
            tensor<fp16, [1280]> input_25_gamma_0_to_fp16 = const()[name = tensor<string, []>("input_25_gamma_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(176310016)))];
            tensor<fp16, [1280]> input_25_beta_0_to_fp16 = const()[name = tensor<string, []>("input_25_beta_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(176312640)))];
            tensor<fp16, []> input_25_epsilon_0_to_fp16 = const()[name = tensor<string, []>("input_25_epsilon_0_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> input_25_cast_fp16 = batch_norm(beta = input_25_beta_0_to_fp16, epsilon = input_25_epsilon_0_to_fp16, gamma = input_25_gamma_0_to_fp16, mean = obj_1_mean_0_to_fp16, variance = obj_1_variance_0_to_fp16, x = out_17_cast_fp16)[name = tensor<string, []>("input_25_cast_fp16")];
            tensor<string, []> var_950_pad_type_0 = const()[name = tensor<string, []>("op_950_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_950_strides_0 = const()[name = tensor<string, []>("op_950_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_950_pad_0 = const()[name = tensor<string, []>("op_950_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_950_dilations_0 = const()[name = tensor<string, []>("op_950_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_950_groups_0 = const()[name = tensor<string, []>("op_950_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [5120, 1280, 1, 1]> layers_2_fc1_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [3276800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(176315264))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(179592128))), name = tensor<string, []>("layers_2_fc1_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([5120, 1280, 1, 1])];
            tensor<fp16, [5120]> layers_2_fc1_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_2_fc1_inlier_module_bias_to_fp16"), val = tensor<fp16, [5120]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(179592256)))];
            tensor<fp16, [1, 5120, 1, 1]> var_950_cast_fp16 = conv(bias = layers_2_fc1_inlier_module_bias_to_fp16, dilations = var_950_dilations_0, groups = var_950_groups_0, pad = var_950_pad_0, pad_type = var_950_pad_type_0, strides = var_950_strides_0, weight = layers_2_fc1_inlier_module_weight_to_fp16_palettized, x = input_25_cast_fp16)[name = tensor<string, []>("op_950_cast_fp16")];
            tensor<string, []> var_956_pad_type_0 = const()[name = tensor<string, []>("op_956_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_956_strides_0 = const()[name = tensor<string, []>("op_956_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_956_pad_0 = const()[name = tensor<string, []>("op_956_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_956_dilations_0 = const()[name = tensor<string, []>("op_956_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_956_groups_0 = const()[name = tensor<string, []>("op_956_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [5120, 1280, 1, 1]> layers_2_fc1_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(179764480))), name = tensor<string, []>("layers_2_fc1_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [80920]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(179602560))), shape = tensor<uint32, [4]>([5120, 1280, 1, 1])];
            tensor<fp16, [1, 5120, 1, 1]> var_956_cast_fp16 = conv(dilations = var_956_dilations_0, groups = var_956_groups_0, pad = var_956_pad_0, pad_type = var_956_pad_type_0, strides = var_956_strides_0, weight = layers_2_fc1_outlier_module_weight_to_fp16_sparsified, x = input_25_cast_fp16)[name = tensor<string, []>("op_956_cast_fp16")];
            tensor<fp16, [1, 5120, 1, 1]> input_27_cast_fp16 = add(x = var_950_cast_fp16, y = var_956_cast_fp16)[name = tensor<string, []>("input_27_cast_fp16")];
            tensor<string, []> input_29_mode_0 = const()[name = tensor<string, []>("input_29_mode_0"), val = tensor<string, []>("EXACT")];
            tensor<fp16, [1, 5120, 1, 1]> input_29_cast_fp16 = gelu(mode = input_29_mode_0, x = input_27_cast_fp16)[name = tensor<string, []>("input_29_cast_fp16")];
            tensor<string, []> var_967_pad_type_0 = const()[name = tensor<string, []>("op_967_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_967_strides_0 = const()[name = tensor<string, []>("op_967_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_967_pad_0 = const()[name = tensor<string, []>("op_967_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_967_dilations_0 = const()[name = tensor<string, []>("op_967_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_967_groups_0 = const()[name = tensor<string, []>("op_967_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 5120, 1, 1]> layers_2_fc2_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [3276800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(180583744))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(183860608))), name = tensor<string, []>("layers_2_fc2_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 5120, 1, 1])];
            tensor<fp16, [1280]> layers_2_fc2_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_2_fc2_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(183860736)))];
            tensor<fp16, [1, 1280, 1, 1]> var_967_cast_fp16 = conv(bias = layers_2_fc2_inlier_module_bias_to_fp16, dilations = var_967_dilations_0, groups = var_967_groups_0, pad = var_967_pad_0, pad_type = var_967_pad_type_0, strides = var_967_strides_0, weight = layers_2_fc2_inlier_module_weight_to_fp16_palettized, x = input_29_cast_fp16)[name = tensor<string, []>("op_967_cast_fp16")];
            tensor<string, []> var_973_pad_type_0 = const()[name = tensor<string, []>("op_973_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_973_strides_0 = const()[name = tensor<string, []>("op_973_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_973_pad_0 = const()[name = tensor<string, []>("op_973_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_973_dilations_0 = const()[name = tensor<string, []>("op_973_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_973_groups_0 = const()[name = tensor<string, []>("op_973_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 5120, 1, 1]> layers_2_fc2_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(183943552))), name = tensor<string, []>("layers_2_fc2_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [40054]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(183863360))), shape = tensor<uint32, [4]>([1280, 5120, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_973_cast_fp16 = conv(dilations = var_973_dilations_0, groups = var_973_groups_0, pad = var_973_pad_0, pad_type = var_973_pad_type_0, strides = var_973_strides_0, weight = layers_2_fc2_outlier_module_weight_to_fp16_sparsified, x = input_29_cast_fp16)[name = tensor<string, []>("op_973_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> hidden_states_7_cast_fp16 = add(x = var_967_cast_fp16, y = var_973_cast_fp16)[name = tensor<string, []>("hidden_states_7_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> inputs_19_cast_fp16 = add(x = inputs_17_cast_fp16, y = hidden_states_7_cast_fp16)[name = tensor<string, []>("inputs_19_cast_fp16")];
            tensor<int32, []> var_986 = const()[name = tensor<string, []>("op_986"), val = tensor<int32, []>(3)];
            tensor<int32, [1]> out_19_axes_0 = const()[name = tensor<string, []>("out_19_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, []> var_1012_to_fp16 = const()[name = tensor<string, []>("op_1012_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> out_19_cast_fp16 = layer_norm(axes = out_19_axes_0, epsilon = var_1012_to_fp16, x = inputs_19_cast_fp16)[name = tensor<string, []>("out_19_cast_fp16")];
            tensor<fp16, [1280]> obj_43_gamma_0_to_fp16 = const()[name = tensor<string, []>("obj_43_gamma_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(184762816)))];
            tensor<fp16, [1280]> obj_43_beta_0_to_fp16 = const()[name = tensor<string, []>("obj_43_beta_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(184765440)))];
            tensor<fp16, []> obj_43_epsilon_0_to_fp16 = const()[name = tensor<string, []>("obj_43_epsilon_0_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> obj_43_cast_fp16 = batch_norm(beta = obj_43_beta_0_to_fp16, epsilon = obj_43_epsilon_0_to_fp16, gamma = obj_43_gamma_0_to_fp16, mean = obj_1_mean_0_to_fp16, variance = obj_1_variance_0_to_fp16, x = out_19_cast_fp16)[name = tensor<string, []>("obj_43_cast_fp16")];
            tensor<string, []> var_1034_pad_type_0 = const()[name = tensor<string, []>("op_1034_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1034_strides_0 = const()[name = tensor<string, []>("op_1034_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1034_pad_0 = const()[name = tensor<string, []>("op_1034_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1034_dilations_0 = const()[name = tensor<string, []>("op_1034_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1034_groups_0 = const()[name = tensor<string, []>("op_1034_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_self_attn_q_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(184768064))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(185587328))), name = tensor<string, []>("layers_3_self_attn_q_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_3_self_attn_q_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_3_self_attn_q_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(185587456)))];
            tensor<fp16, [1, 1280, 1, 1]> var_1034_cast_fp16 = conv(bias = layers_3_self_attn_q_proj_inlier_module_bias_to_fp16, dilations = var_1034_dilations_0, groups = var_1034_groups_0, pad = var_1034_pad_0, pad_type = var_1034_pad_type_0, strides = var_1034_strides_0, weight = layers_3_self_attn_q_proj_inlier_module_weight_to_fp16_palettized, x = obj_43_cast_fp16)[name = tensor<string, []>("op_1034_cast_fp16")];
            tensor<string, []> var_1040_pad_type_0 = const()[name = tensor<string, []>("op_1040_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1040_strides_0 = const()[name = tensor<string, []>("op_1040_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1040_pad_0 = const()[name = tensor<string, []>("op_1040_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1040_dilations_0 = const()[name = tensor<string, []>("op_1040_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1040_groups_0 = const()[name = tensor<string, []>("op_1040_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_self_attn_q_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(185611520))), name = tensor<string, []>("layers_3_self_attn_q_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [10664]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(185590080))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_1040_cast_fp16 = conv(dilations = var_1040_dilations_0, groups = var_1040_groups_0, pad = var_1040_pad_0, pad_type = var_1040_pad_type_0, strides = var_1040_strides_0, weight = layers_3_self_attn_q_proj_outlier_module_weight_to_fp16_sparsified, x = obj_43_cast_fp16)[name = tensor<string, []>("op_1040_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> query_13_cast_fp16 = add(x = var_1034_cast_fp16, y = var_1040_cast_fp16)[name = tensor<string, []>("query_13_cast_fp16")];
            tensor<string, []> var_1049_pad_type_0 = const()[name = tensor<string, []>("op_1049_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1049_strides_0 = const()[name = tensor<string, []>("op_1049_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1049_pad_0 = const()[name = tensor<string, []>("op_1049_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1049_dilations_0 = const()[name = tensor<string, []>("op_1049_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1049_groups_0 = const()[name = tensor<string, []>("op_1049_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_self_attn_k_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(185816384))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(186635648))), name = tensor<string, []>("layers_3_self_attn_k_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_1049_cast_fp16 = conv(dilations = var_1049_dilations_0, groups = var_1049_groups_0, pad = var_1049_pad_0, pad_type = var_1049_pad_type_0, strides = var_1049_strides_0, weight = layers_3_self_attn_k_proj_inlier_module_weight_to_fp16_palettized, x = obj_43_cast_fp16)[name = tensor<string, []>("op_1049_cast_fp16")];
            tensor<string, []> var_1055_pad_type_0 = const()[name = tensor<string, []>("op_1055_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1055_strides_0 = const()[name = tensor<string, []>("op_1055_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1055_pad_0 = const()[name = tensor<string, []>("op_1055_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1055_dilations_0 = const()[name = tensor<string, []>("op_1055_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1055_groups_0 = const()[name = tensor<string, []>("op_1055_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_self_attn_k_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(186656640))), name = tensor<string, []>("layers_3_self_attn_k_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [10387]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(186635776))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_1055_cast_fp16 = conv(dilations = var_1055_dilations_0, groups = var_1055_groups_0, pad = var_1055_pad_0, pad_type = var_1055_pad_type_0, strides = var_1055_strides_0, weight = layers_3_self_attn_k_proj_outlier_module_weight_to_fp16_sparsified, x = obj_43_cast_fp16)[name = tensor<string, []>("op_1055_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> current_key_cast_fp16 = add(x = var_1049_cast_fp16, y = var_1055_cast_fp16)[name = tensor<string, []>("current_key_cast_fp16")];
            tensor<string, []> var_1065_pad_type_0 = const()[name = tensor<string, []>("op_1065_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1065_strides_0 = const()[name = tensor<string, []>("op_1065_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1065_pad_0 = const()[name = tensor<string, []>("op_1065_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1065_dilations_0 = const()[name = tensor<string, []>("op_1065_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1065_groups_0 = const()[name = tensor<string, []>("op_1065_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_self_attn_v_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(186861504))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(187680768))), name = tensor<string, []>("layers_3_self_attn_v_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_3_self_attn_v_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_3_self_attn_v_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(187680896)))];
            tensor<fp16, [1, 1280, 1, 1]> var_1065_cast_fp16 = conv(bias = layers_3_self_attn_v_proj_inlier_module_bias_to_fp16, dilations = var_1065_dilations_0, groups = var_1065_groups_0, pad = var_1065_pad_0, pad_type = var_1065_pad_type_0, strides = var_1065_strides_0, weight = layers_3_self_attn_v_proj_inlier_module_weight_to_fp16_palettized, x = obj_43_cast_fp16)[name = tensor<string, []>("op_1065_cast_fp16")];
            tensor<string, []> var_1071_pad_type_0 = const()[name = tensor<string, []>("op_1071_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1071_strides_0 = const()[name = tensor<string, []>("op_1071_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1071_pad_0 = const()[name = tensor<string, []>("op_1071_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1071_dilations_0 = const()[name = tensor<string, []>("op_1071_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1071_groups_0 = const()[name = tensor<string, []>("op_1071_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_self_attn_v_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(187698304))), name = tensor<string, []>("layers_3_self_attn_v_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [7342]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(187683520))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_1071_cast_fp16 = conv(dilations = var_1071_dilations_0, groups = var_1071_groups_0, pad = var_1071_pad_0, pad_type = var_1071_pad_type_0, strides = var_1071_strides_0, weight = layers_3_self_attn_v_proj_outlier_module_weight_to_fp16_sparsified, x = obj_43_cast_fp16)[name = tensor<string, []>("op_1071_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> current_value_cast_fp16 = add(x = var_1065_cast_fp16, y = var_1071_cast_fp16)[name = tensor<string, []>("current_value_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_1077_cast_fp16 = mul(x = current_key_cast_fp16, y = var_159_cast_fp16)[name = tensor<string, []>("op_1077_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_1079_cast_fp16 = mul(x = var_53_cast_fp16_3, y = var_162_cast_fp16)[name = tensor<string, []>("op_1079_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> key_13_cast_fp16 = add(x = var_1077_cast_fp16, y = var_1079_cast_fp16)[name = tensor<string, []>("key_13_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_1081_cast_fp16 = mul(x = current_value_cast_fp16, y = var_159_cast_fp16)[name = tensor<string, []>("op_1081_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> var_1083_cast_fp16 = mul(x = var_60_cast_fp16_3, y = var_162_cast_fp16)[name = tensor<string, []>("op_1083_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 448]> value_13_cast_fp16 = add(x = var_1081_cast_fp16, y = var_1083_cast_fp16)[name = tensor<string, []>("value_13_cast_fp16")];
            tensor<int32, [4]> var_1086 = const()[name = tensor<string, []>("op_1086"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1]> mh_q_13_cast_fp16 = reshape(shape = var_1086, x = query_13_cast_fp16)[name = tensor<string, []>("mh_q_13_cast_fp16")];
            tensor<fp16, []> var_1088_to_fp16 = const()[name = tensor<string, []>("op_1088_to_fp16"), val = tensor<fp16, []>(0x1p-3)];
            tensor<fp16, [1, 20, 64, 1]> var_1089_cast_fp16 = mul(x = mh_q_13_cast_fp16, y = var_1088_to_fp16)[name = tensor<string, []>("op_1089_cast_fp16")];
            tensor<int32, [4]> var_1090 = const()[name = tensor<string, []>("op_1090"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 448]> var_1091_cast_fp16 = reshape(shape = var_1090, x = key_13_cast_fp16)[name = tensor<string, []>("op_1091_cast_fp16")];
            tensor<bool, []> mh_w_19_transpose_x_0 = const()[name = tensor<string, []>("mh_w_19_transpose_x_0"), val = tensor<bool, []>(true)];
            tensor<bool, []> mh_w_19_transpose_y_0 = const()[name = tensor<string, []>("mh_w_19_transpose_y_0"), val = tensor<bool, []>(false)];
            tensor<fp16, [1, 20, 1, 448]> mh_w_19_cast_fp16 = matmul(transpose_x = mh_w_19_transpose_x_0, transpose_y = mh_w_19_transpose_y_0, x = var_1089_cast_fp16, y = var_1091_cast_fp16)[name = tensor<string, []>("mh_w_19_cast_fp16")];
            tensor<fp16, [1, 20, 1, 448]> mh_w_21_cast_fp16 = add(x = mh_w_19_cast_fp16, y = var_180_cast_fp16)[name = tensor<string, []>("mh_w_21_cast_fp16")];
            tensor<fp16, [1, 20, 1, 448]> var_1099_cast_fp16 = softmax(axis = var_986, x = mh_w_21_cast_fp16)[name = tensor<string, []>("op_1099_cast_fp16")];
            tensor<int32, [4]> var_1100 = const()[name = tensor<string, []>("op_1100"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 448]> var_1101_cast_fp16 = reshape(shape = var_1100, x = value_13_cast_fp16)[name = tensor<string, []>("op_1101_cast_fp16")];
            tensor<bool, []> attn_13_transpose_x_0 = const()[name = tensor<string, []>("attn_13_transpose_x_0"), val = tensor<bool, []>(false)];
            tensor<bool, []> attn_13_transpose_y_0 = const()[name = tensor<string, []>("attn_13_transpose_y_0"), val = tensor<bool, []>(true)];
            tensor<fp16, [1, 20, 64, 1]> attn_13_cast_fp16 = matmul(transpose_x = attn_13_transpose_x_0, transpose_y = attn_13_transpose_y_0, x = var_1101_cast_fp16, y = var_1099_cast_fp16)[name = tensor<string, []>("attn_13_cast_fp16")];
            tensor<int32, [4]> var_1104 = const()[name = tensor<string, []>("op_1104"), val = tensor<int32, [4]>([1, 1280, 1, -1])];
            tensor<fp16, [1, 1280, 1, 1]> input_31_cast_fp16 = reshape(shape = var_1104, x = attn_13_cast_fp16)[name = tensor<string, []>("input_31_cast_fp16")];
            tensor<string, []> var_1114_pad_type_0 = const()[name = tensor<string, []>("op_1114_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1114_strides_0 = const()[name = tensor<string, []>("op_1114_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1114_pad_0 = const()[name = tensor<string, []>("op_1114_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1114_dilations_0 = const()[name = tensor<string, []>("op_1114_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1114_groups_0 = const()[name = tensor<string, []>("op_1114_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_self_attn_o_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(187903168))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(188722432))), name = tensor<string, []>("layers_3_self_attn_o_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_3_self_attn_o_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_3_self_attn_o_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(188722560)))];
            tensor<fp16, [1, 1280, 1, 1]> var_1114_cast_fp16 = conv(bias = layers_3_self_attn_o_proj_inlier_module_bias_to_fp16, dilations = var_1114_dilations_0, groups = var_1114_groups_0, pad = var_1114_pad_0, pad_type = var_1114_pad_type_0, strides = var_1114_strides_0, weight = layers_3_self_attn_o_proj_inlier_module_weight_to_fp16_palettized, x = input_31_cast_fp16)[name = tensor<string, []>("op_1114_cast_fp16")];
            tensor<string, []> var_1120_pad_type_0 = const()[name = tensor<string, []>("op_1120_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1120_strides_0 = const()[name = tensor<string, []>("op_1120_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1120_pad_0 = const()[name = tensor<string, []>("op_1120_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1120_dilations_0 = const()[name = tensor<string, []>("op_1120_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1120_groups_0 = const()[name = tensor<string, []>("op_1120_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_self_attn_o_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(188739712))), name = tensor<string, []>("layers_3_self_attn_o_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [7219]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(188725184))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_1120_cast_fp16 = conv(dilations = var_1120_dilations_0, groups = var_1120_groups_0, pad = var_1120_pad_0, pad_type = var_1120_pad_type_0, strides = var_1120_strides_0, weight = layers_3_self_attn_o_proj_outlier_module_weight_to_fp16_sparsified, x = input_31_cast_fp16)[name = tensor<string, []>("op_1120_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> obj_49_cast_fp16 = add(x = var_1114_cast_fp16, y = var_1120_cast_fp16)[name = tensor<string, []>("obj_49_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> inputs_21_cast_fp16 = add(x = inputs_19_cast_fp16, y = obj_49_cast_fp16)[name = tensor<string, []>("inputs_21_cast_fp16")];
            tensor<int32, [1]> out_21_axes_0 = const()[name = tensor<string, []>("out_21_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, []> var_1135_to_fp16 = const()[name = tensor<string, []>("op_1135_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> out_21_cast_fp16 = layer_norm(axes = out_21_axes_0, epsilon = var_1135_to_fp16, x = inputs_21_cast_fp16)[name = tensor<string, []>("out_21_cast_fp16")];
            tensor<fp16, [1280]> obj_51_gamma_0_to_fp16 = const()[name = tensor<string, []>("obj_51_gamma_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(188944576)))];
            tensor<fp16, [1280]> obj_51_beta_0_to_fp16 = const()[name = tensor<string, []>("obj_51_beta_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(188947200)))];
            tensor<fp16, []> obj_51_epsilon_0_to_fp16 = const()[name = tensor<string, []>("obj_51_epsilon_0_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> obj_51_cast_fp16 = batch_norm(beta = obj_51_beta_0_to_fp16, epsilon = obj_51_epsilon_0_to_fp16, gamma = obj_51_gamma_0_to_fp16, mean = obj_1_mean_0_to_fp16, variance = obj_1_variance_0_to_fp16, x = out_21_cast_fp16)[name = tensor<string, []>("obj_51_cast_fp16")];
            tensor<string, []> var_1157_pad_type_0 = const()[name = tensor<string, []>("op_1157_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1157_strides_0 = const()[name = tensor<string, []>("op_1157_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1157_pad_0 = const()[name = tensor<string, []>("op_1157_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1157_dilations_0 = const()[name = tensor<string, []>("op_1157_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1157_groups_0 = const()[name = tensor<string, []>("op_1157_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_encoder_attn_q_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(188949824))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(189769088))), name = tensor<string, []>("layers_3_encoder_attn_q_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_3_encoder_attn_q_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_3_encoder_attn_q_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(189769216)))];
            tensor<fp16, [1, 1280, 1, 1]> var_1157_cast_fp16 = conv(bias = layers_3_encoder_attn_q_proj_inlier_module_bias_to_fp16, dilations = var_1157_dilations_0, groups = var_1157_groups_0, pad = var_1157_pad_0, pad_type = var_1157_pad_type_0, strides = var_1157_strides_0, weight = layers_3_encoder_attn_q_proj_inlier_module_weight_to_fp16_palettized, x = obj_51_cast_fp16)[name = tensor<string, []>("op_1157_cast_fp16")];
            tensor<string, []> var_1163_pad_type_0 = const()[name = tensor<string, []>("op_1163_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1163_strides_0 = const()[name = tensor<string, []>("op_1163_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1163_pad_0 = const()[name = tensor<string, []>("op_1163_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1163_dilations_0 = const()[name = tensor<string, []>("op_1163_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1163_groups_0 = const()[name = tensor<string, []>("op_1163_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_encoder_attn_q_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(189787264))), name = tensor<string, []>("layers_3_encoder_attn_q_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [7675]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(189771840))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_1163_cast_fp16 = conv(dilations = var_1163_dilations_0, groups = var_1163_groups_0, pad = var_1163_pad_0, pad_type = var_1163_pad_type_0, strides = var_1163_strides_0, weight = layers_3_encoder_attn_q_proj_outlier_module_weight_to_fp16_sparsified, x = obj_51_cast_fp16)[name = tensor<string, []>("op_1163_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> query_cast_fp16 = add(x = var_1157_cast_fp16, y = var_1163_cast_fp16)[name = tensor<string, []>("query_cast_fp16")];
            tensor<string, []> var_1172_pad_type_0 = const()[name = tensor<string, []>("op_1172_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1172_strides_0 = const()[name = tensor<string, []>("op_1172_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1172_pad_0 = const()[name = tensor<string, []>("op_1172_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1172_dilations_0 = const()[name = tensor<string, []>("op_1172_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1172_groups_0 = const()[name = tensor<string, []>("op_1172_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_encoder_attn_k_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(189992128))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(190811392))), name = tensor<string, []>("layers_3_encoder_attn_k_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1500]> var_1172_cast_fp16 = conv(dilations = var_1172_dilations_0, groups = var_1172_groups_0, pad = var_1172_pad_0, pad_type = var_1172_pad_type_0, strides = var_1172_strides_0, weight = layers_3_encoder_attn_k_proj_inlier_module_weight_to_fp16_palettized, x = encoder_output_embeds)[name = tensor<string, []>("op_1172_cast_fp16")];
            tensor<string, []> var_1178_pad_type_0 = const()[name = tensor<string, []>("op_1178_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1178_strides_0 = const()[name = tensor<string, []>("op_1178_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1178_pad_0 = const()[name = tensor<string, []>("op_1178_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1178_dilations_0 = const()[name = tensor<string, []>("op_1178_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1178_groups_0 = const()[name = tensor<string, []>("op_1178_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_encoder_attn_k_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(190834240))), name = tensor<string, []>("layers_3_encoder_attn_k_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [11308]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(190811520))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1500]> var_1178_cast_fp16 = conv(dilations = var_1178_dilations_0, groups = var_1178_groups_0, pad = var_1178_pad_0, pad_type = var_1178_pad_type_0, strides = var_1178_strides_0, weight = layers_3_encoder_attn_k_proj_outlier_module_weight_to_fp16_sparsified, x = encoder_output_embeds)[name = tensor<string, []>("op_1178_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1500]> key_cast_fp16 = add(x = var_1172_cast_fp16, y = var_1178_cast_fp16)[name = tensor<string, []>("key_cast_fp16")];
            tensor<string, []> var_1188_pad_type_0 = const()[name = tensor<string, []>("op_1188_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1188_strides_0 = const()[name = tensor<string, []>("op_1188_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1188_pad_0 = const()[name = tensor<string, []>("op_1188_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1188_dilations_0 = const()[name = tensor<string, []>("op_1188_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1188_groups_0 = const()[name = tensor<string, []>("op_1188_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_encoder_attn_v_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(191039104))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(191858368))), name = tensor<string, []>("layers_3_encoder_attn_v_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_3_encoder_attn_v_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_3_encoder_attn_v_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(191858496)))];
            tensor<fp16, [1, 1280, 1, 1500]> var_1188_cast_fp16 = conv(bias = layers_3_encoder_attn_v_proj_inlier_module_bias_to_fp16, dilations = var_1188_dilations_0, groups = var_1188_groups_0, pad = var_1188_pad_0, pad_type = var_1188_pad_type_0, strides = var_1188_strides_0, weight = layers_3_encoder_attn_v_proj_inlier_module_weight_to_fp16_palettized, x = encoder_output_embeds)[name = tensor<string, []>("op_1188_cast_fp16")];
            tensor<string, []> var_1194_pad_type_0 = const()[name = tensor<string, []>("op_1194_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1194_strides_0 = const()[name = tensor<string, []>("op_1194_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1194_pad_0 = const()[name = tensor<string, []>("op_1194_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1194_dilations_0 = const()[name = tensor<string, []>("op_1194_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1194_groups_0 = const()[name = tensor<string, []>("op_1194_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_encoder_attn_v_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(191874944))), name = tensor<string, []>("layers_3_encoder_attn_v_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [6870]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(191861120))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1500]> var_1194_cast_fp16 = conv(dilations = var_1194_dilations_0, groups = var_1194_groups_0, pad = var_1194_pad_0, pad_type = var_1194_pad_type_0, strides = var_1194_strides_0, weight = layers_3_encoder_attn_v_proj_outlier_module_weight_to_fp16_sparsified, x = encoder_output_embeds)[name = tensor<string, []>("op_1194_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1500]> value_cast_fp16 = add(x = var_1188_cast_fp16, y = var_1194_cast_fp16)[name = tensor<string, []>("value_cast_fp16")];
            tensor<int32, [4]> var_1197 = const()[name = tensor<string, []>("op_1197"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1]> mh_q_cast_fp16 = reshape(shape = var_1197, x = query_cast_fp16)[name = tensor<string, []>("mh_q_cast_fp16")];
            tensor<fp16, []> var_1199_to_fp16 = const()[name = tensor<string, []>("op_1199_to_fp16"), val = tensor<fp16, []>(0x1p-3)];
            tensor<fp16, [1, 20, 64, 1]> var_1200_cast_fp16 = mul(x = mh_q_cast_fp16, y = var_1199_to_fp16)[name = tensor<string, []>("op_1200_cast_fp16")];
            tensor<int32, [4]> var_1201 = const()[name = tensor<string, []>("op_1201"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1500]> var_1202_cast_fp16 = reshape(shape = var_1201, x = key_cast_fp16)[name = tensor<string, []>("op_1202_cast_fp16")];
            tensor<bool, []> mh_w_transpose_x_0 = const()[name = tensor<string, []>("mh_w_transpose_x_0"), val = tensor<bool, []>(true)];
            tensor<bool, []> mh_w_transpose_y_0 = const()[name = tensor<string, []>("mh_w_transpose_y_0"), val = tensor<bool, []>(false)];
            tensor<fp16, [1, 20, 1, 1500]> mh_w_cast_fp16 = matmul(transpose_x = mh_w_transpose_x_0, transpose_y = mh_w_transpose_y_0, x = var_1200_cast_fp16, y = var_1202_cast_fp16)[name = tensor<string, []>("mh_w_cast_fp16")];
            tensor<fp16, [1, 20, 1, 1500]> obj_55_cast_fp16 = softmax(axis = var_986, x = mh_w_cast_fp16)[name = tensor<string, []>("obj_55_cast_fp16")];
            tensor<int32, [4]> var_1206 = const()[name = tensor<string, []>("op_1206"), val = tensor<int32, [4]>([1, 20, 64, -1])];
            tensor<fp16, [1, 20, 64, 1500]> var_1207_cast_fp16 = reshape(shape = var_1206, x = value_cast_fp16)[name = tensor<string, []>("op_1207_cast_fp16")];
            tensor<bool, []> attn_transpose_x_0 = const()[name = tensor<string, []>("attn_transpose_x_0"), val = tensor<bool, []>(false)];
            tensor<bool, []> attn_transpose_y_0 = const()[name = tensor<string, []>("attn_transpose_y_0"), val = tensor<bool, []>(true)];
            tensor<fp16, [1, 20, 64, 1]> attn_cast_fp16 = matmul(transpose_x = attn_transpose_x_0, transpose_y = attn_transpose_y_0, x = var_1207_cast_fp16, y = obj_55_cast_fp16)[name = tensor<string, []>("attn_cast_fp16")];
            tensor<int32, [4]> var_1210 = const()[name = tensor<string, []>("op_1210"), val = tensor<int32, [4]>([1, 1280, 1, -1])];
            tensor<fp16, [1, 1280, 1, 1]> input_33_cast_fp16 = reshape(shape = var_1210, x = attn_cast_fp16)[name = tensor<string, []>("input_33_cast_fp16")];
            tensor<string, []> var_1220_pad_type_0 = const()[name = tensor<string, []>("op_1220_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1220_strides_0 = const()[name = tensor<string, []>("op_1220_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1220_pad_0 = const()[name = tensor<string, []>("op_1220_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1220_dilations_0 = const()[name = tensor<string, []>("op_1220_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1220_groups_0 = const()[name = tensor<string, []>("op_1220_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_encoder_attn_o_proj_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(192079808))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(192899072))), name = tensor<string, []>("layers_3_encoder_attn_o_proj_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1280]> layers_3_encoder_attn_o_proj_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_3_encoder_attn_o_proj_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(192899200)))];
            tensor<fp16, [1, 1280, 1, 1]> var_1220_cast_fp16 = conv(bias = layers_3_encoder_attn_o_proj_inlier_module_bias_to_fp16, dilations = var_1220_dilations_0, groups = var_1220_groups_0, pad = var_1220_pad_0, pad_type = var_1220_pad_type_0, strides = var_1220_strides_0, weight = layers_3_encoder_attn_o_proj_inlier_module_weight_to_fp16_palettized, x = input_33_cast_fp16)[name = tensor<string, []>("op_1220_cast_fp16")];
            tensor<string, []> var_1226_pad_type_0 = const()[name = tensor<string, []>("op_1226_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1226_strides_0 = const()[name = tensor<string, []>("op_1226_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1226_pad_0 = const()[name = tensor<string, []>("op_1226_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1226_dilations_0 = const()[name = tensor<string, []>("op_1226_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1226_groups_0 = const()[name = tensor<string, []>("op_1226_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 1280, 1, 1]> layers_3_encoder_attn_o_proj_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [204800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(192913536))), name = tensor<string, []>("layers_3_encoder_attn_o_proj_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [5809]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(192901824))), shape = tensor<uint32, [4]>([1280, 1280, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_1226_cast_fp16 = conv(dilations = var_1226_dilations_0, groups = var_1226_groups_0, pad = var_1226_pad_0, pad_type = var_1226_pad_type_0, strides = var_1226_strides_0, weight = layers_3_encoder_attn_o_proj_outlier_module_weight_to_fp16_sparsified, x = input_33_cast_fp16)[name = tensor<string, []>("op_1226_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> obj_53_cast_fp16 = add(x = var_1220_cast_fp16, y = var_1226_cast_fp16)[name = tensor<string, []>("obj_53_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> inputs_23_cast_fp16 = add(x = inputs_21_cast_fp16, y = obj_53_cast_fp16)[name = tensor<string, []>("inputs_23_cast_fp16")];
            tensor<int32, [1]> out_23_axes_0 = const()[name = tensor<string, []>("out_23_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, []> var_1240_to_fp16 = const()[name = tensor<string, []>("op_1240_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> out_23_cast_fp16 = layer_norm(axes = out_23_axes_0, epsilon = var_1240_to_fp16, x = inputs_23_cast_fp16)[name = tensor<string, []>("out_23_cast_fp16")];
            tensor<fp16, [1280]> input_35_gamma_0_to_fp16 = const()[name = tensor<string, []>("input_35_gamma_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(193118400)))];
            tensor<fp16, [1280]> input_35_beta_0_to_fp16 = const()[name = tensor<string, []>("input_35_beta_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(193121024)))];
            tensor<fp16, []> input_35_epsilon_0_to_fp16 = const()[name = tensor<string, []>("input_35_epsilon_0_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> input_35_cast_fp16 = batch_norm(beta = input_35_beta_0_to_fp16, epsilon = input_35_epsilon_0_to_fp16, gamma = input_35_gamma_0_to_fp16, mean = obj_1_mean_0_to_fp16, variance = obj_1_variance_0_to_fp16, x = out_23_cast_fp16)[name = tensor<string, []>("input_35_cast_fp16")];
            tensor<string, []> var_1258_pad_type_0 = const()[name = tensor<string, []>("op_1258_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1258_strides_0 = const()[name = tensor<string, []>("op_1258_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1258_pad_0 = const()[name = tensor<string, []>("op_1258_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1258_dilations_0 = const()[name = tensor<string, []>("op_1258_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1258_groups_0 = const()[name = tensor<string, []>("op_1258_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [5120, 1280, 1, 1]> layers_3_fc1_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [3276800]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(193123648))), lut = tensor<fp16, [16]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(196400512))), name = tensor<string, []>("layers_3_fc1_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([5120, 1280, 1, 1])];
            tensor<fp16, [5120]> layers_3_fc1_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_3_fc1_inlier_module_bias_to_fp16"), val = tensor<fp16, [5120]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(196400640)))];
            tensor<fp16, [1, 5120, 1, 1]> var_1258_cast_fp16 = conv(bias = layers_3_fc1_inlier_module_bias_to_fp16, dilations = var_1258_dilations_0, groups = var_1258_groups_0, pad = var_1258_pad_0, pad_type = var_1258_pad_type_0, strides = var_1258_strides_0, weight = layers_3_fc1_inlier_module_weight_to_fp16_palettized, x = input_35_cast_fp16)[name = tensor<string, []>("op_1258_cast_fp16")];
            tensor<string, []> var_1264_pad_type_0 = const()[name = tensor<string, []>("op_1264_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1264_strides_0 = const()[name = tensor<string, []>("op_1264_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1264_pad_0 = const()[name = tensor<string, []>("op_1264_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1264_dilations_0 = const()[name = tensor<string, []>("op_1264_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1264_groups_0 = const()[name = tensor<string, []>("op_1264_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [5120, 1280, 1, 1]> layers_3_fc1_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(196463680))), name = tensor<string, []>("layers_3_fc1_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [26331]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(196410944))), shape = tensor<uint32, [4]>([5120, 1280, 1, 1])];
            tensor<fp16, [1, 5120, 1, 1]> var_1264_cast_fp16 = conv(dilations = var_1264_dilations_0, groups = var_1264_groups_0, pad = var_1264_pad_0, pad_type = var_1264_pad_type_0, strides = var_1264_strides_0, weight = layers_3_fc1_outlier_module_weight_to_fp16_sparsified, x = input_35_cast_fp16)[name = tensor<string, []>("op_1264_cast_fp16")];
            tensor<fp16, [1, 5120, 1, 1]> input_37_cast_fp16 = add(x = var_1258_cast_fp16, y = var_1264_cast_fp16)[name = tensor<string, []>("input_37_cast_fp16")];
            tensor<string, []> input_mode_0 = const()[name = tensor<string, []>("input_mode_0"), val = tensor<string, []>("EXACT")];
            tensor<fp16, [1, 5120, 1, 1]> input_cast_fp16 = gelu(mode = input_mode_0, x = input_37_cast_fp16)[name = tensor<string, []>("input_cast_fp16")];
            tensor<string, []> var_1275_pad_type_0 = const()[name = tensor<string, []>("op_1275_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1275_strides_0 = const()[name = tensor<string, []>("op_1275_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1275_pad_0 = const()[name = tensor<string, []>("op_1275_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1275_dilations_0 = const()[name = tensor<string, []>("op_1275_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1275_groups_0 = const()[name = tensor<string, []>("op_1275_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 5120, 1, 1]> layers_3_fc2_inlier_module_weight_to_fp16_palettized = constexpr_lut_to_dense()[indices = tensor<uint8, [4915200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(197282944))), lut = tensor<fp16, [64]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(202198208))), name = tensor<string, []>("layers_3_fc2_inlier_module_weight_to_fp16_palettized"), shape = tensor<uint32, [4]>([1280, 5120, 1, 1])];
            tensor<fp16, [1280]> layers_3_fc2_inlier_module_bias_to_fp16 = const()[name = tensor<string, []>("layers_3_fc2_inlier_module_bias_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(202198400)))];
            tensor<fp16, [1, 1280, 1, 1]> var_1275_cast_fp16 = conv(bias = layers_3_fc2_inlier_module_bias_to_fp16, dilations = var_1275_dilations_0, groups = var_1275_groups_0, pad = var_1275_pad_0, pad_type = var_1275_pad_type_0, strides = var_1275_strides_0, weight = layers_3_fc2_inlier_module_weight_to_fp16_palettized, x = input_cast_fp16)[name = tensor<string, []>("op_1275_cast_fp16")];
            tensor<string, []> var_1281_pad_type_0 = const()[name = tensor<string, []>("op_1281_pad_type_0"), val = tensor<string, []>("valid")];
            tensor<int32, [2]> var_1281_strides_0 = const()[name = tensor<string, []>("op_1281_strides_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, [4]> var_1281_pad_0 = const()[name = tensor<string, []>("op_1281_pad_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [2]> var_1281_dilations_0 = const()[name = tensor<string, []>("op_1281_dilations_0"), val = tensor<int32, [2]>([1, 1])];
            tensor<int32, []> var_1281_groups_0 = const()[name = tensor<string, []>("op_1281_groups_0"), val = tensor<int32, []>(1)];
            tensor<fp16, [1280, 5120, 1, 1]> layers_3_fc2_outlier_module_weight_to_fp16_sparsified = constexpr_sparse_to_dense()[mask = tensor<uint8, [819200]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(202271552))), name = tensor<string, []>("layers_3_fc2_outlier_module_weight_to_fp16_sparsified"), nonzero_data = tensor<fp16, [35232]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(202201024))), shape = tensor<uint32, [4]>([1280, 5120, 1, 1])];
            tensor<fp16, [1, 1280, 1, 1]> var_1281_cast_fp16 = conv(dilations = var_1281_dilations_0, groups = var_1281_groups_0, pad = var_1281_pad_0, pad_type = var_1281_pad_type_0, strides = var_1281_strides_0, weight = layers_3_fc2_outlier_module_weight_to_fp16_sparsified, x = input_cast_fp16)[name = tensor<string, []>("op_1281_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> hidden_states_9_cast_fp16 = add(x = var_1275_cast_fp16, y = var_1281_cast_fp16)[name = tensor<string, []>("hidden_states_9_cast_fp16")];
            tensor<fp16, [1, 1280, 1, 1]> inputs_cast_fp16 = add(x = inputs_23_cast_fp16, y = hidden_states_9_cast_fp16)[name = tensor<string, []>("inputs_cast_fp16")];
            tensor<int32, [1]> out_axes_0 = const()[name = tensor<string, []>("out_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, []> var_1301_to_fp16 = const()[name = tensor<string, []>("op_1301_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> out_cast_fp16 = layer_norm(axes = out_axes_0, epsilon = var_1301_to_fp16, x = inputs_cast_fp16)[name = tensor<string, []>("out_cast_fp16")];
            tensor<fp16, [1280]> hidden_states_gamma_0_to_fp16 = const()[name = tensor<string, []>("hidden_states_gamma_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(203090816)))];
            tensor<fp16, [1280]> hidden_states_beta_0_to_fp16 = const()[name = tensor<string, []>("hidden_states_beta_0_to_fp16"), val = tensor<fp16, [1280]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(203093440)))];
            tensor<fp16, []> hidden_states_epsilon_0_to_fp16 = const()[name = tensor<string, []>("hidden_states_epsilon_0_to_fp16"), val = tensor<fp16, []>(0x1.5p-17)];
            tensor<fp16, [1, 1280, 1, 1]> hidden_states_cast_fp16 = batch_norm(beta = hidden_states_beta_0_to_fp16, epsilon = hidden_states_epsilon_0_to_fp16, gamma = hidden_states_gamma_0_to_fp16, mean = obj_1_mean_0_to_fp16, variance = obj_1_variance_0_to_fp16, x = out_cast_fp16)[name = tensor<string, []>("hidden_states_cast_fp16")];
            tensor<int32, [1]> var_1312_axes_0 = const()[name = tensor<string, []>("op_1312_axes_0"), val = tensor<int32, [1]>([2])];
            tensor<fp16, [1, 1280, 1]> var_1312_cast_fp16 = squeeze(axes = var_1312_axes_0, x = hidden_states_cast_fp16)[name = tensor<string, []>("op_1312_cast_fp16")];
            tensor<int32, [3]> var_1315_perm_0 = const()[name = tensor<string, []>("op_1315_perm_0"), val = tensor<int32, [3]>([0, 2, 1])];
            tensor<fp16, [51866]> linear_0_bias_0_to_fp16 = const()[name = tensor<string, []>("linear_0_bias_0_to_fp16"), val = tensor<fp16, [51866]>(BLOBFILE(path = tensor<string, []>("@model_path/weights/weight.bin"), offset = tensor<uint64, []>(203096064)))];
            tensor<fp16, [1, 1, 1280]> var_1315_cast_fp16 = transpose(perm = var_1315_perm_0, x = var_1312_cast_fp16)[name = tensor<string, []>("transpose_0")];
            tensor<fp16, [1, 1, 51866]> logits = linear(bias = linear_0_bias_0_to_fp16, weight = embed_tokens_weight_to_fp16, x = var_1315_cast_fp16)[name = tensor<string, []>("linear_0_cast_fp16")];
            tensor<int32, []> var_1319 = const()[name = tensor<string, []>("op_1319"), val = tensor<int32, []>(1)];
            tensor<bool, []> obj_59_interleave_0 = const()[name = tensor<string, []>("obj_59_interleave_0"), val = tensor<bool, []>(false)];
            tensor<fp16, [1, 5120, 1, 1]> key_cache_updates = concat(axis = var_1319, interleave = obj_59_interleave_0, values = (current_key_1_cast_fp16, current_key_3_cast_fp16, current_key_5_cast_fp16, current_key_cast_fp16))[name = tensor<string, []>("obj_59_cast_fp16")];
            tensor<int32, []> var_1322 = const()[name = tensor<string, []>("op_1322"), val = tensor<int32, []>(1)];
            tensor<bool, []> obj_61_interleave_0 = const()[name = tensor<string, []>("obj_61_interleave_0"), val = tensor<bool, []>(false)];
            tensor<fp16, [1, 5120, 1, 1]> value_cache_updates = concat(axis = var_1322, interleave = obj_61_interleave_0, values = (current_value_1_cast_fp16, current_value_3_cast_fp16, current_value_5_cast_fp16, current_value_cast_fp16))[name = tensor<string, []>("obj_61_cast_fp16")];
            tensor<int32, [4]> var_1333_begin_0 = const()[name = tensor<string, []>("op_1333_begin_0"), val = tensor<int32, [4]>([0, 4, 0, 0])];
            tensor<int32, [4]> var_1333_end_0 = const()[name = tensor<string, []>("op_1333_end_0"), val = tensor<int32, [4]>([1, 5, 1, 1500])];
            tensor<bool, [4]> var_1333_end_mask_0 = const()[name = tensor<string, []>("op_1333_end_mask_0"), val = tensor<bool, [4]>([true, false, true, true])];
            tensor<fp16, [1, 1, 1, 1500]> var_1333_cast_fp16 = slice_by_index(begin = var_1333_begin_0, end = var_1333_end_0, end_mask = var_1333_end_mask_0, x = obj_41_cast_fp16)[name = tensor<string, []>("op_1333_cast_fp16")];
            tensor<int32, [4]> var_1336_begin_0 = const()[name = tensor<string, []>("op_1336_begin_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [4]> var_1336_end_0 = const()[name = tensor<string, []>("op_1336_end_0"), val = tensor<int32, [4]>([1, 1, 1, 1500])];
            tensor<bool, [4]> var_1336_end_mask_0 = const()[name = tensor<string, []>("op_1336_end_mask_0"), val = tensor<bool, [4]>([true, true, false, true])];
            tensor<bool, [4]> var_1336_squeeze_mask_0 = const()[name = tensor<string, []>("op_1336_squeeze_mask_0"), val = tensor<bool, [4]>([false, false, true, false])];
            tensor<fp16, [1, 1, 1500]> var_1336_cast_fp16 = slice_by_index(begin = var_1336_begin_0, end = var_1336_end_0, end_mask = var_1336_end_mask_0, squeeze_mask = var_1336_squeeze_mask_0, x = var_1333_cast_fp16)[name = tensor<string, []>("op_1336_cast_fp16")];
            tensor<int32, [4]> var_1351_begin_0 = const()[name = tensor<string, []>("op_1351_begin_0"), val = tensor<int32, [4]>([0, 11, 0, 0])];
            tensor<int32, [4]> var_1351_end_0 = const()[name = tensor<string, []>("op_1351_end_0"), val = tensor<int32, [4]>([1, 12, 1, 1500])];
            tensor<bool, [4]> var_1351_end_mask_0 = const()[name = tensor<string, []>("op_1351_end_mask_0"), val = tensor<bool, [4]>([true, false, true, true])];
            tensor<fp16, [1, 1, 1, 1500]> var_1351_cast_fp16 = slice_by_index(begin = var_1351_begin_0, end = var_1351_end_0, end_mask = var_1351_end_mask_0, x = obj_41_cast_fp16)[name = tensor<string, []>("op_1351_cast_fp16")];
            tensor<int32, [4]> var_1354_begin_0 = const()[name = tensor<string, []>("op_1354_begin_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [4]> var_1354_end_0 = const()[name = tensor<string, []>("op_1354_end_0"), val = tensor<int32, [4]>([1, 1, 1, 1500])];
            tensor<bool, [4]> var_1354_end_mask_0 = const()[name = tensor<string, []>("op_1354_end_mask_0"), val = tensor<bool, [4]>([true, true, false, true])];
            tensor<bool, [4]> var_1354_squeeze_mask_0 = const()[name = tensor<string, []>("op_1354_squeeze_mask_0"), val = tensor<bool, [4]>([false, false, true, false])];
            tensor<fp16, [1, 1, 1500]> var_1354_cast_fp16 = slice_by_index(begin = var_1354_begin_0, end = var_1354_end_0, end_mask = var_1354_end_mask_0, squeeze_mask = var_1354_squeeze_mask_0, x = var_1351_cast_fp16)[name = tensor<string, []>("op_1354_cast_fp16")];
            tensor<int32, [4]> var_1369_begin_0 = const()[name = tensor<string, []>("op_1369_begin_0"), val = tensor<int32, [4]>([0, 3, 0, 0])];
            tensor<int32, [4]> var_1369_end_0 = const()[name = tensor<string, []>("op_1369_end_0"), val = tensor<int32, [4]>([1, 4, 1, 1500])];
            tensor<bool, [4]> var_1369_end_mask_0 = const()[name = tensor<string, []>("op_1369_end_mask_0"), val = tensor<bool, [4]>([true, false, true, true])];
            tensor<fp16, [1, 1, 1, 1500]> var_1369_cast_fp16 = slice_by_index(begin = var_1369_begin_0, end = var_1369_end_0, end_mask = var_1369_end_mask_0, x = obj_55_cast_fp16)[name = tensor<string, []>("op_1369_cast_fp16")];
            tensor<int32, [4]> var_1372_begin_0 = const()[name = tensor<string, []>("op_1372_begin_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [4]> var_1372_end_0 = const()[name = tensor<string, []>("op_1372_end_0"), val = tensor<int32, [4]>([1, 1, 1, 1500])];
            tensor<bool, [4]> var_1372_end_mask_0 = const()[name = tensor<string, []>("op_1372_end_mask_0"), val = tensor<bool, [4]>([true, true, false, true])];
            tensor<bool, [4]> var_1372_squeeze_mask_0 = const()[name = tensor<string, []>("op_1372_squeeze_mask_0"), val = tensor<bool, [4]>([false, false, true, false])];
            tensor<fp16, [1, 1, 1500]> var_1372_cast_fp16 = slice_by_index(begin = var_1372_begin_0, end = var_1372_end_0, end_mask = var_1372_end_mask_0, squeeze_mask = var_1372_squeeze_mask_0, x = var_1369_cast_fp16)[name = tensor<string, []>("op_1372_cast_fp16")];
            tensor<int32, [4]> var_1387_begin_0 = const()[name = tensor<string, []>("op_1387_begin_0"), val = tensor<int32, [4]>([0, 6, 0, 0])];
            tensor<int32, [4]> var_1387_end_0 = const()[name = tensor<string, []>("op_1387_end_0"), val = tensor<int32, [4]>([1, 7, 1, 1500])];
            tensor<bool, [4]> var_1387_end_mask_0 = const()[name = tensor<string, []>("op_1387_end_mask_0"), val = tensor<bool, [4]>([true, false, true, true])];
            tensor<fp16, [1, 1, 1, 1500]> var_1387_cast_fp16 = slice_by_index(begin = var_1387_begin_0, end = var_1387_end_0, end_mask = var_1387_end_mask_0, x = obj_55_cast_fp16)[name = tensor<string, []>("op_1387_cast_fp16")];
            tensor<int32, [4]> var_1390_begin_0 = const()[name = tensor<string, []>("op_1390_begin_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [4]> var_1390_end_0 = const()[name = tensor<string, []>("op_1390_end_0"), val = tensor<int32, [4]>([1, 1, 1, 1500])];
            tensor<bool, [4]> var_1390_end_mask_0 = const()[name = tensor<string, []>("op_1390_end_mask_0"), val = tensor<bool, [4]>([true, true, false, true])];
            tensor<bool, [4]> var_1390_squeeze_mask_0 = const()[name = tensor<string, []>("op_1390_squeeze_mask_0"), val = tensor<bool, [4]>([false, false, true, false])];
            tensor<fp16, [1, 1, 1500]> var_1390_cast_fp16 = slice_by_index(begin = var_1390_begin_0, end = var_1390_end_0, end_mask = var_1390_end_mask_0, squeeze_mask = var_1390_squeeze_mask_0, x = var_1387_cast_fp16)[name = tensor<string, []>("op_1390_cast_fp16")];
            tensor<int32, [4]> var_1405_begin_0 = const()[name = tensor<string, []>("op_1405_begin_0"), val = tensor<int32, [4]>([0, 11, 0, 0])];
            tensor<int32, [4]> var_1405_end_0 = const()[name = tensor<string, []>("op_1405_end_0"), val = tensor<int32, [4]>([1, 12, 1, 1500])];
            tensor<bool, [4]> var_1405_end_mask_0 = const()[name = tensor<string, []>("op_1405_end_mask_0"), val = tensor<bool, [4]>([true, false, true, true])];
            tensor<fp16, [1, 1, 1, 1500]> var_1405_cast_fp16 = slice_by_index(begin = var_1405_begin_0, end = var_1405_end_0, end_mask = var_1405_end_mask_0, x = obj_55_cast_fp16)[name = tensor<string, []>("op_1405_cast_fp16")];
            tensor<int32, [4]> var_1408_begin_0 = const()[name = tensor<string, []>("op_1408_begin_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [4]> var_1408_end_0 = const()[name = tensor<string, []>("op_1408_end_0"), val = tensor<int32, [4]>([1, 1, 1, 1500])];
            tensor<bool, [4]> var_1408_end_mask_0 = const()[name = tensor<string, []>("op_1408_end_mask_0"), val = tensor<bool, [4]>([true, true, false, true])];
            tensor<bool, [4]> var_1408_squeeze_mask_0 = const()[name = tensor<string, []>("op_1408_squeeze_mask_0"), val = tensor<bool, [4]>([false, false, true, false])];
            tensor<fp16, [1, 1, 1500]> var_1408_cast_fp16 = slice_by_index(begin = var_1408_begin_0, end = var_1408_end_0, end_mask = var_1408_end_mask_0, squeeze_mask = var_1408_squeeze_mask_0, x = var_1405_cast_fp16)[name = tensor<string, []>("op_1408_cast_fp16")];
            tensor<int32, [4]> var_1423_begin_0 = const()[name = tensor<string, []>("op_1423_begin_0"), val = tensor<int32, [4]>([0, 14, 0, 0])];
            tensor<int32, [4]> var_1423_end_0 = const()[name = tensor<string, []>("op_1423_end_0"), val = tensor<int32, [4]>([1, 15, 1, 1500])];
            tensor<bool, [4]> var_1423_end_mask_0 = const()[name = tensor<string, []>("op_1423_end_mask_0"), val = tensor<bool, [4]>([true, false, true, true])];
            tensor<fp16, [1, 1, 1, 1500]> var_1423_cast_fp16 = slice_by_index(begin = var_1423_begin_0, end = var_1423_end_0, end_mask = var_1423_end_mask_0, x = obj_55_cast_fp16)[name = tensor<string, []>("op_1423_cast_fp16")];
            tensor<int32, [4]> var_1426_begin_0 = const()[name = tensor<string, []>("op_1426_begin_0"), val = tensor<int32, [4]>([0, 0, 0, 0])];
            tensor<int32, [4]> var_1426_end_0 = const()[name = tensor<string, []>("op_1426_end_0"), val = tensor<int32, [4]>([1, 1, 1, 1500])];
            tensor<bool, [4]> var_1426_end_mask_0 = const()[name = tensor<string, []>("op_1426_end_mask_0"), val = tensor<bool, [4]>([true, true, false, true])];
            tensor<bool, [4]> var_1426_squeeze_mask_0 = const()[name = tensor<string, []>("op_1426_squeeze_mask_0"), val = tensor<bool, [4]>([false, false, true, false])];
            tensor<fp16, [1, 1, 1500]> var_1426_cast_fp16 = slice_by_index(begin = var_1426_begin_0, end = var_1426_end_0, end_mask = var_1426_end_mask_0, squeeze_mask = var_1426_squeeze_mask_0, x = var_1423_cast_fp16)[name = tensor<string, []>("op_1426_cast_fp16")];
            tensor<int32, []> var_1433 = const()[name = tensor<string, []>("op_1433"), val = tensor<int32, []>(1)];
            tensor<bool, []> var_1434_interleave_0 = const()[name = tensor<string, []>("op_1434_interleave_0"), val = tensor<bool, []>(false)];
            tensor<fp16, [1, 6, 1500]> var_1434_cast_fp16 = concat(axis = var_1433, interleave = var_1434_interleave_0, values = (var_1336_cast_fp16, var_1354_cast_fp16, var_1372_cast_fp16, var_1390_cast_fp16, var_1408_cast_fp16, var_1426_cast_fp16))[name = tensor<string, []>("op_1434_cast_fp16")];
            tensor<bool, []> var_1437 = const()[name = tensor<string, []>("op_1437"), val = tensor<bool, []>(false)];
            tensor<int32, [1]> obj_axes_0 = const()[name = tensor<string, []>("obj_axes_0"), val = tensor<int32, [1]>([1])];
            tensor<fp16, [1, 1500]> alignment_heads_weights = reduce_mean(axes = obj_axes_0, keep_dims = var_1437, x = var_1434_cast_fp16)[name = tensor<string, []>("obj_cast_fp16")];
        } -> (logits, key_cache_updates, value_cache_updates, alignment_heads_weights);
}