PumeTu's picture
Add files using upload-large-folder tool
a774be5 verified
2025-07-18,19:18:46 | INFO | Running in distributed mode with multiple processes. Device: cuda:0.Process (global: 0, local 0), total 16.
2025-07-18,19:18:46 | INFO | Parsing model identifier. Schema: None, Identifier: ViT-T-16
2025-07-18,19:18:46 | INFO | Loaded built-in ViT-T-16 model config.
2025-07-18,19:18:46 | INFO | No potential checkpoint path found from config source or pretrained arg.
2025-07-18,19:18:46 | INFO | Instantiating model architecture: CLIP
2025-07-18,19:18:47 | WARNING | Model ViT-T-16 initialized partially.
2025-07-18,19:18:47 | INFO | Final image preprocessing configuration set: {'size': (224, 224), 'mode': 'RGB', 'mean': (0.48145466, 0.4578275, 0.40821073), 'std': (0.26862954, 0.26130258, 0.27577711), 'interpolation': 'bicubic', 'resize_mode': 'shortest', 'fill_color': 0}
2025-07-18,19:18:47 | INFO | Model ViT-T-16 creation process complete.
2025-07-18,19:18:47 | INFO | Parsing model identifier. Schema: local-dir, Identifier: /ist-project/scads/pumet/models/DFN2B-CLIP-ViT-L-14
2025-07-18,19:18:47 | WARNING | Ignoring `pretrained='None'` because `model_name` has 'local-dir' schema.
2025-07-18,19:18:47 | INFO | Attempting to load config from local dir: /ist-project/scads/pumet/models/DFN2B-CLIP-ViT-L-14/open_clip_config.json
2025-07-18,19:18:47 | INFO | Loaded model config and preprocess from: /ist-project/scads/pumet/models/DFN2B-CLIP-ViT-L-14/open_clip_config.json
2025-07-18,19:18:47 | INFO | Found preferred checkpoint file: open_clip_pytorch_model.bin in /ist-project/scads/pumet/models/DFN2B-CLIP-ViT-L-14
2025-07-18,19:18:47 | INFO | Found CLIP weights in local folder: /ist-project/scads/pumet/models/DFN2B-CLIP-ViT-L-14/open_clip_pytorch_model.bin
2025-07-18,19:18:47 | INFO | Instantiating model architecture: CLIP
2025-07-18,19:18:50 | INFO | Loading full pretrained weights from: /ist-project/scads/pumet/models/DFN2B-CLIP-ViT-L-14/open_clip_pytorch_model.bin
2025-07-18,19:18:55 | INFO | Final image preprocessing configuration set: {'size': (224, 224), 'mode': 'RGB', 'mean': [0.48145466, 0.4578275, 0.40821073], 'std': [0.26862954, 0.26130258, 0.27577711], 'interpolation': 'bicubic', 'resize_mode': 'shortest', 'fill_color': 0}
2025-07-18,19:18:55 | INFO | Model local-dir:/ist-project/scads/pumet/models/DFN2B-CLIP-ViT-L-14 creation process complete.
2025-07-18,19:18:55 | INFO | Model:
2025-07-18,19:18:55 | INFO | CLIP(
(visual): VisionTransformer(
(conv1): Conv2d(3, 192, kernel_size=(16, 16), stride=(16, 16), bias=False)
(patch_dropout): Identity()
(ln_pre): LayerNorm((192,), eps=1e-05, elementwise_affine=True)
(transformer): Transformer(
(resblocks): ModuleList(
(0-11): 12 x ResidualAttentionBlock(
(ln_1): LayerNorm((192,), eps=1e-05, elementwise_affine=True)
(attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=192, out_features=192, bias=True)
)
(ls_1): Identity()
(ln_2): LayerNorm((192,), eps=1e-05, elementwise_affine=True)
(mlp): Sequential(
(c_fc): Linear(in_features=192, out_features=768, bias=True)
(gelu): GELU(approximate='none')
(c_proj): Linear(in_features=768, out_features=192, bias=True)
)
(ls_2): Identity()
)
)
)
(ln_post): LayerNorm((192,), eps=1e-05, elementwise_affine=True)
)
(transformer): Transformer(
(resblocks): ModuleList(
(0-11): 12 x ResidualAttentionBlock(
(ln_1): LayerNorm((384,), eps=1e-05, elementwise_affine=True)
(attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=384, out_features=384, bias=True)
)
(ls_1): Identity()
(ln_2): LayerNorm((384,), eps=1e-05, elementwise_affine=True)
(mlp): Sequential(
(c_fc): Linear(in_features=384, out_features=1536, bias=True)
(gelu): GELU(approximate='none')
(c_proj): Linear(in_features=1536, out_features=384, bias=True)
)
(ls_2): Identity()
)
)
)
(token_embedding): Embedding(49408, 384)
(ln_final): LayerNorm((384,), eps=1e-05, elementwise_affine=True)
)
2025-07-18,19:18:55 | INFO | Params:
2025-07-18,19:18:55 | INFO | accum_freq: 1
2025-07-18,19:18:55 | INFO | aug_cfg: {}
2025-07-18,19:18:55 | INFO | batch_size: 256
2025-07-18,19:18:55 | INFO | beta1: 0.9
2025-07-18,19:18:55 | INFO | beta2: 0.98
2025-07-18,19:18:55 | INFO | cache_dir: None
2025-07-18,19:18:55 | INFO | checkpoint_path: ./models/ViT-T-16-CC9M-distill-DFN2B-CLIP-ViT-L-14-MSE/checkpoints
2025-07-18,19:18:55 | INFO | coca_caption_loss_weight: 2.0
2025-07-18,19:18:55 | INFO | coca_contrastive_loss_weight: 1.0
2025-07-18,19:18:55 | INFO | copy_codebase: False
2025-07-18,19:18:55 | INFO | csv_caption_key: text
2025-07-18,19:18:55 | INFO | csv_img_key: image
2025-07-18,19:18:55 | INFO | csv_separator:
2025-07-18,19:18:55 | INFO | dataset_resampled: False
2025-07-18,19:18:55 | INFO | dataset_type: auto
2025-07-18,19:18:55 | INFO | ddp_static_graph: False
2025-07-18,19:18:55 | INFO | debug: False
2025-07-18,19:18:55 | INFO | delete_previous_checkpoint: False
2025-07-18,19:18:55 | INFO | device: cuda:0
2025-07-18,19:18:55 | INFO | dist_backend: None
2025-07-18,19:18:55 | INFO | dist_url: None
2025-07-18,19:18:55 | INFO | distill: True
2025-07-18,19:18:55 | INFO | distill_model: local-dir:/ist-project/scads/pumet/models/DFN2B-CLIP-ViT-L-14
2025-07-18,19:18:55 | INFO | distill_pretrained: None
2025-07-18,19:18:55 | INFO | distributed: True
2025-07-18,19:18:55 | INFO | epochs: 32
2025-07-18,19:18:55 | INFO | epochs_cooldown: None
2025-07-18,19:18:55 | INFO | eps: 1e-06
2025-07-18,19:18:55 | INFO | force_context_length: None
2025-07-18,19:18:55 | INFO | force_custom_text: False
2025-07-18,19:18:55 | INFO | force_image_size: None
2025-07-18,19:18:55 | INFO | force_patch_dropout: None
2025-07-18,19:18:55 | INFO | force_quick_gelu: False
2025-07-18,19:18:55 | INFO | gather_with_grad: False
2025-07-18,19:18:55 | INFO | grad_checkpointing: False
2025-07-18,19:18:55 | INFO | grad_clip_norm: None
2025-07-18,19:18:55 | INFO | horovod: False
2025-07-18,19:18:55 | INFO | image_interpolation: None
2025-07-18,19:18:55 | INFO | image_mean: None
2025-07-18,19:18:55 | INFO | image_resize_mode: None
2025-07-18,19:18:55 | INFO | image_std: None
2025-07-18,19:18:55 | INFO | imagenet_v2: None
2025-07-18,19:18:55 | INFO | imagenet_val: /ist-project/scads/pumet/datasets/imagenet-1k/validation
2025-07-18,19:18:55 | INFO | local_loss: False
2025-07-18,19:18:55 | INFO | local_rank: 0
2025-07-18,19:18:55 | INFO | lock_image: False
2025-07-18,19:18:55 | INFO | lock_image_freeze_bn_stats: False
2025-07-18,19:18:55 | INFO | lock_image_unlocked_groups: 0
2025-07-18,19:18:55 | INFO | lock_text: False
2025-07-18,19:18:55 | INFO | lock_text_freeze_layer_norm: False
2025-07-18,19:18:55 | INFO | lock_text_unlocked_layers: 0
2025-07-18,19:18:55 | INFO | log_every_n_steps: 100
2025-07-18,19:18:55 | INFO | log_level: 20
2025-07-18,19:18:55 | INFO | log_local: False
2025-07-18,19:18:55 | INFO | log_path: ./models/ViT-T-16-CC9M-distill-DFN2B-CLIP-ViT-L-14-MSE/out.log
2025-07-18,19:18:55 | INFO | logs: ./models/
2025-07-18,19:18:55 | INFO | loss_dist_impl: None
2025-07-18,19:18:55 | INFO | lr: 0.0005
2025-07-18,19:18:55 | INFO | lr_cooldown_end: 0.0
2025-07-18,19:18:55 | INFO | lr_cooldown_power: 1.0
2025-07-18,19:18:55 | INFO | lr_scheduler: cosine
2025-07-18,19:18:55 | INFO | model: ViT-T-16
2025-07-18,19:18:55 | INFO | momentum: None
2025-07-18,19:18:55 | INFO | name: ViT-T-16-CC9M-distill-DFN2B-CLIP-ViT-L-14-MSE
2025-07-18,19:18:55 | INFO | no_set_device_rank: False
2025-07-18,19:18:55 | INFO | opt: adamw
2025-07-18,19:18:55 | INFO | precision: amp
2025-07-18,19:18:55 | INFO | pretrained:
2025-07-18,19:18:55 | INFO | pretrained_image: False
2025-07-18,19:18:55 | INFO | rank: 0
2025-07-18,19:18:55 | INFO | remote_sync: None
2025-07-18,19:18:55 | INFO | remote_sync_frequency: 300
2025-07-18,19:18:55 | INFO | remote_sync_protocol: s3
2025-07-18,19:18:55 | INFO | report_to: none
2025-07-18,19:18:55 | INFO | resume: None
2025-07-18,19:18:55 | INFO | s_embed: 512
2025-07-18,19:18:55 | INFO | save_frequency: 1
2025-07-18,19:18:55 | INFO | save_most_recent: False
2025-07-18,19:18:55 | INFO | seed: 0
2025-07-18,19:18:55 | INFO | siglip: False
2025-07-18,19:18:55 | INFO | skip_scheduler: False
2025-07-18,19:18:55 | INFO | t_embed: 768
2025-07-18,19:18:55 | INFO | tensorboard: False
2025-07-18,19:18:55 | INFO | tensorboard_path:
2025-07-18,19:18:55 | INFO | torchcompile: False
2025-07-18,19:18:55 | INFO | torchscript: False
2025-07-18,19:18:55 | INFO | trace: False
2025-07-18,19:18:55 | INFO | train_data: /ist-project/scads/pumet/datasets/cc9m/cc9m.csv
2025-07-18,19:18:55 | INFO | train_data_upsampling_factors: None
2025-07-18,19:18:55 | INFO | train_num_samples: None
2025-07-18,19:18:55 | INFO | use_bn_sync: False
2025-07-18,19:18:55 | INFO | use_bnb_linear: None
2025-07-18,19:18:55 | INFO | val_data: None
2025-07-18,19:18:55 | INFO | val_frequency: 1
2025-07-18,19:18:55 | INFO | val_num_samples: None
2025-07-18,19:18:55 | INFO | wandb: False
2025-07-18,19:18:55 | INFO | wandb_notes:
2025-07-18,19:18:55 | INFO | wandb_project_name: open-clip
2025-07-18,19:18:55 | INFO | warmup: 2000
2025-07-18,19:18:55 | INFO | wd: 0.2
2025-07-18,19:18:55 | INFO | workers: 8
2025-07-18,19:18:55 | INFO | world_size: 16
2025-07-18,19:18:55 | INFO | zeroshot_frequency: 2
2025-07-18,19:18:56 | INFO | Created AdamW (adamw) optimizer: lr: 0.0005, betas: (0.9, 0.98), eps: 1e-06, weight_decay: 0.2, amsgrad: False, foreach: None, maximize: False, capturable: False, differentiable: False, fused: None
2025-07-18,19:18:56 | INFO | Parsing tokenizer identifier. Schema: None, Identifier: ViT-T-16
2025-07-18,19:18:56 | INFO | Attempting to load config from built-in: ViT-T-16
2025-07-18,19:18:56 | INFO | Using default SimpleTokenizer.
2025-07-18,19:18:56 | INFO | Parsing tokenizer identifier. Schema: local-dir, Identifier: /ist-project/scads/pumet/models/DFN2B-CLIP-ViT-L-14
2025-07-18,19:18:56 | INFO | Attempting to load config from local-dir: /ist-project/scads/pumet/models/DFN2B-CLIP-ViT-L-14 at /ist-project/scads/pumet/models/DFN2B-CLIP-ViT-L-14/open_clip_config.json
2025-07-18,19:18:56 | INFO | Using default SimpleTokenizer.
2025-07-18,19:19:19 | INFO | Start epoch 0
2025-07-18,19:20:03 | INFO | Train Epoch: 0 [ 4096/9319509 (0%)] Data (t): 35.984 Batch (t): 43.954, 93.1877/s, 5.82423/s/gpu LR: 0.000000 Logit Scale: 14.286 Contrastive_loss: 8.3942 (8.3942) Fd_loss: 10.258 (10.258) Loss: 18.652 (18.652)
2025-07-18,19:22:21 | INFO | Train Epoch: 0 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.379, 2997.61/s, 187.351/s/gpu LR: 0.000025 Logit Scale: 14.288 Contrastive_loss: 8.3110 (8.3526) Fd_loss: 5.6115 (7.9346) Loss: 13.922 (16.287)
2025-07-18,19:24:38 | INFO | Train Epoch: 0 [ 823296/9319509 (9%)] Data (t): 0.000 Batch (t): 1.364, 3002.10/s, 187.631/s/gpu LR: 0.000050 Logit Scale: 14.333 Contrastive_loss: 8.2311 (8.3121) Fd_loss: 3.8947 (6.5880) Loss: 12.126 (14.900)
2025-07-18,19:26:54 | INFO | Train Epoch: 0 [1232896/9319509 (13%)] Data (t): 0.000 Batch (t): 1.365, 2992.12/s, 187.007/s/gpu LR: 0.000075 Logit Scale: 14.422 Contrastive_loss: 8.1049 (8.2603) Fd_loss: 3.5709 (5.8337) Loss: 11.676 (14.094)
2025-07-18,19:29:11 | INFO | Train Epoch: 0 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.366, 2999.99/s, 187.499/s/gpu LR: 0.000100 Logit Scale: 14.543 Contrastive_loss: 7.8386 (8.1760) Fd_loss: 3.3967 (5.3463) Loss: 11.235 (13.522)
2025-07-18,19:31:27 | INFO | Train Epoch: 0 [2052096/9319509 (22%)] Data (t): 0.000 Batch (t): 1.365, 3000.16/s, 187.510/s/gpu LR: 0.000125 Logit Scale: 14.691 Contrastive_loss: 7.6377 (8.0863) Fd_loss: 3.2785 (5.0017) Loss: 10.916 (13.088)
2025-07-18,19:33:44 | INFO | Train Epoch: 0 [2461696/9319509 (26%)] Data (t): 0.000 Batch (t): 1.365, 3003.94/s, 187.747/s/gpu LR: 0.000150 Logit Scale: 14.841 Contrastive_loss: 7.4240 (7.9916) Fd_loss: 3.1789 (4.7413) Loss: 10.603 (12.733)
2025-07-18,19:36:00 | INFO | Train Epoch: 0 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.365, 2999.16/s, 187.447/s/gpu LR: 0.000175 Logit Scale: 15.034 Contrastive_loss: 7.0954 (7.8796) Fd_loss: 3.1148 (4.5380) Loss: 10.210 (12.418)
2025-07-18,19:38:17 | INFO | Train Epoch: 0 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.366, 3004.98/s, 187.811/s/gpu LR: 0.000200 Logit Scale: 15.254 Contrastive_loss: 6.9640 (7.7779) Fd_loss: 3.0420 (4.3718) Loss: 10.006 (12.150)
2025-07-18,19:40:33 | INFO | Train Epoch: 0 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.365, 3002.64/s, 187.665/s/gpu LR: 0.000225 Logit Scale: 15.481 Contrastive_loss: 6.7229 (7.6724) Fd_loss: 2.9618 (4.2308) Loss: 9.6848 (11.903)
2025-07-18,19:42:50 | INFO | Train Epoch: 0 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.365, 3012.89/s, 188.306/s/gpu LR: 0.000250 Logit Scale: 15.760 Contrastive_loss: 6.5058 (7.5663) Fd_loss: 2.9118 (4.1109) Loss: 9.4175 (11.677)
2025-07-18,19:45:06 | INFO | Train Epoch: 0 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.365, 2997.80/s, 187.362/s/gpu LR: 0.000275 Logit Scale: 16.076 Contrastive_loss: 6.3884 (7.4682) Fd_loss: 2.8603 (4.0066) Loss: 9.2487 (11.475)
2025-07-18,19:47:23 | INFO | Train Epoch: 0 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.366, 2998.31/s, 187.394/s/gpu LR: 0.000300 Logit Scale: 16.451 Contrastive_loss: 6.2021 (7.3708) Fd_loss: 2.7908 (3.9131) Loss: 8.9929 (11.284)
2025-07-18,19:49:39 | INFO | Train Epoch: 0 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.366, 2996.08/s, 187.255/s/gpu LR: 0.000325 Logit Scale: 16.879 Contrastive_loss: 6.0316 (7.2751) Fd_loss: 2.7461 (3.8298) Loss: 8.7777 (11.105)
2025-07-18,19:51:56 | INFO | Train Epoch: 0 [5738496/9319509 (62%)] Data (t): 0.000 Batch (t): 1.365, 2993.64/s, 187.102/s/gpu LR: 0.000350 Logit Scale: 17.367 Contrastive_loss: 5.9137 (7.1844) Fd_loss: 2.6996 (3.7544) Loss: 8.6133 (10.939)
2025-07-18,19:54:12 | INFO | Train Epoch: 0 [6148096/9319509 (66%)] Data (t): 0.000 Batch (t): 1.365, 2999.21/s, 187.451/s/gpu LR: 0.000375 Logit Scale: 17.913 Contrastive_loss: 5.7663 (7.0957) Fd_loss: 2.6410 (3.6848) Loss: 8.4074 (10.781)
2025-07-18,19:56:30 | INFO | Train Epoch: 0 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.375, 3001.30/s, 187.581/s/gpu LR: 0.000400 Logit Scale: 18.519 Contrastive_loss: 5.7593 (7.0171) Fd_loss: 2.6072 (3.6214) Loss: 8.3665 (10.639)
2025-07-18,19:58:48 | INFO | Train Epoch: 0 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.383, 2921.30/s, 182.581/s/gpu LR: 0.000425 Logit Scale: 19.196 Contrastive_loss: 5.6354 (6.9404) Fd_loss: 2.5651 (3.5628) Loss: 8.2005 (10.503)
2025-07-18,20:01:06 | INFO | Train Epoch: 0 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.377, 2945.02/s, 184.064/s/gpu LR: 0.000450 Logit Scale: 19.903 Contrastive_loss: 5.4676 (6.8628) Fd_loss: 2.5231 (3.5080) Loss: 7.9907 (10.371)
2025-07-18,20:03:24 | INFO | Train Epoch: 0 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.378, 2993.86/s, 187.116/s/gpu LR: 0.000475 Logit Scale: 20.718 Contrastive_loss: 5.3806 (6.7887) Fd_loss: 2.4855 (3.4569) Loss: 7.8661 (10.246)
2025-07-18,20:05:40 | INFO | Train Epoch: 0 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.364, 3007.16/s, 187.948/s/gpu LR: 0.000500 Logit Scale: 21.595 Contrastive_loss: 5.1813 (6.7122) Fd_loss: 2.4421 (3.4086) Loss: 7.6233 (10.121)
2025-07-18,20:07:56 | INFO | Train Epoch: 0 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.363, 3002.32/s, 187.645/s/gpu LR: 0.000500 Logit Scale: 22.511 Contrastive_loss: 5.1832 (6.6427) Fd_loss: 2.4081 (3.3631) Loss: 7.5913 (10.006)
2025-07-18,20:10:13 | INFO | Train Epoch: 0 [9015296/9319509 (97%)] Data (t): 0.000 Batch (t): 1.365, 2997.10/s, 187.319/s/gpu LR: 0.000500 Logit Scale: 23.204 Contrastive_loss: 5.0312 (6.5726) Fd_loss: 2.3603 (3.3195) Loss: 7.3915 (9.8921)
2025-07-18,20:11:54 | INFO | Train Epoch: 0 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.368, 2999.46/s, 187.466/s/gpu LR: 0.000500 Logit Scale: 23.892 Contrastive_loss: 4.9292 (6.5041) Fd_loss: 2.3542 (3.2793) Loss: 7.2834 (9.7834)
2025-07-18,20:11:57 | INFO | Start epoch 1
2025-07-18,20:12:11 | INFO | Train Epoch: 1 [ 4096/9319509 (0%)] Data (t): 8.494 Batch (t): 13.587, 301.466/s, 18.8416/s/gpu LR: 0.000500 Logit Scale: 23.901 Contrastive_loss: 4.8777 (4.8777) Fd_loss: 2.3509 (2.3509) Loss: 7.2286 (7.2286)
2025-07-18,20:14:27 | INFO | Train Epoch: 1 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.364, 3004.94/s, 187.809/s/gpu LR: 0.000500 Logit Scale: 24.883 Contrastive_loss: 4.7079 (4.7928) Fd_loss: 2.3163 (2.3336) Loss: 7.0242 (7.1264)
2025-07-18,20:16:44 | INFO | Train Epoch: 1 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.364, 3005.33/s, 187.833/s/gpu LR: 0.000500 Logit Scale: 25.900 Contrastive_loss: 4.6553 (4.7470) Fd_loss: 2.3002 (2.3224) Loss: 6.9554 (7.0694)
2025-07-18,20:19:00 | INFO | Train Epoch: 1 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.365, 2999.01/s, 187.438/s/gpu LR: 0.000500 Logit Scale: 26.914 Contrastive_loss: 4.6195 (4.7151) Fd_loss: 2.2833 (2.3127) Loss: 6.9028 (7.0278)
2025-07-18,20:21:16 | INFO | Train Epoch: 1 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.364, 2997.40/s, 187.337/s/gpu LR: 0.000500 Logit Scale: 27.910 Contrastive_loss: 4.4645 (4.6650) Fd_loss: 2.2464 (2.2994) Loss: 6.7109 (6.9644)
2025-07-18,20:23:33 | INFO | Train Epoch: 1 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.364, 2983.44/s, 186.465/s/gpu LR: 0.000500 Logit Scale: 28.876 Contrastive_loss: 4.3571 (4.6137) Fd_loss: 2.2281 (2.2875) Loss: 6.5852 (6.9012)
2025-07-18,20:25:49 | INFO | Train Epoch: 1 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.366, 3005.66/s, 187.853/s/gpu LR: 0.000500 Logit Scale: 29.740 Contrastive_loss: 4.3527 (4.5764) Fd_loss: 2.1993 (2.2749) Loss: 6.5520 (6.8513)
2025-07-18,20:28:06 | INFO | Train Epoch: 1 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.364, 3003.18/s, 187.699/s/gpu LR: 0.000500 Logit Scale: 30.646 Contrastive_loss: 4.3109 (4.5432) Fd_loss: 2.1962 (2.2651) Loss: 6.5070 (6.8083)
2025-07-18,20:30:22 | INFO | Train Epoch: 1 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.364, 3009.18/s, 188.074/s/gpu LR: 0.000500 Logit Scale: 31.533 Contrastive_loss: 4.1600 (4.5006) Fd_loss: 2.1852 (2.2562) Loss: 6.3452 (6.7568)
2025-07-18,20:32:39 | INFO | Train Epoch: 1 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.363, 3015.47/s, 188.467/s/gpu LR: 0.000500 Logit Scale: 32.438 Contrastive_loss: 4.2513 (4.4757) Fd_loss: 2.1609 (2.2467) Loss: 6.4122 (6.7224)
2025-07-18,20:34:55 | INFO | Train Epoch: 1 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.363, 2993.72/s, 187.108/s/gpu LR: 0.000500 Logit Scale: 33.253 Contrastive_loss: 4.0635 (4.4382) Fd_loss: 2.1373 (2.2367) Loss: 6.2008 (6.6749)
2025-07-18,20:37:11 | INFO | Train Epoch: 1 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.362, 2999.62/s, 187.476/s/gpu LR: 0.000500 Logit Scale: 33.986 Contrastive_loss: 3.9612 (4.3985) Fd_loss: 2.1182 (2.2269) Loss: 6.0795 (6.6253)
2025-07-18,20:39:27 | INFO | Train Epoch: 1 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.363, 3002.48/s, 187.655/s/gpu LR: 0.000499 Logit Scale: 34.706 Contrastive_loss: 3.9482 (4.3638) Fd_loss: 2.1161 (2.2183) Loss: 6.0643 (6.5822)
2025-07-18,20:41:44 | INFO | Train Epoch: 1 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.364, 3000.54/s, 187.534/s/gpu LR: 0.000499 Logit Scale: 35.166 Contrastive_loss: 3.9132 (4.3316) Fd_loss: 2.0940 (2.2095) Loss: 6.0072 (6.5411)
2025-07-18,20:44:00 | INFO | Train Epoch: 1 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.363, 2993.23/s, 187.077/s/gpu LR: 0.000499 Logit Scale: 35.783 Contrastive_loss: 3.8612 (4.3003) Fd_loss: 2.0927 (2.2017) Loss: 5.9540 (6.5020)
2025-07-18,20:46:16 | INFO | Train Epoch: 1 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.362, 3005.67/s, 187.854/s/gpu LR: 0.000499 Logit Scale: 36.391 Contrastive_loss: 3.8334 (4.2711) Fd_loss: 2.0762 (2.1938) Loss: 5.9096 (6.4649)
2025-07-18,20:48:33 | INFO | Train Epoch: 1 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.363, 2994.00/s, 187.125/s/gpu LR: 0.000499 Logit Scale: 36.963 Contrastive_loss: 3.8185 (4.2445) Fd_loss: 2.0586 (2.1859) Loss: 5.8771 (6.4304)
2025-07-18,20:50:49 | INFO | Train Epoch: 1 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.363, 3012.49/s, 188.281/s/gpu LR: 0.000499 Logit Scale: 37.539 Contrastive_loss: 3.7063 (4.2146) Fd_loss: 2.0481 (2.1782) Loss: 5.7545 (6.3928)
2025-07-18,20:53:05 | INFO | Train Epoch: 1 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.363, 2975.18/s, 185.949/s/gpu LR: 0.000499 Logit Scale: 38.042 Contrastive_loss: 3.6909 (4.1870) Fd_loss: 2.0337 (2.1706) Loss: 5.7246 (6.3576)
2025-07-18,20:55:21 | INFO | Train Epoch: 1 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.363, 3011.34/s, 188.209/s/gpu LR: 0.000499 Logit Scale: 38.623 Contrastive_loss: 3.6754 (4.1614) Fd_loss: 2.0251 (2.1633) Loss: 5.7005 (6.3248)
2025-07-18,20:57:38 | INFO | Train Epoch: 1 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.362, 3009.67/s, 188.105/s/gpu LR: 0.000499 Logit Scale: 39.136 Contrastive_loss: 3.5259 (4.1312) Fd_loss: 2.0165 (2.1564) Loss: 5.5423 (6.2875)
2025-07-18,20:59:54 | INFO | Train Epoch: 1 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.362, 3000.40/s, 187.525/s/gpu LR: 0.000499 Logit Scale: 39.647 Contrastive_loss: 3.5411 (4.1043) Fd_loss: 2.0169 (2.1500) Loss: 5.5579 (6.2544)
2025-07-18,21:02:10 | INFO | Train Epoch: 1 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.362, 3013.10/s, 188.318/s/gpu LR: 0.000498 Logit Scale: 40.135 Contrastive_loss: 3.4587 (4.0763) Fd_loss: 1.9873 (2.1429) Loss: 5.4460 (6.2192)
2025-07-18,21:03:51 | INFO | Train Epoch: 1 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.364, 3013.60/s, 188.350/s/gpu LR: 0.000498 Logit Scale: 40.414 Contrastive_loss: 3.4323 (4.0494) Fd_loss: 1.9932 (2.1367) Loss: 5.4255 (6.1861)
2025-07-18,21:03:52 | INFO | Starting zero-shot imagenet.
2025-07-18,21:03:52 | INFO | Building zero-shot classifier
2025-07-18,21:04:07 | INFO | Using classifier
2025-07-18,21:05:26 | INFO | Finished zero-shot imagenet.
2025-07-18,21:05:26 | INFO | Eval Epoch: 2 imagenet-zeroshot-val-top1: 0.1019 imagenet-zeroshot-val-top5: 0.2529
2025-07-18,21:05:27 | INFO | Start epoch 2
2025-07-18,21:05:32 | INFO | Train Epoch: 2 [ 4096/9319509 (0%)] Data (t): 3.864 Batch (t): 5.203, 787.218/s, 49.2011/s/gpu LR: 0.000498 Logit Scale: 40.423 Contrastive_loss: 3.3331 (3.3331) Fd_loss: 1.9888 (1.9888) Loss: 5.3219 (5.3219)
2025-07-18,21:07:48 | INFO | Train Epoch: 2 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.358, 2992.93/s, 187.058/s/gpu LR: 0.000498 Logit Scale: 41.335 Contrastive_loss: 3.3713 (3.3522) Fd_loss: 1.9848 (1.9868) Loss: 5.3561 (5.3390)
2025-07-18,21:10:04 | INFO | Train Epoch: 2 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.363, 3001.24/s, 187.578/s/gpu LR: 0.000498 Logit Scale: 41.883 Contrastive_loss: 3.2970 (3.3338) Fd_loss: 1.9780 (1.9838) Loss: 5.2750 (5.3176)
2025-07-18,21:12:21 | INFO | Train Epoch: 2 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.363, 3008.87/s, 188.054/s/gpu LR: 0.000498 Logit Scale: 42.382 Contrastive_loss: 3.1889 (3.2976) Fd_loss: 1.9888 (1.9851) Loss: 5.1777 (5.2827)
2025-07-18,21:14:37 | INFO | Train Epoch: 2 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.365, 3006.76/s, 187.923/s/gpu LR: 0.000498 Logit Scale: 42.787 Contrastive_loss: 3.1797 (3.2740) Fd_loss: 1.9665 (1.9814) Loss: 5.1462 (5.2554)
2025-07-18,21:16:53 | INFO | Train Epoch: 2 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.362, 3010.77/s, 188.173/s/gpu LR: 0.000498 Logit Scale: 43.129 Contrastive_loss: 3.2554 (3.2709) Fd_loss: 1.9528 (1.9766) Loss: 5.2082 (5.2475)
2025-07-18,21:19:09 | INFO | Train Epoch: 2 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.362, 2999.25/s, 187.453/s/gpu LR: 0.000498 Logit Scale: 43.445 Contrastive_loss: 3.2132 (3.2627) Fd_loss: 1.9406 (1.9715) Loss: 5.1538 (5.2341)
2025-07-18,21:21:26 | INFO | Train Epoch: 2 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.363, 3007.59/s, 187.974/s/gpu LR: 0.000497 Logit Scale: 43.813 Contrastive_loss: 3.2043 (3.2554) Fd_loss: 1.9356 (1.9670) Loss: 5.1399 (5.2224)
2025-07-18,21:23:42 | INFO | Train Epoch: 2 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.363, 3002.94/s, 187.684/s/gpu LR: 0.000497 Logit Scale: 44.122 Contrastive_loss: 3.2070 (3.2500) Fd_loss: 1.9354 (1.9635) Loss: 5.1424 (5.2135)
2025-07-18,21:25:58 | INFO | Train Epoch: 2 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.362, 3003.85/s, 187.740/s/gpu LR: 0.000497 Logit Scale: 44.474 Contrastive_loss: 3.1681 (3.2418) Fd_loss: 1.9269 (1.9598) Loss: 5.0950 (5.2016)
2025-07-18,21:28:14 | INFO | Train Epoch: 2 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.362, 3010.11/s, 188.132/s/gpu LR: 0.000497 Logit Scale: 44.756 Contrastive_loss: 3.0908 (3.2281) Fd_loss: 1.9298 (1.9571) Loss: 5.0206 (5.1852)
2025-07-18,21:30:31 | INFO | Train Epoch: 2 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.362, 3001.34/s, 187.584/s/gpu LR: 0.000497 Logit Scale: 45.119 Contrastive_loss: 3.0895 (3.2165) Fd_loss: 1.9108 (1.9532) Loss: 5.0004 (5.1698)
2025-07-18,21:32:47 | INFO | Train Epoch: 2 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.363, 3010.13/s, 188.133/s/gpu LR: 0.000497 Logit Scale: 45.422 Contrastive_loss: 3.0727 (3.2055) Fd_loss: 1.9089 (1.9498) Loss: 4.9816 (5.1553)
2025-07-18,21:35:03 | INFO | Train Epoch: 2 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.362, 3011.88/s, 188.243/s/gpu LR: 0.000496 Logit Scale: 45.739 Contrastive_loss: 3.0892 (3.1972) Fd_loss: 1.9107 (1.9470) Loss: 5.0000 (5.1442)
2025-07-18,21:37:19 | INFO | Train Epoch: 2 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.362, 3010.02/s, 188.126/s/gpu LR: 0.000496 Logit Scale: 46.076 Contrastive_loss: 3.0424 (3.1868) Fd_loss: 1.8970 (1.9437) Loss: 4.9394 (5.1305)
2025-07-18,21:39:36 | INFO | Train Epoch: 2 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.365, 3001.95/s, 187.622/s/gpu LR: 0.000496 Logit Scale: 46.347 Contrastive_loss: 2.9998 (3.1752) Fd_loss: 1.9076 (1.9414) Loss: 4.9074 (5.1166)
2025-07-18,21:41:52 | INFO | Train Epoch: 2 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.363, 3009.64/s, 188.103/s/gpu LR: 0.000496 Logit Scale: 46.643 Contrastive_loss: 2.9578 (3.1624) Fd_loss: 1.8804 (1.9378) Loss: 4.8382 (5.1002)
2025-07-18,21:44:09 | INFO | Train Epoch: 2 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.363, 2999.53/s, 187.471/s/gpu LR: 0.000496 Logit Scale: 46.991 Contrastive_loss: 3.0298 (3.1550) Fd_loss: 1.8811 (1.9347) Loss: 4.9109 (5.0897)
2025-07-18,21:46:25 | INFO | Train Epoch: 2 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.363, 3004.59/s, 187.787/s/gpu LR: 0.000495 Logit Scale: 47.336 Contrastive_loss: 2.9478 (3.1441) Fd_loss: 1.8741 (1.9315) Loss: 4.8219 (5.0756)
2025-07-18,21:48:41 | INFO | Train Epoch: 2 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.362, 3008.91/s, 188.057/s/gpu LR: 0.000495 Logit Scale: 47.578 Contrastive_loss: 3.0151 (3.1377) Fd_loss: 1.8738 (1.9286) Loss: 4.8889 (5.0663)
2025-07-18,21:50:57 | INFO | Train Epoch: 2 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.363, 3010.34/s, 188.146/s/gpu LR: 0.000495 Logit Scale: 47.893 Contrastive_loss: 2.8522 (3.1241) Fd_loss: 1.8777 (1.9262) Loss: 4.7299 (5.0503)
2025-07-18,21:53:14 | INFO | Train Epoch: 2 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.364, 3003.47/s, 187.717/s/gpu LR: 0.000495 Logit Scale: 48.144 Contrastive_loss: 2.8882 (3.1133) Fd_loss: 1.8686 (1.9236) Loss: 4.7568 (5.0369)
2025-07-18,21:55:30 | INFO | Train Epoch: 2 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.362, 3000.50/s, 187.531/s/gpu LR: 0.000494 Logit Scale: 48.489 Contrastive_loss: 2.9283 (3.1053) Fd_loss: 1.8693 (1.9212) Loss: 4.7976 (5.0265)
2025-07-18,21:57:11 | INFO | Train Epoch: 2 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.365, 3005.82/s, 187.864/s/gpu LR: 0.000494 Logit Scale: 48.765 Contrastive_loss: 2.8534 (3.0948) Fd_loss: 1.8475 (1.9181) Loss: 4.7009 (5.0129)
2025-07-18,21:57:13 | INFO | Start epoch 3
2025-07-18,21:57:23 | INFO | Train Epoch: 3 [ 4096/9319509 (0%)] Data (t): 8.765 Batch (t): 10.532, 388.911/s, 24.3069/s/gpu LR: 0.000494 Logit Scale: 48.766 Contrastive_loss: 2.7222 (2.7222) Fd_loss: 1.8561 (1.8561) Loss: 4.5783 (4.5783)
2025-07-18,21:59:40 | INFO | Train Epoch: 3 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.365, 2998.43/s, 187.402/s/gpu LR: 0.000494 Logit Scale: 49.767 Contrastive_loss: 2.7031 (2.7127) Fd_loss: 1.8735 (1.8648) Loss: 4.5766 (4.5774)
2025-07-18,22:01:56 | INFO | Train Epoch: 3 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.362, 3010.91/s, 188.182/s/gpu LR: 0.000494 Logit Scale: 50.229 Contrastive_loss: 2.6967 (2.7073) Fd_loss: 1.8510 (1.8602) Loss: 4.5477 (4.5675)
2025-07-18,22:04:12 | INFO | Train Epoch: 3 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.363, 3013.51/s, 188.344/s/gpu LR: 0.000494 Logit Scale: 50.579 Contrastive_loss: 2.7549 (2.7192) Fd_loss: 1.8543 (1.8587) Loss: 4.6093 (4.5779)
2025-07-18,22:06:29 | INFO | Train Epoch: 3 [1642496/9319509 (18%)] Data (t): 0.000 Batch (t): 1.368, 3012.13/s, 188.258/s/gpu LR: 0.000493 Logit Scale: 50.841 Contrastive_loss: 2.6917 (2.7137) Fd_loss: 1.8619 (1.8594) Loss: 4.5536 (4.5731)
2025-07-18,22:08:45 | INFO | Train Epoch: 3 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.363, 2985.14/s, 186.571/s/gpu LR: 0.000493 Logit Scale: 51.071 Contrastive_loss: 2.6471 (2.7026) Fd_loss: 1.8458 (1.8571) Loss: 4.4929 (4.5597)
2025-07-18,22:11:02 | INFO | Train Epoch: 3 [2461696/9319509 (26%)] Data (t): 0.000 Batch (t): 1.362, 3011.51/s, 188.219/s/gpu LR: 0.000493 Logit Scale: 51.329 Contrastive_loss: 2.6815 (2.6996) Fd_loss: 1.8355 (1.8540) Loss: 4.5170 (4.5536)
2025-07-18,22:13:18 | INFO | Train Epoch: 3 [2871296/9319509 (31%)] Data (t): 0.000 Batch (t): 1.362, 3005.69/s, 187.856/s/gpu LR: 0.000493 Logit Scale: 51.533 Contrastive_loss: 2.6753 (2.6966) Fd_loss: 1.8336 (1.8515) Loss: 4.5089 (4.5480)
2025-07-18,22:15:34 | INFO | Train Epoch: 3 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.363, 3001.98/s, 187.623/s/gpu LR: 0.000492 Logit Scale: 51.749 Contrastive_loss: 2.6213 (2.6882) Fd_loss: 1.8369 (1.8498) Loss: 4.4582 (4.5380)
2025-07-18,22:17:50 | INFO | Train Epoch: 3 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.362, 3014.47/s, 188.405/s/gpu LR: 0.000492 Logit Scale: 52.014 Contrastive_loss: 2.6142 (2.6808) Fd_loss: 1.8284 (1.8477) Loss: 4.4426 (4.5285)
2025-07-18,22:20:06 | INFO | Train Epoch: 3 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.362, 3017.80/s, 188.612/s/gpu LR: 0.000492 Logit Scale: 52.183 Contrastive_loss: 2.6182 (2.6751) Fd_loss: 1.8378 (1.8468) Loss: 4.4560 (4.5219)
2025-07-18,22:22:23 | INFO | Train Epoch: 3 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.362, 3018.03/s, 188.627/s/gpu LR: 0.000491 Logit Scale: 52.454 Contrastive_loss: 2.6187 (2.6704) Fd_loss: 1.8233 (1.8448) Loss: 4.4420 (4.5153)
2025-07-18,22:24:39 | INFO | Train Epoch: 3 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.363, 2985.68/s, 186.605/s/gpu LR: 0.000491 Logit Scale: 52.660 Contrastive_loss: 2.6129 (2.6660) Fd_loss: 1.8179 (1.8428) Loss: 4.4307 (4.5087)
2025-07-18,22:26:56 | INFO | Train Epoch: 3 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.365, 2979.34/s, 186.209/s/gpu LR: 0.000491 Logit Scale: 52.889 Contrastive_loss: 2.5863 (2.6603) Fd_loss: 1.8107 (1.8405) Loss: 4.3969 (4.5008)
2025-07-18,22:29:12 | INFO | Train Epoch: 3 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.363, 3007.47/s, 187.967/s/gpu LR: 0.000491 Logit Scale: 53.084 Contrastive_loss: 2.6378 (2.6588) Fd_loss: 1.8250 (1.8394) Loss: 4.4629 (4.4982)
2025-07-18,22:31:28 | INFO | Train Epoch: 3 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.363, 3008.78/s, 188.049/s/gpu LR: 0.000490 Logit Scale: 53.351 Contrastive_loss: 2.6047 (2.6554) Fd_loss: 1.8219 (1.8383) Loss: 4.4266 (4.4938)
2025-07-18,22:33:44 | INFO | Train Epoch: 3 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.362, 2997.60/s, 187.350/s/gpu LR: 0.000490 Logit Scale: 53.551 Contrastive_loss: 2.5398 (2.6486) Fd_loss: 1.8141 (1.8369) Loss: 4.3539 (4.4855)
2025-07-18,22:36:01 | INFO | Train Epoch: 3 [6967296/9319509 (75%)] Data (t): 0.000 Batch (t): 1.362, 2995.20/s, 187.200/s/gpu LR: 0.000490 Logit Scale: 53.768 Contrastive_loss: 2.6282 (2.6475) Fd_loss: 1.8053 (1.8352) Loss: 4.4335 (4.4826)
2025-07-18,22:38:17 | INFO | Train Epoch: 3 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.362, 3007.50/s, 187.969/s/gpu LR: 0.000489 Logit Scale: 54.095 Contrastive_loss: 2.5515 (2.6424) Fd_loss: 1.8085 (1.8338) Loss: 4.3599 (4.4762)
2025-07-18,22:40:33 | INFO | Train Epoch: 3 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.362, 3003.24/s, 187.702/s/gpu LR: 0.000489 Logit Scale: 54.341 Contrastive_loss: 2.5923 (2.6399) Fd_loss: 1.8000 (1.8321) Loss: 4.3922 (4.4720)
2025-07-18,22:42:49 | INFO | Train Epoch: 3 [8196096/9319509 (88%)] Data (t): 0.000 Batch (t): 1.362, 2995.04/s, 187.190/s/gpu LR: 0.000489 Logit Scale: 54.523 Contrastive_loss: 2.5342 (2.6349) Fd_loss: 1.8044 (1.8308) Loss: 4.3386 (4.4656)
2025-07-18,22:45:06 | INFO | Train Epoch: 3 [8605696/9319509 (92%)] Data (t): 0.000 Batch (t): 1.363, 3013.81/s, 188.363/s/gpu LR: 0.000488 Logit Scale: 54.735 Contrastive_loss: 2.5834 (2.6325) Fd_loss: 1.7930 (1.8290) Loss: 4.3764 (4.4616)
2025-07-18,22:47:22 | INFO | Train Epoch: 3 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.363, 3003.09/s, 187.693/s/gpu LR: 0.000488 Logit Scale: 54.988 Contrastive_loss: 2.5878 (2.6306) Fd_loss: 1.7901 (1.8273) Loss: 4.3778 (4.4579)
2025-07-18,22:49:03 | INFO | Train Epoch: 3 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.363, 3017.31/s, 188.582/s/gpu LR: 0.000488 Logit Scale: 55.160 Contrastive_loss: 2.4775 (2.6242) Fd_loss: 1.8001 (1.8262) Loss: 4.2776 (4.4504)
2025-07-18,22:49:04 | INFO | Starting zero-shot imagenet.
2025-07-18,22:49:04 | INFO | Building zero-shot classifier
2025-07-18,22:49:19 | INFO | Using classifier
2025-07-18,22:50:39 | INFO | Finished zero-shot imagenet.
2025-07-18,22:50:39 | INFO | Eval Epoch: 4 imagenet-zeroshot-val-top1: 0.1662 imagenet-zeroshot-val-top5: 0.3723
2025-07-18,22:50:40 | INFO | Start epoch 4
2025-07-18,22:50:46 | INFO | Train Epoch: 4 [ 4096/9319509 (0%)] Data (t): 4.405 Batch (t): 5.807, 705.412/s, 44.0882/s/gpu LR: 0.000488 Logit Scale: 55.166 Contrastive_loss: 2.3275 (2.3275) Fd_loss: 1.7704 (1.7704) Loss: 4.0979 (4.0979)
2025-07-18,22:53:02 | INFO | Train Epoch: 4 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.360, 2990.06/s, 186.879/s/gpu LR: 0.000487 Logit Scale: 56.324 Contrastive_loss: 2.3286 (2.3281) Fd_loss: 1.7909 (1.7807) Loss: 4.1196 (4.1087)
2025-07-18,22:55:18 | INFO | Train Epoch: 4 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.365, 3003.37/s, 187.711/s/gpu LR: 0.000487 Logit Scale: 56.765 Contrastive_loss: 2.3343 (2.3301) Fd_loss: 1.8014 (1.7876) Loss: 4.1357 (4.1177)
2025-07-18,22:57:34 | INFO | Train Epoch: 4 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.365, 2990.97/s, 186.936/s/gpu LR: 0.000487 Logit Scale: 56.980 Contrastive_loss: 2.3490 (2.3349) Fd_loss: 1.7917 (1.7886) Loss: 4.1407 (4.1235)
2025-07-18,22:59:51 | INFO | Train Epoch: 4 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.364, 3008.85/s, 188.053/s/gpu LR: 0.000486 Logit Scale: 57.278 Contrastive_loss: 2.3424 (2.3364) Fd_loss: 1.7722 (1.7853) Loss: 4.1145 (4.1217)
2025-07-18,23:02:07 | INFO | Train Epoch: 4 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.363, 3015.81/s, 188.488/s/gpu LR: 0.000486 Logit Scale: 57.446 Contrastive_loss: 2.3993 (2.3469) Fd_loss: 1.7761 (1.7838) Loss: 4.1754 (4.1306)
2025-07-18,23:04:24 | INFO | Train Epoch: 4 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.363, 3002.11/s, 187.632/s/gpu LR: 0.000486 Logit Scale: 57.665 Contrastive_loss: 2.3738 (2.3507) Fd_loss: 1.7808 (1.7834) Loss: 4.1546 (4.1341)
2025-07-18,23:06:40 | INFO | Train Epoch: 4 [2871296/9319509 (31%)] Data (t): 0.000 Batch (t): 1.362, 3012.13/s, 188.258/s/gpu LR: 0.000485 Logit Scale: 57.792 Contrastive_loss: 2.3486 (2.3504) Fd_loss: 1.7773 (1.7826) Loss: 4.1259 (4.1330)
2025-07-18,23:08:56 | INFO | Train Epoch: 4 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.363, 3009.67/s, 188.104/s/gpu LR: 0.000485 Logit Scale: 58.005 Contrastive_loss: 2.3668 (2.3523) Fd_loss: 1.7766 (1.7819) Loss: 4.1434 (4.1342)
2025-07-18,23:11:12 | INFO | Train Epoch: 4 [3690496/9319509 (40%)] Data (t): 0.000 Batch (t): 1.363, 3003.33/s, 187.708/s/gpu LR: 0.000484 Logit Scale: 58.173 Contrastive_loss: 2.3388 (2.3509) Fd_loss: 1.7670 (1.7804) Loss: 4.1057 (4.1314)
2025-07-18,23:13:29 | INFO | Train Epoch: 4 [4100096/9319509 (44%)] Data (t): 0.000 Batch (t): 1.363, 3000.41/s, 187.526/s/gpu LR: 0.000484 Logit Scale: 58.325 Contrastive_loss: 2.3423 (2.3501) Fd_loss: 1.7602 (1.7786) Loss: 4.1026 (4.1287)
2025-07-18,23:15:45 | INFO | Train Epoch: 4 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.363, 3004.54/s, 187.784/s/gpu LR: 0.000484 Logit Scale: 58.377 Contrastive_loss: 3.0296 (2.4068) Fd_loss: 1.9043 (1.7891) Loss: 4.9339 (4.1958)
2025-07-18,23:18:01 | INFO | Train Epoch: 4 [4919296/9319509 (53%)] Data (t): 0.000 Batch (t): 1.363, 2998.35/s, 187.397/s/gpu LR: 0.000483 Logit Scale: 58.596 Contrastive_loss: 2.0741 (2.3812) Fd_loss: 1.7923 (1.7893) Loss: 3.8664 (4.1705)
2025-07-18,23:20:17 | INFO | Train Epoch: 4 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.362, 3016.15/s, 188.510/s/gpu LR: 0.000483 Logit Scale: 59.234 Contrastive_loss: 2.0042 (2.3542) Fd_loss: 1.7859 (1.7891) Loss: 3.7902 (4.1433)
2025-07-18,23:22:34 | INFO | Train Epoch: 4 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.362, 3020.24/s, 188.765/s/gpu LR: 0.000482 Logit Scale: 59.694 Contrastive_loss: 2.0072 (2.3311) Fd_loss: 1.7844 (1.7888) Loss: 3.7916 (4.1199)
2025-07-18,23:24:50 | INFO | Train Epoch: 4 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.363, 3003.70/s, 187.731/s/gpu LR: 0.000482 Logit Scale: 60.206 Contrastive_loss: 1.9483 (2.3072) Fd_loss: 1.7745 (1.7879) Loss: 3.7228 (4.0951)
2025-07-18,23:27:06 | INFO | Train Epoch: 4 [6557696/9319509 (70%)] Data (t): 0.000 Batch (t): 1.363, 3007.30/s, 187.956/s/gpu LR: 0.000482 Logit Scale: 60.559 Contrastive_loss: 1.9881 (2.2884) Fd_loss: 1.7728 (1.7870) Loss: 3.7609 (4.0754)
2025-07-18,23:29:23 | INFO | Train Epoch: 4 [6967296/9319509 (75%)] Data (t): 0.000 Batch (t): 1.365, 2981.98/s, 186.374/s/gpu LR: 0.000481 Logit Scale: 60.969 Contrastive_loss: 1.9283 (2.2684) Fd_loss: 1.7747 (1.7863) Loss: 3.7030 (4.0547)
2025-07-18,23:31:39 | INFO | Train Epoch: 4 [7376896/9319509 (79%)] Data (t): 0.000 Batch (t): 1.366, 3005.45/s, 187.840/s/gpu LR: 0.000481 Logit Scale: 61.401 Contrastive_loss: 1.9877 (2.2536) Fd_loss: 1.7655 (1.7852) Loss: 3.7532 (4.0388)
2025-07-18,23:33:56 | INFO | Train Epoch: 4 [7786496/9319509 (84%)] Data (t): 0.000 Batch (t): 1.364, 3002.15/s, 187.634/s/gpu LR: 0.000480 Logit Scale: 61.715 Contrastive_loss: 1.9370 (2.2378) Fd_loss: 1.7584 (1.7839) Loss: 3.6954 (4.0217)
2025-07-18,23:36:12 | INFO | Train Epoch: 4 [8196096/9319509 (88%)] Data (t): 0.000 Batch (t): 1.364, 3012.34/s, 188.271/s/gpu LR: 0.000480 Logit Scale: 61.941 Contrastive_loss: 1.9571 (2.2244) Fd_loss: 1.7595 (1.7827) Loss: 3.7166 (4.0071)
2025-07-18,23:38:28 | INFO | Train Epoch: 4 [8605696/9319509 (92%)] Data (t): 0.000 Batch (t): 1.363, 3008.80/s, 188.050/s/gpu LR: 0.000479 Logit Scale: 62.242 Contrastive_loss: 1.9318 (2.2111) Fd_loss: 1.7499 (1.7812) Loss: 3.6817 (3.9924)
2025-07-18,23:40:45 | INFO | Train Epoch: 4 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.363, 3013.20/s, 188.325/s/gpu LR: 0.000479 Logit Scale: 62.511 Contrastive_loss: 1.9324 (2.1990) Fd_loss: 1.7582 (1.7802) Loss: 3.6906 (3.9792)
2025-07-18,23:42:26 | INFO | Train Epoch: 4 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.364, 3012.41/s, 188.276/s/gpu LR: 0.000479 Logit Scale: 62.720 Contrastive_loss: 1.9705 (2.1895) Fd_loss: 1.7571 (1.7793) Loss: 3.7276 (3.9687)
2025-07-18,23:42:27 | INFO | Start epoch 5
2025-07-18,23:42:39 | INFO | Train Epoch: 5 [ 4096/9319509 (0%)] Data (t): 9.263 Batch (t): 11.144, 367.561/s, 22.9726/s/gpu LR: 0.000479 Logit Scale: 62.716 Contrastive_loss: 1.7395 (1.7395) Fd_loss: 1.7403 (1.7403) Loss: 3.4798 (3.4798)
2025-07-18,23:44:55 | INFO | Train Epoch: 5 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.365, 3012.71/s, 188.294/s/gpu LR: 0.000478 Logit Scale: 64.004 Contrastive_loss: 1.7827 (1.7611) Fd_loss: 1.7540 (1.7472) Loss: 3.5367 (3.5083)
2025-07-18,23:47:11 | INFO | Train Epoch: 5 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.361, 3005.11/s, 187.819/s/gpu LR: 0.000478 Logit Scale: 64.533 Contrastive_loss: 1.7159 (1.7460) Fd_loss: 1.7603 (1.7515) Loss: 3.4762 (3.4976)
2025-07-18,23:49:27 | INFO | Train Epoch: 5 [1232896/9319509 (13%)] Data (t): 0.000 Batch (t): 1.361, 3005.21/s, 187.826/s/gpu LR: 0.000477 Logit Scale: 64.814 Contrastive_loss: 1.7788 (1.7542) Fd_loss: 1.7388 (1.7484) Loss: 3.5176 (3.5026)
2025-07-18,23:51:43 | INFO | Train Epoch: 5 [1642496/9319509 (18%)] Data (t): 0.000 Batch (t): 1.362, 3010.58/s, 188.161/s/gpu LR: 0.000477 Logit Scale: 65.162 Contrastive_loss: 1.7939 (1.7622) Fd_loss: 1.7423 (1.7472) Loss: 3.5362 (3.5093)
2025-07-18,23:54:00 | INFO | Train Epoch: 5 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.362, 3015.44/s, 188.465/s/gpu LR: 0.000476 Logit Scale: 65.419 Contrastive_loss: 1.8534 (1.7774) Fd_loss: 1.7450 (1.7468) Loss: 3.5984 (3.5242)
2025-07-18,23:56:16 | INFO | Train Epoch: 5 [2461696/9319509 (26%)] Data (t): 0.000 Batch (t): 1.361, 3009.96/s, 188.123/s/gpu LR: 0.000476 Logit Scale: 65.600 Contrastive_loss: 1.8228 (1.7839) Fd_loss: 1.7332 (1.7449) Loss: 3.5561 (3.5287)
2025-07-18,23:58:32 | INFO | Train Epoch: 5 [2871296/9319509 (31%)] Data (t): 0.000 Batch (t): 1.363, 2999.06/s, 187.441/s/gpu LR: 0.000475 Logit Scale: 65.769 Contrastive_loss: 1.8082 (1.7869) Fd_loss: 1.7436 (1.7447) Loss: 3.5519 (3.5316)
2025-07-19,00:00:48 | INFO | Train Epoch: 5 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.365, 3007.94/s, 187.996/s/gpu LR: 0.000475 Logit Scale: 65.959 Contrastive_loss: 1.7627 (1.7842) Fd_loss: 1.7453 (1.7448) Loss: 3.5081 (3.5290)
2025-07-19,00:03:05 | INFO | Train Epoch: 5 [3690496/9319509 (40%)] Data (t): 0.000 Batch (t): 1.363, 2993.70/s, 187.106/s/gpu LR: 0.000474 Logit Scale: 65.911 Contrastive_loss: 1.8162 (1.7874) Fd_loss: 1.7429 (1.7446) Loss: 3.5590 (3.5320)
2025-07-19,00:05:21 | INFO | Train Epoch: 5 [4100096/9319509 (44%)] Data (t): 0.000 Batch (t): 1.362, 3004.16/s, 187.760/s/gpu LR: 0.000474 Logit Scale: 66.101 Contrastive_loss: 1.8876 (1.7965) Fd_loss: 1.7411 (1.7443) Loss: 3.6287 (3.5408)
2025-07-19,00:07:37 | INFO | Train Epoch: 5 [4509696/9319509 (48%)] Data (t): 0.000 Batch (t): 1.362, 3014.44/s, 188.402/s/gpu LR: 0.000473 Logit Scale: 66.269 Contrastive_loss: 1.8671 (1.8024) Fd_loss: 1.7361 (1.7436) Loss: 3.6032 (3.5460)
2025-07-19,00:09:53 | INFO | Train Epoch: 5 [4919296/9319509 (53%)] Data (t): 0.000 Batch (t): 1.361, 3003.40/s, 187.712/s/gpu LR: 0.000473 Logit Scale: 66.540 Contrastive_loss: 1.8308 (1.8046) Fd_loss: 1.7447 (1.7437) Loss: 3.5755 (3.5483)
2025-07-19,00:12:09 | INFO | Train Epoch: 5 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.361, 3007.38/s, 187.962/s/gpu LR: 0.000472 Logit Scale: 66.722 Contrastive_loss: 1.8075 (1.8048) Fd_loss: 1.7339 (1.7430) Loss: 3.5414 (3.5478)
2025-07-19,00:14:25 | INFO | Train Epoch: 5 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.361, 3008.39/s, 188.024/s/gpu LR: 0.000472 Logit Scale: 66.901 Contrastive_loss: 1.8452 (1.8075) Fd_loss: 1.7346 (1.7424) Loss: 3.5797 (3.5499)
2025-07-19,00:16:42 | INFO | Train Epoch: 5 [6148096/9319509 (66%)] Data (t): 0.000 Batch (t): 1.361, 3002.02/s, 187.626/s/gpu LR: 0.000471 Logit Scale: 67.055 Contrastive_loss: 1.7641 (1.8048) Fd_loss: 1.7216 (1.7411) Loss: 3.4857 (3.5459)
2025-07-19,00:18:58 | INFO | Train Epoch: 5 [6557696/9319509 (70%)] Data (t): 0.000 Batch (t): 1.360, 3010.68/s, 188.167/s/gpu LR: 0.000471 Logit Scale: 67.209 Contrastive_loss: 1.7513 (1.8016) Fd_loss: 1.7367 (1.7409) Loss: 3.4881 (3.5425)
2025-07-19,00:21:14 | INFO | Train Epoch: 5 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.360, 3005.71/s, 187.857/s/gpu LR: 0.000470 Logit Scale: 67.255 Contrastive_loss: 1.6600 (1.7938) Fd_loss: 1.7483 (1.7413) Loss: 3.4083 (3.5350)
2025-07-19,00:23:30 | INFO | Train Epoch: 5 [7376896/9319509 (79%)] Data (t): 0.000 Batch (t): 1.360, 3015.32/s, 188.458/s/gpu LR: 0.000470 Logit Scale: 68.339 Contrastive_loss: 1.7257 (1.7902) Fd_loss: 1.7325 (1.7408) Loss: 3.4582 (3.5310)
2025-07-19,00:25:46 | INFO | Train Epoch: 5 [7786496/9319509 (84%)] Data (t): 0.000 Batch (t): 1.363, 3008.88/s, 188.055/s/gpu LR: 0.000469 Logit Scale: 69.059 Contrastive_loss: 1.6327 (1.7823) Fd_loss: 1.7149 (1.7395) Loss: 3.3476 (3.5218)
2025-07-19,00:28:02 | INFO | Train Epoch: 5 [8196096/9319509 (88%)] Data (t): 0.000 Batch (t): 1.363, 3016.87/s, 188.554/s/gpu LR: 0.000469 Logit Scale: 69.591 Contrastive_loss: 1.6268 (1.7749) Fd_loss: 1.7293 (1.7390) Loss: 3.3561 (3.5139)
2025-07-19,00:30:18 | INFO | Train Epoch: 5 [8605696/9319509 (92%)] Data (t): 0.000 Batch (t): 1.362, 3003.00/s, 187.687/s/gpu LR: 0.000468 Logit Scale: 69.961 Contrastive_loss: 1.6722 (1.7702) Fd_loss: 1.7398 (1.7391) Loss: 3.4120 (3.5093)
2025-07-19,00:32:34 | INFO | Train Epoch: 5 [9015296/9319509 (97%)] Data (t): 0.000 Batch (t): 1.361, 3005.08/s, 187.817/s/gpu LR: 0.000468 Logit Scale: 70.282 Contrastive_loss: 1.6563 (1.7653) Fd_loss: 1.7313 (1.7387) Loss: 3.3875 (3.5040)
2025-07-19,00:34:15 | INFO | Train Epoch: 5 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.362, 3003.36/s, 187.710/s/gpu LR: 0.000467 Logit Scale: 70.528 Contrastive_loss: 1.6355 (1.7599) Fd_loss: 1.7267 (1.7382) Loss: 3.3622 (3.4981)
2025-07-19,00:34:16 | INFO | Starting zero-shot imagenet.
2025-07-19,00:34:16 | INFO | Building zero-shot classifier
2025-07-19,00:34:31 | INFO | Using classifier
2025-07-19,00:35:49 | INFO | Finished zero-shot imagenet.
2025-07-19,00:35:49 | INFO | Eval Epoch: 6 imagenet-zeroshot-val-top1: 0.1807 imagenet-zeroshot-val-top5: 0.3958
2025-07-19,00:35:50 | INFO | Start epoch 6
2025-07-19,00:35:56 | INFO | Train Epoch: 6 [ 4096/9319509 (0%)] Data (t): 5.090 Batch (t): 6.429, 637.086/s, 39.8179/s/gpu LR: 0.000467 Logit Scale: 70.533 Contrastive_loss: 1.4360 (1.4360) Fd_loss: 1.7242 (1.7242) Loss: 3.1602 (3.1602)
2025-07-19,00:38:13 | INFO | Train Epoch: 6 [ 413696/9319509 (4%)] Data (t): 0.000 Batch (t): 1.361, 2997.25/s, 187.328/s/gpu LR: 0.000467 Logit Scale: 71.929 Contrastive_loss: 1.4816 (1.4588) Fd_loss: 1.7252 (1.7247) Loss: 3.2068 (3.1835)
2025-07-19,00:40:29 | INFO | Train Epoch: 6 [ 823296/9319509 (9%)] Data (t): 0.000 Batch (t): 1.362, 3012.43/s, 188.277/s/gpu LR: 0.000466 Logit Scale: 72.527 Contrastive_loss: 1.4468 (1.4548) Fd_loss: 1.7190 (1.7228) Loss: 3.1659 (3.1776)
2025-07-19,00:42:45 | INFO | Train Epoch: 6 [1232896/9319509 (13%)] Data (t): 0.000 Batch (t): 1.361, 3003.33/s, 187.708/s/gpu LR: 0.000466 Logit Scale: 73.017 Contrastive_loss: 1.4786 (1.4608) Fd_loss: 1.7171 (1.7214) Loss: 3.1957 (3.1822)
2025-07-19,00:45:01 | INFO | Train Epoch: 6 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.361, 3001.97/s, 187.623/s/gpu LR: 0.000465 Logit Scale: 73.257 Contrastive_loss: 1.4534 (1.4593) Fd_loss: 1.7196 (1.7210) Loss: 3.1731 (3.1803)
2025-07-19,00:47:17 | INFO | Train Epoch: 6 [2052096/9319509 (22%)] Data (t): 0.000 Batch (t): 1.360, 3014.26/s, 188.391/s/gpu LR: 0.000465 Logit Scale: 73.602 Contrastive_loss: 1.5075 (1.4673) Fd_loss: 1.7110 (1.7194) Loss: 3.2185 (3.1867)
2025-07-19,00:49:33 | INFO | Train Epoch: 6 [2461696/9319509 (26%)] Data (t): 0.000 Batch (t): 1.361, 3012.01/s, 188.250/s/gpu LR: 0.000464 Logit Scale: 73.339 Contrastive_loss: 1.5904 (1.4849) Fd_loss: 1.7377 (1.7220) Loss: 3.3281 (3.2069)
2025-07-19,00:51:49 | INFO | Train Epoch: 6 [2871296/9319509 (31%)] Data (t): 0.000 Batch (t): 1.361, 3009.06/s, 188.066/s/gpu LR: 0.000463 Logit Scale: 73.536 Contrastive_loss: 1.5724 (1.4958) Fd_loss: 1.7246 (1.7223) Loss: 3.2970 (3.2182)
2025-07-19,00:54:05 | INFO | Train Epoch: 6 [3280896/9319509 (35%)] Data (t): 0.000 Batch (t): 1.360, 3010.00/s, 188.125/s/gpu LR: 0.000463 Logit Scale: 73.575 Contrastive_loss: 1.5987 (1.5073) Fd_loss: 1.7248 (1.7226) Loss: 3.3235 (3.2299)
2025-07-19,00:56:21 | INFO | Train Epoch: 6 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.360, 3007.04/s, 187.940/s/gpu LR: 0.000462 Logit Scale: 73.566 Contrastive_loss: 1.5413 (1.5107) Fd_loss: 1.7268 (1.7230) Loss: 3.2680 (3.2337)
2025-07-19,00:58:37 | INFO | Train Epoch: 6 [4100096/9319509 (44%)] Data (t): 0.000 Batch (t): 1.362, 3002.68/s, 187.667/s/gpu LR: 0.000462 Logit Scale: 73.679 Contrastive_loss: 1.5035 (1.5100) Fd_loss: 1.7097 (1.7218) Loss: 3.2132 (3.2318)
2025-07-19,01:00:53 | INFO | Train Epoch: 6 [4509696/9319509 (48%)] Data (t): 0.000 Batch (t): 1.362, 3006.77/s, 187.923/s/gpu LR: 0.000461 Logit Scale: 73.776 Contrastive_loss: 1.5695 (1.5150) Fd_loss: 1.7102 (1.7208) Loss: 3.2797 (3.2358)
2025-07-19,01:03:10 | INFO | Train Epoch: 6 [4919296/9319509 (53%)] Data (t): 0.000 Batch (t): 1.362, 3000.78/s, 187.549/s/gpu LR: 0.000460 Logit Scale: 73.847 Contrastive_loss: 1.5547 (1.5180) Fd_loss: 1.7262 (1.7212) Loss: 3.2808 (3.2393)
2025-07-19,01:05:26 | INFO | Train Epoch: 6 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.362, 3010.52/s, 188.157/s/gpu LR: 0.000460 Logit Scale: 73.912 Contrastive_loss: 1.5168 (1.5179) Fd_loss: 1.6993 (1.7197) Loss: 3.2160 (3.2376)
2025-07-19,01:07:42 | INFO | Train Epoch: 6 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.362, 3011.60/s, 188.225/s/gpu LR: 0.000459 Logit Scale: 73.996 Contrastive_loss: 1.4725 (1.5149) Fd_loss: 1.7091 (1.7190) Loss: 3.1817 (3.2339)
2025-07-19,01:09:58 | INFO | Train Epoch: 6 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.361, 3006.15/s, 187.884/s/gpu LR: 0.000459 Logit Scale: 74.186 Contrastive_loss: 1.4950 (1.5137) Fd_loss: 1.7065 (1.7182) Loss: 3.2015 (3.2319)
2025-07-19,01:12:14 | INFO | Train Epoch: 6 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.361, 3008.43/s, 188.027/s/gpu LR: 0.000458 Logit Scale: 74.226 Contrastive_loss: 1.5002 (1.5129) Fd_loss: 1.6996 (1.7171) Loss: 3.1999 (3.2300)
2025-07-19,01:14:30 | INFO | Train Epoch: 6 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.361, 3004.47/s, 187.779/s/gpu LR: 0.000457 Logit Scale: 74.387 Contrastive_loss: 1.5185 (1.5132) Fd_loss: 1.6900 (1.7156) Loss: 3.2085 (3.2288)
2025-07-19,01:16:46 | INFO | Train Epoch: 6 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.361, 3005.20/s, 187.825/s/gpu LR: 0.000457 Logit Scale: 74.518 Contrastive_loss: 1.4978 (1.5124) Fd_loss: 1.6849 (1.7140) Loss: 3.1827 (3.2264)
2025-07-19,01:19:03 | INFO | Train Epoch: 6 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.361, 3009.02/s, 188.064/s/gpu LR: 0.000456 Logit Scale: 74.686 Contrastive_loss: 1.5233 (1.5129) Fd_loss: 1.7005 (1.7133) Loss: 3.2238 (3.2262)
2025-07-19,01:21:19 | INFO | Train Epoch: 6 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.361, 3012.49/s, 188.281/s/gpu LR: 0.000456 Logit Scale: 74.803 Contrastive_loss: 1.5512 (1.5147) Fd_loss: 1.6873 (1.7121) Loss: 3.2385 (3.2268)
2025-07-19,01:23:35 | INFO | Train Epoch: 6 [8605696/9319509 (92%)] Data (t): 0.000 Batch (t): 1.360, 3014.77/s, 188.423/s/gpu LR: 0.000455 Logit Scale: 74.923 Contrastive_loss: 1.4967 (1.5139) Fd_loss: 1.6921 (1.7112) Loss: 3.1888 (3.2251)
2025-07-19,01:25:51 | INFO | Train Epoch: 6 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.362, 3005.62/s, 187.851/s/gpu LR: 0.000454 Logit Scale: 75.059 Contrastive_loss: 1.5003 (1.5133) Fd_loss: 1.6886 (1.7102) Loss: 3.1889 (3.2235)
2025-07-19,01:27:32 | INFO | Train Epoch: 6 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.365, 3007.51/s, 187.969/s/gpu LR: 0.000454 Logit Scale: 75.091 Contrastive_loss: 1.5515 (1.5149) Fd_loss: 1.6928 (1.7095) Loss: 3.2443 (3.2244)
2025-07-19,01:27:34 | INFO | Start epoch 7
2025-07-19,01:27:46 | INFO | Train Epoch: 7 [ 4096/9319509 (0%)] Data (t): 10.151 Batch (t): 12.378, 330.899/s, 20.6812/s/gpu LR: 0.000454 Logit Scale: 75.090 Contrastive_loss: 1.3377 (1.3377) Fd_loss: 1.6885 (1.6885) Loss: 3.0261 (3.0261)
2025-07-19,01:30:02 | INFO | Train Epoch: 7 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.365, 3008.13/s, 188.008/s/gpu LR: 0.000453 Logit Scale: 76.525 Contrastive_loss: 1.3086 (1.3231) Fd_loss: 1.6801 (1.6843) Loss: 2.9887 (3.0074)
2025-07-19,01:32:19 | INFO | Train Epoch: 7 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.362, 3001.08/s, 187.567/s/gpu LR: 0.000452 Logit Scale: 76.860 Contrastive_loss: 1.3413 (1.3292) Fd_loss: 1.6816 (1.6834) Loss: 3.0228 (3.0126)
2025-07-19,01:34:35 | INFO | Train Epoch: 7 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.361, 3017.53/s, 188.596/s/gpu LR: 0.000452 Logit Scale: 77.040 Contrastive_loss: 1.3663 (1.3385) Fd_loss: 1.7010 (1.6878) Loss: 3.0674 (3.0263)
2025-07-19,01:36:51 | INFO | Train Epoch: 7 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.361, 3017.36/s, 188.585/s/gpu LR: 0.000451 Logit Scale: 77.202 Contrastive_loss: 1.3748 (1.3457) Fd_loss: 1.6913 (1.6885) Loss: 3.0660 (3.0342)
2025-07-19,01:39:07 | INFO | Train Epoch: 7 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.361, 3001.18/s, 187.574/s/gpu LR: 0.000451 Logit Scale: 77.270 Contrastive_loss: 1.4364 (1.3608) Fd_loss: 1.6828 (1.6875) Loss: 3.1192 (3.0484)
2025-07-19,01:41:23 | INFO | Train Epoch: 7 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.361, 2994.81/s, 187.176/s/gpu LR: 0.000450 Logit Scale: 77.420 Contrastive_loss: 1.3744 (1.3628) Fd_loss: 1.6779 (1.6862) Loss: 3.0523 (3.0489)
2025-07-19,01:43:39 | INFO | Train Epoch: 7 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.361, 3017.22/s, 188.577/s/gpu LR: 0.000449 Logit Scale: 77.526 Contrastive_loss: 1.3435 (1.3604) Fd_loss: 1.6829 (1.6858) Loss: 3.0264 (3.0461)
2025-07-19,01:45:55 | INFO | Train Epoch: 7 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.361, 3017.99/s, 188.625/s/gpu LR: 0.000449 Logit Scale: 77.509 Contrastive_loss: 1.4248 (1.3675) Fd_loss: 1.6835 (1.6855) Loss: 3.1083 (3.0530)
2025-07-19,01:48:11 | INFO | Train Epoch: 7 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.361, 3002.25/s, 187.641/s/gpu LR: 0.000448 Logit Scale: 77.762 Contrastive_loss: 1.4180 (1.3726) Fd_loss: 1.6904 (1.6860) Loss: 3.1083 (3.0586)
2025-07-19,01:50:27 | INFO | Train Epoch: 7 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.360, 3019.26/s, 188.704/s/gpu LR: 0.000447 Logit Scale: 77.817 Contrastive_loss: 1.3861 (1.3738) Fd_loss: 1.6841 (1.6858) Loss: 3.0703 (3.0596)
2025-07-19,01:52:43 | INFO | Train Epoch: 7 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.361, 3016.51/s, 188.532/s/gpu LR: 0.000446 Logit Scale: 77.868 Contrastive_loss: 1.3938 (1.3755) Fd_loss: 1.6636 (1.6840) Loss: 3.0573 (3.0594)
2025-07-19,01:54:59 | INFO | Train Epoch: 7 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.361, 3018.01/s, 188.626/s/gpu LR: 0.000446 Logit Scale: 78.064 Contrastive_loss: 1.4519 (1.3814) Fd_loss: 1.6792 (1.6836) Loss: 3.1310 (3.0650)
2025-07-19,01:57:15 | INFO | Train Epoch: 7 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.360, 3014.10/s, 188.381/s/gpu LR: 0.000445 Logit Scale: 78.073 Contrastive_loss: 1.4264 (1.3846) Fd_loss: 1.6858 (1.6838) Loss: 3.1122 (3.0683)
2025-07-19,01:59:31 | INFO | Train Epoch: 7 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.360, 3015.98/s, 188.499/s/gpu LR: 0.000444 Logit Scale: 78.152 Contrastive_loss: 1.4161 (1.3867) Fd_loss: 1.6723 (1.6830) Loss: 3.0885 (3.0697)
2025-07-19,02:01:48 | INFO | Train Epoch: 7 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.361, 3014.28/s, 188.393/s/gpu LR: 0.000444 Logit Scale: 78.395 Contrastive_loss: 1.3673 (1.3855) Fd_loss: 1.6707 (1.6822) Loss: 3.0380 (3.0677)
2025-07-19,02:04:04 | INFO | Train Epoch: 7 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.364, 3001.55/s, 187.597/s/gpu LR: 0.000443 Logit Scale: 78.347 Contrastive_loss: 1.4515 (1.3894) Fd_loss: 1.6639 (1.6811) Loss: 3.1154 (3.0705)
2025-07-19,02:06:20 | INFO | Train Epoch: 7 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.364, 3011.94/s, 188.246/s/gpu LR: 0.000442 Logit Scale: 78.531 Contrastive_loss: 1.3920 (1.3895) Fd_loss: 1.6718 (1.6806) Loss: 3.0639 (3.0701)
2025-07-19,02:08:36 | INFO | Train Epoch: 7 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.362, 3011.88/s, 188.242/s/gpu LR: 0.000442 Logit Scale: 78.500 Contrastive_loss: 1.4301 (1.3916) Fd_loss: 1.6672 (1.6799) Loss: 3.0973 (3.0716)
2025-07-19,02:10:53 | INFO | Train Epoch: 7 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.362, 3004.23/s, 187.764/s/gpu LR: 0.000441 Logit Scale: 78.684 Contrastive_loss: 1.3986 (1.3920) Fd_loss: 1.6599 (1.6789) Loss: 3.0585 (3.0709)
2025-07-19,02:13:09 | INFO | Train Epoch: 7 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.361, 3011.76/s, 188.235/s/gpu LR: 0.000440 Logit Scale: 78.750 Contrastive_loss: 1.3976 (1.3923) Fd_loss: 1.6622 (1.6781) Loss: 3.0597 (3.0704)
2025-07-19,02:15:25 | INFO | Train Epoch: 7 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.361, 3013.86/s, 188.366/s/gpu LR: 0.000439 Logit Scale: 78.915 Contrastive_loss: 1.3722 (1.3913) Fd_loss: 1.6550 (1.6771) Loss: 3.0272 (3.0684)
2025-07-19,02:17:41 | INFO | Train Epoch: 7 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.360, 3012.38/s, 188.274/s/gpu LR: 0.000439 Logit Scale: 79.054 Contrastive_loss: 1.4096 (1.3921) Fd_loss: 1.6657 (1.6766) Loss: 3.0752 (3.0687)
2025-07-19,02:19:22 | INFO | Train Epoch: 7 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.362, 3014.91/s, 188.432/s/gpu LR: 0.000438 Logit Scale: 79.157 Contrastive_loss: 1.4397 (1.3941) Fd_loss: 1.6579 (1.6758) Loss: 3.0977 (3.0699)
2025-07-19,02:19:23 | INFO | Starting zero-shot imagenet.
2025-07-19,02:19:23 | INFO | Building zero-shot classifier
2025-07-19,02:19:38 | INFO | Using classifier
2025-07-19,02:20:48 | INFO | Finished zero-shot imagenet.
2025-07-19,02:20:48 | INFO | Eval Epoch: 8 imagenet-zeroshot-val-top1: 0.2099 imagenet-zeroshot-val-top5: 0.4443
2025-07-19,02:20:49 | INFO | Start epoch 8
2025-07-19,02:20:56 | INFO | Train Epoch: 8 [ 4096/9319509 (0%)] Data (t): 5.331 Batch (t): 6.682, 612.962/s, 38.3101/s/gpu LR: 0.000438 Logit Scale: 79.163 Contrastive_loss: 1.2734 (1.2734) Fd_loss: 1.6644 (1.6644) Loss: 2.9379 (2.9379)
2025-07-19,02:23:12 | INFO | Train Epoch: 8 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.362, 3009.77/s, 188.111/s/gpu LR: 0.000437 Logit Scale: 80.645 Contrastive_loss: 1.2033 (1.2384) Fd_loss: 1.6604 (1.6624) Loss: 2.8637 (2.9008)
2025-07-19,02:25:28 | INFO | Train Epoch: 8 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.363, 3010.36/s, 188.148/s/gpu LR: 0.000437 Logit Scale: 81.037 Contrastive_loss: 1.2351 (1.2373) Fd_loss: 1.6606 (1.6618) Loss: 2.8958 (2.8991)
2025-07-19,02:27:44 | INFO | Train Epoch: 8 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.363, 3012.71/s, 188.294/s/gpu LR: 0.000436 Logit Scale: 81.281 Contrastive_loss: 1.2637 (1.2439) Fd_loss: 1.6599 (1.6613) Loss: 2.9235 (2.9052)
2025-07-19,02:30:01 | INFO | Train Epoch: 8 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.364, 3007.50/s, 187.969/s/gpu LR: 0.000435 Logit Scale: 81.368 Contrastive_loss: 1.2830 (1.2517) Fd_loss: 1.6594 (1.6610) Loss: 2.9424 (2.9127)
2025-07-19,02:32:17 | INFO | Train Epoch: 8 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.365, 3001.92/s, 187.620/s/gpu LR: 0.000434 Logit Scale: 81.481 Contrastive_loss: 1.2236 (1.2470) Fd_loss: 1.6546 (1.6599) Loss: 2.8782 (2.9069)
2025-07-19,02:34:34 | INFO | Train Epoch: 8 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.363, 3014.38/s, 188.399/s/gpu LR: 0.000434 Logit Scale: 81.535 Contrastive_loss: 1.2420 (1.2463) Fd_loss: 1.6623 (1.6602) Loss: 2.9043 (2.9065)
2025-07-19,02:36:50 | INFO | Train Epoch: 8 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.362, 3011.02/s, 188.189/s/gpu LR: 0.000433 Logit Scale: 81.543 Contrastive_loss: 1.3199 (1.2555) Fd_loss: 1.6526 (1.6593) Loss: 2.9725 (2.9148)
2025-07-19,02:39:06 | INFO | Train Epoch: 8 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.363, 3010.14/s, 188.134/s/gpu LR: 0.000432 Logit Scale: 81.611 Contrastive_loss: 1.3518 (1.2662) Fd_loss: 1.6616 (1.6595) Loss: 3.0134 (2.9257)
2025-07-19,02:41:22 | INFO | Train Epoch: 8 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.362, 3000.38/s, 187.524/s/gpu LR: 0.000431 Logit Scale: 81.644 Contrastive_loss: 1.2765 (1.2672) Fd_loss: 1.6640 (1.6600) Loss: 2.9406 (2.9272)
2025-07-19,02:43:39 | INFO | Train Epoch: 8 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.363, 3003.61/s, 187.726/s/gpu LR: 0.000431 Logit Scale: 81.853 Contrastive_loss: 1.2825 (1.2686) Fd_loss: 1.6512 (1.6592) Loss: 2.9337 (2.9278)
2025-07-19,02:45:55 | INFO | Train Epoch: 8 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.363, 3003.61/s, 187.726/s/gpu LR: 0.000430 Logit Scale: 81.972 Contrastive_loss: 1.3199 (1.2729) Fd_loss: 1.6514 (1.6585) Loss: 2.9713 (2.9314)
2025-07-19,02:48:11 | INFO | Train Epoch: 8 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.365, 3003.00/s, 187.687/s/gpu LR: 0.000429 Logit Scale: 81.959 Contrastive_loss: 1.2873 (1.2740) Fd_loss: 1.6567 (1.6584) Loss: 2.9440 (2.9324)
2025-07-19,02:50:28 | INFO | Train Epoch: 8 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.365, 3004.17/s, 187.760/s/gpu LR: 0.000428 Logit Scale: 82.056 Contrastive_loss: 1.2683 (1.2736) Fd_loss: 1.6513 (1.6579) Loss: 2.9196 (2.9315)
2025-07-19,02:52:44 | INFO | Train Epoch: 8 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.364, 2997.17/s, 187.323/s/gpu LR: 0.000428 Logit Scale: 82.124 Contrastive_loss: 1.3073 (1.2758) Fd_loss: 1.6549 (1.6577) Loss: 2.9622 (2.9335)
2025-07-19,02:55:01 | INFO | Train Epoch: 8 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.366, 2998.43/s, 187.402/s/gpu LR: 0.000427 Logit Scale: 82.224 Contrastive_loss: 1.3014 (1.2774) Fd_loss: 1.6461 (1.6570) Loss: 2.9475 (2.9344)
2025-07-19,02:57:18 | INFO | Train Epoch: 8 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.366, 3003.38/s, 187.711/s/gpu LR: 0.000426 Logit Scale: 82.352 Contrastive_loss: 1.3493 (1.2817) Fd_loss: 1.6473 (1.6564) Loss: 2.9966 (2.9381)
2025-07-19,02:59:34 | INFO | Train Epoch: 8 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.365, 3006.33/s, 187.896/s/gpu LR: 0.000425 Logit Scale: 82.427 Contrastive_loss: 1.2822 (1.2817) Fd_loss: 1.6472 (1.6559) Loss: 2.9294 (2.9376)
2025-07-19,03:01:51 | INFO | Train Epoch: 8 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.367, 3006.01/s, 187.876/s/gpu LR: 0.000424 Logit Scale: 82.598 Contrastive_loss: 1.3472 (1.2851) Fd_loss: 1.6508 (1.6556) Loss: 2.9979 (2.9408)
2025-07-19,03:04:07 | INFO | Train Epoch: 8 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.365, 3007.44/s, 187.965/s/gpu LR: 0.000424 Logit Scale: 82.692 Contrastive_loss: 1.3329 (1.2875) Fd_loss: 1.6419 (1.6549) Loss: 2.9748 (2.9425)
2025-07-19,03:06:24 | INFO | Train Epoch: 8 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.367, 2991.05/s, 186.941/s/gpu LR: 0.000423 Logit Scale: 82.677 Contrastive_loss: 1.3930 (1.2925) Fd_loss: 1.6545 (1.6549) Loss: 3.0474 (2.9475)
2025-07-19,03:08:40 | INFO | Train Epoch: 8 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.366, 3004.40/s, 187.775/s/gpu LR: 0.000422 Logit Scale: 82.503 Contrastive_loss: 1.3503 (1.2952) Fd_loss: 1.6563 (1.6550) Loss: 3.0066 (2.9501)
2025-07-19,03:10:57 | INFO | Train Epoch: 8 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.366, 2992.11/s, 187.007/s/gpu LR: 0.000421 Logit Scale: 82.608 Contrastive_loss: 1.3479 (1.2975) Fd_loss: 1.6454 (1.6546) Loss: 2.9933 (2.9520)
2025-07-19,03:12:38 | INFO | Train Epoch: 8 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.370, 2997.57/s, 187.348/s/gpu LR: 0.000421 Logit Scale: 82.788 Contrastive_loss: 1.3033 (1.2977) Fd_loss: 1.6418 (1.6540) Loss: 2.9451 (2.9517)
2025-07-19,03:12:40 | INFO | Start epoch 9
2025-07-19,03:12:53 | INFO | Train Epoch: 9 [ 4096/9319509 (0%)] Data (t): 10.658 Batch (t): 12.478, 328.258/s, 20.5161/s/gpu LR: 0.000421 Logit Scale: 82.799 Contrastive_loss: 1.0951 (1.0951) Fd_loss: 1.6382 (1.6382) Loss: 2.7333 (2.7333)
2025-07-19,03:15:09 | INFO | Train Epoch: 9 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.369, 3003.55/s, 187.722/s/gpu LR: 0.000420 Logit Scale: 84.385 Contrastive_loss: 1.1504 (1.1228) Fd_loss: 1.6480 (1.6431) Loss: 2.7985 (2.7659)
2025-07-19,03:17:26 | INFO | Train Epoch: 9 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.368, 2989.10/s, 186.819/s/gpu LR: 0.000419 Logit Scale: 84.857 Contrastive_loss: 1.1968 (1.1474) Fd_loss: 1.6346 (1.6403) Loss: 2.8315 (2.7877)
2025-07-19,03:19:43 | INFO | Train Epoch: 9 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.365, 3007.84/s, 187.990/s/gpu LR: 0.000418 Logit Scale: 85.128 Contrastive_loss: 1.1820 (1.1561) Fd_loss: 1.6444 (1.6413) Loss: 2.8263 (2.7974)
2025-07-19,03:21:59 | INFO | Train Epoch: 9 [1642496/9319509 (18%)] Data (t): 0.000 Batch (t): 1.366, 2987.95/s, 186.747/s/gpu LR: 0.000417 Logit Scale: 85.231 Contrastive_loss: 1.1939 (1.1636) Fd_loss: 1.6343 (1.6399) Loss: 2.8281 (2.8035)
2025-07-19,03:24:16 | INFO | Train Epoch: 9 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.366, 3002.65/s, 187.665/s/gpu LR: 0.000416 Logit Scale: 85.350 Contrastive_loss: 1.1552 (1.1622) Fd_loss: 1.6412 (1.6401) Loss: 2.7964 (2.8024)
2025-07-19,03:26:33 | INFO | Train Epoch: 9 [2461696/9319509 (26%)] Data (t): 0.000 Batch (t): 1.365, 2990.72/s, 186.920/s/gpu LR: 0.000416 Logit Scale: 85.484 Contrastive_loss: 1.2060 (1.1685) Fd_loss: 1.6343 (1.6393) Loss: 2.8402 (2.8078)
2025-07-19,03:28:49 | INFO | Train Epoch: 9 [2871296/9319509 (31%)] Data (t): 0.000 Batch (t): 1.367, 3006.43/s, 187.902/s/gpu LR: 0.000415 Logit Scale: 85.566 Contrastive_loss: 1.2374 (1.1771) Fd_loss: 1.6415 (1.6396) Loss: 2.8789 (2.8167)
2025-07-19,03:31:06 | INFO | Train Epoch: 9 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.365, 3001.37/s, 187.586/s/gpu LR: 0.000414 Logit Scale: 85.632 Contrastive_loss: 1.2293 (1.1829) Fd_loss: 1.6228 (1.6377) Loss: 2.8521 (2.8206)
2025-07-19,03:33:22 | INFO | Train Epoch: 9 [3690496/9319509 (40%)] Data (t): 0.000 Batch (t): 1.367, 2996.53/s, 187.283/s/gpu LR: 0.000413 Logit Scale: 85.771 Contrastive_loss: 1.1911 (1.1837) Fd_loss: 1.6464 (1.6386) Loss: 2.8374 (2.8223)
2025-07-19,03:35:39 | INFO | Train Epoch: 9 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.366, 2993.91/s, 187.119/s/gpu LR: 0.000412 Logit Scale: 85.831 Contrastive_loss: 1.1976 (1.1850) Fd_loss: 1.6295 (1.6377) Loss: 2.8272 (2.8227)
2025-07-19,03:37:56 | INFO | Train Epoch: 9 [4509696/9319509 (48%)] Data (t): 0.000 Batch (t): 1.366, 2998.27/s, 187.392/s/gpu LR: 0.000411 Logit Scale: 85.882 Contrastive_loss: 1.1878 (1.1852) Fd_loss: 1.6243 (1.6366) Loss: 2.8120 (2.8218)
2025-07-19,03:40:12 | INFO | Train Epoch: 9 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.367, 2998.44/s, 187.402/s/gpu LR: 0.000411 Logit Scale: 85.977 Contrastive_loss: 1.2028 (1.1866) Fd_loss: 1.6516 (1.6378) Loss: 2.8544 (2.8243)
2025-07-19,03:42:29 | INFO | Train Epoch: 9 [5328896/9319509 (57%)] Data (t): 0.000 Batch (t): 1.365, 3003.05/s, 187.691/s/gpu LR: 0.000410 Logit Scale: 86.045 Contrastive_loss: 1.2249 (1.1893) Fd_loss: 1.6290 (1.6371) Loss: 2.8539 (2.8264)
2025-07-19,03:44:46 | INFO | Train Epoch: 9 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.368, 2996.38/s, 187.274/s/gpu LR: 0.000409 Logit Scale: 86.137 Contrastive_loss: 1.2144 (1.1910) Fd_loss: 1.6333 (1.6369) Loss: 2.8477 (2.8279)
2025-07-19,03:47:02 | INFO | Train Epoch: 9 [6148096/9319509 (66%)] Data (t): 0.000 Batch (t): 1.366, 3010.32/s, 188.145/s/gpu LR: 0.000408 Logit Scale: 86.199 Contrastive_loss: 1.2554 (1.1950) Fd_loss: 1.6284 (1.6364) Loss: 2.8839 (2.8314)
2025-07-19,03:49:19 | INFO | Train Epoch: 9 [6557696/9319509 (70%)] Data (t): 0.000 Batch (t): 1.366, 2995.31/s, 187.207/s/gpu LR: 0.000407 Logit Scale: 86.251 Contrastive_loss: 1.2021 (1.1954) Fd_loss: 1.6425 (1.6367) Loss: 2.8446 (2.8321)
2025-07-19,03:51:35 | INFO | Train Epoch: 9 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.366, 3004.14/s, 187.759/s/gpu LR: 0.000406 Logit Scale: 86.365 Contrastive_loss: 1.2103 (1.1963) Fd_loss: 1.6310 (1.6364) Loss: 2.8413 (2.8327)
2025-07-19,03:53:52 | INFO | Train Epoch: 9 [7376896/9319509 (79%)] Data (t): 0.000 Batch (t): 1.365, 2996.04/s, 187.252/s/gpu LR: 0.000405 Logit Scale: 86.256 Contrastive_loss: 1.2391 (1.1985) Fd_loss: 1.6388 (1.6365) Loss: 2.8779 (2.8350)
2025-07-19,03:56:09 | INFO | Train Epoch: 9 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.367, 2999.89/s, 187.493/s/gpu LR: 0.000405 Logit Scale: 86.405 Contrastive_loss: 1.2343 (1.2003) Fd_loss: 1.6390 (1.6367) Loss: 2.8734 (2.8370)
2025-07-19,03:58:25 | INFO | Train Epoch: 9 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.365, 3001.39/s, 187.587/s/gpu LR: 0.000404 Logit Scale: 86.373 Contrastive_loss: 1.2288 (1.2017) Fd_loss: 1.6307 (1.6364) Loss: 2.8596 (2.8380)
2025-07-19,04:00:43 | INFO | Train Epoch: 9 [8605696/9319509 (92%)] Data (t): 0.000 Batch (t): 1.384, 3005.59/s, 187.850/s/gpu LR: 0.000403 Logit Scale: 86.493 Contrastive_loss: 1.2177 (1.2024) Fd_loss: 1.6242 (1.6358) Loss: 2.8419 (2.8382)
2025-07-19,04:03:00 | INFO | Train Epoch: 9 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.366, 3000.00/s, 187.500/s/gpu LR: 0.000402 Logit Scale: 86.541 Contrastive_loss: 1.2093 (1.2027) Fd_loss: 1.6283 (1.6355) Loss: 2.8375 (2.8382)
2025-07-19,04:04:41 | INFO | Train Epoch: 9 [9318400/9319509 (100%)] Data (t): 0.001 Batch (t): 1.367, 2991.04/s, 186.940/s/gpu LR: 0.000401 Logit Scale: 86.649 Contrastive_loss: 1.2202 (1.2034) Fd_loss: 1.6281 (1.6352) Loss: 2.8483 (2.8386)
2025-07-19,04:04:42 | INFO | Starting zero-shot imagenet.
2025-07-19,04:04:42 | INFO | Building zero-shot classifier
2025-07-19,04:04:57 | INFO | Using classifier
2025-07-19,04:06:19 | INFO | Finished zero-shot imagenet.
2025-07-19,04:06:19 | INFO | Eval Epoch: 10 imagenet-zeroshot-val-top1: 0.2274 imagenet-zeroshot-val-top5: 0.4670
2025-07-19,04:06:20 | INFO | Start epoch 10
2025-07-19,04:06:26 | INFO | Train Epoch: 10 [ 4096/9319509 (0%)] Data (t): 4.581 Batch (t): 5.932, 690.541/s, 43.1588/s/gpu LR: 0.000401 Logit Scale: 86.654 Contrastive_loss: 1.0843 (1.0843) Fd_loss: 1.6220 (1.6220) Loss: 2.7064 (2.7064)
2025-07-19,04:08:42 | INFO | Train Epoch: 10 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.363, 3010.10/s, 188.131/s/gpu LR: 0.000400 Logit Scale: 88.312 Contrastive_loss: 1.0347 (1.0595) Fd_loss: 1.6365 (1.6292) Loss: 2.6711 (2.6888)
2025-07-19,04:10:59 | INFO | Train Epoch: 10 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.367, 2982.35/s, 186.397/s/gpu LR: 0.000400 Logit Scale: 88.746 Contrastive_loss: 1.0748 (1.0646) Fd_loss: 1.6338 (1.6308) Loss: 2.7086 (2.6954)
2025-07-19,04:13:15 | INFO | Train Epoch: 10 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.366, 3008.09/s, 188.006/s/gpu LR: 0.000399 Logit Scale: 88.908 Contrastive_loss: 1.0910 (1.0712) Fd_loss: 1.6199 (1.6280) Loss: 2.7109 (2.6993)
2025-07-19,04:15:32 | INFO | Train Epoch: 10 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.366, 2994.82/s, 187.176/s/gpu LR: 0.000398 Logit Scale: 89.136 Contrastive_loss: 1.1024 (1.0774) Fd_loss: 1.6349 (1.6294) Loss: 2.7373 (2.7069)
2025-07-19,04:17:49 | INFO | Train Epoch: 10 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.367, 3007.74/s, 187.983/s/gpu LR: 0.000397 Logit Scale: 89.146 Contrastive_loss: 1.0870 (1.0790) Fd_loss: 1.6329 (1.6300) Loss: 2.7200 (2.7090)
2025-07-19,04:20:05 | INFO | Train Epoch: 10 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.366, 3000.69/s, 187.543/s/gpu LR: 0.000396 Logit Scale: 89.168 Contrastive_loss: 1.1266 (1.0858) Fd_loss: 1.6302 (1.6300) Loss: 2.7568 (2.7159)
2025-07-19,04:22:22 | INFO | Train Epoch: 10 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.367, 2995.29/s, 187.206/s/gpu LR: 0.000395 Logit Scale: 89.277 Contrastive_loss: 1.1182 (1.0899) Fd_loss: 1.6261 (1.6295) Loss: 2.7443 (2.7194)
2025-07-19,04:24:39 | INFO | Train Epoch: 10 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.367, 2994.16/s, 187.135/s/gpu LR: 0.000394 Logit Scale: 89.268 Contrastive_loss: 1.0787 (1.0886) Fd_loss: 1.6324 (1.6299) Loss: 2.7112 (2.7185)
2025-07-19,04:26:55 | INFO | Train Epoch: 10 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.367, 3005.48/s, 187.842/s/gpu LR: 0.000393 Logit Scale: 89.323 Contrastive_loss: 1.1522 (1.0950) Fd_loss: 1.6123 (1.6281) Loss: 2.7645 (2.7231)
2025-07-19,04:29:12 | INFO | Train Epoch: 10 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.367, 3004.78/s, 187.798/s/gpu LR: 0.000392 Logit Scale: 89.294 Contrastive_loss: 1.1393 (1.0990) Fd_loss: 1.6396 (1.6292) Loss: 2.7789 (2.7282)
2025-07-19,04:31:29 | INFO | Train Epoch: 10 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.366, 2986.31/s, 186.645/s/gpu LR: 0.000391 Logit Scale: 89.430 Contrastive_loss: 1.1258 (1.1013) Fd_loss: 1.6129 (1.6278) Loss: 2.7387 (2.7291)
2025-07-19,04:33:45 | INFO | Train Epoch: 10 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.368, 3003.74/s, 187.734/s/gpu LR: 0.000391 Logit Scale: 89.567 Contrastive_loss: 1.1678 (1.1064) Fd_loss: 1.6090 (1.6264) Loss: 2.7768 (2.7327)
2025-07-19,04:36:02 | INFO | Train Epoch: 10 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.366, 2994.96/s, 187.185/s/gpu LR: 0.000390 Logit Scale: 89.514 Contrastive_loss: 1.2001 (1.1131) Fd_loss: 1.6171 (1.6257) Loss: 2.8172 (2.7388)
2025-07-19,04:38:19 | INFO | Train Epoch: 10 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.367, 2995.19/s, 187.199/s/gpu LR: 0.000389 Logit Scale: 89.583 Contrastive_loss: 1.1989 (1.1188) Fd_loss: 1.6158 (1.6250) Loss: 2.8147 (2.7438)
2025-07-19,04:40:35 | INFO | Train Epoch: 10 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.367, 2988.11/s, 186.757/s/gpu LR: 0.000388 Logit Scale: 89.557 Contrastive_loss: 1.2244 (1.1254) Fd_loss: 1.6307 (1.6254) Loss: 2.8551 (2.7508)
2025-07-19,04:42:52 | INFO | Train Epoch: 10 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.366, 2982.25/s, 186.391/s/gpu LR: 0.000387 Logit Scale: 89.447 Contrastive_loss: 1.1890 (1.1291) Fd_loss: 1.6186 (1.6250) Loss: 2.8075 (2.7541)
2025-07-19,04:45:09 | INFO | Train Epoch: 10 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.369, 2987.54/s, 186.721/s/gpu LR: 0.000386 Logit Scale: 89.715 Contrastive_loss: 1.1286 (1.1291) Fd_loss: 1.6160 (1.6245) Loss: 2.7447 (2.7536)
2025-07-19,04:47:25 | INFO | Train Epoch: 10 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.366, 2993.49/s, 187.093/s/gpu LR: 0.000385 Logit Scale: 89.634 Contrastive_loss: 1.1443 (1.1299) Fd_loss: 1.6198 (1.6242) Loss: 2.7641 (2.7541)
2025-07-19,04:49:42 | INFO | Train Epoch: 10 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.368, 2998.10/s, 187.382/s/gpu LR: 0.000384 Logit Scale: 89.768 Contrastive_loss: 1.1614 (1.1315) Fd_loss: 1.6255 (1.6243) Loss: 2.7870 (2.7558)
2025-07-19,04:51:59 | INFO | Train Epoch: 10 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.367, 2993.21/s, 187.075/s/gpu LR: 0.000383 Logit Scale: 89.823 Contrastive_loss: 1.1613 (1.1329) Fd_loss: 1.6021 (1.6233) Loss: 2.7635 (2.7562)
2025-07-19,04:54:16 | INFO | Train Epoch: 10 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.368, 2990.59/s, 186.912/s/gpu LR: 0.000382 Logit Scale: 89.962 Contrastive_loss: 1.1339 (1.1329) Fd_loss: 1.6158 (1.6229) Loss: 2.7497 (2.7559)
2025-07-19,04:56:32 | INFO | Train Epoch: 10 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.368, 2997.51/s, 187.344/s/gpu LR: 0.000381 Logit Scale: 89.978 Contrastive_loss: 1.2065 (1.1361) Fd_loss: 1.6046 (1.6221) Loss: 2.8112 (2.7583)
2025-07-19,04:58:14 | INFO | Train Epoch: 10 [9318400/9319509 (100%)] Data (t): 0.003 Batch (t): 1.369, 3001.93/s, 187.621/s/gpu LR: 0.000380 Logit Scale: 90.034 Contrastive_loss: 1.1496 (1.1367) Fd_loss: 1.6143 (1.6218) Loss: 2.7639 (2.7585)
2025-07-19,04:58:16 | INFO | Start epoch 11
2025-07-19,04:58:27 | INFO | Train Epoch: 11 [ 4096/9319509 (0%)] Data (t): 9.569 Batch (t): 11.312, 362.090/s, 22.6306/s/gpu LR: 0.000380 Logit Scale: 90.036 Contrastive_loss: 1.0145 (1.0145) Fd_loss: 1.6030 (1.6030) Loss: 2.6176 (2.6176)
2025-07-19,05:00:44 | INFO | Train Epoch: 11 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.371, 2990.72/s, 186.920/s/gpu LR: 0.000380 Logit Scale: 91.712 Contrastive_loss: 0.95636 (0.98545) Fd_loss: 1.6070 (1.6050) Loss: 2.5634 (2.5905)
2025-07-19,05:03:01 | INFO | Train Epoch: 11 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.369, 3013.89/s, 188.368/s/gpu LR: 0.000379 Logit Scale: 92.199 Contrastive_loss: 1.0446 (1.0052) Fd_loss: 1.6148 (1.6083) Loss: 2.6594 (2.6135)
2025-07-19,05:05:18 | INFO | Train Epoch: 11 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.370, 2992.91/s, 187.057/s/gpu LR: 0.000378 Logit Scale: 92.274 Contrastive_loss: 0.98948 (1.0012) Fd_loss: 1.6262 (1.6128) Loss: 2.6157 (2.6140)
2025-07-19,05:07:35 | INFO | Train Epoch: 11 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.368, 3006.74/s, 187.921/s/gpu LR: 0.000377 Logit Scale: 92.499 Contrastive_loss: 1.0443 (1.0099) Fd_loss: 1.6158 (1.6134) Loss: 2.6602 (2.6232)
2025-07-19,05:09:51 | INFO | Train Epoch: 11 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.366, 3006.43/s, 187.902/s/gpu LR: 0.000376 Logit Scale: 92.572 Contrastive_loss: 1.0512 (1.0167) Fd_loss: 1.6068 (1.6123) Loss: 2.6580 (2.6290)
2025-07-19,05:12:08 | INFO | Train Epoch: 11 [2461696/9319509 (26%)] Data (t): 0.000 Batch (t): 1.368, 3010.96/s, 188.185/s/gpu LR: 0.000375 Logit Scale: 92.544 Contrastive_loss: 1.0553 (1.0223) Fd_loss: 1.6079 (1.6117) Loss: 2.6632 (2.6339)
2025-07-19,05:14:25 | INFO | Train Epoch: 11 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.367, 2991.96/s, 186.997/s/gpu LR: 0.000374 Logit Scale: 92.702 Contrastive_loss: 1.0916 (1.0309) Fd_loss: 1.6091 (1.6113) Loss: 2.7008 (2.6423)
2025-07-19,05:16:42 | INFO | Train Epoch: 11 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.369, 2996.62/s, 187.289/s/gpu LR: 0.000373 Logit Scale: 92.755 Contrastive_loss: 1.0193 (1.0296) Fd_loss: 1.6099 (1.6112) Loss: 2.6292 (2.6408)
2025-07-19,05:18:59 | INFO | Train Epoch: 11 [3690496/9319509 (40%)] Data (t): 0.000 Batch (t): 1.368, 3002.54/s, 187.659/s/gpu LR: 0.000372 Logit Scale: 92.831 Contrastive_loss: 1.0672 (1.0334) Fd_loss: 1.6077 (1.6108) Loss: 2.6749 (2.6442)
2025-07-19,05:21:15 | INFO | Train Epoch: 11 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.369, 2992.76/s, 187.048/s/gpu LR: 0.000371 Logit Scale: 92.866 Contrastive_loss: 1.1118 (1.0405) Fd_loss: 1.6101 (1.6108) Loss: 2.7219 (2.6513)
2025-07-19,05:23:32 | INFO | Train Epoch: 11 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.370, 2997.08/s, 187.318/s/gpu LR: 0.000370 Logit Scale: 92.828 Contrastive_loss: 1.0989 (1.0454) Fd_loss: 1.5965 (1.6096) Loss: 2.6954 (2.6550)
2025-07-19,05:25:49 | INFO | Train Epoch: 11 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.367, 2995.12/s, 187.195/s/gpu LR: 0.000369 Logit Scale: 92.945 Contrastive_loss: 1.0656 (1.0469) Fd_loss: 1.6100 (1.6096) Loss: 2.6756 (2.6565)
2025-07-19,05:28:06 | INFO | Train Epoch: 11 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.369, 2993.35/s, 187.085/s/gpu LR: 0.000368 Logit Scale: 92.991 Contrastive_loss: 1.1124 (1.0516) Fd_loss: 1.5984 (1.6088) Loss: 2.7108 (2.6604)
2025-07-19,05:30:23 | INFO | Train Epoch: 11 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.368, 2999.67/s, 187.479/s/gpu LR: 0.000367 Logit Scale: 93.089 Contrastive_loss: 1.1112 (1.0556) Fd_loss: 1.6038 (1.6085) Loss: 2.7150 (2.6641)
2025-07-19,05:32:40 | INFO | Train Epoch: 11 [6148096/9319509 (66%)] Data (t): 0.000 Batch (t): 1.368, 2987.65/s, 186.728/s/gpu LR: 0.000366 Logit Scale: 93.124 Contrastive_loss: 1.1432 (1.0611) Fd_loss: 1.6096 (1.6085) Loss: 2.7529 (2.6696)
2025-07-19,05:34:57 | INFO | Train Epoch: 11 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.370, 3002.74/s, 187.671/s/gpu LR: 0.000365 Logit Scale: 93.142 Contrastive_loss: 1.1152 (1.0642) Fd_loss: 1.6024 (1.6082) Loss: 2.7176 (2.6724)
2025-07-19,05:37:13 | INFO | Train Epoch: 11 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.367, 2986.58/s, 186.661/s/gpu LR: 0.000364 Logit Scale: 93.263 Contrastive_loss: 1.1117 (1.0669) Fd_loss: 1.6087 (1.6082) Loss: 2.7204 (2.6751)
2025-07-19,05:39:30 | INFO | Train Epoch: 11 [7376896/9319509 (79%)] Data (t): 0.000 Batch (t): 1.368, 2994.80/s, 187.175/s/gpu LR: 0.000363 Logit Scale: 93.336 Contrastive_loss: 1.0867 (1.0679) Fd_loss: 1.6043 (1.6080) Loss: 2.6911 (2.6759)
2025-07-19,05:41:47 | INFO | Train Epoch: 11 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.368, 2987.01/s, 186.688/s/gpu LR: 0.000362 Logit Scale: 93.215 Contrastive_loss: 1.0938 (1.0692) Fd_loss: 1.6017 (1.6077) Loss: 2.6955 (2.6769)
2025-07-19,05:44:04 | INFO | Train Epoch: 11 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.368, 2994.24/s, 187.140/s/gpu LR: 0.000361 Logit Scale: 93.340 Contrastive_loss: 1.1214 (1.0717) Fd_loss: 1.6033 (1.6075) Loss: 2.7247 (2.6792)
2025-07-19,05:46:20 | INFO | Train Epoch: 11 [8605696/9319509 (92%)] Data (t): 0.000 Batch (t): 1.368, 3003.46/s, 187.716/s/gpu LR: 0.000360 Logit Scale: 93.404 Contrastive_loss: 1.1291 (1.0743) Fd_loss: 1.6019 (1.6072) Loss: 2.7310 (2.6815)
2025-07-19,05:48:37 | INFO | Train Epoch: 11 [9015296/9319509 (97%)] Data (t): 0.000 Batch (t): 1.367, 2993.28/s, 187.080/s/gpu LR: 0.000359 Logit Scale: 93.405 Contrastive_loss: 1.0974 (1.0753) Fd_loss: 1.5886 (1.6064) Loss: 2.6860 (2.6817)
2025-07-19,05:50:18 | INFO | Train Epoch: 11 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.370, 3004.85/s, 187.803/s/gpu LR: 0.000358 Logit Scale: 93.382 Contrastive_loss: 1.1438 (1.0782) Fd_loss: 1.5829 (1.6054) Loss: 2.7267 (2.6836)
2025-07-19,05:50:20 | INFO | Starting zero-shot imagenet.
2025-07-19,05:50:20 | INFO | Building zero-shot classifier
2025-07-19,05:50:35 | INFO | Using classifier
2025-07-19,05:51:59 | INFO | Finished zero-shot imagenet.
2025-07-19,05:51:59 | INFO | Eval Epoch: 12 imagenet-zeroshot-val-top1: 0.2340 imagenet-zeroshot-val-top5: 0.4800
2025-07-19,05:52:00 | INFO | Start epoch 12
2025-07-19,05:52:05 | INFO | Train Epoch: 12 [ 4096/9319509 (0%)] Data (t): 4.320 Batch (t): 5.663, 723.260/s, 45.2038/s/gpu LR: 0.000358 Logit Scale: 93.377 Contrastive_loss: 0.99930 (0.99930) Fd_loss: 1.5994 (1.5994) Loss: 2.5987 (2.5987)
2025-07-19,05:54:22 | INFO | Train Epoch: 12 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.369, 3006.31/s, 187.894/s/gpu LR: 0.000357 Logit Scale: 95.198 Contrastive_loss: 0.96720 (0.98325) Fd_loss: 1.5982 (1.5988) Loss: 2.5654 (2.5821)
2025-07-19,05:56:39 | INFO | Train Epoch: 12 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.368, 3009.07/s, 188.067/s/gpu LR: 0.000356 Logit Scale: 95.590 Contrastive_loss: 0.96863 (0.97838) Fd_loss: 1.5951 (1.5976) Loss: 2.5637 (2.5760)
2025-07-19,05:58:56 | INFO | Train Epoch: 12 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.368, 2994.20/s, 187.138/s/gpu LR: 0.000355 Logit Scale: 95.823 Contrastive_loss: 1.0142 (0.98734) Fd_loss: 1.5902 (1.5957) Loss: 2.6044 (2.5831)
2025-07-19,06:01:13 | INFO | Train Epoch: 12 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.370, 2971.51/s, 185.719/s/gpu LR: 0.000354 Logit Scale: 96.110 Contrastive_loss: 0.96033 (0.98194) Fd_loss: 1.5966 (1.5959) Loss: 2.5570 (2.5779)
2025-07-19,06:03:30 | INFO | Train Epoch: 12 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.368, 3007.58/s, 187.974/s/gpu LR: 0.000353 Logit Scale: 96.029 Contrastive_loss: 0.98645 (0.98269) Fd_loss: 1.5877 (1.5945) Loss: 2.5742 (2.5772)
2025-07-19,06:05:46 | INFO | Train Epoch: 12 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.368, 2998.04/s, 187.377/s/gpu LR: 0.000352 Logit Scale: 96.207 Contrastive_loss: 0.98397 (0.98287) Fd_loss: 1.5967 (1.5949) Loss: 2.5807 (2.5777)
2025-07-19,06:08:03 | INFO | Train Epoch: 12 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.368, 2980.34/s, 186.271/s/gpu LR: 0.000351 Logit Scale: 96.158 Contrastive_loss: 1.0549 (0.99187) Fd_loss: 1.6046 (1.5961) Loss: 2.6595 (2.5879)
2025-07-19,06:10:20 | INFO | Train Epoch: 12 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.368, 2987.94/s, 186.746/s/gpu LR: 0.000350 Logit Scale: 96.142 Contrastive_loss: 1.0116 (0.99406) Fd_loss: 1.5904 (1.5954) Loss: 2.6020 (2.5895)
2025-07-19,06:12:37 | INFO | Train Epoch: 12 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.368, 2993.74/s, 187.109/s/gpu LR: 0.000349 Logit Scale: 96.252 Contrastive_loss: 1.0148 (0.99614) Fd_loss: 1.5961 (1.5955) Loss: 2.6109 (2.5917)
2025-07-19,06:14:54 | INFO | Train Epoch: 12 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.367, 2979.02/s, 186.189/s/gpu LR: 0.000348 Logit Scale: 96.290 Contrastive_loss: 0.98324 (0.99496) Fd_loss: 1.5927 (1.5953) Loss: 2.5760 (2.5902)
2025-07-19,06:17:11 | INFO | Train Epoch: 12 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.370, 2984.01/s, 186.500/s/gpu LR: 0.000347 Logit Scale: 96.214 Contrastive_loss: 1.0318 (0.99804) Fd_loss: 1.6007 (1.5957) Loss: 2.6325 (2.5937)
2025-07-19,06:19:27 | INFO | Train Epoch: 12 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.367, 3005.74/s, 187.859/s/gpu LR: 0.000346 Logit Scale: 96.316 Contrastive_loss: 1.0518 (1.0022) Fd_loss: 1.5887 (1.5952) Loss: 2.6404 (2.5973)
2025-07-19,06:21:44 | INFO | Train Epoch: 12 [5328896/9319509 (57%)] Data (t): 0.000 Batch (t): 1.368, 2996.67/s, 187.292/s/gpu LR: 0.000345 Logit Scale: 96.358 Contrastive_loss: 1.0698 (1.0070) Fd_loss: 1.5864 (1.5945) Loss: 2.6561 (2.6015)
2025-07-19,06:24:01 | INFO | Train Epoch: 12 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.368, 3007.72/s, 187.982/s/gpu LR: 0.000344 Logit Scale: 96.442 Contrastive_loss: 1.0615 (1.0106) Fd_loss: 1.5793 (1.5935) Loss: 2.6408 (2.6042)
2025-07-19,06:26:18 | INFO | Train Epoch: 12 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.368, 2993.11/s, 187.070/s/gpu LR: 0.000343 Logit Scale: 96.577 Contrastive_loss: 1.0354 (1.0122) Fd_loss: 1.5816 (1.5928) Loss: 2.6170 (2.6050)
2025-07-19,06:28:35 | INFO | Train Epoch: 12 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.369, 3002.89/s, 187.680/s/gpu LR: 0.000342 Logit Scale: 96.505 Contrastive_loss: 1.0622 (1.0151) Fd_loss: 1.5880 (1.5925) Loss: 2.6502 (2.6076)
2025-07-19,06:30:51 | INFO | Train Epoch: 12 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.367, 2986.81/s, 186.676/s/gpu LR: 0.000341 Logit Scale: 96.449 Contrastive_loss: 1.0817 (1.0188) Fd_loss: 1.5868 (1.5922) Loss: 2.6685 (2.6110)
2025-07-19,06:33:08 | INFO | Train Epoch: 12 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.368, 2988.85/s, 186.803/s/gpu LR: 0.000340 Logit Scale: 96.613 Contrastive_loss: 1.0575 (1.0209) Fd_loss: 1.5789 (1.5915) Loss: 2.6364 (2.6123)
2025-07-19,06:35:25 | INFO | Train Epoch: 12 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.367, 3005.99/s, 187.874/s/gpu LR: 0.000339 Logit Scale: 96.608 Contrastive_loss: 1.1223 (1.0259) Fd_loss: 1.5860 (1.5912) Loss: 2.7084 (2.6171)
2025-07-19,06:37:41 | INFO | Train Epoch: 12 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.368, 2981.08/s, 186.318/s/gpu LR: 0.000338 Logit Scale: 96.676 Contrastive_loss: 1.0306 (1.0261) Fd_loss: 1.5871 (1.5910) Loss: 2.6177 (2.6172)
2025-07-19,06:39:58 | INFO | Train Epoch: 12 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.368, 2999.27/s, 187.455/s/gpu LR: 0.000337 Logit Scale: 96.570 Contrastive_loss: 0.99407 (1.0247) Fd_loss: 1.5943 (1.5912) Loss: 2.5883 (2.6159)
2025-07-19,06:42:15 | INFO | Train Epoch: 12 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.367, 2994.00/s, 187.125/s/gpu LR: 0.000336 Logit Scale: 96.707 Contrastive_loss: 1.0393 (1.0253) Fd_loss: 1.5704 (1.5903) Loss: 2.6097 (2.6156)
2025-07-19,06:43:56 | INFO | Train Epoch: 12 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.370, 3005.65/s, 187.853/s/gpu LR: 0.000335 Logit Scale: 96.873 Contrastive_loss: 1.0452 (1.0262) Fd_loss: 1.5757 (1.5897) Loss: 2.6209 (2.6158)
2025-07-19,06:43:58 | INFO | Start epoch 13
2025-07-19,06:44:09 | INFO | Train Epoch: 13 [ 4096/9319509 (0%)] Data (t): 8.623 Batch (t): 10.727, 381.824/s, 23.8640/s/gpu LR: 0.000335 Logit Scale: 96.868 Contrastive_loss: 0.88774 (0.88774) Fd_loss: 1.5771 (1.5771) Loss: 2.4648 (2.4648)
2025-07-19,06:46:26 | INFO | Train Epoch: 13 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.373, 2999.34/s, 187.459/s/gpu LR: 0.000334 Logit Scale: 98.612 Contrastive_loss: 0.86593 (0.87683) Fd_loss: 1.5836 (1.5803) Loss: 2.4495 (2.4572)
2025-07-19,06:48:43 | INFO | Train Epoch: 13 [ 823296/9319509 (9%)] Data (t): 0.000 Batch (t): 1.369, 2988.45/s, 186.778/s/gpu LR: 0.000333 Logit Scale: 99.110 Contrastive_loss: 0.92725 (0.89364) Fd_loss: 1.5921 (1.5842) Loss: 2.5193 (2.4779)
2025-07-19,06:51:00 | INFO | Train Epoch: 13 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.368, 2992.17/s, 187.011/s/gpu LR: 0.000332 Logit Scale: 99.319 Contrastive_loss: 0.87578 (0.88918) Fd_loss: 1.5834 (1.5840) Loss: 2.4592 (2.4732)
2025-07-19,06:53:17 | INFO | Train Epoch: 13 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.369, 2977.63/s, 186.102/s/gpu LR: 0.000331 Logit Scale: 99.566 Contrastive_loss: 0.91281 (0.89390) Fd_loss: 1.5824 (1.5837) Loss: 2.4952 (2.4776)
2025-07-19,06:55:34 | INFO | Train Epoch: 13 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.369, 2994.99/s, 187.187/s/gpu LR: 0.000330 Logit Scale: 99.623 Contrastive_loss: 0.91935 (0.89814) Fd_loss: 1.5802 (1.5831) Loss: 2.4995 (2.4812)
2025-07-19,06:57:50 | INFO | Train Epoch: 13 [2461696/9319509 (26%)] Data (t): 0.000 Batch (t): 1.367, 2985.64/s, 186.603/s/gpu LR: 0.000329 Logit Scale: 99.462 Contrastive_loss: 0.93414 (0.90329) Fd_loss: 1.5857 (1.5835) Loss: 2.5199 (2.4868)
2025-07-19,07:00:07 | INFO | Train Epoch: 13 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.369, 2988.35/s, 186.772/s/gpu LR: 0.000328 Logit Scale: 99.532 Contrastive_loss: 0.96727 (0.91128) Fd_loss: 1.5871 (1.5839) Loss: 2.5544 (2.4952)
2025-07-19,07:02:24 | INFO | Train Epoch: 13 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.367, 2998.51/s, 187.407/s/gpu LR: 0.000327 Logit Scale: 99.682 Contrastive_loss: 0.94597 (0.91514) Fd_loss: 1.5870 (1.5843) Loss: 2.5330 (2.4994)
2025-07-19,07:04:41 | INFO | Train Epoch: 13 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.368, 2984.67/s, 186.542/s/gpu LR: 0.000326 Logit Scale: 99.626 Contrastive_loss: 0.93134 (0.91676) Fd_loss: 1.5813 (1.5840) Loss: 2.5127 (2.5007)
2025-07-19,07:06:58 | INFO | Train Epoch: 13 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.369, 2970.05/s, 185.628/s/gpu LR: 0.000325 Logit Scale: 99.591 Contrastive_loss: 0.93196 (0.91814) Fd_loss: 1.5887 (1.5844) Loss: 2.5207 (2.5026)
2025-07-19,07:09:14 | INFO | Train Epoch: 13 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.368, 2965.50/s, 185.344/s/gpu LR: 0.000323 Logit Scale: 99.565 Contrastive_loss: 1.0273 (0.92724) Fd_loss: 1.5780 (1.5839) Loss: 2.6053 (2.5111)
2025-07-19,07:11:31 | INFO | Train Epoch: 13 [4919296/9319509 (53%)] Data (t): 0.000 Batch (t): 1.369, 3001.80/s, 187.613/s/gpu LR: 0.000322 Logit Scale: 99.693 Contrastive_loss: 0.94811 (0.92884) Fd_loss: 1.5919 (1.5845) Loss: 2.5400 (2.5133)
2025-07-19,07:13:48 | INFO | Train Epoch: 13 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.367, 2996.64/s, 187.290/s/gpu LR: 0.000321 Logit Scale: 99.702 Contrastive_loss: 0.95366 (0.93061) Fd_loss: 1.5731 (1.5837) Loss: 2.5268 (2.5143)
2025-07-19,07:16:05 | INFO | Train Epoch: 13 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.368, 3007.29/s, 187.955/s/gpu LR: 0.000320 Logit Scale: 99.720 Contrastive_loss: 0.95660 (0.93235) Fd_loss: 1.5812 (1.5835) Loss: 2.5378 (2.5159)
2025-07-19,07:18:22 | INFO | Train Epoch: 13 [6148096/9319509 (66%)] Data (t): 0.000 Batch (t): 1.368, 3008.95/s, 188.059/s/gpu LR: 0.000319 Logit Scale: 99.660 Contrastive_loss: 1.0176 (0.93768) Fd_loss: 1.5742 (1.5829) Loss: 2.5919 (2.5206)
2025-07-19,07:20:38 | INFO | Train Epoch: 13 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.367, 2992.27/s, 187.017/s/gpu LR: 0.000318 Logit Scale: 99.664 Contrastive_loss: 0.98059 (0.94020) Fd_loss: 1.5650 (1.5819) Loss: 2.5456 (2.5221)
2025-07-19,07:22:55 | INFO | Train Epoch: 13 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.369, 2998.20/s, 187.388/s/gpu LR: 0.000317 Logit Scale: 99.780 Contrastive_loss: 1.0195 (0.94461) Fd_loss: 1.5730 (1.5814) Loss: 2.5925 (2.5260)
2025-07-19,07:25:12 | INFO | Train Epoch: 13 [7376896/9319509 (79%)] Data (t): 0.000 Batch (t): 1.367, 2990.67/s, 186.917/s/gpu LR: 0.000316 Logit Scale: 99.748 Contrastive_loss: 0.99790 (0.94741) Fd_loss: 1.5736 (1.5810) Loss: 2.5715 (2.5284)
2025-07-19,07:27:29 | INFO | Train Epoch: 13 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.369, 3004.20/s, 187.763/s/gpu LR: 0.000315 Logit Scale: 99.764 Contrastive_loss: 0.99803 (0.94994) Fd_loss: 1.5703 (1.5804) Loss: 2.5683 (2.5304)
2025-07-19,07:29:45 | INFO | Train Epoch: 13 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.367, 2994.32/s, 187.145/s/gpu LR: 0.000314 Logit Scale: 99.726 Contrastive_loss: 1.0080 (0.95271) Fd_loss: 1.5687 (1.5799) Loss: 2.5767 (2.5326)
2025-07-19,07:32:02 | INFO | Train Epoch: 13 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.368, 2992.92/s, 187.057/s/gpu LR: 0.000313 Logit Scale: 99.799 Contrastive_loss: 0.96299 (0.95317) Fd_loss: 1.5676 (1.5793) Loss: 2.5306 (2.5325)
2025-07-19,07:34:19 | INFO | Train Epoch: 13 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.368, 2987.99/s, 186.749/s/gpu LR: 0.000312 Logit Scale: 99.990 Contrastive_loss: 0.99442 (0.95497) Fd_loss: 1.5808 (1.5794) Loss: 2.5752 (2.5344)
2025-07-19,07:36:00 | INFO | Train Epoch: 13 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.367, 3004.46/s, 187.778/s/gpu LR: 0.000311 Logit Scale: 99.905 Contrastive_loss: 1.0045 (0.95703) Fd_loss: 1.5687 (1.5789) Loss: 2.5732 (2.5360)
2025-07-19,07:36:01 | INFO | Starting zero-shot imagenet.
2025-07-19,07:36:01 | INFO | Building zero-shot classifier
2025-07-19,07:36:16 | INFO | Using classifier
2025-07-19,07:37:36 | INFO | Finished zero-shot imagenet.
2025-07-19,07:37:36 | INFO | Eval Epoch: 14 imagenet-zeroshot-val-top1: 0.2529 imagenet-zeroshot-val-top5: 0.5099
2025-07-19,07:37:37 | INFO | Start epoch 14
2025-07-19,07:37:43 | INFO | Train Epoch: 14 [ 4096/9319509 (0%)] Data (t): 4.740 Batch (t): 6.083, 673.369/s, 42.0856/s/gpu LR: 0.000311 Logit Scale: 99.901 Contrastive_loss: 0.82850 (0.82850) Fd_loss: 1.5719 (1.5719) Loss: 2.4004 (2.4004)
2025-07-19,07:40:00 | INFO | Train Epoch: 14 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.369, 3008.29/s, 188.018/s/gpu LR: 0.000310 Logit Scale: 100.000 Contrastive_loss: 0.81326 (0.82088) Fd_loss: 1.5650 (1.5684) Loss: 2.3783 (2.3893)
2025-07-19,07:42:16 | INFO | Train Epoch: 14 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.368, 3004.38/s, 187.774/s/gpu LR: 0.000309 Logit Scale: 99.999 Contrastive_loss: 0.83674 (0.82617) Fd_loss: 1.5638 (1.5669) Loss: 2.4005 (2.3930)
2025-07-19,07:44:33 | INFO | Train Epoch: 14 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.369, 2980.49/s, 186.281/s/gpu LR: 0.000308 Logit Scale: 100.000 Contrastive_loss: 0.85567 (0.83354) Fd_loss: 1.5795 (1.5700) Loss: 2.4351 (2.4036)
2025-07-19,07:46:50 | INFO | Train Epoch: 14 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.367, 3000.31/s, 187.520/s/gpu LR: 0.000307 Logit Scale: 99.074 Contrastive_loss: 1.0494 (0.87671) Fd_loss: 1.6006 (1.5761) Loss: 2.6500 (2.4528)
2025-07-19,07:49:07 | INFO | Train Epoch: 14 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.368, 2995.71/s, 187.232/s/gpu LR: 0.000306 Logit Scale: 98.115 Contrastive_loss: 1.1271 (0.91844) Fd_loss: 1.5967 (1.5796) Loss: 2.7238 (2.4980)
2025-07-19,07:51:24 | INFO | Train Epoch: 14 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.367, 2999.71/s, 187.482/s/gpu LR: 0.000304 Logit Scale: 98.370 Contrastive_loss: 0.83491 (0.90651) Fd_loss: 1.5816 (1.5799) Loss: 2.4166 (2.4864)
2025-07-19,07:53:40 | INFO | Train Epoch: 14 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.368, 2994.29/s, 187.143/s/gpu LR: 0.000303 Logit Scale: 98.432 Contrastive_loss: 0.82526 (0.89635) Fd_loss: 1.5889 (1.5810) Loss: 2.4142 (2.4773)
2025-07-19,07:55:57 | INFO | Train Epoch: 14 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.369, 3006.38/s, 187.899/s/gpu LR: 0.000302 Logit Scale: 98.377 Contrastive_loss: 0.88251 (0.89482) Fd_loss: 1.5798 (1.5809) Loss: 2.4624 (2.4757)
2025-07-19,07:58:14 | INFO | Train Epoch: 14 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.367, 2992.62/s, 187.039/s/gpu LR: 0.000301 Logit Scale: 98.301 Contrastive_loss: 0.86606 (0.89194) Fd_loss: 1.5753 (1.5803) Loss: 2.4414 (2.4723)
2025-07-19,08:00:31 | INFO | Train Epoch: 14 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.369, 2982.03/s, 186.377/s/gpu LR: 0.000300 Logit Scale: 98.361 Contrastive_loss: 0.87119 (0.89005) Fd_loss: 1.5807 (1.5803) Loss: 2.4519 (2.4704)
2025-07-19,08:02:47 | INFO | Train Epoch: 14 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.366, 3009.27/s, 188.080/s/gpu LR: 0.000299 Logit Scale: 98.546 Contrastive_loss: 0.90847 (0.89159) Fd_loss: 1.5684 (1.5794) Loss: 2.4768 (2.4709)
2025-07-19,08:05:04 | INFO | Train Epoch: 14 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.367, 2993.33/s, 187.083/s/gpu LR: 0.000298 Logit Scale: 98.551 Contrastive_loss: 0.86419 (0.88948) Fd_loss: 1.5586 (1.5778) Loss: 2.4228 (2.4672)
2025-07-19,08:07:21 | INFO | Train Epoch: 14 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.368, 3000.75/s, 187.547/s/gpu LR: 0.000297 Logit Scale: 98.538 Contrastive_loss: 0.86261 (0.88756) Fd_loss: 1.5691 (1.5771) Loss: 2.4317 (2.4647)
2025-07-19,08:09:38 | INFO | Train Epoch: 14 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.366, 2986.44/s, 186.652/s/gpu LR: 0.000296 Logit Scale: 98.512 Contrastive_loss: 0.87403 (0.88666) Fd_loss: 1.5753 (1.5770) Loss: 2.4493 (2.4637)
2025-07-19,08:11:54 | INFO | Train Epoch: 14 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.369, 2982.51/s, 186.407/s/gpu LR: 0.000295 Logit Scale: 98.625 Contrastive_loss: 0.87871 (0.88616) Fd_loss: 1.5740 (1.5768) Loss: 2.4527 (2.4630)
2025-07-19,08:14:11 | INFO | Train Epoch: 14 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.366, 2990.17/s, 186.885/s/gpu LR: 0.000294 Logit Scale: 98.623 Contrastive_loss: 0.87570 (0.88555) Fd_loss: 1.5680 (1.5763) Loss: 2.4437 (2.4619)
2025-07-19,08:16:28 | INFO | Train Epoch: 14 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.368, 3000.31/s, 187.520/s/gpu LR: 0.000293 Logit Scale: 98.572 Contrastive_loss: 0.86882 (0.88462) Fd_loss: 1.5583 (1.5753) Loss: 2.4272 (2.4599)
2025-07-19,08:18:45 | INFO | Train Epoch: 14 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.368, 2995.31/s, 187.207/s/gpu LR: 0.000291 Logit Scale: 98.621 Contrastive_loss: 0.91731 (0.88634) Fd_loss: 1.5728 (1.5752) Loss: 2.4901 (2.4615)
2025-07-19,08:21:01 | INFO | Train Epoch: 14 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.366, 2977.65/s, 186.103/s/gpu LR: 0.000290 Logit Scale: 98.668 Contrastive_loss: 0.87269 (0.88566) Fd_loss: 1.5645 (1.5746) Loss: 2.4372 (2.4603)
2025-07-19,08:23:18 | INFO | Train Epoch: 14 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.368, 3001.83/s, 187.615/s/gpu LR: 0.000289 Logit Scale: 98.631 Contrastive_loss: 0.88068 (0.88542) Fd_loss: 1.5718 (1.5745) Loss: 2.4525 (2.4599)
2025-07-19,08:25:35 | INFO | Train Epoch: 14 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.366, 3000.93/s, 187.558/s/gpu LR: 0.000288 Logit Scale: 98.659 Contrastive_loss: 0.88322 (0.88532) Fd_loss: 1.5639 (1.5740) Loss: 2.4472 (2.4593)
2025-07-19,08:27:52 | INFO | Train Epoch: 14 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.368, 2992.16/s, 187.010/s/gpu LR: 0.000287 Logit Scale: 98.707 Contrastive_loss: 0.87263 (0.88477) Fd_loss: 1.5645 (1.5736) Loss: 2.4371 (2.4584)
2025-07-19,08:29:33 | INFO | Train Epoch: 14 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.369, 2999.25/s, 187.453/s/gpu LR: 0.000286 Logit Scale: 98.631 Contrastive_loss: 0.93846 (0.88700) Fd_loss: 1.5595 (1.5730) Loss: 2.4979 (2.4600)
2025-07-19,08:29:35 | INFO | Start epoch 15
2025-07-19,08:29:45 | INFO | Train Epoch: 15 [ 4096/9319509 (0%)] Data (t): 8.385 Batch (t): 10.205, 401.381/s, 25.0863/s/gpu LR: 0.000286 Logit Scale: 98.630 Contrastive_loss: 0.71142 (0.71142) Fd_loss: 1.5565 (1.5565) Loss: 2.2679 (2.2679)
2025-07-19,08:32:02 | INFO | Train Epoch: 15 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.371, 3005.01/s, 187.813/s/gpu LR: 0.000285 Logit Scale: 100.000 Contrastive_loss: 0.71378 (0.71260) Fd_loss: 1.5552 (1.5558) Loss: 2.2690 (2.2684)
2025-07-19,08:34:19 | INFO | Train Epoch: 15 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.369, 2988.22/s, 186.764/s/gpu LR: 0.000284 Logit Scale: 100.000 Contrastive_loss: 0.72955 (0.71825) Fd_loss: 1.5745 (1.5620) Loss: 2.3040 (2.2803)
2025-07-19,08:36:36 | INFO | Train Epoch: 15 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.368, 2994.89/s, 187.180/s/gpu LR: 0.000283 Logit Scale: 100.000 Contrastive_loss: 0.73647 (0.72280) Fd_loss: 1.5608 (1.5617) Loss: 2.2972 (2.2845)
2025-07-19,08:38:52 | INFO | Train Epoch: 15 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.367, 2969.06/s, 185.566/s/gpu LR: 0.000282 Logit Scale: 100.000 Contrastive_loss: 0.76300 (0.73084) Fd_loss: 1.5631 (1.5620) Loss: 2.3261 (2.2929)
2025-07-19,08:41:09 | INFO | Train Epoch: 15 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.368, 2987.47/s, 186.717/s/gpu LR: 0.000281 Logit Scale: 99.908 Contrastive_loss: 0.73894 (0.73219) Fd_loss: 1.5617 (1.5620) Loss: 2.3007 (2.2942)
2025-07-19,08:43:26 | INFO | Train Epoch: 15 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.367, 2993.92/s, 187.120/s/gpu LR: 0.000280 Logit Scale: 100.000 Contrastive_loss: 0.79111 (0.74061) Fd_loss: 1.5626 (1.5620) Loss: 2.3537 (2.3027)
2025-07-19,08:45:43 | INFO | Train Epoch: 15 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.366, 3000.60/s, 187.537/s/gpu LR: 0.000279 Logit Scale: 99.977 Contrastive_loss: 0.76806 (0.74404) Fd_loss: 1.5579 (1.5615) Loss: 2.3260 (2.3056)
2025-07-19,08:47:59 | INFO | Train Epoch: 15 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.367, 2984.68/s, 186.543/s/gpu LR: 0.000277 Logit Scale: 99.933 Contrastive_loss: 0.78599 (0.74870) Fd_loss: 1.5521 (1.5605) Loss: 2.3381 (2.3092)
2025-07-19,08:50:16 | INFO | Train Epoch: 15 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.369, 3008.64/s, 188.040/s/gpu LR: 0.000276 Logit Scale: 99.969 Contrastive_loss: 0.77740 (0.75157) Fd_loss: 1.5625 (1.5607) Loss: 2.3399 (2.3122)
2025-07-19,08:52:33 | INFO | Train Epoch: 15 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.365, 3007.22/s, 187.951/s/gpu LR: 0.000275 Logit Scale: 99.942 Contrastive_loss: 0.76300 (0.75261) Fd_loss: 1.5610 (1.5607) Loss: 2.3240 (2.3133)
2025-07-19,08:54:49 | INFO | Train Epoch: 15 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.367, 2994.52/s, 187.157/s/gpu LR: 0.000274 Logit Scale: 99.983 Contrastive_loss: 0.77073 (0.75412) Fd_loss: 1.5508 (1.5599) Loss: 2.3216 (2.3140)
2025-07-19,08:57:06 | INFO | Train Epoch: 15 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.368, 2998.69/s, 187.418/s/gpu LR: 0.000273 Logit Scale: 99.927 Contrastive_loss: 0.82455 (0.75954) Fd_loss: 1.5558 (1.5596) Loss: 2.3804 (2.3191)
2025-07-19,08:59:23 | INFO | Train Epoch: 15 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.366, 2997.34/s, 187.334/s/gpu LR: 0.000272 Logit Scale: 99.982 Contrastive_loss: 0.83534 (0.76495) Fd_loss: 1.5501 (1.5589) Loss: 2.3854 (2.3238)
2025-07-19,09:01:39 | INFO | Train Epoch: 15 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.365, 2995.72/s, 187.232/s/gpu LR: 0.000271 Logit Scale: 99.971 Contrastive_loss: 0.83637 (0.76971) Fd_loss: 1.5489 (1.5582) Loss: 2.3853 (2.3279)
2025-07-19,09:03:56 | INFO | Train Epoch: 15 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.368, 3009.15/s, 188.072/s/gpu LR: 0.000270 Logit Scale: 99.964 Contrastive_loss: 0.84000 (0.77411) Fd_loss: 1.5537 (1.5579) Loss: 2.3937 (2.3321)
2025-07-19,09:06:13 | INFO | Train Epoch: 15 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.366, 2996.77/s, 187.298/s/gpu LR: 0.000269 Logit Scale: 99.958 Contrastive_loss: 0.80895 (0.77616) Fd_loss: 1.5593 (1.5580) Loss: 2.3683 (2.3342)
2025-07-19,09:08:29 | INFO | Train Epoch: 15 [6967296/9319509 (75%)] Data (t): 0.000 Batch (t): 1.366, 2989.69/s, 186.855/s/gpu LR: 0.000267 Logit Scale: 99.955 Contrastive_loss: 0.83824 (0.77961) Fd_loss: 1.5422 (1.5572) Loss: 2.3805 (2.3368)
2025-07-19,09:10:46 | INFO | Train Epoch: 15 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.367, 2997.08/s, 187.318/s/gpu LR: 0.000266 Logit Scale: 99.891 Contrastive_loss: 0.85421 (0.78353) Fd_loss: 1.5548 (1.5570) Loss: 2.4090 (2.3406)
2025-07-19,09:13:03 | INFO | Train Epoch: 15 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.366, 3009.68/s, 188.105/s/gpu LR: 0.000265 Logit Scale: 99.988 Contrastive_loss: 0.86289 (0.78750) Fd_loss: 1.5443 (1.5564) Loss: 2.4072 (2.3439)
2025-07-19,09:15:19 | INFO | Train Epoch: 15 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.367, 2997.33/s, 187.333/s/gpu LR: 0.000264 Logit Scale: 99.998 Contrastive_loss: 0.89253 (0.79250) Fd_loss: 1.5430 (1.5558) Loss: 2.4355 (2.3483)
2025-07-19,09:17:36 | INFO | Train Epoch: 15 [8605696/9319509 (92%)] Data (t): 0.000 Batch (t): 1.367, 2993.17/s, 187.073/s/gpu LR: 0.000263 Logit Scale: 99.946 Contrastive_loss: 0.81220 (0.79340) Fd_loss: 1.5508 (1.5555) Loss: 2.3630 (2.3489)
2025-07-19,09:19:53 | INFO | Train Epoch: 15 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.365, 3001.13/s, 187.570/s/gpu LR: 0.000262 Logit Scale: 99.959 Contrastive_loss: 0.85691 (0.79616) Fd_loss: 1.5536 (1.5554) Loss: 2.4105 (2.3516)
2025-07-19,09:21:34 | INFO | Train Epoch: 15 [9318400/9319509 (100%)] Data (t): 0.003 Batch (t): 1.367, 3007.34/s, 187.959/s/gpu LR: 0.000261 Logit Scale: 99.920 Contrastive_loss: 0.82530 (0.79737) Fd_loss: 1.5411 (1.5548) Loss: 2.3664 (2.3522)
2025-07-19,09:21:35 | INFO | Starting zero-shot imagenet.
2025-07-19,09:21:35 | INFO | Building zero-shot classifier
2025-07-19,09:21:50 | INFO | Using classifier
2025-07-19,09:23:04 | INFO | Finished zero-shot imagenet.
2025-07-19,09:23:04 | INFO | Eval Epoch: 16 imagenet-zeroshot-val-top1: 0.2542 imagenet-zeroshot-val-top5: 0.5089
2025-07-19,09:23:05 | INFO | Start epoch 16
2025-07-19,09:23:11 | INFO | Train Epoch: 16 [ 4096/9319509 (0%)] Data (t): 4.940 Batch (t): 6.287, 651.497/s, 40.7186/s/gpu LR: 0.000261 Logit Scale: 99.919 Contrastive_loss: 0.64543 (0.64543) Fd_loss: 1.5450 (1.5450) Loss: 2.1904 (2.1904)
2025-07-19,09:25:28 | INFO | Train Epoch: 16 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.375, 3009.35/s, 188.085/s/gpu LR: 0.000260 Logit Scale: 100.000 Contrastive_loss: 0.68991 (0.66767) Fd_loss: 1.5506 (1.5478) Loss: 2.2405 (2.2154)
2025-07-19,09:27:48 | INFO | Train Epoch: 16 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.396, 2983.78/s, 186.486/s/gpu LR: 0.000259 Logit Scale: 100.000 Contrastive_loss: 0.73555 (0.69030) Fd_loss: 1.5412 (1.5456) Loss: 2.2768 (2.2359)
2025-07-19,09:30:06 | INFO | Train Epoch: 16 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.378, 2973.24/s, 185.827/s/gpu LR: 0.000258 Logit Scale: 100.000 Contrastive_loss: 0.69794 (0.69221) Fd_loss: 1.5516 (1.5471) Loss: 2.2495 (2.2393)
2025-07-19,09:32:23 | INFO | Train Epoch: 16 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.368, 2994.31/s, 187.144/s/gpu LR: 0.000257 Logit Scale: 100.000 Contrastive_loss: 0.69283 (0.69233) Fd_loss: 1.5455 (1.5468) Loss: 2.2383 (2.2391)
2025-07-19,09:34:39 | INFO | Train Epoch: 16 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.366, 2988.78/s, 186.799/s/gpu LR: 0.000256 Logit Scale: 100.000 Contrastive_loss: 0.70900 (0.69511) Fd_loss: 1.5338 (1.5446) Loss: 2.2428 (2.2397)
2025-07-19,09:37:02 | INFO | Train Epoch: 16 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.434, 3001.27/s, 187.579/s/gpu LR: 0.000254 Logit Scale: 100.000 Contrastive_loss: 0.72943 (0.70001) Fd_loss: 1.5412 (1.5441) Loss: 2.2707 (2.2441)
2025-07-19,09:39:19 | INFO | Train Epoch: 16 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.366, 3003.08/s, 187.692/s/gpu LR: 0.000253 Logit Scale: 100.000 Contrastive_loss: 0.77964 (0.70996) Fd_loss: 1.5501 (1.5449) Loss: 2.3297 (2.2548)
2025-07-19,09:41:36 | INFO | Train Epoch: 16 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.365, 2981.49/s, 186.343/s/gpu LR: 0.000252 Logit Scale: 99.996 Contrastive_loss: 0.69334 (0.70812) Fd_loss: 1.5347 (1.5437) Loss: 2.2281 (2.2519)
2025-07-19,09:43:52 | INFO | Train Epoch: 16 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.368, 3007.11/s, 187.944/s/gpu LR: 0.000251 Logit Scale: 99.975 Contrastive_loss: 0.73867 (0.71117) Fd_loss: 1.5473 (1.5441) Loss: 2.2860 (2.2553)
2025-07-19,09:46:12 | INFO | Train Epoch: 16 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.399, 3001.56/s, 187.597/s/gpu LR: 0.000250 Logit Scale: 100.000 Contrastive_loss: 0.75898 (0.71552) Fd_loss: 1.5332 (1.5431) Loss: 2.2922 (2.2586)
2025-07-19,09:48:29 | INFO | Train Epoch: 16 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.365, 3003.26/s, 187.704/s/gpu LR: 0.000249 Logit Scale: 99.967 Contrastive_loss: 0.74096 (0.71764) Fd_loss: 1.5517 (1.5438) Loss: 2.2927 (2.2615)
2025-07-19,09:50:45 | INFO | Train Epoch: 16 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.366, 2982.42/s, 186.401/s/gpu LR: 0.000248 Logit Scale: 99.975 Contrastive_loss: 0.71342 (0.71731) Fd_loss: 1.5380 (1.5434) Loss: 2.2514 (2.2607)
2025-07-19,09:53:02 | INFO | Train Epoch: 16 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.368, 2991.72/s, 186.982/s/gpu LR: 0.000247 Logit Scale: 99.935 Contrastive_loss: 0.74897 (0.71958) Fd_loss: 1.5403 (1.5432) Loss: 2.2893 (2.2627)
2025-07-19,09:55:19 | INFO | Train Epoch: 16 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.366, 2983.61/s, 186.476/s/gpu LR: 0.000246 Logit Scale: 99.952 Contrastive_loss: 0.79383 (0.72453) Fd_loss: 1.5402 (1.5430) Loss: 2.3340 (2.2675)
2025-07-19,09:57:36 | INFO | Train Epoch: 16 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.368, 3002.62/s, 187.664/s/gpu LR: 0.000244 Logit Scale: 100.000 Contrastive_loss: 0.75625 (0.72651) Fd_loss: 1.5366 (1.5426) Loss: 2.2928 (2.2691)
2025-07-19,09:59:52 | INFO | Train Epoch: 16 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.367, 2999.35/s, 187.459/s/gpu LR: 0.000243 Logit Scale: 99.954 Contrastive_loss: 0.75798 (0.72836) Fd_loss: 1.5424 (1.5425) Loss: 2.3003 (2.2709)
2025-07-19,10:02:09 | INFO | Train Epoch: 16 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.365, 2986.83/s, 186.677/s/gpu LR: 0.000242 Logit Scale: 99.977 Contrastive_loss: 0.75937 (0.73008) Fd_loss: 1.5328 (1.5420) Loss: 2.2922 (2.2721)
2025-07-19,10:04:26 | INFO | Train Epoch: 16 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.368, 2995.58/s, 187.224/s/gpu LR: 0.000241 Logit Scale: 99.988 Contrastive_loss: 0.75360 (0.73132) Fd_loss: 1.5473 (1.5423) Loss: 2.3009 (2.2736)
2025-07-19,10:06:42 | INFO | Train Epoch: 16 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.367, 2993.23/s, 187.077/s/gpu LR: 0.000240 Logit Scale: 99.983 Contrastive_loss: 0.72665 (0.73109) Fd_loss: 1.5282 (1.5416) Loss: 2.2548 (2.2727)
2025-07-19,10:08:59 | INFO | Train Epoch: 16 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.365, 2996.10/s, 187.256/s/gpu LR: 0.000239 Logit Scale: 99.935 Contrastive_loss: 0.78284 (0.73355) Fd_loss: 1.5306 (1.5411) Loss: 2.3134 (2.2746)
2025-07-19,10:11:16 | INFO | Train Epoch: 16 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.367, 2998.43/s, 187.402/s/gpu LR: 0.000238 Logit Scale: 99.999 Contrastive_loss: 0.76478 (0.73497) Fd_loss: 1.5181 (1.5400) Loss: 2.2829 (2.2750)
2025-07-19,10:13:32 | INFO | Train Epoch: 16 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.368, 3000.59/s, 187.537/s/gpu LR: 0.000237 Logit Scale: 99.972 Contrastive_loss: 0.77679 (0.73679) Fd_loss: 1.5342 (1.5398) Loss: 2.3110 (2.2765)
2025-07-19,10:15:14 | INFO | Train Epoch: 16 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.368, 2983.97/s, 186.498/s/gpu LR: 0.000236 Logit Scale: 99.967 Contrastive_loss: 0.79482 (0.73921) Fd_loss: 1.5244 (1.5391) Loss: 2.3192 (2.2783)
2025-07-19,10:15:16 | INFO | Start epoch 17
2025-07-19,10:15:26 | INFO | Train Epoch: 17 [ 4096/9319509 (0%)] Data (t): 8.353 Batch (t): 10.369, 395.014/s, 24.6884/s/gpu LR: 0.000236 Logit Scale: 99.966 Contrastive_loss: 0.61990 (0.61990) Fd_loss: 1.5250 (1.5250) Loss: 2.1449 (2.1449)
2025-07-19,10:17:43 | INFO | Train Epoch: 17 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.369, 2996.65/s, 187.290/s/gpu LR: 0.000235 Logit Scale: 100.000 Contrastive_loss: 0.59993 (0.60992) Fd_loss: 1.5437 (1.5343) Loss: 2.1436 (2.1443)
2025-07-19,10:20:00 | INFO | Train Epoch: 17 [ 823296/9319509 (9%)] Data (t): 0.000 Batch (t): 1.372, 3000.16/s, 187.510/s/gpu LR: 0.000234 Logit Scale: 100.000 Contrastive_loss: 0.62522 (0.61502) Fd_loss: 1.5410 (1.5365) Loss: 2.1662 (2.1516)
2025-07-19,10:22:17 | INFO | Train Epoch: 17 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.367, 2999.20/s, 187.450/s/gpu LR: 0.000233 Logit Scale: 100.000 Contrastive_loss: 0.65432 (0.62484) Fd_loss: 1.5134 (1.5308) Loss: 2.1677 (2.1556)
2025-07-19,10:24:33 | INFO | Train Epoch: 17 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.367, 2998.16/s, 187.385/s/gpu LR: 0.000231 Logit Scale: 100.000 Contrastive_loss: 0.66087 (0.63205) Fd_loss: 1.5307 (1.5307) Loss: 2.1916 (2.1628)
2025-07-19,10:26:50 | INFO | Train Epoch: 17 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.368, 2990.53/s, 186.908/s/gpu LR: 0.000230 Logit Scale: 100.000 Contrastive_loss: 0.67136 (0.63860) Fd_loss: 1.5349 (1.5314) Loss: 2.2063 (2.1700)
2025-07-19,10:29:07 | INFO | Train Epoch: 17 [2461696/9319509 (26%)] Data (t): 0.000 Batch (t): 1.367, 3000.60/s, 187.538/s/gpu LR: 0.000229 Logit Scale: 99.984 Contrastive_loss: 0.65968 (0.64161) Fd_loss: 1.5356 (1.5320) Loss: 2.1953 (2.1737)
2025-07-19,10:31:24 | INFO | Train Epoch: 17 [2871296/9319509 (31%)] Data (t): 0.000 Batch (t): 1.366, 2982.46/s, 186.404/s/gpu LR: 0.000228 Logit Scale: 100.000 Contrastive_loss: 0.68559 (0.64711) Fd_loss: 1.5340 (1.5323) Loss: 2.2196 (2.1794)
2025-07-19,10:33:40 | INFO | Train Epoch: 17 [3280896/9319509 (35%)] Data (t): 0.000 Batch (t): 1.368, 2993.53/s, 187.096/s/gpu LR: 0.000227 Logit Scale: 99.945 Contrastive_loss: 0.65785 (0.64830) Fd_loss: 1.5228 (1.5312) Loss: 2.1807 (2.1795)
2025-07-19,10:35:57 | INFO | Train Epoch: 17 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.368, 2990.72/s, 186.920/s/gpu LR: 0.000226 Logit Scale: 100.000 Contrastive_loss: 0.72860 (0.65633) Fd_loss: 1.5238 (1.5305) Loss: 2.2524 (2.1868)
2025-07-19,10:38:14 | INFO | Train Epoch: 17 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.367, 2996.63/s, 187.289/s/gpu LR: 0.000225 Logit Scale: 100.000 Contrastive_loss: 0.69050 (0.65944) Fd_loss: 1.5253 (1.5300) Loss: 2.2158 (2.1895)
2025-07-19,10:40:31 | INFO | Train Epoch: 17 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.369, 2976.95/s, 186.059/s/gpu LR: 0.000224 Logit Scale: 99.994 Contrastive_loss: 0.69314 (0.66225) Fd_loss: 1.5441 (1.5312) Loss: 2.2373 (2.1934)
2025-07-19,10:42:47 | INFO | Train Epoch: 17 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.367, 3006.19/s, 187.887/s/gpu LR: 0.000223 Logit Scale: 100.000 Contrastive_loss: 0.71292 (0.66614) Fd_loss: 1.5200 (1.5303) Loss: 2.2329 (2.1965)
2025-07-19,10:45:04 | INFO | Train Epoch: 17 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.366, 3003.74/s, 187.734/s/gpu LR: 0.000221 Logit Scale: 100.000 Contrastive_loss: 0.74036 (0.67144) Fd_loss: 1.5275 (1.5301) Loss: 2.2679 (2.2016)
2025-07-19,10:47:21 | INFO | Train Epoch: 17 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.365, 2988.81/s, 186.801/s/gpu LR: 0.000220 Logit Scale: 100.000 Contrastive_loss: 0.72870 (0.67526) Fd_loss: 1.5195 (1.5294) Loss: 2.2482 (2.2047)
2025-07-19,10:49:37 | INFO | Train Epoch: 17 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.365, 3011.57/s, 188.223/s/gpu LR: 0.000219 Logit Scale: 99.965 Contrastive_loss: 0.70785 (0.67730) Fd_loss: 1.5248 (1.5291) Loss: 2.2326 (2.2064)
2025-07-19,10:51:54 | INFO | Train Epoch: 17 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.365, 3002.99/s, 187.687/s/gpu LR: 0.000218 Logit Scale: 99.997 Contrastive_loss: 0.69419 (0.67829) Fd_loss: 1.5216 (1.5287) Loss: 2.2158 (2.2070)
2025-07-19,10:54:10 | INFO | Train Epoch: 17 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.366, 2977.44/s, 186.090/s/gpu LR: 0.000217 Logit Scale: 99.947 Contrastive_loss: 0.67894 (0.67833) Fd_loss: 1.5136 (1.5279) Loss: 2.1926 (2.2062)
2025-07-19,10:56:27 | INFO | Train Epoch: 17 [7376896/9319509 (79%)] Data (t): 0.000 Batch (t): 1.366, 2984.68/s, 186.542/s/gpu LR: 0.000216 Logit Scale: 100.000 Contrastive_loss: 0.73439 (0.68128) Fd_loss: 1.5090 (1.5269) Loss: 2.2434 (2.2081)
2025-07-19,10:58:43 | INFO | Train Epoch: 17 [7786496/9319509 (84%)] Data (t): 0.000 Batch (t): 1.365, 2988.53/s, 186.783/s/gpu LR: 0.000215 Logit Scale: 99.981 Contrastive_loss: 0.69821 (0.68213) Fd_loss: 1.5196 (1.5265) Loss: 2.2179 (2.2086)
2025-07-19,11:01:00 | INFO | Train Epoch: 17 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.365, 3004.51/s, 187.782/s/gpu LR: 0.000214 Logit Scale: 99.937 Contrastive_loss: 0.69960 (0.68296) Fd_loss: 1.5311 (1.5267) Loss: 2.2307 (2.2097)
2025-07-19,11:03:16 | INFO | Train Epoch: 17 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.366, 2999.23/s, 187.452/s/gpu LR: 0.000213 Logit Scale: 99.960 Contrastive_loss: 0.70942 (0.68416) Fd_loss: 1.5262 (1.5267) Loss: 2.2357 (2.2109)
2025-07-19,11:05:33 | INFO | Train Epoch: 17 [9015296/9319509 (97%)] Data (t): 0.000 Batch (t): 1.365, 3000.18/s, 187.512/s/gpu LR: 0.000212 Logit Scale: 99.973 Contrastive_loss: 0.71418 (0.68547) Fd_loss: 1.5209 (1.5264) Loss: 2.2351 (2.2119)
2025-07-19,11:07:14 | INFO | Train Epoch: 17 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.366, 3002.12/s, 187.633/s/gpu LR: 0.000211 Logit Scale: 99.993 Contrastive_loss: 0.70613 (0.68633) Fd_loss: 1.5226 (1.5263) Loss: 2.2287 (2.2126)
2025-07-19,11:07:15 | INFO | Starting zero-shot imagenet.
2025-07-19,11:07:15 | INFO | Building zero-shot classifier
2025-07-19,11:07:31 | INFO | Using classifier
2025-07-19,11:08:45 | INFO | Finished zero-shot imagenet.
2025-07-19,11:08:45 | INFO | Eval Epoch: 18 imagenet-zeroshot-val-top1: 0.2653 imagenet-zeroshot-val-top5: 0.5279
2025-07-19,11:08:45 | INFO | Start epoch 18
2025-07-19,11:08:51 | INFO | Train Epoch: 18 [ 4096/9319509 (0%)] Data (t): 4.223 Batch (t): 5.575, 734.672/s, 45.9170/s/gpu LR: 0.000211 Logit Scale: 99.994 Contrastive_loss: 0.55734 (0.55734) Fd_loss: 1.5174 (1.5174) Loss: 2.0747 (2.0747)
2025-07-19,11:11:07 | INFO | Train Epoch: 18 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.361, 2988.65/s, 186.790/s/gpu LR: 0.000210 Logit Scale: 100.000 Contrastive_loss: 0.56366 (0.56050) Fd_loss: 1.5191 (1.5182) Loss: 2.0827 (2.0787)
2025-07-19,11:13:24 | INFO | Train Epoch: 18 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.372, 2982.93/s, 186.433/s/gpu LR: 0.000209 Logit Scale: 100.000 Contrastive_loss: 0.55443 (0.55848) Fd_loss: 1.5197 (1.5187) Loss: 2.0741 (2.0772)
2025-07-19,11:15:41 | INFO | Train Epoch: 18 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.371, 2993.55/s, 187.097/s/gpu LR: 0.000207 Logit Scale: 100.000 Contrastive_loss: 0.59281 (0.56706) Fd_loss: 1.5241 (1.5201) Loss: 2.1169 (2.0871)
2025-07-19,11:17:58 | INFO | Train Epoch: 18 [1642496/9319509 (18%)] Data (t): 0.000 Batch (t): 1.365, 2997.10/s, 187.319/s/gpu LR: 0.000206 Logit Scale: 100.000 Contrastive_loss: 0.61698 (0.57704) Fd_loss: 1.5128 (1.5186) Loss: 2.1298 (2.0957)
2025-07-19,11:20:15 | INFO | Train Epoch: 18 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.367, 2995.75/s, 187.234/s/gpu LR: 0.000205 Logit Scale: 100.000 Contrastive_loss: 0.59128 (0.57942) Fd_loss: 1.5164 (1.5182) Loss: 2.1077 (2.0977)
2025-07-19,11:22:31 | INFO | Train Epoch: 18 [2461696/9319509 (26%)] Data (t): 0.000 Batch (t): 1.368, 2986.31/s, 186.645/s/gpu LR: 0.000204 Logit Scale: 99.997 Contrastive_loss: 0.62404 (0.58579) Fd_loss: 1.5144 (1.5177) Loss: 2.1384 (2.1035)
2025-07-19,11:24:48 | INFO | Train Epoch: 18 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.365, 3003.66/s, 187.729/s/gpu LR: 0.000203 Logit Scale: 100.000 Contrastive_loss: 0.60191 (0.58781) Fd_loss: 1.5237 (1.5184) Loss: 2.1256 (2.1062)
2025-07-19,11:27:05 | INFO | Train Epoch: 18 [3280896/9319509 (35%)] Data (t): 0.000 Batch (t): 1.367, 2994.80/s, 187.175/s/gpu LR: 0.000202 Logit Scale: 100.000 Contrastive_loss: 0.62681 (0.59214) Fd_loss: 1.5194 (1.5186) Loss: 2.1463 (2.1107)
2025-07-19,11:29:21 | INFO | Train Epoch: 18 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.367, 3003.84/s, 187.740/s/gpu LR: 0.000201 Logit Scale: 100.000 Contrastive_loss: 0.60056 (0.59298) Fd_loss: 1.5147 (1.5182) Loss: 2.1153 (2.1112)
2025-07-19,11:31:38 | INFO | Train Epoch: 18 [4100096/9319509 (44%)] Data (t): 0.000 Batch (t): 1.366, 3002.74/s, 187.671/s/gpu LR: 0.000200 Logit Scale: 99.948 Contrastive_loss: 0.63229 (0.59656) Fd_loss: 1.5178 (1.5181) Loss: 2.1501 (2.1147)
2025-07-19,11:33:55 | INFO | Train Epoch: 18 [4509696/9319509 (48%)] Data (t): 0.000 Batch (t): 1.368, 2996.53/s, 187.283/s/gpu LR: 0.000199 Logit Scale: 100.000 Contrastive_loss: 0.65566 (0.60148) Fd_loss: 1.5142 (1.5178) Loss: 2.1699 (2.1193)
2025-07-19,11:36:11 | INFO | Train Epoch: 18 [4919296/9319509 (53%)] Data (t): 0.000 Batch (t): 1.366, 3006.57/s, 187.910/s/gpu LR: 0.000198 Logit Scale: 100.000 Contrastive_loss: 0.65084 (0.60528) Fd_loss: 1.5181 (1.5178) Loss: 2.1690 (2.1231)
2025-07-19,11:38:28 | INFO | Train Epoch: 18 [5328896/9319509 (57%)] Data (t): 0.000 Batch (t): 1.364, 3010.60/s, 188.162/s/gpu LR: 0.000197 Logit Scale: 100.000 Contrastive_loss: 0.66712 (0.60970) Fd_loss: 1.5087 (1.5172) Loss: 2.1758 (2.1269)
2025-07-19,11:40:44 | INFO | Train Epoch: 18 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.366, 3001.22/s, 187.576/s/gpu LR: 0.000196 Logit Scale: 100.000 Contrastive_loss: 0.68045 (0.61441) Fd_loss: 1.5246 (1.5177) Loss: 2.2050 (2.1321)
2025-07-19,11:43:01 | INFO | Train Epoch: 18 [6148096/9319509 (66%)] Data (t): 0.000 Batch (t): 1.367, 3006.06/s, 187.879/s/gpu LR: 0.000194 Logit Scale: 100.000 Contrastive_loss: 0.60942 (0.61410) Fd_loss: 1.5250 (1.5181) Loss: 2.1344 (2.1322)
2025-07-19,11:45:18 | INFO | Train Epoch: 18 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.365, 3002.64/s, 187.665/s/gpu LR: 0.000193 Logit Scale: 99.990 Contrastive_loss: 0.65546 (0.61653) Fd_loss: 1.5186 (1.5182) Loss: 2.1741 (2.1347)
2025-07-19,11:47:34 | INFO | Train Epoch: 18 [6967296/9319509 (75%)] Data (t): 0.000 Batch (t): 1.366, 2988.76/s, 186.797/s/gpu LR: 0.000192 Logit Scale: 100.000 Contrastive_loss: 0.66667 (0.61932) Fd_loss: 1.5116 (1.5178) Loss: 2.1782 (2.1371)
2025-07-19,11:49:51 | INFO | Train Epoch: 18 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.367, 3004.14/s, 187.759/s/gpu LR: 0.000191 Logit Scale: 99.997 Contrastive_loss: 0.69425 (0.62326) Fd_loss: 1.5148 (1.5176) Loss: 2.2090 (2.1409)
2025-07-19,11:52:07 | INFO | Train Epoch: 18 [7786496/9319509 (84%)] Data (t): 0.000 Batch (t): 1.365, 2998.83/s, 187.427/s/gpu LR: 0.000190 Logit Scale: 100.000 Contrastive_loss: 0.69062 (0.62663) Fd_loss: 1.5085 (1.5172) Loss: 2.1991 (2.1438)
2025-07-19,11:54:24 | INFO | Train Epoch: 18 [8196096/9319509 (88%)] Data (t): 0.000 Batch (t): 1.367, 2977.43/s, 186.090/s/gpu LR: 0.000189 Logit Scale: 100.000 Contrastive_loss: 0.68624 (0.62947) Fd_loss: 1.5133 (1.5170) Loss: 2.1995 (2.1465)
2025-07-19,11:56:41 | INFO | Train Epoch: 18 [8605696/9319509 (92%)] Data (t): 0.000 Batch (t): 1.367, 3001.72/s, 187.607/s/gpu LR: 0.000188 Logit Scale: 100.000 Contrastive_loss: 0.66247 (0.63097) Fd_loss: 1.5085 (1.5166) Loss: 2.1709 (2.1476)
2025-07-19,11:58:57 | INFO | Train Epoch: 18 [9015296/9319509 (97%)] Data (t): 0.000 Batch (t): 1.365, 3005.84/s, 187.865/s/gpu LR: 0.000187 Logit Scale: 99.997 Contrastive_loss: 0.63734 (0.63125) Fd_loss: 1.5089 (1.5163) Loss: 2.1463 (2.1475)
2025-07-19,12:00:38 | INFO | Train Epoch: 18 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.365, 3009.54/s, 188.096/s/gpu LR: 0.000186 Logit Scale: 100.000 Contrastive_loss: 0.70662 (0.63439) Fd_loss: 1.5235 (1.5166) Loss: 2.2301 (2.1510)
2025-07-19,12:00:40 | INFO | Start epoch 19
2025-07-19,12:00:50 | INFO | Train Epoch: 19 [ 4096/9319509 (0%)] Data (t): 8.038 Batch (t): 10.198, 401.639/s, 25.1024/s/gpu LR: 0.000186 Logit Scale: 100.000 Contrastive_loss: 0.48845 (0.48845) Fd_loss: 1.5195 (1.5195) Loss: 2.0079 (2.0079)
2025-07-19,12:03:07 | INFO | Train Epoch: 19 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.367, 2976.73/s, 186.046/s/gpu LR: 0.000185 Logit Scale: 100.000 Contrastive_loss: 0.52664 (0.50755) Fd_loss: 1.5165 (1.5180) Loss: 2.0432 (2.0256)
2025-07-19,12:05:24 | INFO | Train Epoch: 19 [ 823296/9319509 (9%)] Data (t): 0.000 Batch (t): 1.373, 2984.08/s, 186.505/s/gpu LR: 0.000184 Logit Scale: 100.000 Contrastive_loss: 0.53397 (0.51635) Fd_loss: 1.5141 (1.5167) Loss: 2.0480 (2.0330)
2025-07-19,12:07:41 | INFO | Train Epoch: 19 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.369, 3002.10/s, 187.631/s/gpu LR: 0.000183 Logit Scale: 100.000 Contrastive_loss: 0.53682 (0.52147) Fd_loss: 1.5068 (1.5142) Loss: 2.0436 (2.0357)
2025-07-19,12:09:58 | INFO | Train Epoch: 19 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.369, 2985.28/s, 186.580/s/gpu LR: 0.000182 Logit Scale: 100.000 Contrastive_loss: 0.55677 (0.52853) Fd_loss: 1.5057 (1.5125) Loss: 2.0625 (2.0410)
2025-07-19,12:12:15 | INFO | Train Epoch: 19 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.367, 2995.81/s, 187.238/s/gpu LR: 0.000181 Logit Scale: 100.000 Contrastive_loss: 0.58266 (0.53755) Fd_loss: 1.5157 (1.5130) Loss: 2.0984 (2.0506)
2025-07-19,12:14:31 | INFO | Train Epoch: 19 [2461696/9319509 (26%)] Data (t): 0.000 Batch (t): 1.365, 2998.11/s, 187.382/s/gpu LR: 0.000180 Logit Scale: 100.000 Contrastive_loss: 0.55397 (0.53990) Fd_loss: 1.5184 (1.5138) Loss: 2.0724 (2.0537)
2025-07-19,12:16:48 | INFO | Train Epoch: 19 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.368, 3004.33/s, 187.771/s/gpu LR: 0.000179 Logit Scale: 100.000 Contrastive_loss: 0.58487 (0.54552) Fd_loss: 1.5029 (1.5125) Loss: 2.0878 (2.0580)
2025-07-19,12:19:05 | INFO | Train Epoch: 19 [3280896/9319509 (35%)] Data (t): 0.000 Batch (t): 1.367, 2995.11/s, 187.195/s/gpu LR: 0.000178 Logit Scale: 99.999 Contrastive_loss: 0.50848 (0.54140) Fd_loss: 1.5145 (1.5127) Loss: 2.0229 (2.0541)
2025-07-19,12:21:21 | INFO | Train Epoch: 19 [3690496/9319509 (40%)] Data (t): 0.000 Batch (t): 1.365, 3008.74/s, 188.046/s/gpu LR: 0.000177 Logit Scale: 99.999 Contrastive_loss: 0.58265 (0.54553) Fd_loss: 1.5012 (1.5115) Loss: 2.0839 (2.0571)
2025-07-19,12:23:38 | INFO | Train Epoch: 19 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.367, 2966.17/s, 185.386/s/gpu LR: 0.000175 Logit Scale: 100.000 Contrastive_loss: 0.55358 (0.54626) Fd_loss: 1.5136 (1.5117) Loss: 2.0672 (2.0580)
2025-07-19,12:25:55 | INFO | Train Epoch: 19 [4509696/9319509 (48%)] Data (t): 0.000 Batch (t): 1.368, 2995.87/s, 187.242/s/gpu LR: 0.000174 Logit Scale: 100.000 Contrastive_loss: 0.59139 (0.55002) Fd_loss: 1.5063 (1.5113) Loss: 2.0977 (2.0613)
2025-07-19,12:28:11 | INFO | Train Epoch: 19 [4919296/9319509 (53%)] Data (t): 0.000 Batch (t): 1.366, 2997.14/s, 187.321/s/gpu LR: 0.000173 Logit Scale: 100.000 Contrastive_loss: 0.59889 (0.55378) Fd_loss: 1.5057 (1.5108) Loss: 2.1046 (2.0646)
2025-07-19,12:30:28 | INFO | Train Epoch: 19 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.367, 3000.01/s, 187.501/s/gpu LR: 0.000172 Logit Scale: 99.976 Contrastive_loss: 0.61275 (0.55799) Fd_loss: 1.5071 (1.5106) Loss: 2.1199 (2.0686)
2025-07-19,12:32:45 | INFO | Train Epoch: 19 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.367, 3004.27/s, 187.767/s/gpu LR: 0.000171 Logit Scale: 99.982 Contrastive_loss: 0.56925 (0.55874) Fd_loss: 1.5153 (1.5109) Loss: 2.0845 (2.0696)
2025-07-19,12:35:01 | INFO | Train Epoch: 19 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.365, 3004.08/s, 187.755/s/gpu LR: 0.000170 Logit Scale: 99.988 Contrastive_loss: 0.59593 (0.56107) Fd_loss: 1.5093 (1.5108) Loss: 2.1052 (2.0719)
2025-07-19,12:37:18 | INFO | Train Epoch: 19 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.366, 2993.92/s, 187.120/s/gpu LR: 0.000169 Logit Scale: 99.982 Contrastive_loss: 0.57544 (0.56191) Fd_loss: 1.5141 (1.5110) Loss: 2.0895 (2.0729)
2025-07-19,12:39:35 | INFO | Train Epoch: 19 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.368, 2999.37/s, 187.461/s/gpu LR: 0.000168 Logit Scale: 100.000 Contrastive_loss: 0.62974 (0.56568) Fd_loss: 1.5026 (1.5105) Loss: 2.1323 (2.0762)
2025-07-19,12:41:51 | INFO | Train Epoch: 19 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.366, 3008.47/s, 188.029/s/gpu LR: 0.000167 Logit Scale: 100.000 Contrastive_loss: 0.60689 (0.56785) Fd_loss: 1.5067 (1.5103) Loss: 2.1136 (2.0782)
2025-07-19,12:44:08 | INFO | Train Epoch: 19 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.366, 2982.84/s, 186.428/s/gpu LR: 0.000166 Logit Scale: 99.989 Contrastive_loss: 0.61907 (0.57041) Fd_loss: 1.5054 (1.5101) Loss: 2.1245 (2.0805)
2025-07-19,12:46:25 | INFO | Train Epoch: 19 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.368, 2987.58/s, 186.724/s/gpu LR: 0.000165 Logit Scale: 100.000 Contrastive_loss: 0.59324 (0.57150) Fd_loss: 1.5027 (1.5097) Loss: 2.0959 (2.0812)
2025-07-19,12:48:41 | INFO | Train Epoch: 19 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.367, 3003.54/s, 187.721/s/gpu LR: 0.000164 Logit Scale: 100.000 Contrastive_loss: 0.60395 (0.57297) Fd_loss: 1.5066 (1.5096) Loss: 2.1106 (2.0826)
2025-07-19,12:50:58 | INFO | Train Epoch: 19 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.367, 2999.03/s, 187.439/s/gpu LR: 0.000163 Logit Scale: 99.971 Contrastive_loss: 0.60508 (0.57437) Fd_loss: 1.4931 (1.5089) Loss: 2.0982 (2.0832)
2025-07-19,12:52:40 | INFO | Train Epoch: 19 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.370, 2988.69/s, 186.793/s/gpu LR: 0.000162 Logit Scale: 99.978 Contrastive_loss: 0.63802 (0.57702) Fd_loss: 1.4958 (1.5083) Loss: 2.1338 (2.0853)
2025-07-19,12:52:41 | INFO | Starting zero-shot imagenet.
2025-07-19,12:52:41 | INFO | Building zero-shot classifier
2025-07-19,12:52:56 | INFO | Using classifier
2025-07-19,12:54:13 | INFO | Finished zero-shot imagenet.
2025-07-19,12:54:13 | INFO | Eval Epoch: 20 imagenet-zeroshot-val-top1: 0.2666 imagenet-zeroshot-val-top5: 0.5288
2025-07-19,12:54:13 | INFO | Start epoch 20
2025-07-19,12:54:20 | INFO | Train Epoch: 20 [ 4096/9319509 (0%)] Data (t): 5.034 Batch (t): 6.380, 641.985/s, 40.1240/s/gpu LR: 0.000162 Logit Scale: 99.973 Contrastive_loss: 0.48818 (0.48818) Fd_loss: 1.4989 (1.4989) Loss: 1.9871 (1.9871)
2025-07-19,12:56:36 | INFO | Train Epoch: 20 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.361, 3003.56/s, 187.723/s/gpu LR: 0.000161 Logit Scale: 100.000 Contrastive_loss: 0.48693 (0.48756) Fd_loss: 1.5103 (1.5046) Loss: 1.9972 (1.9922)
2025-07-19,12:58:53 | INFO | Train Epoch: 20 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.372, 2989.49/s, 186.843/s/gpu LR: 0.000160 Logit Scale: 100.000 Contrastive_loss: 0.49218 (0.48910) Fd_loss: 1.5009 (1.5034) Loss: 1.9931 (1.9925)
2025-07-19,13:01:10 | INFO | Train Epoch: 20 [1232896/9319509 (13%)] Data (t): 0.000 Batch (t): 1.368, 3002.88/s, 187.680/s/gpu LR: 0.000159 Logit Scale: 100.000 Contrastive_loss: 0.50076 (0.49201) Fd_loss: 1.5025 (1.5032) Loss: 2.0032 (1.9952)
2025-07-19,13:03:27 | INFO | Train Epoch: 20 [1642496/9319509 (18%)] Data (t): 0.000 Batch (t): 1.367, 2993.60/s, 187.100/s/gpu LR: 0.000158 Logit Scale: 100.000 Contrastive_loss: 0.49285 (0.49218) Fd_loss: 1.5064 (1.5038) Loss: 1.9993 (1.9960)
2025-07-19,13:05:43 | INFO | Train Epoch: 20 [2052096/9319509 (22%)] Data (t): 0.000 Batch (t): 1.369, 3000.62/s, 187.539/s/gpu LR: 0.000157 Logit Scale: 100.000 Contrastive_loss: 0.53239 (0.49888) Fd_loss: 1.5049 (1.5040) Loss: 2.0373 (2.0029)
2025-07-19,13:08:00 | INFO | Train Epoch: 20 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.366, 2999.51/s, 187.470/s/gpu LR: 0.000156 Logit Scale: 100.000 Contrastive_loss: 0.51181 (0.50073) Fd_loss: 1.5090 (1.5047) Loss: 2.0209 (2.0055)
2025-07-19,13:10:17 | INFO | Train Epoch: 20 [2871296/9319509 (31%)] Data (t): 0.000 Batch (t): 1.365, 2993.73/s, 187.108/s/gpu LR: 0.000155 Logit Scale: 100.000 Contrastive_loss: 0.52271 (0.50348) Fd_loss: 1.4955 (1.5036) Loss: 2.0182 (2.0070)
2025-07-19,13:12:33 | INFO | Train Epoch: 20 [3280896/9319509 (35%)] Data (t): 0.000 Batch (t): 1.368, 3005.67/s, 187.854/s/gpu LR: 0.000154 Logit Scale: 100.000 Contrastive_loss: 0.54618 (0.50822) Fd_loss: 1.5032 (1.5035) Loss: 2.0494 (2.0117)
2025-07-19,13:14:50 | INFO | Train Epoch: 20 [3690496/9319509 (40%)] Data (t): 0.000 Batch (t): 1.368, 2988.24/s, 186.765/s/gpu LR: 0.000153 Logit Scale: 100.000 Contrastive_loss: 0.52437 (0.50984) Fd_loss: 1.4957 (1.5027) Loss: 2.0201 (2.0126)
2025-07-19,13:17:07 | INFO | Train Epoch: 20 [4100096/9319509 (44%)] Data (t): 0.000 Batch (t): 1.365, 3003.80/s, 187.738/s/gpu LR: 0.000152 Logit Scale: 100.000 Contrastive_loss: 0.52763 (0.51145) Fd_loss: 1.4963 (1.5022) Loss: 2.0239 (2.0136)
2025-07-19,13:19:23 | INFO | Train Epoch: 20 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.368, 3005.46/s, 187.841/s/gpu LR: 0.000151 Logit Scale: 100.000 Contrastive_loss: 0.53442 (0.51337) Fd_loss: 1.4954 (1.5016) Loss: 2.0298 (2.0150)
2025-07-19,13:21:40 | INFO | Train Epoch: 20 [4919296/9319509 (53%)] Data (t): 0.000 Batch (t): 1.368, 2998.55/s, 187.410/s/gpu LR: 0.000150 Logit Scale: 99.995 Contrastive_loss: 0.53843 (0.51529) Fd_loss: 1.5022 (1.5016) Loss: 2.0406 (2.0169)
2025-07-19,13:23:57 | INFO | Train Epoch: 20 [5328896/9319509 (57%)] Data (t): 0.000 Batch (t): 1.366, 3004.74/s, 187.796/s/gpu LR: 0.000149 Logit Scale: 100.000 Contrastive_loss: 0.53773 (0.51690) Fd_loss: 1.4977 (1.5014) Loss: 2.0354 (2.0183)
2025-07-19,13:26:14 | INFO | Train Epoch: 20 [5738496/9319509 (62%)] Data (t): 0.000 Batch (t): 1.368, 2998.05/s, 187.378/s/gpu LR: 0.000148 Logit Scale: 100.000 Contrastive_loss: 0.51006 (0.51644) Fd_loss: 1.4887 (1.5005) Loss: 1.9987 (2.0170)
2025-07-19,13:28:30 | INFO | Train Epoch: 20 [6148096/9319509 (66%)] Data (t): 0.000 Batch (t): 1.367, 2995.14/s, 187.196/s/gpu LR: 0.000147 Logit Scale: 100.000 Contrastive_loss: 0.54536 (0.51825) Fd_loss: 1.4963 (1.5003) Loss: 2.0417 (2.0185)
2025-07-19,13:30:47 | INFO | Train Epoch: 20 [6557696/9319509 (70%)] Data (t): 0.000 Batch (t): 1.366, 2998.06/s, 187.379/s/gpu LR: 0.000146 Logit Scale: 100.000 Contrastive_loss: 0.56147 (0.52079) Fd_loss: 1.4935 (1.4999) Loss: 2.0550 (2.0207)
2025-07-19,13:33:04 | INFO | Train Epoch: 20 [6967296/9319509 (75%)] Data (t): 0.000 Batch (t): 1.369, 2990.90/s, 186.932/s/gpu LR: 0.000145 Logit Scale: 100.000 Contrastive_loss: 0.53873 (0.52179) Fd_loss: 1.4995 (1.4998) Loss: 2.0382 (2.0216)
2025-07-19,13:35:20 | INFO | Train Epoch: 20 [7376896/9319509 (79%)] Data (t): 0.000 Batch (t): 1.367, 3002.28/s, 187.643/s/gpu LR: 0.000144 Logit Scale: 99.993 Contrastive_loss: 0.53308 (0.52238) Fd_loss: 1.4961 (1.4996) Loss: 2.0291 (2.0220)
2025-07-19,13:37:37 | INFO | Train Epoch: 20 [7786496/9319509 (84%)] Data (t): 0.000 Batch (t): 1.365, 3014.42/s, 188.402/s/gpu LR: 0.000143 Logit Scale: 100.000 Contrastive_loss: 0.52890 (0.52271) Fd_loss: 1.4970 (1.4995) Loss: 2.0259 (2.0222)
2025-07-19,13:39:53 | INFO | Train Epoch: 20 [8196096/9319509 (88%)] Data (t): 0.000 Batch (t): 1.366, 3003.27/s, 187.704/s/gpu LR: 0.000142 Logit Scale: 100.000 Contrastive_loss: 0.53895 (0.52348) Fd_loss: 1.5039 (1.4997) Loss: 2.0429 (2.0232)
2025-07-19,13:42:10 | INFO | Train Epoch: 20 [8605696/9319509 (92%)] Data (t): 0.000 Batch (t): 1.367, 3011.97/s, 188.248/s/gpu LR: 0.000141 Logit Scale: 100.000 Contrastive_loss: 0.55081 (0.52472) Fd_loss: 1.4831 (1.4990) Loss: 2.0339 (2.0237)
2025-07-19,13:44:27 | INFO | Train Epoch: 20 [9015296/9319509 (97%)] Data (t): 0.000 Batch (t): 1.366, 3003.49/s, 187.718/s/gpu LR: 0.000140 Logit Scale: 100.000 Contrastive_loss: 0.54519 (0.52561) Fd_loss: 1.4831 (1.4983) Loss: 2.0283 (2.0239)
2025-07-19,13:46:08 | INFO | Train Epoch: 20 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.366, 3010.95/s, 188.184/s/gpu LR: 0.000139 Logit Scale: 99.991 Contrastive_loss: 0.54613 (0.52647) Fd_loss: 1.4933 (1.4981) Loss: 2.0395 (2.0245)
2025-07-19,13:46:10 | INFO | Start epoch 21
2025-07-19,13:46:20 | INFO | Train Epoch: 21 [ 4096/9319509 (0%)] Data (t): 9.215 Batch (t): 10.579, 387.168/s, 24.1980/s/gpu LR: 0.000139 Logit Scale: 99.993 Contrastive_loss: 0.42521 (0.42521) Fd_loss: 1.4952 (1.4952) Loss: 1.9204 (1.9204)
2025-07-19,13:48:37 | INFO | Train Epoch: 21 [ 413696/9319509 (4%)] Data (t): 0.000 Batch (t): 1.368, 2992.06/s, 187.004/s/gpu LR: 0.000138 Logit Scale: 100.000 Contrastive_loss: 0.41023 (0.41772) Fd_loss: 1.4940 (1.4946) Loss: 1.9042 (1.9123)
2025-07-19,13:50:54 | INFO | Train Epoch: 21 [ 823296/9319509 (9%)] Data (t): 0.000 Batch (t): 1.369, 2993.92/s, 187.120/s/gpu LR: 0.000137 Logit Scale: 100.000 Contrastive_loss: 0.42951 (0.42165) Fd_loss: 1.4894 (1.4929) Loss: 1.9189 (1.9145)
2025-07-19,13:53:11 | INFO | Train Epoch: 21 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.369, 3003.38/s, 187.711/s/gpu LR: 0.000136 Logit Scale: 100.000 Contrastive_loss: 0.44258 (0.42688) Fd_loss: 1.4878 (1.4916) Loss: 1.9304 (1.9185)
2025-07-19,13:55:28 | INFO | Train Epoch: 21 [1642496/9319509 (18%)] Data (t): 0.000 Batch (t): 1.369, 2986.66/s, 186.666/s/gpu LR: 0.000135 Logit Scale: 100.000 Contrastive_loss: 0.43073 (0.42765) Fd_loss: 1.4789 (1.4891) Loss: 1.9096 (1.9167)
2025-07-19,13:57:45 | INFO | Train Epoch: 21 [2052096/9319509 (22%)] Data (t): 0.000 Batch (t): 1.367, 2995.29/s, 187.206/s/gpu LR: 0.000134 Logit Scale: 100.000 Contrastive_loss: 0.43934 (0.42960) Fd_loss: 1.4938 (1.4899) Loss: 1.9332 (1.9195)
2025-07-19,14:00:01 | INFO | Train Epoch: 21 [2461696/9319509 (26%)] Data (t): 0.000 Batch (t): 1.367, 3005.32/s, 187.833/s/gpu LR: 0.000133 Logit Scale: 100.000 Contrastive_loss: 0.42959 (0.42960) Fd_loss: 1.4913 (1.4901) Loss: 1.9209 (1.9197)
2025-07-19,14:02:18 | INFO | Train Epoch: 21 [2871296/9319509 (31%)] Data (t): 0.000 Batch (t): 1.368, 3003.68/s, 187.730/s/gpu LR: 0.000132 Logit Scale: 100.000 Contrastive_loss: 0.47264 (0.43498) Fd_loss: 1.4938 (1.4905) Loss: 1.9664 (1.9255)
2025-07-19,14:04:35 | INFO | Train Epoch: 21 [3280896/9319509 (35%)] Data (t): 0.000 Batch (t): 1.366, 2995.57/s, 187.223/s/gpu LR: 0.000131 Logit Scale: 100.000 Contrastive_loss: 0.46468 (0.43828) Fd_loss: 1.4923 (1.4907) Loss: 1.9569 (1.9290)
2025-07-19,14:06:51 | INFO | Train Epoch: 21 [3690496/9319509 (40%)] Data (t): 0.000 Batch (t): 1.365, 3002.91/s, 187.682/s/gpu LR: 0.000130 Logit Scale: 100.000 Contrastive_loss: 0.45761 (0.44021) Fd_loss: 1.4901 (1.4907) Loss: 1.9478 (1.9309)
2025-07-19,14:09:08 | INFO | Train Epoch: 21 [4100096/9319509 (44%)] Data (t): 0.000 Batch (t): 1.368, 2984.22/s, 186.514/s/gpu LR: 0.000129 Logit Scale: 100.000 Contrastive_loss: 0.46782 (0.44272) Fd_loss: 1.4853 (1.4902) Loss: 1.9531 (1.9329)
2025-07-19,14:11:25 | INFO | Train Epoch: 21 [4509696/9319509 (48%)] Data (t): 0.000 Batch (t): 1.367, 3007.13/s, 187.946/s/gpu LR: 0.000128 Logit Scale: 100.000 Contrastive_loss: 0.45913 (0.44409) Fd_loss: 1.4969 (1.4907) Loss: 1.9561 (1.9348)
2025-07-19,14:13:41 | INFO | Train Epoch: 21 [4919296/9319509 (53%)] Data (t): 0.000 Batch (t): 1.366, 2996.06/s, 187.254/s/gpu LR: 0.000127 Logit Scale: 100.000 Contrastive_loss: 0.44221 (0.44394) Fd_loss: 1.4925 (1.4909) Loss: 1.9347 (1.9348)
2025-07-19,14:15:58 | INFO | Train Epoch: 21 [5328896/9319509 (57%)] Data (t): 0.000 Batch (t): 1.367, 2987.16/s, 186.697/s/gpu LR: 0.000126 Logit Scale: 100.000 Contrastive_loss: 0.48455 (0.44684) Fd_loss: 1.4883 (1.4907) Loss: 1.9728 (1.9375)
2025-07-19,14:18:15 | INFO | Train Epoch: 21 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.367, 2999.41/s, 187.463/s/gpu LR: 0.000125 Logit Scale: 100.000 Contrastive_loss: 0.50275 (0.45057) Fd_loss: 1.4910 (1.4907) Loss: 1.9937 (1.9413)
2025-07-19,14:20:31 | INFO | Train Epoch: 21 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.367, 2995.53/s, 187.221/s/gpu LR: 0.000124 Logit Scale: 99.998 Contrastive_loss: 0.47449 (0.45207) Fd_loss: 1.4860 (1.4904) Loss: 1.9605 (1.9425)
2025-07-19,14:22:48 | INFO | Train Epoch: 21 [6557696/9319509 (70%)] Data (t): 0.000 Batch (t): 1.368, 2994.94/s, 187.183/s/gpu LR: 0.000123 Logit Scale: 100.000 Contrastive_loss: 0.46785 (0.45299) Fd_loss: 1.4827 (1.4900) Loss: 1.9506 (1.9430)
2025-07-19,14:25:05 | INFO | Train Epoch: 21 [6967296/9319509 (75%)] Data (t): 0.000 Batch (t): 1.367, 2998.73/s, 187.421/s/gpu LR: 0.000122 Logit Scale: 100.000 Contrastive_loss: 0.47706 (0.45433) Fd_loss: 1.4893 (1.4899) Loss: 1.9664 (1.9443)
2025-07-19,14:27:21 | INFO | Train Epoch: 21 [7376896/9319509 (79%)] Data (t): 0.000 Batch (t): 1.366, 3006.80/s, 187.925/s/gpu LR: 0.000121 Logit Scale: 100.000 Contrastive_loss: 0.46360 (0.45482) Fd_loss: 1.4806 (1.4894) Loss: 1.9442 (1.9443)
2025-07-19,14:29:38 | INFO | Train Epoch: 21 [7786496/9319509 (84%)] Data (t): 0.000 Batch (t): 1.368, 3009.12/s, 188.070/s/gpu LR: 0.000120 Logit Scale: 100.000 Contrastive_loss: 0.49259 (0.45671) Fd_loss: 1.4752 (1.4887) Loss: 1.9678 (1.9454)
2025-07-19,14:31:55 | INFO | Train Epoch: 21 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.368, 3002.30/s, 187.643/s/gpu LR: 0.000120 Logit Scale: 100.000 Contrastive_loss: 0.46957 (0.45732) Fd_loss: 1.4829 (1.4884) Loss: 1.9525 (1.9458)
2025-07-19,14:34:12 | INFO | Train Epoch: 21 [8605696/9319509 (92%)] Data (t): 0.000 Batch (t): 1.366, 3006.70/s, 187.919/s/gpu LR: 0.000119 Logit Scale: 100.000 Contrastive_loss: 0.48957 (0.45879) Fd_loss: 1.4897 (1.4885) Loss: 1.9793 (1.9473)
2025-07-19,14:36:28 | INFO | Train Epoch: 21 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.367, 2986.04/s, 186.627/s/gpu LR: 0.000118 Logit Scale: 100.000 Contrastive_loss: 0.50683 (0.46087) Fd_loss: 1.4924 (1.4887) Loss: 1.9993 (1.9496)
2025-07-19,14:38:10 | INFO | Train Epoch: 21 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.367, 3003.29/s, 187.705/s/gpu LR: 0.000117 Logit Scale: 100.000 Contrastive_loss: 0.49169 (0.46216) Fd_loss: 1.4720 (1.4880) Loss: 1.9637 (1.9501)
2025-07-19,14:38:11 | INFO | Starting zero-shot imagenet.
2025-07-19,14:38:11 | INFO | Building zero-shot classifier
2025-07-19,14:38:26 | INFO | Using classifier
2025-07-19,14:39:40 | INFO | Finished zero-shot imagenet.
2025-07-19,14:39:40 | INFO | Eval Epoch: 22 imagenet-zeroshot-val-top1: 0.2744 imagenet-zeroshot-val-top5: 0.5393
2025-07-19,14:39:41 | INFO | Start epoch 22
2025-07-19,14:39:46 | INFO | Train Epoch: 22 [ 4096/9319509 (0%)] Data (t): 4.184 Batch (t): 5.551, 737.822/s, 46.1139/s/gpu LR: 0.000117 Logit Scale: 100.000 Contrastive_loss: 0.38662 (0.38662) Fd_loss: 1.4838 (1.4838) Loss: 1.8704 (1.8704)
2025-07-19,14:42:03 | INFO | Train Epoch: 22 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.363, 3001.93/s, 187.621/s/gpu LR: 0.000116 Logit Scale: 100.000 Contrastive_loss: 0.38743 (0.38702) Fd_loss: 1.4723 (1.4781) Loss: 1.8598 (1.8651)
2025-07-19,14:44:20 | INFO | Train Epoch: 22 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.374, 2979.91/s, 186.245/s/gpu LR: 0.000115 Logit Scale: 100.000 Contrastive_loss: 0.39877 (0.39094) Fd_loss: 1.4757 (1.4773) Loss: 1.8745 (1.8682)
2025-07-19,14:46:37 | INFO | Train Epoch: 22 [1232896/9319509 (13%)] Data (t): 0.000 Batch (t): 1.370, 3000.67/s, 187.542/s/gpu LR: 0.000114 Logit Scale: 100.000 Contrastive_loss: 0.39024 (0.39076) Fd_loss: 1.4667 (1.4746) Loss: 1.8570 (1.8654)
2025-07-19,14:48:54 | INFO | Train Epoch: 22 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.368, 2986.46/s, 186.654/s/gpu LR: 0.000113 Logit Scale: 100.000 Contrastive_loss: 0.40708 (0.39403) Fd_loss: 1.4770 (1.4751) Loss: 1.8841 (1.8691)
2025-07-19,14:51:11 | INFO | Train Epoch: 22 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.367, 2995.48/s, 187.217/s/gpu LR: 0.000112 Logit Scale: 100.000 Contrastive_loss: 0.39467 (0.39413) Fd_loss: 1.4818 (1.4762) Loss: 1.8765 (1.8704)
2025-07-19,14:53:27 | INFO | Train Epoch: 22 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.367, 3003.51/s, 187.719/s/gpu LR: 0.000111 Logit Scale: 100.000 Contrastive_loss: 0.39869 (0.39479) Fd_loss: 1.4715 (1.4756) Loss: 1.8702 (1.8703)
2025-07-19,14:55:44 | INFO | Train Epoch: 22 [2871296/9319509 (31%)] Data (t): 0.000 Batch (t): 1.368, 2992.85/s, 187.053/s/gpu LR: 0.000110 Logit Scale: 100.000 Contrastive_loss: 0.41154 (0.39688) Fd_loss: 1.4710 (1.4750) Loss: 1.8826 (1.8719)
2025-07-19,14:58:01 | INFO | Train Epoch: 22 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.368, 2993.21/s, 187.075/s/gpu LR: 0.000109 Logit Scale: 100.000 Contrastive_loss: 0.39398 (0.39656) Fd_loss: 1.4729 (1.4748) Loss: 1.8669 (1.8713)
2025-07-19,15:00:18 | INFO | Train Epoch: 22 [3690496/9319509 (40%)] Data (t): 0.000 Batch (t): 1.368, 3001.48/s, 187.593/s/gpu LR: 0.000109 Logit Scale: 100.000 Contrastive_loss: 0.41253 (0.39815) Fd_loss: 1.4645 (1.4737) Loss: 1.8770 (1.8719)
2025-07-19,15:02:34 | INFO | Train Epoch: 22 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.368, 3004.73/s, 187.795/s/gpu LR: 0.000108 Logit Scale: 100.000 Contrastive_loss: 0.41232 (0.39944) Fd_loss: 1.4944 (1.4756) Loss: 1.9068 (1.8751)
2025-07-19,15:04:51 | INFO | Train Epoch: 22 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.369, 2997.43/s, 187.339/s/gpu LR: 0.000107 Logit Scale: 100.000 Contrastive_loss: 0.41152 (0.40045) Fd_loss: 1.4717 (1.4753) Loss: 1.8832 (1.8757)
2025-07-19,15:07:08 | INFO | Train Epoch: 22 [4919296/9319509 (53%)] Data (t): 0.000 Batch (t): 1.368, 3001.53/s, 187.595/s/gpu LR: 0.000106 Logit Scale: 100.000 Contrastive_loss: 0.42862 (0.40261) Fd_loss: 1.4663 (1.4746) Loss: 1.8950 (1.8772)
2025-07-19,15:09:25 | INFO | Train Epoch: 22 [5328896/9319509 (57%)] Data (t): 0.000 Batch (t): 1.367, 3000.84/s, 187.553/s/gpu LR: 0.000105 Logit Scale: 100.000 Contrastive_loss: 0.41423 (0.40344) Fd_loss: 1.4667 (1.4740) Loss: 1.8809 (1.8775)
2025-07-19,15:11:42 | INFO | Train Epoch: 22 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.367, 2993.48/s, 187.093/s/gpu LR: 0.000104 Logit Scale: 100.000 Contrastive_loss: 0.42044 (0.40458) Fd_loss: 1.4812 (1.4745) Loss: 1.9016 (1.8791)
2025-07-19,15:13:58 | INFO | Train Epoch: 22 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.368, 2991.01/s, 186.938/s/gpu LR: 0.000103 Logit Scale: 100.000 Contrastive_loss: 0.43966 (0.40677) Fd_loss: 1.4657 (1.4740) Loss: 1.9054 (1.8807)
2025-07-19,15:16:15 | INFO | Train Epoch: 22 [6557696/9319509 (70%)] Data (t): 0.000 Batch (t): 1.368, 2995.75/s, 187.235/s/gpu LR: 0.000102 Logit Scale: 100.000 Contrastive_loss: 0.42786 (0.40801) Fd_loss: 1.4750 (1.4740) Loss: 1.9028 (1.8820)
2025-07-19,15:18:32 | INFO | Train Epoch: 22 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.368, 2998.64/s, 187.415/s/gpu LR: 0.000101 Logit Scale: 100.000 Contrastive_loss: 0.43832 (0.40969) Fd_loss: 1.4661 (1.4736) Loss: 1.9045 (1.8833)
2025-07-19,15:20:49 | INFO | Train Epoch: 22 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.368, 2987.90/s, 186.743/s/gpu LR: 0.000100 Logit Scale: 100.000 Contrastive_loss: 0.43264 (0.41090) Fd_loss: 1.4713 (1.4735) Loss: 1.9040 (1.8844)
2025-07-19,15:23:05 | INFO | Train Epoch: 22 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.368, 2994.87/s, 187.179/s/gpu LR: 0.000100 Logit Scale: 100.000 Contrastive_loss: 0.41753 (0.41123) Fd_loss: 1.4746 (1.4735) Loss: 1.8921 (1.8848)
2025-07-19,15:25:22 | INFO | Train Epoch: 22 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.368, 2994.73/s, 187.171/s/gpu LR: 0.000099 Logit Scale: 100.000 Contrastive_loss: 0.44534 (0.41286) Fd_loss: 1.4579 (1.4728) Loss: 1.9033 (1.8856)
2025-07-19,15:27:39 | INFO | Train Epoch: 22 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.367, 3000.20/s, 187.513/s/gpu LR: 0.000098 Logit Scale: 100.000 Contrastive_loss: 0.43749 (0.41398) Fd_loss: 1.4780 (1.4730) Loss: 1.9155 (1.8870)
2025-07-19,15:29:56 | INFO | Train Epoch: 22 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.368, 2994.74/s, 187.171/s/gpu LR: 0.000097 Logit Scale: 100.000 Contrastive_loss: 0.42497 (0.41446) Fd_loss: 1.4650 (1.4727) Loss: 1.8900 (1.8871)
2025-07-19,15:31:37 | INFO | Train Epoch: 22 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.370, 3003.81/s, 187.738/s/gpu LR: 0.000096 Logit Scale: 100.000 Contrastive_loss: 0.42332 (0.41482) Fd_loss: 1.4680 (1.4725) Loss: 1.8913 (1.8873)
2025-07-19,15:31:39 | INFO | Start epoch 23
2025-07-19,15:31:49 | INFO | Train Epoch: 23 [ 4096/9319509 (0%)] Data (t): 8.600 Batch (t): 10.310, 397.297/s, 24.8311/s/gpu LR: 0.000096 Logit Scale: 100.000 Contrastive_loss: 0.35011 (0.35011) Fd_loss: 1.4629 (1.4629) Loss: 1.8130 (1.8130)
2025-07-19,15:34:06 | INFO | Train Epoch: 23 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.368, 3013.79/s, 188.362/s/gpu LR: 0.000095 Logit Scale: 100.000 Contrastive_loss: 0.33474 (0.34242) Fd_loss: 1.4683 (1.4656) Loss: 1.8030 (1.8080)
2025-07-19,15:36:23 | INFO | Train Epoch: 23 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.373, 2977.70/s, 186.106/s/gpu LR: 0.000095 Logit Scale: 100.000 Contrastive_loss: 0.35253 (0.34579) Fd_loss: 1.4602 (1.4638) Loss: 1.8127 (1.8096)
2025-07-19,15:38:40 | INFO | Train Epoch: 23 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.369, 3005.71/s, 187.857/s/gpu LR: 0.000094 Logit Scale: 100.000 Contrastive_loss: 0.34900 (0.34659) Fd_loss: 1.4662 (1.4644) Loss: 1.8152 (1.8110)
2025-07-19,15:40:57 | INFO | Train Epoch: 23 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.369, 2985.03/s, 186.564/s/gpu LR: 0.000093 Logit Scale: 100.000 Contrastive_loss: 0.38321 (0.35392) Fd_loss: 1.4545 (1.4624) Loss: 1.8377 (1.8163)
2025-07-19,15:43:14 | INFO | Train Epoch: 23 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.367, 3001.84/s, 187.615/s/gpu LR: 0.000092 Logit Scale: 100.000 Contrastive_loss: 0.35704 (0.35444) Fd_loss: 1.4683 (1.4634) Loss: 1.8253 (1.8178)
2025-07-19,15:45:30 | INFO | Train Epoch: 23 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.366, 3000.18/s, 187.511/s/gpu LR: 0.000091 Logit Scale: 100.000 Contrastive_loss: 0.37896 (0.35794) Fd_loss: 1.4613 (1.4631) Loss: 1.8403 (1.8210)
2025-07-19,15:47:47 | INFO | Train Epoch: 23 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.367, 2986.01/s, 186.626/s/gpu LR: 0.000090 Logit Scale: 100.000 Contrastive_loss: 0.35025 (0.35698) Fd_loss: 1.4636 (1.4632) Loss: 1.8139 (1.8201)
2025-07-19,15:50:04 | INFO | Train Epoch: 23 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.367, 2994.54/s, 187.159/s/gpu LR: 0.000089 Logit Scale: 100.000 Contrastive_loss: 0.36592 (0.35797) Fd_loss: 1.4678 (1.4637) Loss: 1.8337 (1.8217)
2025-07-19,15:52:21 | INFO | Train Epoch: 23 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.368, 2999.95/s, 187.497/s/gpu LR: 0.000089 Logit Scale: 100.000 Contrastive_loss: 0.38663 (0.36084) Fd_loss: 1.4559 (1.4629) Loss: 1.8425 (1.8237)
2025-07-19,15:54:37 | INFO | Train Epoch: 23 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.367, 2980.15/s, 186.260/s/gpu LR: 0.000088 Logit Scale: 100.000 Contrastive_loss: 0.36485 (0.36120) Fd_loss: 1.4672 (1.4633) Loss: 1.8320 (1.8245)
2025-07-19,15:56:54 | INFO | Train Epoch: 23 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.366, 2993.16/s, 187.072/s/gpu LR: 0.000087 Logit Scale: 100.000 Contrastive_loss: 0.37775 (0.36258) Fd_loss: 1.4534 (1.4625) Loss: 1.8312 (1.8251)
2025-07-19,15:59:11 | INFO | Train Epoch: 23 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.367, 2997.22/s, 187.326/s/gpu LR: 0.000086 Logit Scale: 100.000 Contrastive_loss: 0.36351 (0.36265) Fd_loss: 1.4575 (1.4621) Loss: 1.8210 (1.8247)
2025-07-19,16:01:27 | INFO | Train Epoch: 23 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.368, 2997.48/s, 187.342/s/gpu LR: 0.000085 Logit Scale: 100.000 Contrastive_loss: 0.38314 (0.36412) Fd_loss: 1.4563 (1.4617) Loss: 1.8395 (1.8258)
2025-07-19,16:03:44 | INFO | Train Epoch: 23 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.367, 3000.12/s, 187.508/s/gpu LR: 0.000084 Logit Scale: 100.000 Contrastive_loss: 0.37413 (0.36478) Fd_loss: 1.4657 (1.4619) Loss: 1.8398 (1.8267)
2025-07-19,16:06:01 | INFO | Train Epoch: 23 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.366, 2999.30/s, 187.456/s/gpu LR: 0.000084 Logit Scale: 100.000 Contrastive_loss: 0.35258 (0.36402) Fd_loss: 1.4556 (1.4616) Loss: 1.8082 (1.8256)
2025-07-19,16:08:18 | INFO | Train Epoch: 23 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.368, 3003.37/s, 187.711/s/gpu LR: 0.000083 Logit Scale: 100.000 Contrastive_loss: 0.36474 (0.36406) Fd_loss: 1.4557 (1.4612) Loss: 1.8204 (1.8253)
2025-07-19,16:10:34 | INFO | Train Epoch: 23 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.369, 3004.42/s, 187.777/s/gpu LR: 0.000082 Logit Scale: 100.000 Contrastive_loss: 0.38484 (0.36522) Fd_loss: 1.4545 (1.4608) Loss: 1.8393 (1.8260)
2025-07-19,16:12:51 | INFO | Train Epoch: 23 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.369, 2981.63/s, 186.352/s/gpu LR: 0.000081 Logit Scale: 100.000 Contrastive_loss: 0.36130 (0.36501) Fd_loss: 1.4629 (1.4609) Loss: 1.8242 (1.8260)
2025-07-19,16:15:08 | INFO | Train Epoch: 23 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.367, 2999.15/s, 187.447/s/gpu LR: 0.000080 Logit Scale: 100.000 Contrastive_loss: 0.35702 (0.36461) Fd_loss: 1.4559 (1.4607) Loss: 1.8129 (1.8253)
2025-07-19,16:17:25 | INFO | Train Epoch: 23 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.367, 2999.21/s, 187.451/s/gpu LR: 0.000079 Logit Scale: 100.000 Contrastive_loss: 0.38930 (0.36579) Fd_loss: 1.4571 (1.4605) Loss: 1.8464 (1.8263)
2025-07-19,16:19:42 | INFO | Train Epoch: 23 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.368, 2987.33/s, 186.708/s/gpu LR: 0.000079 Logit Scale: 100.000 Contrastive_loss: 0.41551 (0.36805) Fd_loss: 1.4558 (1.4603) Loss: 1.8713 (1.8284)
2025-07-19,16:21:58 | INFO | Train Epoch: 23 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.368, 2989.77/s, 186.861/s/gpu LR: 0.000078 Logit Scale: 100.000 Contrastive_loss: 0.35975 (0.36769) Fd_loss: 1.4545 (1.4601) Loss: 1.8143 (1.8277)
2025-07-19,16:23:40 | INFO | Train Epoch: 23 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.370, 3002.12/s, 187.632/s/gpu LR: 0.000077 Logit Scale: 100.000 Contrastive_loss: 0.39868 (0.36898) Fd_loss: 1.4517 (1.4597) Loss: 1.8504 (1.8287)
2025-07-19,16:23:41 | INFO | Starting zero-shot imagenet.
2025-07-19,16:23:41 | INFO | Building zero-shot classifier
2025-07-19,16:23:56 | INFO | Using classifier
2025-07-19,16:25:09 | INFO | Finished zero-shot imagenet.
2025-07-19,16:25:09 | INFO | Eval Epoch: 24 imagenet-zeroshot-val-top1: 0.2854 imagenet-zeroshot-val-top5: 0.5493
2025-07-19,16:25:09 | INFO | Start epoch 24
2025-07-19,16:25:15 | INFO | Train Epoch: 24 [ 4096/9319509 (0%)] Data (t): 4.576 Batch (t): 5.929, 690.892/s, 43.1808/s/gpu LR: 0.000077 Logit Scale: 100.000 Contrastive_loss: 0.28608 (0.28608) Fd_loss: 1.4564 (1.4564) Loss: 1.7425 (1.7425)
2025-07-19,16:27:31 | INFO | Train Epoch: 24 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.363, 3011.80/s, 188.238/s/gpu LR: 0.000076 Logit Scale: 100.000 Contrastive_loss: 0.28572 (0.28590) Fd_loss: 1.4615 (1.4590) Loss: 1.7473 (1.7449)
2025-07-19,16:29:49 | INFO | Train Epoch: 24 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.373, 3000.48/s, 187.530/s/gpu LR: 0.000076 Logit Scale: 100.000 Contrastive_loss: 0.30872 (0.29351) Fd_loss: 1.4533 (1.4571) Loss: 1.7621 (1.7506)
2025-07-19,16:32:06 | INFO | Train Epoch: 24 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.369, 2992.51/s, 187.032/s/gpu LR: 0.000075 Logit Scale: 100.000 Contrastive_loss: 0.34015 (0.30517) Fd_loss: 1.4518 (1.4558) Loss: 1.7920 (1.7609)
2025-07-19,16:34:22 | INFO | Train Epoch: 24 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.368, 3002.88/s, 187.680/s/gpu LR: 0.000074 Logit Scale: 100.000 Contrastive_loss: 0.32398 (0.30893) Fd_loss: 1.4566 (1.4559) Loss: 1.7806 (1.7649)
2025-07-19,16:36:39 | INFO | Train Epoch: 24 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.368, 2997.46/s, 187.341/s/gpu LR: 0.000073 Logit Scale: 100.000 Contrastive_loss: 0.29993 (0.30743) Fd_loss: 1.4560 (1.4559) Loss: 1.7559 (1.7634)
2025-07-19,16:38:56 | INFO | Train Epoch: 24 [2461696/9319509 (26%)] Data (t): 0.000 Batch (t): 1.366, 2999.41/s, 187.463/s/gpu LR: 0.000072 Logit Scale: 100.000 Contrastive_loss: 0.31578 (0.30862) Fd_loss: 1.4448 (1.4544) Loss: 1.7605 (1.7630)
2025-07-19,16:41:13 | INFO | Train Epoch: 24 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.366, 2996.44/s, 187.278/s/gpu LR: 0.000072 Logit Scale: 100.000 Contrastive_loss: 0.30826 (0.30858) Fd_loss: 1.4547 (1.4544) Loss: 1.7630 (1.7630)
2025-07-19,16:43:29 | INFO | Train Epoch: 24 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.368, 2998.82/s, 187.426/s/gpu LR: 0.000071 Logit Scale: 100.000 Contrastive_loss: 0.30903 (0.30863) Fd_loss: 1.4466 (1.4535) Loss: 1.7557 (1.7622)
2025-07-19,16:45:46 | INFO | Train Epoch: 24 [3690496/9319509 (40%)] Data (t): 0.000 Batch (t): 1.366, 3005.48/s, 187.843/s/gpu LR: 0.000070 Logit Scale: 100.000 Contrastive_loss: 0.32928 (0.31069) Fd_loss: 1.4493 (1.4531) Loss: 1.7786 (1.7638)
2025-07-19,16:48:03 | INFO | Train Epoch: 24 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.367, 3005.01/s, 187.813/s/gpu LR: 0.000069 Logit Scale: 100.000 Contrastive_loss: 0.31736 (0.31130) Fd_loss: 1.4547 (1.4533) Loss: 1.7720 (1.7646)
2025-07-19,16:50:19 | INFO | Train Epoch: 24 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.368, 2983.50/s, 186.469/s/gpu LR: 0.000069 Logit Scale: 100.000 Contrastive_loss: 0.31629 (0.31172) Fd_loss: 1.4500 (1.4530) Loss: 1.7663 (1.7647)
2025-07-19,16:52:36 | INFO | Train Epoch: 24 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.367, 2989.86/s, 186.866/s/gpu LR: 0.000068 Logit Scale: 100.000 Contrastive_loss: 0.34526 (0.31430) Fd_loss: 1.4528 (1.4530) Loss: 1.7980 (1.7673)
2025-07-19,16:54:53 | INFO | Train Epoch: 24 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.368, 2999.55/s, 187.472/s/gpu LR: 0.000067 Logit Scale: 100.000 Contrastive_loss: 0.32124 (0.31479) Fd_loss: 1.4444 (1.4524) Loss: 1.7656 (1.7671)
2025-07-19,16:57:10 | INFO | Train Epoch: 24 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.367, 2996.24/s, 187.265/s/gpu LR: 0.000066 Logit Scale: 100.000 Contrastive_loss: 0.29236 (0.31330) Fd_loss: 1.4598 (1.4529) Loss: 1.7522 (1.7662)
2025-07-19,16:59:26 | INFO | Train Epoch: 24 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.367, 2978.41/s, 186.150/s/gpu LR: 0.000066 Logit Scale: 100.000 Contrastive_loss: 0.36128 (0.31630) Fd_loss: 1.4517 (1.4528) Loss: 1.8130 (1.7691)
2025-07-19,17:01:43 | INFO | Train Epoch: 24 [6557696/9319509 (70%)] Data (t): 0.000 Batch (t): 1.368, 2994.84/s, 187.178/s/gpu LR: 0.000065 Logit Scale: 100.000 Contrastive_loss: 0.31880 (0.31644) Fd_loss: 1.4474 (1.4525) Loss: 1.7662 (1.7689)
2025-07-19,17:04:00 | INFO | Train Epoch: 24 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.367, 2977.50/s, 186.094/s/gpu LR: 0.000064 Logit Scale: 100.000 Contrastive_loss: 0.32934 (0.31716) Fd_loss: 1.4594 (1.4529) Loss: 1.7888 (1.7700)
2025-07-19,17:06:16 | INFO | Train Epoch: 24 [7376896/9319509 (79%)] Data (t): 0.000 Batch (t): 1.366, 2999.84/s, 187.490/s/gpu LR: 0.000063 Logit Scale: 100.000 Contrastive_loss: 0.32568 (0.31761) Fd_loss: 1.4589 (1.4532) Loss: 1.7846 (1.7708)
2025-07-19,17:08:33 | INFO | Train Epoch: 24 [7786496/9319509 (84%)] Data (t): 0.000 Batch (t): 1.368, 2998.67/s, 187.417/s/gpu LR: 0.000063 Logit Scale: 100.000 Contrastive_loss: 0.33043 (0.31825) Fd_loss: 1.4367 (1.4524) Loss: 1.7672 (1.7706)
2025-07-19,17:10:50 | INFO | Train Epoch: 24 [8196096/9319509 (88%)] Data (t): 0.000 Batch (t): 1.367, 3002.17/s, 187.635/s/gpu LR: 0.000062 Logit Scale: 100.000 Contrastive_loss: 0.33078 (0.31885) Fd_loss: 1.4396 (1.4517) Loss: 1.7704 (1.7706)
2025-07-19,17:13:07 | INFO | Train Epoch: 24 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.367, 3004.20/s, 187.762/s/gpu LR: 0.000061 Logit Scale: 100.000 Contrastive_loss: 0.33970 (0.31979) Fd_loss: 1.4482 (1.4516) Loss: 1.7879 (1.7714)
2025-07-19,17:15:23 | INFO | Train Epoch: 24 [9015296/9319509 (97%)] Data (t): 0.000 Batch (t): 1.367, 2996.00/s, 187.250/s/gpu LR: 0.000060 Logit Scale: 100.000 Contrastive_loss: 0.34870 (0.32105) Fd_loss: 1.4434 (1.4512) Loss: 1.7921 (1.7723)
2025-07-19,17:17:05 | INFO | Train Epoch: 24 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.369, 3000.34/s, 187.521/s/gpu LR: 0.000060 Logit Scale: 100.000 Contrastive_loss: 0.32376 (0.32116) Fd_loss: 1.4378 (1.4507) Loss: 1.7616 (1.7718)
2025-07-19,17:17:07 | INFO | Start epoch 25
2025-07-19,17:17:16 | INFO | Train Epoch: 25 [ 4096/9319509 (0%)] Data (t): 8.059 Batch (t): 9.615, 425.998/s, 26.6249/s/gpu LR: 0.000060 Logit Scale: 100.000 Contrastive_loss: 0.25536 (0.25536) Fd_loss: 1.4388 (1.4388) Loss: 1.6941 (1.6941)
2025-07-19,17:19:33 | INFO | Train Epoch: 25 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.368, 3001.91/s, 187.619/s/gpu LR: 0.000059 Logit Scale: 100.000 Contrastive_loss: 0.27605 (0.26571) Fd_loss: 1.4383 (1.4385) Loss: 1.7144 (1.7042)
2025-07-19,17:21:50 | INFO | Train Epoch: 25 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.369, 2987.51/s, 186.719/s/gpu LR: 0.000058 Logit Scale: 100.000 Contrastive_loss: 0.25840 (0.26327) Fd_loss: 1.4463 (1.4411) Loss: 1.7047 (1.7044)
2025-07-19,17:24:07 | INFO | Train Epoch: 25 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.370, 2990.26/s, 186.891/s/gpu LR: 0.000058 Logit Scale: 100.000 Contrastive_loss: 0.23418 (0.25600) Fd_loss: 1.4583 (1.4454) Loss: 1.6925 (1.7014)
2025-07-19,17:26:24 | INFO | Train Epoch: 25 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.370, 2996.96/s, 187.310/s/gpu LR: 0.000057 Logit Scale: 100.000 Contrastive_loss: 0.26687 (0.25817) Fd_loss: 1.4446 (1.4453) Loss: 1.7115 (1.7034)
2025-07-19,17:28:41 | INFO | Train Epoch: 25 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.368, 2977.89/s, 186.118/s/gpu LR: 0.000056 Logit Scale: 100.000 Contrastive_loss: 0.27580 (0.26111) Fd_loss: 1.4361 (1.4437) Loss: 1.7119 (1.7048)
2025-07-19,17:30:58 | INFO | Train Epoch: 25 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.367, 3000.73/s, 187.546/s/gpu LR: 0.000056 Logit Scale: 100.000 Contrastive_loss: 0.25539 (0.26029) Fd_loss: 1.4312 (1.4419) Loss: 1.6866 (1.7022)
2025-07-19,17:33:14 | INFO | Train Epoch: 25 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.368, 3002.60/s, 187.662/s/gpu LR: 0.000055 Logit Scale: 100.000 Contrastive_loss: 0.26839 (0.26131) Fd_loss: 1.4270 (1.4401) Loss: 1.6954 (1.7014)
2025-07-19,17:35:31 | INFO | Train Epoch: 25 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.368, 2997.47/s, 187.342/s/gpu LR: 0.000054 Logit Scale: 100.000 Contrastive_loss: 0.26465 (0.26168) Fd_loss: 1.4453 (1.4407) Loss: 1.7099 (1.7023)
2025-07-19,17:37:48 | INFO | Train Epoch: 25 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.367, 2984.74/s, 186.546/s/gpu LR: 0.000054 Logit Scale: 100.000 Contrastive_loss: 0.26274 (0.26178) Fd_loss: 1.4331 (1.4399) Loss: 1.6959 (1.7017)
2025-07-19,17:40:05 | INFO | Train Epoch: 25 [4100096/9319509 (44%)] Data (t): 0.000 Batch (t): 1.366, 2996.57/s, 187.286/s/gpu LR: 0.000053 Logit Scale: 100.000 Contrastive_loss: 0.28378 (0.26378) Fd_loss: 1.4414 (1.4400) Loss: 1.7251 (1.7038)
2025-07-19,17:42:21 | INFO | Train Epoch: 25 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.369, 2991.22/s, 186.952/s/gpu LR: 0.000052 Logit Scale: 100.000 Contrastive_loss: 0.28762 (0.26577) Fd_loss: 1.4375 (1.4398) Loss: 1.7252 (1.7056)
2025-07-19,17:44:38 | INFO | Train Epoch: 25 [4919296/9319509 (53%)] Data (t): 0.000 Batch (t): 1.368, 2998.24/s, 187.390/s/gpu LR: 0.000051 Logit Scale: 100.000 Contrastive_loss: 0.27821 (0.26673) Fd_loss: 1.4406 (1.4399) Loss: 1.7188 (1.7066)
2025-07-19,17:46:55 | INFO | Train Epoch: 25 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.367, 2990.21/s, 186.888/s/gpu LR: 0.000051 Logit Scale: 100.000 Contrastive_loss: 0.30689 (0.26960) Fd_loss: 1.4320 (1.4393) Loss: 1.7388 (1.7089)
2025-07-19,17:49:12 | INFO | Train Epoch: 25 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.368, 2992.76/s, 187.048/s/gpu LR: 0.000050 Logit Scale: 100.000 Contrastive_loss: 0.30551 (0.27199) Fd_loss: 1.4295 (1.4387) Loss: 1.7350 (1.7107)
2025-07-19,17:51:29 | INFO | Train Epoch: 25 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.369, 2988.25/s, 186.766/s/gpu LR: 0.000049 Logit Scale: 100.000 Contrastive_loss: 0.27413 (0.27212) Fd_loss: 1.4306 (1.4382) Loss: 1.7047 (1.7103)
2025-07-19,17:53:45 | INFO | Train Epoch: 25 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.368, 3002.13/s, 187.633/s/gpu LR: 0.000049 Logit Scale: 100.000 Contrastive_loss: 0.29408 (0.27342) Fd_loss: 1.4399 (1.4383) Loss: 1.7339 (1.7117)
2025-07-19,17:56:02 | INFO | Train Epoch: 25 [6967296/9319509 (75%)] Data (t): 0.000 Batch (t): 1.367, 2977.33/s, 186.083/s/gpu LR: 0.000048 Logit Scale: 100.000 Contrastive_loss: 0.29519 (0.27463) Fd_loss: 1.4388 (1.4383) Loss: 1.7340 (1.7129)
2025-07-19,17:58:19 | INFO | Train Epoch: 25 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.368, 2995.01/s, 187.188/s/gpu LR: 0.000048 Logit Scale: 100.000 Contrastive_loss: 0.29392 (0.27564) Fd_loss: 1.4424 (1.4385) Loss: 1.7364 (1.7141)
2025-07-19,18:00:36 | INFO | Train Epoch: 25 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.367, 2994.17/s, 187.135/s/gpu LR: 0.000047 Logit Scale: 100.000 Contrastive_loss: 0.29192 (0.27645) Fd_loss: 1.4321 (1.4382) Loss: 1.7240 (1.7146)
2025-07-19,18:02:52 | INFO | Train Epoch: 25 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.367, 2987.95/s, 186.747/s/gpu LR: 0.000046 Logit Scale: 100.000 Contrastive_loss: 0.27568 (0.27642) Fd_loss: 1.4329 (1.4379) Loss: 1.7086 (1.7144)
2025-07-19,18:05:09 | INFO | Train Epoch: 25 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.367, 2999.15/s, 187.447/s/gpu LR: 0.000046 Logit Scale: 100.000 Contrastive_loss: 0.30621 (0.27777) Fd_loss: 1.4260 (1.4374) Loss: 1.7322 (1.7152)
2025-07-19,18:07:26 | INFO | Train Epoch: 25 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.369, 2994.64/s, 187.165/s/gpu LR: 0.000045 Logit Scale: 100.000 Contrastive_loss: 0.29644 (0.27858) Fd_loss: 1.4320 (1.4372) Loss: 1.7284 (1.7157)
2025-07-19,18:09:07 | INFO | Train Epoch: 25 [9318400/9319509 (100%)] Data (t): 0.003 Batch (t): 1.370, 3012.29/s, 188.268/s/gpu LR: 0.000044 Logit Scale: 100.000 Contrastive_loss: 0.28918 (0.27902) Fd_loss: 1.4364 (1.4371) Loss: 1.7256 (1.7162)
2025-07-19,18:09:09 | INFO | Starting zero-shot imagenet.
2025-07-19,18:09:09 | INFO | Building zero-shot classifier
2025-07-19,18:09:26 | INFO | Using classifier
2025-07-19,18:10:49 | INFO | Finished zero-shot imagenet.
2025-07-19,18:10:49 | INFO | Eval Epoch: 26 imagenet-zeroshot-val-top1: 0.2905 imagenet-zeroshot-val-top5: 0.5597
2025-07-19,18:10:50 | INFO | Start epoch 26
2025-07-19,18:10:56 | INFO | Train Epoch: 26 [ 4096/9319509 (0%)] Data (t): 4.782 Batch (t): 6.130, 668.176/s, 41.7610/s/gpu LR: 0.000044 Logit Scale: 100.000 Contrastive_loss: 0.21445 (0.21445) Fd_loss: 1.4372 (1.4372) Loss: 1.6517 (1.6517)
2025-07-19,18:13:12 | INFO | Train Epoch: 26 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.362, 2991.57/s, 186.973/s/gpu LR: 0.000044 Logit Scale: 100.000 Contrastive_loss: 0.20757 (0.21101) Fd_loss: 1.4370 (1.4371) Loss: 1.6446 (1.6481)
2025-07-19,18:15:29 | INFO | Train Epoch: 26 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.371, 2997.26/s, 187.329/s/gpu LR: 0.000043 Logit Scale: 100.000 Contrastive_loss: 0.23981 (0.22061) Fd_loss: 1.4248 (1.4330) Loss: 1.6646 (1.6536)
2025-07-19,18:17:46 | INFO | Train Epoch: 26 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.370, 2995.18/s, 187.199/s/gpu LR: 0.000043 Logit Scale: 100.000 Contrastive_loss: 0.25644 (0.22957) Fd_loss: 1.4285 (1.4319) Loss: 1.6849 (1.6614)
2025-07-19,18:20:03 | INFO | Train Epoch: 26 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.368, 2989.97/s, 186.873/s/gpu LR: 0.000042 Logit Scale: 100.000 Contrastive_loss: 0.25360 (0.23438) Fd_loss: 1.4230 (1.4301) Loss: 1.6766 (1.6645)
2025-07-19,18:22:20 | INFO | Train Epoch: 26 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.368, 2971.04/s, 185.690/s/gpu LR: 0.000041 Logit Scale: 100.000 Contrastive_loss: 0.22538 (0.23288) Fd_loss: 1.4290 (1.4299) Loss: 1.6544 (1.6628)
2025-07-19,18:24:36 | INFO | Train Epoch: 26 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.367, 3001.00/s, 187.563/s/gpu LR: 0.000041 Logit Scale: 100.000 Contrastive_loss: 0.25173 (0.23557) Fd_loss: 1.4210 (1.4286) Loss: 1.6727 (1.6642)
2025-07-19,18:26:53 | INFO | Train Epoch: 26 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.367, 2996.52/s, 187.282/s/gpu LR: 0.000040 Logit Scale: 100.000 Contrastive_loss: 0.22885 (0.23473) Fd_loss: 1.4214 (1.4277) Loss: 1.6503 (1.6625)
2025-07-19,18:29:10 | INFO | Train Epoch: 26 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.367, 2990.26/s, 186.892/s/gpu LR: 0.000040 Logit Scale: 100.000 Contrastive_loss: 0.24598 (0.23598) Fd_loss: 1.4333 (1.4284) Loss: 1.6793 (1.6643)
2025-07-19,18:31:27 | INFO | Train Epoch: 26 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.367, 2997.34/s, 187.334/s/gpu LR: 0.000039 Logit Scale: 100.000 Contrastive_loss: 0.22950 (0.23533) Fd_loss: 1.4392 (1.4294) Loss: 1.6687 (1.6648)
2025-07-19,18:33:43 | INFO | Train Epoch: 26 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.367, 2996.28/s, 187.267/s/gpu LR: 0.000038 Logit Scale: 100.000 Contrastive_loss: 0.24796 (0.23648) Fd_loss: 1.4253 (1.4291) Loss: 1.6733 (1.6655)
2025-07-19,18:36:00 | INFO | Train Epoch: 26 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.368, 2989.14/s, 186.821/s/gpu LR: 0.000038 Logit Scale: 100.000 Contrastive_loss: 0.23526 (0.23638) Fd_loss: 1.4322 (1.4293) Loss: 1.6675 (1.6657)
2025-07-19,18:38:17 | INFO | Train Epoch: 26 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.367, 2982.76/s, 186.422/s/gpu LR: 0.000037 Logit Scale: 100.000 Contrastive_loss: 0.27865 (0.23963) Fd_loss: 1.4242 (1.4289) Loss: 1.7028 (1.6686)
2025-07-19,18:40:34 | INFO | Train Epoch: 26 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.367, 3013.15/s, 188.322/s/gpu LR: 0.000037 Logit Scale: 100.000 Contrastive_loss: 0.25389 (0.24065) Fd_loss: 1.4215 (1.4284) Loss: 1.6754 (1.6690)
2025-07-19,18:42:50 | INFO | Train Epoch: 26 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.367, 3004.89/s, 187.806/s/gpu LR: 0.000036 Logit Scale: 100.000 Contrastive_loss: 0.23927 (0.24056) Fd_loss: 1.4235 (1.4281) Loss: 1.6628 (1.6686)
2025-07-19,18:45:07 | INFO | Train Epoch: 26 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.368, 2999.71/s, 187.482/s/gpu LR: 0.000035 Logit Scale: 100.000 Contrastive_loss: 0.23889 (0.24045) Fd_loss: 1.4268 (1.4280) Loss: 1.6657 (1.6684)
2025-07-19,18:47:24 | INFO | Train Epoch: 26 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.368, 3005.78/s, 187.862/s/gpu LR: 0.000035 Logit Scale: 100.000 Contrastive_loss: 0.25049 (0.24104) Fd_loss: 1.4255 (1.4278) Loss: 1.6760 (1.6689)
2025-07-19,18:49:41 | INFO | Train Epoch: 26 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.368, 2996.26/s, 187.266/s/gpu LR: 0.000034 Logit Scale: 100.000 Contrastive_loss: 0.23628 (0.24078) Fd_loss: 1.4281 (1.4279) Loss: 1.6644 (1.6686)
2025-07-19,18:51:57 | INFO | Train Epoch: 26 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.368, 2992.30/s, 187.019/s/gpu LR: 0.000034 Logit Scale: 100.000 Contrastive_loss: 0.25090 (0.24131) Fd_loss: 1.4250 (1.4277) Loss: 1.6759 (1.6690)
2025-07-19,18:54:14 | INFO | Train Epoch: 26 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.367, 3004.38/s, 187.774/s/gpu LR: 0.000033 Logit Scale: 100.000 Contrastive_loss: 0.25529 (0.24201) Fd_loss: 1.4235 (1.4275) Loss: 1.6787 (1.6695)
2025-07-19,18:56:31 | INFO | Train Epoch: 26 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.367, 2977.92/s, 186.120/s/gpu LR: 0.000033 Logit Scale: 100.000 Contrastive_loss: 0.24317 (0.24207) Fd_loss: 1.4265 (1.4275) Loss: 1.6697 (1.6695)
2025-07-19,18:58:48 | INFO | Train Epoch: 26 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.368, 2999.57/s, 187.473/s/gpu LR: 0.000032 Logit Scale: 100.000 Contrastive_loss: 0.26139 (0.24294) Fd_loss: 1.4167 (1.4270) Loss: 1.6780 (1.6699)
2025-07-19,19:01:04 | INFO | Train Epoch: 26 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.368, 2987.46/s, 186.716/s/gpu LR: 0.000032 Logit Scale: 100.000 Contrastive_loss: 0.24685 (0.24311) Fd_loss: 1.4253 (1.4269) Loss: 1.6722 (1.6700)
2025-07-19,19:02:46 | INFO | Train Epoch: 26 [9318400/9319509 (100%)] Data (t): 0.003 Batch (t): 1.369, 3000.39/s, 187.525/s/gpu LR: 0.000031 Logit Scale: 100.000 Contrastive_loss: 0.24356 (0.24313) Fd_loss: 1.4275 (1.4269) Loss: 1.6710 (1.6700)
2025-07-19,19:02:48 | INFO | Start epoch 27
2025-07-19,19:02:58 | INFO | Train Epoch: 27 [ 4096/9319509 (0%)] Data (t): 9.097 Batch (t): 10.805, 379.084/s, 23.6928/s/gpu LR: 0.000031 Logit Scale: 100.000 Contrastive_loss: 0.21229 (0.21229) Fd_loss: 1.4215 (1.4215) Loss: 1.6338 (1.6338)
2025-07-19,19:05:15 | INFO | Train Epoch: 27 [ 413696/9319509 (4%)] Data (t): 0.000 Batch (t): 1.366, 2995.14/s, 187.196/s/gpu LR: 0.000031 Logit Scale: 100.000 Contrastive_loss: 0.22587 (0.21908) Fd_loss: 1.4150 (1.4183) Loss: 1.6409 (1.6374)
2025-07-19,19:07:32 | INFO | Train Epoch: 27 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.370, 2994.53/s, 187.158/s/gpu LR: 0.000030 Logit Scale: 100.000 Contrastive_loss: 0.21912 (0.21909) Fd_loss: 1.4191 (1.4186) Loss: 1.6383 (1.6377)
2025-07-19,19:09:49 | INFO | Train Epoch: 27 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.374, 3002.03/s, 187.627/s/gpu LR: 0.000030 Logit Scale: 100.000 Contrastive_loss: 0.20111 (0.21460) Fd_loss: 1.4235 (1.4198) Loss: 1.6246 (1.6344)
2025-07-19,19:12:06 | INFO | Train Epoch: 27 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.371, 2992.72/s, 187.045/s/gpu LR: 0.000029 Logit Scale: 100.000 Contrastive_loss: 0.22107 (0.21589) Fd_loss: 1.4185 (1.4195) Loss: 1.6395 (1.6354)
2025-07-19,19:14:23 | INFO | Train Epoch: 27 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.369, 2980.60/s, 186.287/s/gpu LR: 0.000029 Logit Scale: 100.000 Contrastive_loss: 0.20525 (0.21412) Fd_loss: 1.4243 (1.4203) Loss: 1.6295 (1.6344)
2025-07-19,19:16:40 | INFO | Train Epoch: 27 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.369, 2991.03/s, 186.939/s/gpu LR: 0.000028 Logit Scale: 100.000 Contrastive_loss: 0.22422 (0.21556) Fd_loss: 1.4178 (1.4200) Loss: 1.6420 (1.6355)
2025-07-19,19:18:57 | INFO | Train Epoch: 27 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.367, 3005.09/s, 187.818/s/gpu LR: 0.000028 Logit Scale: 100.000 Contrastive_loss: 0.20299 (0.21399) Fd_loss: 1.4306 (1.4213) Loss: 1.6336 (1.6353)
2025-07-19,19:21:14 | INFO | Train Epoch: 27 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.368, 2992.49/s, 187.031/s/gpu LR: 0.000027 Logit Scale: 100.000 Contrastive_loss: 0.21434 (0.21403) Fd_loss: 1.4242 (1.4216) Loss: 1.6385 (1.6356)
2025-07-19,19:23:31 | INFO | Train Epoch: 27 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.368, 2992.49/s, 187.031/s/gpu LR: 0.000027 Logit Scale: 100.000 Contrastive_loss: 0.21862 (0.21449) Fd_loss: 1.4127 (1.4207) Loss: 1.6313 (1.6352)
2025-07-19,19:25:47 | INFO | Train Epoch: 27 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.368, 2990.23/s, 186.889/s/gpu LR: 0.000026 Logit Scale: 100.000 Contrastive_loss: 0.22416 (0.21537) Fd_loss: 1.4214 (1.4208) Loss: 1.6456 (1.6362)
2025-07-19,19:28:04 | INFO | Train Epoch: 27 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.369, 2978.07/s, 186.129/s/gpu LR: 0.000026 Logit Scale: 100.000 Contrastive_loss: 0.22021 (0.21577) Fd_loss: 1.4208 (1.4208) Loss: 1.6410 (1.6366)
2025-07-19,19:30:21 | INFO | Train Epoch: 27 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.369, 2967.84/s, 185.490/s/gpu LR: 0.000025 Logit Scale: 100.000 Contrastive_loss: 0.23305 (0.21710) Fd_loss: 1.4192 (1.4207) Loss: 1.6523 (1.6378)
2025-07-19,19:32:38 | INFO | Train Epoch: 27 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.368, 3000.62/s, 187.539/s/gpu LR: 0.000025 Logit Scale: 100.000 Contrastive_loss: 0.20796 (0.21645) Fd_loss: 1.4123 (1.4201) Loss: 1.6203 (1.6365)
2025-07-19,19:34:55 | INFO | Train Epoch: 27 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.368, 3005.81/s, 187.863/s/gpu LR: 0.000024 Logit Scale: 100.000 Contrastive_loss: 0.21104 (0.21609) Fd_loss: 1.4118 (1.4195) Loss: 1.6229 (1.6356)
2025-07-19,19:37:11 | INFO | Train Epoch: 27 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.368, 2995.93/s, 187.246/s/gpu LR: 0.000024 Logit Scale: 100.000 Contrastive_loss: 0.20652 (0.21549) Fd_loss: 1.4187 (1.4195) Loss: 1.6252 (1.6350)
2025-07-19,19:39:28 | INFO | Train Epoch: 27 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.369, 2991.01/s, 186.938/s/gpu LR: 0.000023 Logit Scale: 100.000 Contrastive_loss: 0.22933 (0.21630) Fd_loss: 1.4100 (1.4189) Loss: 1.6393 (1.6352)
2025-07-19,19:41:45 | INFO | Train Epoch: 27 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.367, 2988.23/s, 186.765/s/gpu LR: 0.000023 Logit Scale: 100.000 Contrastive_loss: 0.22152 (0.21659) Fd_loss: 1.4169 (1.4188) Loss: 1.6384 (1.6354)
2025-07-19,19:44:02 | INFO | Train Epoch: 27 [7376896/9319509 (79%)] Data (t): 0.000 Batch (t): 1.367, 2987.17/s, 186.698/s/gpu LR: 0.000022 Logit Scale: 100.000 Contrastive_loss: 0.21803 (0.21667) Fd_loss: 1.4113 (1.4184) Loss: 1.6294 (1.6351)
2025-07-19,19:46:19 | INFO | Train Epoch: 27 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.370, 2989.22/s, 186.826/s/gpu LR: 0.000022 Logit Scale: 100.000 Contrastive_loss: 0.22939 (0.21730) Fd_loss: 1.4188 (1.4184) Loss: 1.6482 (1.6357)
2025-07-19,19:48:36 | INFO | Train Epoch: 27 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.368, 2991.49/s, 186.968/s/gpu LR: 0.000021 Logit Scale: 100.000 Contrastive_loss: 0.24826 (0.21878) Fd_loss: 1.4122 (1.4181) Loss: 1.6605 (1.6369)
2025-07-19,19:50:52 | INFO | Train Epoch: 27 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.367, 2990.46/s, 186.904/s/gpu LR: 0.000021 Logit Scale: 100.000 Contrastive_loss: 0.21062 (0.21841) Fd_loss: 1.4254 (1.4185) Loss: 1.6361 (1.6369)
2025-07-19,19:53:09 | INFO | Train Epoch: 27 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.369, 2993.82/s, 187.113/s/gpu LR: 0.000020 Logit Scale: 100.000 Contrastive_loss: 0.22986 (0.21890) Fd_loss: 1.4030 (1.4178) Loss: 1.6328 (1.6367)
2025-07-19,19:54:51 | INFO | Train Epoch: 27 [9318400/9319509 (100%)] Data (t): 0.003 Batch (t): 1.370, 3004.02/s, 187.751/s/gpu LR: 0.000020 Logit Scale: 100.000 Contrastive_loss: 0.23054 (0.21939) Fd_loss: 1.4202 (1.4179) Loss: 1.6508 (1.6373)
2025-07-19,19:54:52 | INFO | Starting zero-shot imagenet.
2025-07-19,19:54:52 | INFO | Building zero-shot classifier
2025-07-19,19:55:07 | INFO | Using classifier
2025-07-19,19:56:32 | INFO | Finished zero-shot imagenet.
2025-07-19,19:56:32 | INFO | Eval Epoch: 28 imagenet-zeroshot-val-top1: 0.2944 imagenet-zeroshot-val-top5: 0.5631
2025-07-19,19:56:33 | INFO | Start epoch 28
2025-07-19,19:56:38 | INFO | Train Epoch: 28 [ 4096/9319509 (0%)] Data (t): 4.416 Batch (t): 5.762, 710.859/s, 44.4287/s/gpu LR: 0.000020 Logit Scale: 100.000 Contrastive_loss: 0.20177 (0.20177) Fd_loss: 1.4270 (1.4270) Loss: 1.6287 (1.6287)
2025-07-19,19:58:54 | INFO | Train Epoch: 28 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.362, 3008.38/s, 188.024/s/gpu LR: 0.000020 Logit Scale: 100.000 Contrastive_loss: 0.19361 (0.19769) Fd_loss: 1.4130 (1.4200) Loss: 1.6066 (1.6177)
2025-07-19,20:01:11 | INFO | Train Epoch: 28 [ 823296/9319509 (9%)] Data (t): 0.000 Batch (t): 1.368, 2991.45/s, 186.966/s/gpu LR: 0.000019 Logit Scale: 100.000 Contrastive_loss: 0.18933 (0.19490) Fd_loss: 1.4186 (1.4195) Loss: 1.6079 (1.6144)
2025-07-19,20:03:28 | INFO | Train Epoch: 28 [1232896/9319509 (13%)] Data (t): 0.000 Batch (t): 1.371, 2986.44/s, 186.652/s/gpu LR: 0.000019 Logit Scale: 100.000 Contrastive_loss: 0.19706 (0.19544) Fd_loss: 1.4035 (1.4155) Loss: 1.6005 (1.6109)
2025-07-19,20:05:45 | INFO | Train Epoch: 28 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.369, 2994.71/s, 187.169/s/gpu LR: 0.000018 Logit Scale: 100.000 Contrastive_loss: 0.20091 (0.19654) Fd_loss: 1.4087 (1.4142) Loss: 1.6097 (1.6107)
2025-07-19,20:08:02 | INFO | Train Epoch: 28 [2052096/9319509 (22%)] Data (t): 0.000 Batch (t): 1.368, 3007.55/s, 187.972/s/gpu LR: 0.000018 Logit Scale: 100.000 Contrastive_loss: 0.20045 (0.19719) Fd_loss: 1.4028 (1.4123) Loss: 1.6032 (1.6094)
2025-07-19,20:10:19 | INFO | Train Epoch: 28 [2461696/9319509 (26%)] Data (t): 0.000 Batch (t): 1.368, 2996.16/s, 187.260/s/gpu LR: 0.000018 Logit Scale: 100.000 Contrastive_loss: 0.19348 (0.19666) Fd_loss: 1.4146 (1.4126) Loss: 1.6081 (1.6093)
2025-07-19,20:12:36 | INFO | Train Epoch: 28 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.368, 2991.85/s, 186.991/s/gpu LR: 0.000017 Logit Scale: 100.000 Contrastive_loss: 0.19149 (0.19601) Fd_loss: 1.4079 (1.4120) Loss: 1.5994 (1.6080)
2025-07-19,20:14:52 | INFO | Train Epoch: 28 [3280896/9319509 (35%)] Data (t): 0.000 Batch (t): 1.367, 2978.36/s, 186.148/s/gpu LR: 0.000017 Logit Scale: 100.000 Contrastive_loss: 0.18355 (0.19463) Fd_loss: 1.4098 (1.4118) Loss: 1.5933 (1.6064)
2025-07-19,20:17:09 | INFO | Train Epoch: 28 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.368, 2989.69/s, 186.856/s/gpu LR: 0.000016 Logit Scale: 100.000 Contrastive_loss: 0.18845 (0.19401) Fd_loss: 1.4003 (1.4106) Loss: 1.5887 (1.6046)
2025-07-19,20:19:26 | INFO | Train Epoch: 28 [4100096/9319509 (44%)] Data (t): 0.000 Batch (t): 1.367, 2992.81/s, 187.050/s/gpu LR: 0.000016 Logit Scale: 100.000 Contrastive_loss: 0.19991 (0.19455) Fd_loss: 1.4223 (1.4117) Loss: 1.6222 (1.6062)
2025-07-19,20:21:42 | INFO | Train Epoch: 28 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.367, 2986.44/s, 186.652/s/gpu LR: 0.000016 Logit Scale: 100.000 Contrastive_loss: 0.18617 (0.19385) Fd_loss: 1.4129 (1.4118) Loss: 1.5991 (1.6056)
2025-07-19,20:23:59 | INFO | Train Epoch: 28 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.368, 2994.17/s, 187.136/s/gpu LR: 0.000015 Logit Scale: 100.000 Contrastive_loss: 0.19169 (0.19368) Fd_loss: 1.4129 (1.4119) Loss: 1.6046 (1.6055)
2025-07-19,20:26:16 | INFO | Train Epoch: 28 [5328896/9319509 (57%)] Data (t): 0.000 Batch (t): 1.368, 3000.09/s, 187.506/s/gpu LR: 0.000015 Logit Scale: 100.000 Contrastive_loss: 0.19109 (0.19350) Fd_loss: 1.4015 (1.4111) Loss: 1.5926 (1.6046)
2025-07-19,20:28:33 | INFO | Train Epoch: 28 [5738496/9319509 (62%)] Data (t): 0.000 Batch (t): 1.367, 2996.69/s, 187.293/s/gpu LR: 0.000014 Logit Scale: 100.000 Contrastive_loss: 0.21106 (0.19467) Fd_loss: 1.4093 (1.4110) Loss: 1.6204 (1.6057)
2025-07-19,20:30:50 | INFO | Train Epoch: 28 [6148096/9319509 (66%)] Data (t): 0.000 Batch (t): 1.367, 3007.91/s, 187.994/s/gpu LR: 0.000014 Logit Scale: 100.000 Contrastive_loss: 0.18991 (0.19437) Fd_loss: 1.4114 (1.4110) Loss: 1.6013 (1.6054)
2025-07-19,20:33:06 | INFO | Train Epoch: 28 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.368, 3004.75/s, 187.797/s/gpu LR: 0.000014 Logit Scale: 100.000 Contrastive_loss: 0.20826 (0.19519) Fd_loss: 1.4104 (1.4110) Loss: 1.6186 (1.6062)
2025-07-19,20:35:23 | INFO | Train Epoch: 28 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.368, 3001.95/s, 187.622/s/gpu LR: 0.000013 Logit Scale: 100.000 Contrastive_loss: 0.20232 (0.19558) Fd_loss: 1.4106 (1.4110) Loss: 1.6129 (1.6065)
2025-07-19,20:37:40 | INFO | Train Epoch: 28 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.367, 2999.82/s, 187.488/s/gpu LR: 0.000013 Logit Scale: 100.000 Contrastive_loss: 0.21360 (0.19653) Fd_loss: 1.3922 (1.4100) Loss: 1.6058 (1.6065)
2025-07-19,20:39:56 | INFO | Train Epoch: 28 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.366, 2988.03/s, 186.752/s/gpu LR: 0.000013 Logit Scale: 100.000 Contrastive_loss: 0.19894 (0.19665) Fd_loss: 1.4108 (1.4100) Loss: 1.6098 (1.6067)
2025-07-19,20:42:13 | INFO | Train Epoch: 28 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.365, 2998.55/s, 187.410/s/gpu LR: 0.000012 Logit Scale: 100.000 Contrastive_loss: 0.21013 (0.19729) Fd_loss: 1.4097 (1.4100) Loss: 1.6198 (1.6073)
2025-07-19,20:44:29 | INFO | Train Epoch: 28 [8605696/9319509 (92%)] Data (t): 0.000 Batch (t): 1.365, 2998.29/s, 187.393/s/gpu LR: 0.000012 Logit Scale: 100.000 Contrastive_loss: 0.19472 (0.19718) Fd_loss: 1.4112 (1.4101) Loss: 1.6059 (1.6072)
2025-07-19,20:46:46 | INFO | Train Epoch: 28 [9015296/9319509 (97%)] Data (t): 0.000 Batch (t): 1.366, 3008.89/s, 188.056/s/gpu LR: 0.000012 Logit Scale: 100.000 Contrastive_loss: 0.19444 (0.19706) Fd_loss: 1.4091 (1.4100) Loss: 1.6036 (1.6071)
2025-07-19,20:48:27 | INFO | Train Epoch: 28 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.367, 3003.51/s, 187.719/s/gpu LR: 0.000011 Logit Scale: 100.000 Contrastive_loss: 0.18981 (0.19676) Fd_loss: 1.4025 (1.4097) Loss: 1.5923 (1.6065)
2025-07-19,20:48:29 | INFO | Start epoch 29
2025-07-19,20:48:40 | INFO | Train Epoch: 29 [ 4096/9319509 (0%)] Data (t): 9.290 Batch (t): 10.652, 384.537/s, 24.0336/s/gpu LR: 0.000011 Logit Scale: 100.000 Contrastive_loss: 0.18026 (0.18026) Fd_loss: 1.4073 (1.4073) Loss: 1.5876 (1.5876)
2025-07-19,20:50:57 | INFO | Train Epoch: 29 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.367, 2996.04/s, 187.253/s/gpu LR: 0.000011 Logit Scale: 100.000 Contrastive_loss: 0.18440 (0.18233) Fd_loss: 1.4090 (1.4082) Loss: 1.5935 (1.5905)
2025-07-19,20:53:13 | INFO | Train Epoch: 29 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.366, 3016.92/s, 188.558/s/gpu LR: 0.000011 Logit Scale: 100.000 Contrastive_loss: 0.18511 (0.18326) Fd_loss: 1.4125 (1.4096) Loss: 1.5976 (1.5929)
2025-07-19,20:55:30 | INFO | Train Epoch: 29 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.369, 2996.01/s, 187.251/s/gpu LR: 0.000010 Logit Scale: 100.000 Contrastive_loss: 0.17034 (0.18003) Fd_loss: 1.4159 (1.4112) Loss: 1.5863 (1.5912)
2025-07-19,20:57:47 | INFO | Train Epoch: 29 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.369, 3007.58/s, 187.974/s/gpu LR: 0.000010 Logit Scale: 100.000 Contrastive_loss: 0.20175 (0.18437) Fd_loss: 1.4008 (1.4091) Loss: 1.6026 (1.5935)
2025-07-19,21:00:03 | INFO | Train Epoch: 29 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.365, 2998.69/s, 187.418/s/gpu LR: 0.000010 Logit Scale: 100.000 Contrastive_loss: 0.18605 (0.18465) Fd_loss: 1.4030 (1.4081) Loss: 1.5890 (1.5928)
2025-07-19,21:02:20 | INFO | Train Epoch: 29 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.365, 2995.94/s, 187.246/s/gpu LR: 0.000009 Logit Scale: 100.000 Contrastive_loss: 0.18604 (0.18485) Fd_loss: 1.3997 (1.4069) Loss: 1.5858 (1.5918)
2025-07-19,21:04:37 | INFO | Train Epoch: 29 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.366, 3004.47/s, 187.779/s/gpu LR: 0.000009 Logit Scale: 100.000 Contrastive_loss: 0.18200 (0.18450) Fd_loss: 1.4065 (1.4069) Loss: 1.5885 (1.5914)
2025-07-19,21:06:53 | INFO | Train Epoch: 29 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.364, 3002.16/s, 187.635/s/gpu LR: 0.000009 Logit Scale: 100.000 Contrastive_loss: 0.19684 (0.18587) Fd_loss: 1.3979 (1.4059) Loss: 1.5947 (1.5917)
2025-07-19,21:09:09 | INFO | Train Epoch: 29 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.365, 3007.62/s, 187.977/s/gpu LR: 0.000009 Logit Scale: 100.000 Contrastive_loss: 0.18502 (0.18578) Fd_loss: 1.4051 (1.4058) Loss: 1.5901 (1.5916)
2025-07-19,21:11:26 | INFO | Train Epoch: 29 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.365, 2998.57/s, 187.410/s/gpu LR: 0.000008 Logit Scale: 100.000 Contrastive_loss: 0.18326 (0.18555) Fd_loss: 1.4153 (1.4066) Loss: 1.5986 (1.5922)
2025-07-19,21:13:43 | INFO | Train Epoch: 29 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.365, 3008.72/s, 188.045/s/gpu LR: 0.000008 Logit Scale: 100.000 Contrastive_loss: 0.19777 (0.18657) Fd_loss: 1.4069 (1.4067) Loss: 1.6046 (1.5932)
2025-07-19,21:15:59 | INFO | Train Epoch: 29 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.366, 3010.91/s, 188.182/s/gpu LR: 0.000008 Logit Scale: 100.000 Contrastive_loss: 0.19082 (0.18690) Fd_loss: 1.4111 (1.4070) Loss: 1.6019 (1.5939)
2025-07-19,21:18:16 | INFO | Train Epoch: 29 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.365, 2995.31/s, 187.207/s/gpu LR: 0.000007 Logit Scale: 100.000 Contrastive_loss: 0.17648 (0.18615) Fd_loss: 1.4041 (1.4068) Loss: 1.5806 (1.5930)
2025-07-19,21:20:32 | INFO | Train Epoch: 29 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.366, 3000.40/s, 187.525/s/gpu LR: 0.000007 Logit Scale: 100.000 Contrastive_loss: 0.18303 (0.18595) Fd_loss: 1.4027 (1.4065) Loss: 1.5857 (1.5925)
2025-07-19,21:22:49 | INFO | Train Epoch: 29 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.365, 3020.63/s, 188.789/s/gpu LR: 0.000007 Logit Scale: 100.000 Contrastive_loss: 0.18140 (0.18566) Fd_loss: 1.3986 (1.4060) Loss: 1.5800 (1.5917)
2025-07-19,21:25:05 | INFO | Train Epoch: 29 [6557696/9319509 (70%)] Data (t): 0.000 Batch (t): 1.365, 3015.17/s, 188.448/s/gpu LR: 0.000007 Logit Scale: 100.000 Contrastive_loss: 0.19539 (0.18623) Fd_loss: 1.3934 (1.4053) Loss: 1.5888 (1.5915)
2025-07-19,21:27:22 | INFO | Train Epoch: 29 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.365, 3007.17/s, 187.948/s/gpu LR: 0.000006 Logit Scale: 100.000 Contrastive_loss: 0.18301 (0.18605) Fd_loss: 1.3966 (1.4048) Loss: 1.5796 (1.5909)
2025-07-19,21:29:38 | INFO | Train Epoch: 29 [7376896/9319509 (79%)] Data (t): 0.000 Batch (t): 1.365, 3000.03/s, 187.502/s/gpu LR: 0.000006 Logit Scale: 100.000 Contrastive_loss: 0.17864 (0.18566) Fd_loss: 1.4046 (1.4048) Loss: 1.5833 (1.5905)
2025-07-19,21:31:55 | INFO | Train Epoch: 29 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.365, 3007.71/s, 187.982/s/gpu LR: 0.000006 Logit Scale: 100.000 Contrastive_loss: 0.17899 (0.18533) Fd_loss: 1.4060 (1.4049) Loss: 1.5850 (1.5902)
2025-07-19,21:34:11 | INFO | Train Epoch: 29 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.366, 3001.32/s, 187.582/s/gpu LR: 0.000006 Logit Scale: 100.000 Contrastive_loss: 0.19096 (0.18560) Fd_loss: 1.3952 (1.4044) Loss: 1.5861 (1.5900)
2025-07-19,21:36:28 | INFO | Train Epoch: 29 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.365, 2975.39/s, 185.962/s/gpu LR: 0.000005 Logit Scale: 100.000 Contrastive_loss: 0.17425 (0.18508) Fd_loss: 1.3928 (1.4039) Loss: 1.5670 (1.5890)
2025-07-19,21:38:44 | INFO | Train Epoch: 29 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.365, 3000.60/s, 187.537/s/gpu LR: 0.000005 Logit Scale: 100.000 Contrastive_loss: 0.18715 (0.18517) Fd_loss: 1.4023 (1.4038) Loss: 1.5895 (1.5890)
2025-07-19,21:40:25 | INFO | Train Epoch: 29 [9318400/9319509 (100%)] Data (t): 0.003 Batch (t): 1.366, 3015.17/s, 188.448/s/gpu LR: 0.000005 Logit Scale: 100.000 Contrastive_loss: 0.17434 (0.18472) Fd_loss: 1.4007 (1.4037) Loss: 1.5751 (1.5884)
2025-07-19,21:40:27 | INFO | Starting zero-shot imagenet.
2025-07-19,21:40:27 | INFO | Building zero-shot classifier
2025-07-19,21:40:42 | INFO | Using classifier
2025-07-19,21:42:04 | INFO | Finished zero-shot imagenet.
2025-07-19,21:42:04 | INFO | Eval Epoch: 30 imagenet-zeroshot-val-top1: 0.2965 imagenet-zeroshot-val-top5: 0.5663
2025-07-19,21:42:05 | INFO | Start epoch 30
2025-07-19,21:42:10 | INFO | Train Epoch: 30 [ 4096/9319509 (0%)] Data (t): 4.291 Batch (t): 5.635, 726.864/s, 45.4290/s/gpu LR: 0.000005 Logit Scale: 100.000 Contrastive_loss: 0.15394 (0.15394) Fd_loss: 1.3991 (1.3991) Loss: 1.5530 (1.5530)
2025-07-19,21:44:26 | INFO | Train Epoch: 30 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.362, 3016.76/s, 188.548/s/gpu LR: 0.000005 Logit Scale: 100.000 Contrastive_loss: 0.18032 (0.16713) Fd_loss: 1.3967 (1.3979) Loss: 1.5770 (1.5650)
2025-07-19,21:46:43 | INFO | Train Epoch: 30 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.364, 2991.16/s, 186.947/s/gpu LR: 0.000005 Logit Scale: 100.000 Contrastive_loss: 0.16939 (0.16788) Fd_loss: 1.3989 (1.3983) Loss: 1.5683 (1.5661)
2025-07-19,21:49:00 | INFO | Train Epoch: 30 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.372, 2986.11/s, 186.632/s/gpu LR: 0.000004 Logit Scale: 100.000 Contrastive_loss: 0.18507 (0.17218) Fd_loss: 1.3962 (1.3977) Loss: 1.5812 (1.5699)
2025-07-19,21:51:17 | INFO | Train Epoch: 30 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.367, 3005.40/s, 187.838/s/gpu LR: 0.000004 Logit Scale: 100.000 Contrastive_loss: 0.17466 (0.17267) Fd_loss: 1.4054 (1.3993) Loss: 1.5800 (1.5719)
2025-07-19,21:53:33 | INFO | Train Epoch: 30 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.365, 3000.50/s, 187.531/s/gpu LR: 0.000004 Logit Scale: 100.000 Contrastive_loss: 0.17096 (0.17239) Fd_loss: 1.4070 (1.4006) Loss: 1.5780 (1.5729)
2025-07-19,21:55:50 | INFO | Train Epoch: 30 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.364, 3005.90/s, 187.868/s/gpu LR: 0.000004 Logit Scale: 100.000 Contrastive_loss: 0.18627 (0.17437) Fd_loss: 1.4081 (1.4016) Loss: 1.5944 (1.5760)
2025-07-19,21:58:06 | INFO | Train Epoch: 30 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.366, 2995.25/s, 187.203/s/gpu LR: 0.000004 Logit Scale: 100.000 Contrastive_loss: 0.17392 (0.17431) Fd_loss: 1.4045 (1.4020) Loss: 1.5784 (1.5763)
2025-07-19,22:00:23 | INFO | Train Epoch: 30 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.366, 3017.94/s, 188.621/s/gpu LR: 0.000003 Logit Scale: 100.000 Contrastive_loss: 0.15901 (0.17261) Fd_loss: 1.4114 (1.4030) Loss: 1.5704 (1.5756)
2025-07-19,22:02:39 | INFO | Train Epoch: 30 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.364, 3005.94/s, 187.871/s/gpu LR: 0.000003 Logit Scale: 100.000 Contrastive_loss: 0.16590 (0.17194) Fd_loss: 1.4006 (1.4028) Loss: 1.5665 (1.5747)
2025-07-19,22:04:56 | INFO | Train Epoch: 30 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.364, 2999.45/s, 187.465/s/gpu LR: 0.000003 Logit Scale: 100.000 Contrastive_loss: 0.17216 (0.17196) Fd_loss: 1.4132 (1.4037) Loss: 1.5853 (1.5757)
2025-07-19,22:07:12 | INFO | Train Epoch: 30 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.365, 3006.83/s, 187.927/s/gpu LR: 0.000003 Logit Scale: 100.000 Contrastive_loss: 0.17675 (0.17236) Fd_loss: 1.4051 (1.4038) Loss: 1.5818 (1.5762)
2025-07-19,22:09:29 | INFO | Train Epoch: 30 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.365, 3009.52/s, 188.095/s/gpu LR: 0.000003 Logit Scale: 100.000 Contrastive_loss: 0.15294 (0.17087) Fd_loss: 1.4000 (1.4036) Loss: 1.5530 (1.5744)
2025-07-19,22:11:45 | INFO | Train Epoch: 30 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.364, 2981.66/s, 186.354/s/gpu LR: 0.000003 Logit Scale: 100.000 Contrastive_loss: 0.16034 (0.17011) Fd_loss: 1.4030 (1.4035) Loss: 1.5633 (1.5736)
2025-07-19,22:14:02 | INFO | Train Epoch: 30 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.365, 3006.84/s, 187.928/s/gpu LR: 0.000002 Logit Scale: 100.000 Contrastive_loss: 0.16057 (0.16948) Fd_loss: 1.3975 (1.4031) Loss: 1.5580 (1.5726)
2025-07-19,22:16:18 | INFO | Train Epoch: 30 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.364, 3006.29/s, 187.893/s/gpu LR: 0.000002 Logit Scale: 100.000 Contrastive_loss: 0.16749 (0.16935) Fd_loss: 1.3948 (1.4026) Loss: 1.5623 (1.5719)
2025-07-19,22:18:35 | INFO | Train Epoch: 30 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.365, 3002.91/s, 187.682/s/gpu LR: 0.000002 Logit Scale: 100.000 Contrastive_loss: 0.18773 (0.17044) Fd_loss: 1.4035 (1.4026) Loss: 1.5912 (1.5731)
2025-07-19,22:20:51 | INFO | Train Epoch: 30 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.365, 3001.44/s, 187.590/s/gpu LR: 0.000002 Logit Scale: 100.000 Contrastive_loss: 0.17462 (0.17067) Fd_loss: 1.4045 (1.4027) Loss: 1.5791 (1.5734)
2025-07-19,22:23:08 | INFO | Train Epoch: 30 [7376896/9319509 (79%)] Data (t): 0.000 Batch (t): 1.364, 2995.18/s, 187.199/s/gpu LR: 0.000002 Logit Scale: 100.000 Contrastive_loss: 0.17463 (0.17088) Fd_loss: 1.3995 (1.4026) Loss: 1.5741 (1.5734)
2025-07-19,22:25:24 | INFO | Train Epoch: 30 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.364, 3003.60/s, 187.725/s/gpu LR: 0.000002 Logit Scale: 100.000 Contrastive_loss: 0.19343 (0.17200) Fd_loss: 1.3843 (1.4017) Loss: 1.5777 (1.5737)
2025-07-19,22:27:41 | INFO | Train Epoch: 30 [8196096/9319509 (88%)] Data (t): 0.000 Batch (t): 1.365, 2995.18/s, 187.199/s/gpu LR: 0.000002 Logit Scale: 100.000 Contrastive_loss: 0.16270 (0.17156) Fd_loss: 1.3947 (1.4013) Loss: 1.5574 (1.5729)
2025-07-19,22:29:57 | INFO | Train Epoch: 30 [8605696/9319509 (92%)] Data (t): 0.000 Batch (t): 1.365, 3003.81/s, 187.738/s/gpu LR: 0.000001 Logit Scale: 100.000 Contrastive_loss: 0.17540 (0.17173) Fd_loss: 1.3942 (1.4010) Loss: 1.5696 (1.5727)
2025-07-19,22:32:14 | INFO | Train Epoch: 30 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.365, 3006.16/s, 187.885/s/gpu LR: 0.000001 Logit Scale: 100.000 Contrastive_loss: 0.16903 (0.17162) Fd_loss: 1.4122 (1.4015) Loss: 1.5813 (1.5731)
2025-07-19,22:33:55 | INFO | Train Epoch: 30 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.367, 3011.45/s, 188.216/s/gpu LR: 0.000001 Logit Scale: 100.000 Contrastive_loss: 0.16858 (0.17149) Fd_loss: 1.4015 (1.4015) Loss: 1.5701 (1.5730)
2025-07-19,22:33:56 | INFO | Start epoch 31
2025-07-19,22:34:07 | INFO | Train Epoch: 31 [ 4096/9319509 (0%)] Data (t): 8.542 Batch (t): 10.191, 401.913/s, 25.1195/s/gpu LR: 0.000001 Logit Scale: 100.000 Contrastive_loss: 0.17843 (0.17843) Fd_loss: 1.3919 (1.3919) Loss: 1.5703 (1.5703)
2025-07-19,22:36:23 | INFO | Train Epoch: 31 [ 413696/9319509 (4%)] Data (t): 0.001 Batch (t): 1.367, 2988.69/s, 186.793/s/gpu LR: 0.000001 Logit Scale: 100.000 Contrastive_loss: 0.15524 (0.16684) Fd_loss: 1.3937 (1.3928) Loss: 1.5490 (1.5596)
2025-07-19,22:38:40 | INFO | Train Epoch: 31 [ 823296/9319509 (9%)] Data (t): 0.001 Batch (t): 1.365, 3003.37/s, 187.711/s/gpu LR: 0.000001 Logit Scale: 100.000 Contrastive_loss: 0.18693 (0.17353) Fd_loss: 1.3852 (1.3903) Loss: 1.5722 (1.5638)
2025-07-19,22:41:00 | INFO | Train Epoch: 31 [1232896/9319509 (13%)] Data (t): 0.001 Batch (t): 1.405, 3007.91/s, 187.995/s/gpu LR: 0.000001 Logit Scale: 100.000 Contrastive_loss: 0.16315 (0.17094) Fd_loss: 1.3918 (1.3907) Loss: 1.5549 (1.5616)
2025-07-19,22:43:18 | INFO | Train Epoch: 31 [1642496/9319509 (18%)] Data (t): 0.001 Batch (t): 1.377, 2998.87/s, 187.430/s/gpu LR: 0.000001 Logit Scale: 100.000 Contrastive_loss: 0.15973 (0.16870) Fd_loss: 1.3892 (1.3904) Loss: 1.5489 (1.5591)
2025-07-19,22:45:36 | INFO | Train Epoch: 31 [2052096/9319509 (22%)] Data (t): 0.001 Batch (t): 1.377, 2993.92/s, 187.120/s/gpu LR: 0.000001 Logit Scale: 100.000 Contrastive_loss: 0.17631 (0.16997) Fd_loss: 1.3992 (1.3918) Loss: 1.5755 (1.5618)
2025-07-19,22:47:52 | INFO | Train Epoch: 31 [2461696/9319509 (26%)] Data (t): 0.001 Batch (t): 1.364, 2991.72/s, 186.983/s/gpu LR: 0.000001 Logit Scale: 100.000 Contrastive_loss: 0.16271 (0.16893) Fd_loss: 1.3948 (1.3923) Loss: 1.5575 (1.5612)
2025-07-19,22:50:11 | INFO | Train Epoch: 31 [2871296/9319509 (31%)] Data (t): 0.001 Batch (t): 1.391, 3007.22/s, 187.952/s/gpu LR: 0.000001 Logit Scale: 100.000 Contrastive_loss: 0.17219 (0.16934) Fd_loss: 1.4024 (1.3935) Loss: 1.5746 (1.5629)
2025-07-19,22:52:32 | INFO | Train Epoch: 31 [3280896/9319509 (35%)] Data (t): 0.001 Batch (t): 1.408, 2995.07/s, 187.192/s/gpu LR: 0.000001 Logit Scale: 100.000 Contrastive_loss: 0.17412 (0.16987) Fd_loss: 1.3940 (1.3936) Loss: 1.5681 (1.5634)
2025-07-19,22:54:49 | INFO | Train Epoch: 31 [3690496/9319509 (40%)] Data (t): 0.001 Batch (t): 1.365, 3008.21/s, 188.013/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.16670 (0.16955) Fd_loss: 1.4061 (1.3948) Loss: 1.5728 (1.5644)
2025-07-19,22:57:05 | INFO | Train Epoch: 31 [4100096/9319509 (44%)] Data (t): 0.001 Batch (t): 1.364, 3003.04/s, 187.690/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.16993 (0.16959) Fd_loss: 1.3987 (1.3952) Loss: 1.5686 (1.5648)
2025-07-19,22:59:23 | INFO | Train Epoch: 31 [4509696/9319509 (48%)] Data (t): 0.001 Batch (t): 1.378, 3000.40/s, 187.525/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.18417 (0.17080) Fd_loss: 1.4003 (1.3956) Loss: 1.5844 (1.5664)
2025-07-19,23:01:40 | INFO | Train Epoch: 31 [4919296/9319509 (53%)] Data (t): 0.001 Batch (t): 1.374, 2996.28/s, 187.268/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.17197 (0.17089) Fd_loss: 1.3910 (1.3953) Loss: 1.5630 (1.5661)
2025-07-19,23:03:57 | INFO | Train Epoch: 31 [5328896/9319509 (57%)] Data (t): 0.001 Batch (t): 1.364, 3013.16/s, 188.322/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.17943 (0.17150) Fd_loss: 1.4125 (1.3965) Loss: 1.5920 (1.5680)
2025-07-19,23:06:13 | INFO | Train Epoch: 31 [5738496/9319509 (62%)] Data (t): 0.001 Batch (t): 1.364, 3011.50/s, 188.219/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.17914 (0.17201) Fd_loss: 1.3937 (1.3963) Loss: 1.5728 (1.5683)
2025-07-19,23:08:29 | INFO | Train Epoch: 31 [6148096/9319509 (66%)] Data (t): 0.001 Batch (t): 1.364, 3005.47/s, 187.842/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.17802 (0.17239) Fd_loss: 1.3895 (1.3959) Loss: 1.5675 (1.5683)
2025-07-19,23:10:46 | INFO | Train Epoch: 31 [6557696/9319509 (70%)] Data (t): 0.001 Batch (t): 1.364, 3006.29/s, 187.893/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.18277 (0.17300) Fd_loss: 1.3991 (1.3961) Loss: 1.5818 (1.5691)
2025-07-19,23:13:02 | INFO | Train Epoch: 31 [6967296/9319509 (75%)] Data (t): 0.001 Batch (t): 1.365, 3008.87/s, 188.054/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.17536 (0.17313) Fd_loss: 1.4045 (1.3965) Loss: 1.5799 (1.5697)
2025-07-19,23:15:19 | INFO | Train Epoch: 31 [7376896/9319509 (79%)] Data (t): 0.001 Batch (t): 1.364, 3003.63/s, 187.727/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.16428 (0.17266) Fd_loss: 1.3935 (1.3964) Loss: 1.5578 (1.5690)
2025-07-19,23:17:35 | INFO | Train Epoch: 31 [7786496/9319509 (84%)] Data (t): 0.001 Batch (t): 1.364, 2998.17/s, 187.386/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.16584 (0.17232) Fd_loss: 1.3972 (1.3964) Loss: 1.5630 (1.5687)
2025-07-19,23:19:52 | INFO | Train Epoch: 31 [8196096/9319509 (88%)] Data (t): 0.001 Batch (t): 1.364, 3011.88/s, 188.243/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.15415 (0.17146) Fd_loss: 1.4041 (1.3968) Loss: 1.5583 (1.5682)
2025-07-19,23:22:08 | INFO | Train Epoch: 31 [8605696/9319509 (92%)] Data (t): 0.001 Batch (t): 1.365, 3005.50/s, 187.844/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.16774 (0.17129) Fd_loss: 1.3910 (1.3965) Loss: 1.5588 (1.5678)
2025-07-19,23:24:24 | INFO | Train Epoch: 31 [9015296/9319509 (97%)] Data (t): 0.001 Batch (t): 1.364, 3005.94/s, 187.871/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.16824 (0.17115) Fd_loss: 1.4006 (1.3967) Loss: 1.5688 (1.5679)
2025-07-19,23:26:06 | INFO | Train Epoch: 31 [9318400/9319509 (100%)] Data (t): 0.002 Batch (t): 1.366, 3009.54/s, 188.096/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 0.16928 (0.17108) Fd_loss: 1.3982 (1.3968) Loss: 1.5675 (1.5678)
2025-07-19,23:26:07 | INFO | Starting zero-shot imagenet.
2025-07-19,23:26:07 | INFO | Building zero-shot classifier
2025-07-19,23:26:21 | INFO | Using classifier
2025-07-19,23:27:39 | INFO | Finished zero-shot imagenet.
2025-07-19,23:27:39 | INFO | Eval Epoch: 32 imagenet-zeroshot-val-top1: 0.2970 imagenet-zeroshot-val-top5: 0.5685