baptiste-pasquier
/

distilcamembert-allocine

@@ -1,10 +1,10 @@
 ## TextAttack Model Card
 This `cmarkea/distilcamembert-base` model was fine-tuned using TextAttackand the `allocine` dataset loaded using the `datasets` library. The model was fine-tuned
-for 1 epochs with a batch size of 16,
     a maximum sequence length of 512, and an initial learning rate of 5e-05.
 Since this was a classification task, the model was trained with a cross-entropy loss function.
-The best score the model achieved on this task was 0.9692, as measured by the
-eval set accuracy, found after 1 epoch.
 For more information, check out [TextAttack on Github](https://github.com/QData/TextAttack).

 ## TextAttack Model Card
 This `cmarkea/distilcamembert-base` model was fine-tuned using TextAttackand the `allocine` dataset loaded using the `datasets` library. The model was fine-tuned
+for 3 epochs with a batch size of 64,
     a maximum sequence length of 512, and an initial learning rate of 5e-05.
 Since this was a classification task, the model was trained with a cross-entropy loss function.
+The best score the model achieved on this task was 0.9707, as measured by the
+eval set accuracy, found after 3 epochs.
 For more information, check out [TextAttack on Github](https://github.com/QData/TextAttack).

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:556004d4af12dfeee38a8f4aadc5ab961bcf74b235a00cc1f7fe08cf09d2fd15
 size 272425205

 version https://git-lfs.github.com/spec/v1
+oid sha256:5cf75746015301662f1a8677d6e45d6c0efba735b927358db3cf1c57975d0875
 size 272425205

train_log.txt CHANGED Viewed

@@ -1,17 +1,29 @@
-Writing logs to ./outputs/2023-02-13-00-40-42-038240/train_log.txt.
-Wrote original training args to ./outputs/2023-02-13-00-40-42-038240/training_args.json.
 ***** Running training *****
   Num examples = 160000
-  Num epochs = 1
-  Num clean epochs = 1
-  Instantaneous batch size per device = 16
-  Total train batch size (w. parallel, distributed & accumulation) = 16
   Gradient accumulation steps = 1
-  Total optimization steps = 10000
 ==========================================================
 Epoch 1
-Running clean epoch 1/1
-Train accuracy: 95.02%
-Eval accuracy: 96.92%
-Best score found. Saved model to ./outputs/2023-02-13-00-40-42-038240/best_model/
-Wrote README to ./outputs/2023-02-13-00-40-42-038240/README.md.

+Writing logs to ./outputs/2023-02-12-23-30-37-265125/train_log.txt.
+Wrote original training args to ./outputs/2023-02-12-23-30-37-265125/training_args.json.
 ***** Running training *****
   Num examples = 160000
+  Num epochs = 3
+  Num clean epochs = 3
+  Instantaneous batch size per device = 64
+  Total train batch size (w. parallel, distributed & accumulation) = 64
   Gradient accumulation steps = 1
+  Total optimization steps = 7500
 ==========================================================
 Epoch 1
+Running clean epoch 1/3
+Train accuracy: 94.11%
+Eval accuracy: 96.77%
+Best score found. Saved model to ./outputs/2023-02-12-23-30-37-265125/best_model/
+==========================================================
+Epoch 2
+Running clean epoch 2/3
+Train accuracy: 97.52%
+Eval accuracy: 96.95%
+Best score found. Saved model to ./outputs/2023-02-12-23-30-37-265125/best_model/
+==========================================================
+Epoch 3
+Running clean epoch 3/3
+Train accuracy: 98.70%
+Eval accuracy: 97.07%
+Best score found. Saved model to ./outputs/2023-02-12-23-30-37-265125/best_model/
+Wrote README to ./outputs/2023-02-12-23-30-37-265125/README.md.

training_args.json CHANGED Viewed

@@ -9,14 +9,14 @@
     "dataset_eval_split": "validation",
     "filter_train_by_labels": null,
     "filter_eval_by_labels": null,
-    "num_epochs": 1,
     "num_clean_epochs": 1,
     "attack_epoch_interval": 1,
     "early_stopping_epochs": null,
     "learning_rate": 5e-05,
     "num_warmup_steps": 500,
     "weight_decay": 0.01,
-    "per_device_train_batch_size": 16,
     "per_device_eval_batch_size": 32,
     "gradient_accumulation_steps": 1,
     "random_seed": 786,
@@ -26,11 +26,11 @@
     "num_train_adv_examples": -1,
     "query_budget_train": null,
     "attack_num_workers_per_device": 1,
-    "output_dir": "./outputs/2023-02-13-00-40-42-038240",
     "checkpoint_interval_steps": null,
     "checkpoint_interval_epochs": null,
     "save_last": true,
-    "log_to_tb": true,
     "tb_log_dir": null,
     "log_to_wandb": false,
     "wandb_project": "textattack",

     "dataset_eval_split": "validation",
     "filter_train_by_labels": null,
     "filter_eval_by_labels": null,
+    "num_epochs": 3,
     "num_clean_epochs": 1,
     "attack_epoch_interval": 1,
     "early_stopping_epochs": null,
     "learning_rate": 5e-05,
     "num_warmup_steps": 500,
     "weight_decay": 0.01,
+    "per_device_train_batch_size": 64,
     "per_device_eval_batch_size": 32,
     "gradient_accumulation_steps": 1,
     "random_seed": 786,
     "num_train_adv_examples": -1,
     "query_budget_train": null,
     "attack_num_workers_per_device": 1,
+    "output_dir": "./outputs/2023-02-12-23-30-37-265125",
     "checkpoint_interval_steps": null,
     "checkpoint_interval_epochs": null,
     "save_last": true,
+    "log_to_tb": false,
     "tb_log_dir": null,
     "log_to_wandb": false,
     "wandb_project": "textattack",