Update README.md
Browse files
README.md
CHANGED
@@ -30,6 +30,38 @@ We present the dev results on SQuAD 1.1/2.0 and MNLI tasks.
|
|
30 |
| DeBERTa-v3-small+SiFT | -/- | -/- | 88.8 |
|
31 |
|
32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
### Citation
|
34 |
|
35 |
If you find DeBERTa useful for your work, please cite the following paper:
|
|
|
30 |
| DeBERTa-v3-small+SiFT | -/- | -/- | 88.8 |
|
31 |
|
32 |
|
33 |
+
#### Fine-tuning with HF transformers
|
34 |
+
|
35 |
+
```bash
|
36 |
+
#!/bin/bash
|
37 |
+
pip install datasets
|
38 |
+
export TASK_NAME=mnli
|
39 |
+
|
40 |
+
output_dir="ds_results"
|
41 |
+
|
42 |
+
num_gpus=8
|
43 |
+
|
44 |
+
batch_size=8
|
45 |
+
|
46 |
+
python -m torch.distributed.launch --nproc_per_node=${num_gpus} \
|
47 |
+
run_glue.py \
|
48 |
+
--model_name_or_path microsoft/deberta-v3-small \
|
49 |
+
--task_name $TASK_NAME \
|
50 |
+
--do_train \
|
51 |
+
--do_eval \
|
52 |
+
--evaluation_strategy steps \
|
53 |
+
--max_seq_length 256 \
|
54 |
+
--warmup_steps 1500 \
|
55 |
+
--per_device_train_batch_size ${batch_size} \
|
56 |
+
--learning_rate 3e-5 \
|
57 |
+
--num_train_epochs 4 \
|
58 |
+
--output_dir $output_dir \
|
59 |
+
--overwrite_output_dir \
|
60 |
+
--logging_steps 1000 \
|
61 |
+
--logging_dir $output_dir
|
62 |
+
|
63 |
+
```
|
64 |
+
|
65 |
### Citation
|
66 |
|
67 |
If you find DeBERTa useful for your work, please cite the following paper:
|