myrkur commited on
Commit
31a6a90
·
verified ·
1 Parent(s): 82429cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -69,7 +69,7 @@ The model was fine-tuned on a custom dataset with **2.5 billion Persian tokens**
69
  - **Optimizer**: AdamW
70
  - **Learning Rate**: 6e-4
71
  - **Batch Size**: 32
72
- - **Epochs**: 3
73
  - **Scheduler**: Inverse square root
74
  - **Precision**: bfloat16 for faster computation and lower memory usage
75
  - **Masking Strategy**: Whole Word Masking (WWM) with a probability of 30%
 
69
  - **Optimizer**: AdamW
70
  - **Learning Rate**: 6e-4
71
  - **Batch Size**: 32
72
+ - **Epochs**: 2
73
  - **Scheduler**: Inverse square root
74
  - **Precision**: bfloat16 for faster computation and lower memory usage
75
  - **Masking Strategy**: Whole Word Masking (WWM) with a probability of 30%