This is a finetuned version of RuRoBERTa-large for the task of linguistic acceptability classification on the RuCoLA benchmark.
The hyperparameters used for finetuning are as follows:
- 5 training epochs (with early stopping based on validation MCC)
- Peak learning rate: 1e-5, linear warmup for 10% of total training time
- Weight decay: 1e-4
- Batch size: 32
- Random seed: 5
- Optimizer: torch.optim.AdamW
- Downloads last month
- 436
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.