--- library_name: transformers tags: - generated_from_trainer license: mit datasets: - SetFit/mnli language: - en metrics: - accuracy - precision - recall - f1 model-index: - name: modernbert-setfit-nli results: - task: type: text-classification name: Text Classification dataset: name: SetFit/mnli type: SetFit/mnli args: SetFit/mnli metrics: - type: precision value: 0.8463114754098361 name: Precision - type: recall value: 0.8463114754098361 name: Recall - type: f1 value: 0.8463114754098361 name: F1 - type: accuracy value: 0.8463114754098361 name: Accuracy base_model: - answerdotai/ModernBERT-base pipeline_tag: text-classification --- # modernbert-setfit-nli ## Model Description This model is a fine-tuned version of [`answerdotai/ModernBERT-base`](https://huggingface.co/answerdotai/ModernBERT-base) trained on a subset of the [SetFit/mnli](https://huggingface.co/datasets/SetFit/mnli) dataset. It is trained for natural language inference (NLI) tasks, where the goal is to determine the relationship between two text inputs (e.g., entailment, contradiction, or neutrality). ## Intended Uses & Limitations ### Intended Uses - **Natural Language Inference (NLI):** Suitable for classifying relationships between pairs of sentences. - **Text Understanding Tasks:** Can be applied to other similar tasks requiring sentence pair classification. ### Limitations - **Dataset-Specific Biases:** The model was fine-tuned on 30,000 samples from the SetFit/mnli dataset and may not generalize well to domains significantly different from the training data. - **Context Length:** The tokenizer’s maximum sequence length is 512 tokens. Inputs longer than this will be truncated. - **Resource Intensive:** May require a modern GPU for efficient inference on large datasets. This model is a starting point for NLI tasks and may need further fine-tuning for domain-specific applications. ## Training Details: ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: linear - num_epochs: 3 ### Framework versions - Transformers 4.48.0 - Pytorch 2.5.1+cu121 - Datasets 3.2.0 - Tokenizers 0.21.0 ## References - **GitHub Repository:** The training code is available a my [GitHub repository](https://github.com/sfarrukhm/model_finetune.git).