Bert-finetuned-mrpc Fine-tuned for Sequence classification
This model is a fine-tuned version of bert-finetuned-mrpc for sequence classification tasks.
Model Description
Dataset
Name: MRPC (Microsoft Research Paraphrase Corpus)
Description: The MRPC dataset consists of sentence pairs automatically extracted from online news sources, with human annotations for whether the sentences in the pair are semantically equivalent.
Source: The dataset is part of the GLUE benchmark.
Model description
This model is a fine-tuned version of BERT-base-uncased, specifically trained to determine if two sentences are paraphrases of each other. The model outputs 1 if the sentences are equivalent and 0 if they are not.
- Model architecture: BertForSequenceClassification
- Task: sequence-classification
- Training dataset: glue mrpc dataset
- Number of parameters: 109,483,778
- Sequence length: 512
- Vocab size: 30522
- Hidden size: 768
- Number of attention heads: 12
- Number of hidden layers: 12
Intended Uses & Limitations
Intended Uses
Paraphrase Detection: This model can be used to determine if two sentences are paraphrases of each other, which is useful in applications like duplicate question detection in forums, semantic search, and text summarization.
Educational Purposes: Can be used for educational purposes to demonstrate fine-tuning of transformer models on specific tasks.
Limitations
Dataset Bias: The MRPC dataset contains sentence pairs from specific news sources, which might introduce bias. The model might not perform well on text from other domains.
Context Limitations: The model evaluates sentences pairwise without considering broader context, which might lead to incorrect paraphrase detections in complex contexts.
Training procedure
Optimizer: AdamW
Learning Rate: 5e-5
Epochs: 3
Batch Size: 8
Evaluation results
{'accuracy': 0.8504901960784313, 'f1': 0.8942807625649913}
- Downloads last month
- 7