|
--- |
|
language: |
|
- fa |
|
library_name: transformers |
|
widget: |
|
- text: "ز سوزناکی گفتار من [MASK] بگریست" |
|
example_title: "Poetry 1" |
|
- text: "نظر از تو برنگیرم همه [MASK] تا بمیرم که تو در دلم نشستی و سر مقام داری" |
|
example_title: "Poetry 2" |
|
- text: "هر ساعتم اندرون بجوشد [MASK] را وآگاهی نیست مردم بیرون را" |
|
example_title: "Poetry 3" |
|
- text: "غلام همت آن رند عافیت سوزم که در گدا صفتی [MASK] داند" |
|
example_title: "Poetry 4" |
|
- text: "این [MASK] اولشه." |
|
example_title: "Informal 1" |
|
- text: "دیگه خسته شدم! [MASK] اینم شد کار؟!" |
|
example_title: "Informal 2" |
|
- text: "فکر نکنم به موقع برسیم. بهتره [MASK] این یکی بشیم." |
|
example_title: "Informal 3" |
|
- text: "تا صبح بیدار موندم و داشتم برای [MASK] آماده می شدم." |
|
example_title: "Informal 4" |
|
- text: "زندگی بدون [MASK] خستهکننده است." |
|
example_title: "Formal 1" |
|
- text: "در حکم اولیه این شرکت مجاز به فعالیت شد ولی پس از بررسی مجدد، مجوز این شرکت [MASK] شد." |
|
example_title: "Formal 2" |
|
--- |
|
|
|
|
|
# FaBERT: Pre-training BERT on Persian Blogs |
|
|
|
## Model Details |
|
|
|
FaBERT is a Persian BERT-base model trained on the diverse HmBlogs corpus, encompassing both casual and formal Persian texts. Developed for natural language processing tasks, FaBERT is a robust solution for processing Persian text. Through evaluation across various Natural Language Understanding (NLU) tasks, FaBERT consistently demonstrates notable improvements, while having a compact model size. Now available on Hugging Face, integrating FaBERT into your projects is hassle-free. Experience enhanced performance without added complexity as FaBERT tackles a variety of NLP tasks. |
|
|
|
## Features |
|
- Pre-trained on the diverse HmBlogs corpus consisting more than 50 GB of text from Persian Blogs |
|
- Remarkable performance across various downstream NLP tasks |
|
- BERT architecture with 124 million parameters |
|
|
|
## Useful Links |
|
- **Repository:** [FaBERT on Github](https://github.com/SBU-NLP-LAB/FaBERT) |
|
- **Paper:** [arXiv preprint](https://arxiv.org/abs/2402.06617) |
|
|
|
## Usage |
|
|
|
### Loading the Model with MLM head |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForMaskedLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("sbunlp/fabert") # make sure to use the default fast tokenizer |
|
model = AutoModelForMaskedLM.from_pretrained("sbunlp/fabert") |
|
``` |
|
### Downstream Tasks |
|
|
|
Similar to the original English BERT, FaBERT can be fine-tuned on many downstream tasks.(https://huggingface.co/docs/transformers/en/training) |
|
|
|
Examples on Persian datasets are available in our [GitHub repository](#useful-links). |
|
|
|
**make sure to use the default Fast Tokenizer** |
|
|
|
## Training Details |
|
|
|
FaBERT was pre-trained with the MLM (WWM) objective, and the resulting perplexity on validation set was 7.76. |
|
|
|
| Hyperparameter | Value | |
|
|-------------------|:--------------:| |
|
| Batch Size | 32 | |
|
| Optimizer | Adam | |
|
| Learning Rate | 6e-5 | |
|
| Weight Decay | 0.01 | |
|
| Total Steps | 18 Million | |
|
| Warmup Steps | 1.8 Million | |
|
| Precision Format | TF32 | |
|
|
|
## Evaluation |
|
|
|
Here are some key performance results for the FaBERT model: |
|
|
|
**Sentiment Analysis** |
|
| Task | FaBERT | ParsBERT | XLM-R | |
|
|:-------------|:------:|:--------:|:-----:| |
|
| MirasOpinion | **87.51** | 86.73 | 84.92 | |
|
| MirasIrony | 74.82 | 71.08 | **75.51** | |
|
| DeepSentiPers | **79.85** | 74.94 | 79.00 | |
|
|
|
**Named Entity Recognition** |
|
| Task | FaBERT | ParsBERT | XLM-R | |
|
|:-------------|:------:|:--------:|:-----:| |
|
| PEYMA | **91.39** | 91.24 | 90.91 | |
|
| ParsTwiner | **82.22** | 81.13 | 79.50 | |
|
| MultiCoNER v2 | 57.92 | **58.09** | 51.47 | |
|
|
|
**Question Answering** |
|
| Task | FaBERT | ParsBERT | XLM-R | |
|
|:-------------|:------:|:--------:|:-----:| |
|
| ParsiNLU | **55.87** | 44.89 | 42.55 | |
|
| PQuAD | 87.34 | 86.89 | **87.60** | |
|
| PCoQA | **53.51** | 50.96 | 51.12 | |
|
|
|
**Natural Language Inference & QQP** |
|
| Task | FaBERT | ParsBERT | XLM-R | |
|
|:-------------|:------:|:--------:|:-----:| |
|
| FarsTail | **84.45** | 82.52 | 83.50 | |
|
| SBU-NLI | **66.65** | 58.41 | 58.85 | |
|
| ParsiNLU QQP | **82.62** | 77.60 | 79.74 | |
|
|
|
**Number of Parameters** |
|
| | FaBERT | ParsBERT | XLM-R | |
|
|:-------------|:------:|:--------:|:-----:| |
|
| Parameter Count (M) | 124 | 162 | 278 | |
|
| Vocabulary Size (K) | 50 | 100 | 250 | |
|
|
|
For a more detailed performance analysis refer to the paper. |
|
|
|
## How to Cite |
|
|
|
If you use FaBERT in your research or projects, please cite it using the following BibTeX: |
|
|
|
```bibtex |
|
@article{masumi2024fabert, |
|
title={FaBERT: Pre-training BERT on Persian Blogs}, |
|
author={Masumi, Mostafa and Majd, Seyed Soroush and Shamsfard, Mehrnoush and Beigy, Hamid}, |
|
journal={arXiv preprint arXiv:2402.06617}, |
|
year={2024} |
|
} |
|
``` |
|
|