ParsBERT (v3.0)

A Transformer-based Model for Persian Language Understanding

The new version of BERT v3.0 for Persian is available today and can tackle the zero-width non-joiner character for Persian writing. Also, the model was trained on new multi-types corpora with a new set of vocabulary.

Introduction

ParsBERT is a monolingual language model based on Google’s BERT architecture. This model is pre-trained on large Persian corpora with various writing styles from numerous subjects (e.g., scientific, novels, news).

Paper presenting ParsBERT: arXiv:2005.12515

BibTeX entry and citation info

Please cite in publications as the following:

@article{ParsBERT,
    title={ParsBERT: Transformer-based Model for Persian Language Understanding},
    author={Mehrdad Farahani, Mohammad Gharachorloo, Marzieh Farahani, Mohammad Manthouri},
    journal={ArXiv},
    year={2020},
    volume={abs/2005.12515}
}

Questions?

Post a Github issue on the ParsBERT Issues repo.

Downloads last month: 2,441

Model tree for HooshvareLab/bert-fa-zwnj-base

Finetunes

7 models

Spaces using HooshvareLab/bert-fa-zwnj-base 5

Paper for HooshvareLab/bert-fa-zwnj-base

ParsBERT: Transformer-based Model for Persian Language Understanding

Paper • 2005.12515 • Published May 26, 2020