datasets:
- AfnanTS/Final_ArLAMA_DS_tokenized_for_ARBERTv1
language:
- ar
base_model:
- UBC-NLP/ARBERT
pipeline_tag: fill-mask
The ArBERTV1_EL model is a transformer-based Arabic language model fine-tuned using the Entity Linking (EL) task. This model leverages Knowledge Graphs (KGs) for intrinsic evaluation of Masked Language Modeling (MLM) models without directly evaluating the EL model. The EL task ensures that the model benefits from the incorporation of structured knowledge during pre-training.
Uses
Direct Use
Filling masked tokens in Arabic text, particularly in contexts enriched with knowledge from KGs.
Downstream Use
Can be further fine-tuned for Arabic NLP tasks that require semantic understanding, such as text classification or question answering.
How to Get Started with the Model
from transformers import pipeline
fill_mask = pipeline("fill-mask", model="AfnanTS/ArBERTV1_EL")
fill_mask("اللغة [MASK] مهمة جدا."
Training Details
Training Data
Trained on the ArLAMA dataset, which is designed to represent Knowledge Graphs in natural language.
Training Procedure
Continued pre-training of the ArBERTv1 model using Entity Linking (EL) task.