QA-History-Saudi / README.md

Update README.md

190b4a8 about 1 year ago

6.21 kB

	---
	base_model: aubmindlab/bert-base-arabertv02
	tags:
	- generated_from_trainer
	model-index:
	- name: SA-History-NASEEJ-QA
	results: []
	language:
	- ar
	library_name: transformers
	widget:
	- text: من كان الأكبر سنًا من آل سعود وتولى الإمارة؟
	context: >-
	بعـد وفـاة سـعود بـن محمـد بـن مقـرن تولـى الإمـارة زيـد بـن مرخـان بـن
	وطبـان، وكان الأكبـر سـناً مـن آل سـعود، ولكـن حكمـه لـم يمتـد طويـ ًا لكبـر
	سـنه، ممـا دعـا مقـرن بـن محمـد بـن مقـرن إلـى انتـزاع الإمـارة منـه، لكـن
	الأمـور لـم تسـتمر طويـ ًا لمقـرن، وذلـك عندمـا حـاول الغـدر بزيـد بـن
	مرخـان الـذي كان يحكـم قبلـه، ممـا دعـا محمـد بـن سـعود ومقـرن بـن عبداللـه
	إلـى قتلـه، وكان ذلـك سـنة 1139 هــ 1727/ م.


	بعـد ذلـك عـاد إلـى الإمـارة زيـد بـن مرخـان، إلا أنـه عندمـا هجـم علـى
	إمـارة العيينـة سـعت - بعـد ذلـك - إلـى التحايـل عليـه وطلبـت التفـاوض معـه،
	وعندمـا ذهـب تم قتلـه، وبعـد قتـل زيـد بـن مرخـان تولـى محمـد بـن سـعود بـن
	مقـرن الإمـارة فـي الدرعيـة سـنة 1139 هــ 1727/ م ، وظـل حكمـه حتـى سـنة
	1179 هـ 1765/ م.
	example_title: تاريخ المملكة العربية السعودية
	pipeline_tag: question-answering
	---


	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Naseej-SA-History-QA

	This model is a fine-tuned version of [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 3.0791

	## Model description
	The Naseej-SA-History-QA model is a fine-tuned version of the aubmindlab/bert-base-arabertv02 pre-trained BERT model.
	It has been tailored and optimized for question answering tasks related to the history of Saudi Arabia.
	The model is designed to comprehend historical context and provide accurate answers to questions in Arabic language.


	## Intended uses & limitations
	The Naseej-SA-History-QA model is intended to be used for answering historical questions specifically related to the history of Saudi Arabia. It can be employed in educational and research settings to assist students, scholars, and researchers in obtaining information about Saudi Arabian history. The model can also be utilized in various NLP applications where historical context is a key factor, such as building educational platforms, historical archives, and language translation tools.
	The model's performance is contingent upon the quality and accuracy of the training and evaluation data it has been fine-tuned on. It may struggle with questions that deviate significantly from the training data distribution.
	The model's understanding of historical events and context is based on the data it has been trained on. It may not perform well on questions involving more recent or less documented historical events.
	The model may not fully comprehend nuanced or highly specific historical inquiries that require deep contextual understanding beyond the scope of its training data.

	## Training and evaluation data
	The Naseej-SA-History-QA model was trained using a custom dataset comprising historical questions and corresponding context passages related to the history of Saudi Arabia. The dataset covers various historical periods and events, providing the model with a wide range of historical context to learn from.

	The evaluation set used during training was designed to assess the model's performance on question answering tasks. The evaluation set includes a variety of questions and context passages that challenge the model's ability to accurately answer questions about Saudi Arabian history.

	## Training procedure
	The Naseej-SA-History-QA model was fine-tuned using the aubmindlab/bert-base-arabertv02 pre-trained BERT model. The training process involved several key steps:

	Dataset Preparation: A custom dataset was curated for training the model. The dataset consisted of pairs of historical questions and corresponding context passages, both in Arabic language. The context passages provided the necessary historical context for answering the questions.

	Tokenization: The dataset was tokenized using the Tokenizers library, which converts text into a format that the model can understand. Tokenization converts words and subwords into numerical tokens that the model can process.

	Model Fine-Tuning: The tokenized dataset was used to fine-tune the aubmindlab/bert-base-arabertv02 base model using the Transformers library. During fine-tuning, the model was adjusted to perform well on the specific task of question answering related to Saudi Arabian history.
	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 9

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| No log \| 1.0 \| 11 \| 4.9014 \|
	\| No log \| 2.0 \| 22 \| 4.7432 \|
	\| No log \| 3.0 \| 33 \| 4.6212 \|
	\| No log \| 4.0 \| 44 \| 4.6347 \|
	\| No log \| 5.0 \| 55 \| 4.6101 \|
	\| No log \| 6.0 \| 66 \| 4.6209 \|
	\| No log \| 7.0 \| 77 \| 4.6445 \|
	\| No log \| 8.0 \| 88 \| 4.6284 \|
	\| No log \| 9.0 \| 99 \| 4.6226 \|


	### Framework versions

	- Transformers 4.32.0
	- Pytorch 2.0.1
	- Datasets 2.14.4
	- Tokenizers 0.13.3