Update README.md
Browse files
README.md
CHANGED
@@ -27,12 +27,16 @@ This Named Entity Recognition (NER) model is designed to extract book titles fro
|
|
27 |
The model has been fine-tuned and evaluated on a Dutch dataset consisting of 12,535 book reviews from the Leeuwarder Courant, identifying 23,529 book titles. The dataset utilizes the IO Tagging Schema. The data was divided into a training set (70%), validation set (15%), and test set (15%). Training involved the Majority or Minority loss function, achieving an F1 score of 84.3%, Precision of 83.4%, and Recall of 85.2% on the test set.
|
28 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/661fcac6ccc447675983951b/Ap95lefSlrwJGDg6eupVF.png)
|
29 |
|
30 |
-
|
31 |
|
32 |
- **Model type:** XML-RoBERTa
|
33 |
- **Language(s):** Dutch
|
34 |
- **Fine-tuned from model:** [FacebookAI/xlm-roberta-large-finetuned-conll03-english](https://huggingface.co/FacebookAI/xlm-roberta-large-finetuned-conll03-english)
|
35 |
|
|
|
|
|
|
|
|
|
36 |
## Uses
|
37 |
|
38 |
This model is intended for extracting book titles from Dutch texts, particularly useful for applications involving text analysis in the literary domain.
|
|
|
27 |
The model has been fine-tuned and evaluated on a Dutch dataset consisting of 12,535 book reviews from the Leeuwarder Courant, identifying 23,529 book titles. The dataset utilizes the IO Tagging Schema. The data was divided into a training set (70%), validation set (15%), and test set (15%). Training involved the Majority or Minority loss function, achieving an F1 score of 84.3%, Precision of 83.4%, and Recall of 85.2% on the test set.
|
28 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/661fcac6ccc447675983951b/Ap95lefSlrwJGDg6eupVF.png)
|
29 |
|
30 |
+
## Model Description
|
31 |
|
32 |
- **Model type:** XML-RoBERTa
|
33 |
- **Language(s):** Dutch
|
34 |
- **Fine-tuned from model:** [FacebookAI/xlm-roberta-large-finetuned-conll03-english](https://huggingface.co/FacebookAI/xlm-roberta-large-finetuned-conll03-english)
|
35 |
|
36 |
+
## Model Flaws
|
37 |
+
- Struggles with accurately identifying subtitles of book titles.
|
38 |
+
- When a book title is mentioned multiple times within the same review, the model tends to mark it only once, missing subsequent occurrences.
|
39 |
+
|
40 |
## Uses
|
41 |
|
42 |
This model is intended for extracting book titles from Dutch texts, particularly useful for applications involving text analysis in the literary domain.
|