license: cc-by-nc-sa-4.0
base_model: microsoft/layoutlmv2-base-uncased
tags:
- generated_from_trainer
datasets:
- cord
metrics:
- precision
- recall
- f1
- accuracy
model-index:
- name: layoutlmv2-finetuned-cord
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: cord
type: cord
config: cord
split: validation
args: cord
metrics:
- name: Precision
type: precision
value: 0.9652945924132365
- name: Recall
type: recall
value: 0.9676375404530745
- name: F1
type: f1
value: 0.9664646464646465
- name: Accuracy
type: accuracy
value: 0.9702653247941445
overfitting issue
I use this colab: https://colab.research.google.com/drive/1AXh3G3-VmbMWlwbSvesVIurzNlcezTce?usp=sharing
to Fine tuning LayoutLMv2ForTokenClassification on CORD dataset
here is the result: https://huggingface.co/doc2txt/layoutlmv2-finetuned-cord
- F1: 0.9665
and indeed the result are pretty amazing when running on the test set, however when running on any other receipt (printed or pdf) the result are completely off
So from some reason the model is overfitting to the cord dataset, even though I use similar images for testing.
I don't think that there is a Data leakage unless the cord DS is not clean (which I assume it is clean)
What could be the reason for this? Is it some inherent property of LayoutLM? The LayoutLM models are somewhat old, and it seems deserted...
I don't have much experience so I would appreciate any info Thanks
here is an example code of how to run this model on a specific img folder: https://huggingface.co/doc2txt/layoutlmv2-finetuned-cord/blob/main/LayoutLMv2Main_cord2_gOcr_folder.py
layoutlmv2-finetuned-cord
This model is a fine-tuned version of microsoft/layoutlmv2-base-uncased on the cord dataset. It achieves the following results on the evaluation set:
- Loss: 0.2819
- Precision: 0.9653
- Recall: 0.9676
- F1: 0.9665
- Accuracy: 0.9703
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|
No log | 1.0 | 400 | 1.2752 | 0.8527 | 0.8382 | 0.8454 | 0.8481 |
1.9583 | 2.0 | 800 | 0.6372 | 0.8799 | 0.8948 | 0.8873 | 0.9021 |
0.7097 | 3.0 | 1200 | 0.4255 | 0.9241 | 0.9264 | 0.9253 | 0.9414 |
0.3845 | 4.0 | 1600 | 0.3021 | 0.9414 | 0.9482 | 0.9448 | 0.9611 |
0.2699 | 5.0 | 2000 | 0.2819 | 0.9653 | 0.9676 | 0.9665 | 0.9703 |
Framework versions
- Transformers 4.37.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.1