curiousily's picture
Update README.md
733368c
|
raw
history blame
746 Bytes
metadata
license: cc-by-nc-sa-4.0
language:
  - en
library_name: transformers
tags:
  - finance
metrics:
  - accuracy

Model

This model is a fine-tuned version of microsoft/layoutlmv3-base trained on Financial Documents Clustering Kaggle Dataset.

It classifies document images into one of the following (5) classes:

  • Income Statements
  • Balance Sheets
  • Cash Flows
  • Notes
  • Others

Training

This model uses OCR data from EasyOCR instead of the default Tesseract OCR engine.

Libraries

  • transformers 4.25.1
  • pytorch-lightning 1.8.6
  • torchmetrics 0.11.0
  • easyocr 1.6.2