--- license: cc-by-nc-sa-4.0 language: - en library_name: transformers tags: - finance metrics: - accuracy --- ## Model This model is a fine-tuned version of [microsoft/layoutlmv3-base](https://huggingface.co/microsoft/layoutlmv3-base) trained on [Financial Documents Clustering Kaggle Dataset](https://www.kaggle.com/datasets/drcrabkg/financial-statements-clustering). It classifies document images into one of the following (5) classes: - Income Statements - Balance Sheets - Cash Flows - Notes - Others ## Training This model uses OCR data from [EasyOCR](https://github.com/JaidedAI/EasyOCR) instead of the default Tesseract OCR engine. ## Libraries - transformers 4.25.1 - pytorch-lightning 1.8.6 - torchmetrics 0.11.0 - easyocr 1.6.2