Spaces:
Paused
Paused
INTRODUCTION | |
3 | |
can change depending on the context in which it is used. As an artifact of the | |
communication channel, not all documents are born digitally, and the quality | |
of the document can vary greatly, with some documents being handwritten, | |
scanned with low resolution, or even a picture of a document. Furthermore, | |
documents are often not standardized templates and can be highly variable in | |
terms of layout, structure, and content. Finally, the longer the document, the | |
more computationally demanding it becomes to process, and the more likely it | |
is to induce errors, which can be harder to detect. | |
Addressing the inherent challenges of document processing, and achieving high | |
levels of accuracy, processing speed, reliability, robustness, and scalability in | |
DU forms the applied scope of this thesis. | |
(II) Consider the example given of the birth certificate. While I might not | |
appreciate as much the manual handling of this document, if they had registered | |
my baby girl’s name (Feliz, Spanish writing without an accent on the ‘e’) | |
incorrectly, I would be pretty upset as this could have further repercussions. | |
Whereas this error might be easily rectified, it is not so easy to do so in the | |
case of a mortgage application, where the wrong information could lead to a | |
rejection of the application, or even worse, a loan agreement with the wrong | |
terms and conditions. This demonstrates that, even when full automation of | |
document processing is in high demand, it is not always desirable if the risk of | |
failure might be too large. | |
Nevertheless, a lot of the potential for automation remains untapped, and | |
organizations are increasingly looking for solutions to fully automate their | |
document processing workflows. However, full automation, implying perfect | |
recognition of document categories and impeccable information extraction is an | |
unattainable goal with the current state of technology [79]. | |
The more realistic objective set is Intelligent Automation (IA) (elaborated | |
on in Section 2.4), where the goal is to have the machine estimate confidence | |
in its predictions, deriving business value with as high as possible volumes of | |
perfect predictions (Straight-Through-Processing, STP) without incurring extra | |
costs (False Positives, FP). | |
The leitmotif of this thesis will be the fundamental enablers of IA: confidence | |
estimation and failure prediction. | |
Calibrated uncertainty estimation with efficient and effective DU technology | |
will allow organizations to confidently automate their document processing | |
workflow, while keeping a human in the loop only for predictions with a higher | |
likelihood of being wrong. To date, however, little research has addressed the | |
question of how to make DU technology more reliable, as is illustrated in a toy | |
analysis (Table 1.1) reporting the absence of many IA-related keywords in the | |
Proceedings of the 2021 International Conference on Document Analysis and | |