Spaces:

jordyvl
/

ask_my_thesis

Paused

App Files Files Community

ask_my_thesis / assets /txts /pg_0042.txt

jordyvl

First commit

e0a78f5 8 months ago

raw

history blame

1.79 kB

	10

	INTRODUCTION

	Chapter 4 reflects on the current state of DU research, and proposes guidelines to
	foster document dataset construction efforts. It introduces two novel document
	classification datasets, RVL-CDIP_MP and RVL-CDIP-N_MP, as extensions
	of the RVL-CDIP dataset [165] with multipage documents. The datasets are
	accompanied by a comprehensive experimental analysis, which shows promise
	from advancing multipage document representations and inference.
	Chapter 5 introduces the multi-faceted DUDE
	benchmark for assessing
	generic DU, that was also hosted as a competition to challenge the DU
	community. It describes the complete methodology and design of the dataset,
	targeting model innovations that can handle the complexity and variety of
	real-world documents and subtasks, and generalize to any documents and any
	questions. Next to a discussion of the competition results, it also presents
	our own comprehensive benchmarking study of SOTA LLMs with varying the
	context length and what modalities are represented.
	Chapter 6 investigates how to efficiently obtain more semantic document layout
	awareness. We explore what affects the teacher-student knowledge gap in
	KD-based model compression methods, and design a downstream task setup
	to evaluate the robustness of distilled DLA models on zero-shot layout-aware
	DocVQA.
	Finally, Chapter 7 concludes the thesis with a summary of the main contributions
	(Section 7.1), and a discussion of future research directions. As a logical followup to Chapter 5, we propose in Section 7.2.2.1 how the DUDE dataset could
	be extended to become the ‘ultimate’ DU benchmark. The thesis ends with a
	hypothetical, informed design of how the research presented would form part of
	an end-to-end, fully-fledged IA-DU solution (Section 7.2.2.2).