WCAG Guidelines 2.1 Layout Detection
The model has only some of the tags like header, text, figure, table etc.
But WCAG have more tags like section, article, span etc so is this there any option or any other model which can handle that part also.
@aishwary456 thank you for your interest in our models!
All our document layout models are part of the Docling
document conversion pipeline (https://docling-project.github.io/docling/) and support 17 document elements:
classes_map = {
0: "Caption",
1: "Footnote",
2: "Formula",
3: "List-item",
4: "Page-footer",
5: "Page-header",
6: "Picture",
7: "Section-header",
8: "Table",
9: "Text",
10: "Title",
11: "Document Index",
12: "Code",
13: "Checkbox-Selected",
14: "Checkbox-Unselected",
15: "Form",
16: "Key-Value Region",
}
The result of the Docling
conversion pipeline is a DoclingDocument
(https://docling-project.github.io/docling/concepts/docling_document/) which describes all the content structure, not only at the elementary level, but also defines their hierarchies and groups. The DoclingDocument
representation covers use cases like elements that are "children" of other elements (e.g. "list items"), elements that span across multiple pages, etc.
We have some other classes also like h1,h2....h6, Label, Table of Contents, Reference, Note, Table Row, Artifact etc can heron model or docling cover these tags also
I recommend to use Docling
(https://github.com/docling-project/docling) and get a DoclingDocument
. This will give you an API to access all document elements (including tables, hierarchy levels, etc.).
Please check this documentation:
- DoclingDocument: https://docling-project.github.io/docling/concepts/docling_document/
- Training material how to use Docling: https://github.com/docling-project/docling-workshops/blob/main/workshops/2025_09_24/docling_lab_1.ipynb