Safetensors
rt_detr_v2

WCAG Guidelines 2.1 Layout Detection

#6
by aishwary456 - opened

The model has only some of the tags like header, text, figure, table etc.
But WCAG have more tags like section, article, span etc so is this there any option or any other model which can handle that part also.

Docling org

@aishwary456 thank you for your interest in our models!

All our document layout models are part of the Docling document conversion pipeline (https://docling-project.github.io/docling/) and support 17 document elements:

classes_map = {
    0: "Caption",
    1: "Footnote",
    2: "Formula",
    3: "List-item",
    4: "Page-footer",
    5: "Page-header",
    6: "Picture",
    7: "Section-header",
    8: "Table",
    9: "Text",
    10: "Title",
    11: "Document Index",
    12: "Code",
    13: "Checkbox-Selected",
    14: "Checkbox-Unselected",
    15: "Form",
    16: "Key-Value Region",
}

The result of the Docling conversion pipeline is a DoclingDocument (https://docling-project.github.io/docling/concepts/docling_document/) which describes all the content structure, not only at the elementary level, but also defines their hierarchies and groups. The DoclingDocument representation covers use cases like elements that are "children" of other elements (e.g. "list items"), elements that span across multiple pages, etc.

We have some other classes also like h1,h2....h6, Label, Table of Contents, Reference, Note, Table Row, Artifact etc can heron model or docling cover these tags also

Docling org

I recommend to use Docling(https://github.com/docling-project/docling) and get a DoclingDocument. This will give you an API to access all document elements (including tables, hierarchy levels, etc.).

Please check this documentation:

Sign up or log in to comment