ModelOne-Vision

🚀 Overview

ModelOne is a state-of-the-art multilingual model fine-tuned from the Microsoft Phi Vision architecture and weights. It is built for extracting structured information from a wide range of documents, images, and visual data, leveraging a specialized output_format token for flexible, structured output.

Base Model: Microsoft Phi Vision
Training Data: 7M+ samples across 70+ languages.
Output Flexibility: Supports free text, CSV, JSON, YAML, XML formats.

💡 Join the Beta Program

Sign up for the Beta Program to finetune, evaluate and deploy this model on your own data and infrastructure.

🌍 Capabilities

ModelOne is ideal for:

Extracting structured data from scanned and photographed documents.
Interpreting complex tables, charts, and visual data representations.
Performing multilingual OCR across a broad set of languages.
Adapting outputs based on user-defined formats for seamless integration.

📚 Training Data and Statistics

ModelOne was trained on a proprietary, high-quality dataset featuring a diverse range of documents and real-world images. The training process included over 7 million data points, with a strong focus on multilingual coverage.

🗂️ Dataset Composition

Document Type	Percentage	Details
Real-world Images	29%	Photos, scans of receipts, forms, ID cards
Multipage Documents	49%	Contracts, reports, books (up to 123 pages)
Single-page Documents	14%	Invoices, certificates, single-page forms
Visual Representations	8%	Tables, charts, graphs, diagrams

🌍 Language Coverage

Balanced representation across six main languages, with additional support for 64 more:

Language	Percentage
English	14.27%
Spanish	14.50%
French	14.34%
German	14.06%
Italian	14.06%
Russian	14.58%
Other	14.19% (64 additional languages)

🔑 Key Insights

Balanced Language Representation: Each major language contributes approximately 14%, ensuring equitable performance.
Document Diversity: Includes a mix of single and multi-page documents, real-world images, and visual representations for comprehensive model training.
Robust Multilingual Capability: Coverage across 70+ languages makes it suitable for global applications needing extensive linguistic support.

manufactAILabs
/

ModelOne

You need to agree to share your contact information to access this model