--- language: - de - en - fr - es - ru - it --- # ModelOne-Vision ## 🚀 Overview **ModelOne** is a state-of-the-art multilingual model fine-tuned from the Microsoft Phi Vision architecture and weights. It is built for extracting structured information from a wide range of documents, images, and visual data, leveraging a specialized `output_format` token for flexible, structured output. - **Base Model**: Microsoft Phi Vision - **Training Data**: 7M+ samples across 70+ languages. - **Output Flexibility**: Supports free text, CSV, JSON, YAML, XML formats. ## 💡 Join the Beta Program [**Sign up for the Beta Program**](https://manufactai.com) to finetune, evaluate and deploy this model on your own data and infrastructure. ## 🌍 Capabilities ModelOne is ideal for: - Extracting structured data from scanned and photographed documents. - Interpreting complex tables, charts, and visual data representations. - Performing multilingual OCR across a broad set of languages. - Adapting outputs based on user-defined formats for seamless integration. ## 📚 Training Data and Statistics ModelOne was trained on a proprietary, high-quality dataset featuring a diverse range of documents and real-world images. The training process included over 7 million data points, with a strong focus on multilingual coverage. ### 🗂️ Dataset Composition | **Document Type** | **Percentage** | **Details** | |-----------------------------|----------------|-------------------------------------------| | **Real-world Images** | 29% | Photos, scans of receipts, forms, ID cards | | **Multipage Documents** | 49% | Contracts, reports, books (up to 123 pages)| | **Single-page Documents** | 14% | Invoices, certificates, single-page forms | | **Visual Representations** | 8% | Tables, charts, graphs, diagrams | ### 🌍 Language Coverage Balanced representation across six main languages, with additional support for 64 more: | **Language** | **Percentage** | |--------------|----------------| | **English** | 14.27% | | **Spanish** | 14.50% | | **French** | 14.34% | | **German** | 14.06% | | **Italian** | 14.06% | | **Russian** | 14.58% | | **Other** | 14.19% (64 additional languages) | ### 🔑 Key Insights - **Balanced Language Representation**: Each major language contributes approximately 14%, ensuring equitable performance. - **Document Diversity**: Includes a mix of single and multi-page documents, real-world images, and visual representations for comprehensive model training. - **Robust Multilingual Capability**: Coverage across 70+ languages makes it suitable for global applications needing extensive linguistic support.