On January 29, 2026, Baidu introduced and made publicly available PaddleOCR-VL-1.5, a cutting-edge document parsing model. Sporting a streamlined 0.9B architecture, this model excelled in the OmniDocBench V1.5 evaluation, achieving a global top-tier comprehensive performance with an impressive accuracy rate of 94.5%. PaddleOCR-VL-1.5 stands out as the inaugural OCR model to incorporate "irregular bounding box localization" functionality, facilitating accurate recognition of documents that are skewed, bent, or distorted due to photography. It effectively tackles the challenge of recognition failures stemming from document deformation in real-world settings, a prevalent issue with conventional OCR techniques. This renders it highly suitable for applications in financial document processing, archival digitization projects, and government document management workflows. Furthermore, the model excels in crucial metrics such as table structure comprehension and reading order prediction. It introduces innovative features like seal recognition and multilingual support, along with capabilities for cross-page table consolidation and paragraph heading identification. Presently, PaddleOCR-VL-1.5 is open-source and accessible through GitHub and HuggingFace, enabling seamless online experiences and API integration.
