Recently, Huazhong University of Science and Technology and Kingsoft Office have jointly introduced the MonkeyOCR model. This innovative model boasts capabilities such as restoring embedded images within tables and merging cross-page tables, achieving an impressive accuracy rate of over 90% in complex table scenarios. Remarkably, with a mere 3 billion (3B) parameter scale, this model outperforms international large-scale models. The latest version, MonkeyOCR v1.5, has claimed the top spot globally in terms of overall performance on the internationally recognized document parsing benchmark.
Looking ahead, both institutions are set to launch the largest multilingual document parsing dataset and OCR visual foundation model ever created, further propelling the advancement of intelligent document parsing technology.
