As reported by the Science and Technology Innovation Board Daily, on January 27, 2026, the DeepSeek team officially made their next-generation document recognition model, DeepSeek-OCR 2, available as open-source, alongside the release of a technical paper entitled 'DeepSeek-OCR 2: Visual Causal Flow'. This innovative model leverages a cutting-edge DeepEncoder V2 encoder architecture, overcoming the constraints of conventional visual language models that process images in a rigid grid sequence. Instead, it dynamically reorganizes the processing sequence of visual data according to image semantics, mirroring the human approach of selective reading. In the OmniDocBench v1.5 benchmark evaluation, DeepSeek-OCR 2 attained an overall score of 91.09% with a reduced visual token limit, marking a 3.73% enhancement compared to its forerunner. The reading order edit distance saw a notable reduction from 0.085 to 0.057, substantially bolstering its capacity to decipher intricate document layouts, particularly in contexts involving academic papers, tables, and formula interpretation.
