Baidu Unveils Its New - Generation Text Recognition Solution: PP - OCRv5
3 day ago / Read about 0 minute
Author:小编   

On September 10, 2025, Baidu made public its new - generation text recognition AI model, PP - OCRv5, on the Hugging Face platform. This move is aimed at overcoming the limitations that general - purpose vision - language models (VLMs) face in the OCR (Optical Character Recognition) domain.

This innovative solution is specifically designed to cater to text recognition requirements across a wide array of scenarios and text types. It offers support for five mainstream text types, namely Simplified Chinese, Traditional Chinese, English, Japanese, and Pinyin. Moreover, it has fine - tuned its recognition capabilities to handle complex scenarios with ease, such as handwritten text, vertical text, and rare characters. When compared to its predecessor, PP - OCRv4, there has been a remarkable 13 percentage point increase in end - to - end recognition accuracy.

PP - OCRv5 employs a modular two - stage process and boasts a mere 0.07B parameters. This lightweight design allows it to operate efficiently on CPUs and edge devices. For instance, the mobile version can process more than 370 characters per second on an Intel Xeon Gold 6271C CPU.

The architecture of PP - OCRv5 is composed of four core components: image preprocessing, text detection, text line direction classification, and text recognition. This comprehensive setup enables it to support recognition for over 40 languages.

At present, the model is accessible on Hugging Face. Users can take advantage of the online testing feature, while developers have the option to download it for local deployment.

  • C114 Communication Network
  • Communication Home