Tencent Hunyuan Unveils Chronicles-OCR: A Benchmark for Chinese Ancient Character Recognition
1 day ago / Read about 0 minute
Author:小编   

Tencent Hunyuan, in partnership with the Key Laboratory of Oracle Bone Script Information Processing at Anyang Normal University, along with other esteemed institutions, has proudly introduced Chronicles-OCR. This groundbreaking dataset represents the industry's inaugural perception evaluation benchmark specifically designed for Chinese ancient characters, encompassing the entire evolutionary spectrum of the "Seven Script Transformations."

Chronicles-OCR amalgamates meticulously curated data from multiple sources. It incorporates manually compiled information from the Key Laboratory of Oracle Bone Script Information Processing at Anyang Normal University, contributions from a dedicated team of doctoral candidates and graduate students specializing in paleography, and the handwritten character recognition test dataset for cultural relics sourced from the Palace Museum. To guarantee utmost accuracy and reliability, all data within this dataset has been subjected to rigorous multi-level cross-annotation by domain experts. The final compilation boasts a total of 2,800 meticulously balanced, high-quality images, with 400 images allocated to each script style, ensuring comprehensive coverage and representation.