Skip to content

Latest commit

 

History

History
82 lines (74 loc) · 2.38 KB

results_htr.md

File metadata and controls

82 lines (74 loc) · 2.38 KB

Handwritten text recognition

Dataset

  • IAM comprises 1,539 pages and 13,353 lines of handwritten English text.
  • CASIA-HWDB is an offline handwritten Chinese dataset, which contains about 5,090 pages and 1.35 million character samples of 7,356 classes (7,185 Chinese characters and 171 symbols).

Prompt

  • For IAM
    Recognize the text in the image.
    
  • For CASIA-HWDB
    请直接告诉我,图片中的文字都是什么?
    

Results

  • Results of IAM

    Method Page-level Line-level
    WER↓ CER↓ WER↓ CER↓
    GPT-4V 9.84% 3.32% 33.42% 13.75%
    Supervised-SOTA 8.29% 2.89% 21.47% 6.52%
  • Results of CASIA-HWDB

    Method Page-level Line-level
    AR↑ CR↑ AR↑ CER↑
    GPT-4V 0.97% 36.54% -3.45% 11.85%
    Supervised-SOTA 92.86% 93.24% 97.70% 97.91%
  • Illustration of handwritten text recognition. (a), (b), (c), (d) are samples of page-level IAM, line-level IAM, page-level CASIA-HWDB and line-level CASIA-HWDB, respectively. In the responses of GPT-4V, we highlight characters that match the GT in green and characters that do not match in red. For English text, GPT-4V demonstrates excellent performance. In contrast, for Chinese text, GPT-4V has generated a passage of text that is semantically coherent, but it is not associated with the ground truth text (GT). 0