- IAM comprises 1,539 pages and 13,353 lines of handwritten English text.
- CASIA-HWDB is an offline handwritten Chinese dataset, which contains about 5,090 pages and 1.35 million character samples of 7,356 classes (7,185 Chinese characters and 171 symbols).
- For IAM
Recognize the text in the image.
Results of IAM
Method Page-level Line-level WER↓ CER↓ WER↓ CER↓ GPT-4V 9.84% 3.32% 33.42% 13.75% Supervised-SOTA 8.29% 2.89% 21.47% 6.52% -
Results of CASIA-HWDB
Method Page-level Line-level AR↑ CR↑ AR↑ CER↑ GPT-4V 0.97% 36.54% -3.45% 11.85% Supervised-SOTA 92.86% 93.24% 97.70% 97.91% -
Illustration of handwritten text recognition. (a), (b), (c), (d) are samples of page-level IAM, line-level IAM, page-level CASIA-HWDB and line-level CASIA-HWDB, respectively. In the responses of GPT-4V, we highlight characters that match the GT in green and characters that do not match in red. For English text, GPT-4V demonstrates excellent performance. In contrast, for Chinese text, GPT-4V has generated a passage of text that is semantically coherent, but it is not associated with the ground truth text (GT).