OpenOCR aims to establish a unified training and evaluation benchmark for scene text detection and recognition algorithms, at the same time, serves as the official code repository for the OCR team from the FVL Laboratory, Fudan University.
We are actively developing and refining it and expect to release the first version in July 2024.
We sincerely welcome the researcher to recommend OCR or relevant algorithms and point out any potential factual errors or bugs. Upon receiving the suggestions, we will promptly evaluate and critically reproduce them. We look forward to collaborating with you to advance the development of OpenOCR and continuously contribute to the OCR community!
- IGTR (Yongkun Du, Zhineng Chen*, Yuchen Su, Caiyan Jia, Yu-Gang Jiang. Instruction-Guided Scene Text Recognition, 2024. Doc, paper)
- SVTRv2 (Yongkun Du, Zhineng Chen*, Caiyan Jia, Yu-Gang Jiang. SVTRv2: Towards Arbitrary-Shaped Text Recognition with a Single Visual Model, 2024. paper coming soon)
- SMTR&FocalSVTR (Yongkun Du, Zhineng Chen*, Caiyan Jia, Xieping Gao, Yu-Gang Jiang. Out of Length Text Recognition with Sub-String Match, 2024. paper coming soon)
- CDistNet (Tianlun Zheng, Zhineng Chen*, Shancheng Fang, Hongtao Xie, Yu-Gang Jiang. CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition, IJCV 2024. paper)
- CPPD (Yongkun Du, Zhineng Chen*, Caiyan Jia, Xiaoting Yin, Chenxia Li, Yuning Du, Yu-Gang Jiang. Context Perception Parallel Decoder for Scene Text Recognition, 2023. PaddleOCR Doc, paper)
- SVTR (Yongkun Du, Zhineng Chen*, Caiyan Jia, Xiaoting Yin, Tianlun Zheng, Chenxia Li, Yuning Du, Yu-Gang Jiang. SVTR: Scene Text Recognition with a Single Visual Model, IJCAI 2022 (Long). PaddleOCR Doc, paper)
- NRTR (Fenfen Sheng, Zhineng Chen*, Bo Xu. NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition, ICDAR 2019. paper)
Reproduction schedule:
Method | Venue | Training | Evaluation | Contributor |
---|---|---|---|---|
CRNN | TPAMI2016 | ✅ | ✅ | |
ASTER | TPAMI2018 | ✅ | ✅ | pretto0 |
NRTR | ICDAR2019 | ✅ | ✅ | |
SAR | AAAI2019 | ✅ | ✅ | pretto0 |
RobustScanner | ECCV2020 | ✅ | ✅ | pretto0 |
SRN | CVPR2020 | ✅ | ✅ | pretto0 |
ABINet | CVPR2021 | ✅ | ✅ | YesianRohn |
VisionLAN | ICCV2021 | ✅ | ✅ | YesianRohn |
SVTR | IJCAI2022 | ✅ | ✅ | |
PARSeq | ECCV2022 | ✅ | ✅ | |
MATRN | ECCV2022 | TODO | ||
MGP-STR | ECCV2022 | TODO | ||
CPPD | 2023 | ✅ | ✅ | |
LPV | IJCAI2023 | ✅ | ✅ | |
MAERec(Union14m) | ICCV2023 | TODO | ||
LISTER | ICCV2023 | ✅ | ✅ | |
CDistNet | IJCV2024 | ✅ | ✅ | YesianRohn |
IGTR | 2024 | ✅ | ✅ | |
SMTR | 2024 | ✅ | ✅ | |
FocalSVTR-CTC | 2024 | ✅ | ✅ | |
SVTRv2 | 2024 | ✅ | ✅ | |
ResNet+En-CTC | ✅ | ✅ | ||
ViT-CTC | ✅ | ✅ |
Yiming Lei (pretto0) and Xingsong Ye (YesianRohn) from the FVL Laboratory, Fudan University, under the guidance of Professor Zhineng Chen, completed the majority of the algorithm reproduction work. Grateful for their outstanding contributions.
- 修改数据集路径
Train:
dataset:
name: LMDBDataSet
data_dir: Path to train data
...
Eval:
dataset:
name: LMDBDataSet
data_dir: Path to eval data
- 启动训练
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 tools/train_rec.py --c configs/rec/svtrnet_ctc.yml
参考提交PR流程
流程为:
1、先Fork OpenOCR 项目到自己Github仓库中。
2、git clone -b develop https://github.com/自己的用户名/OpenOCR.git (注意每次git clone 之前要保证自己的仓库是最新代码)。
3、参考svtrnet_ctc和svtr_base_cppd的代码结构,将新算法的preprocess、modeling.encoder、modeling.decoder、optimizer、loss、postprocess添加到代码中。
4、安装pre-commit,执行代码风格检查。
pip install pre-commit
pre-commit install
5、将新添加的算法训练、评估、测试跑通后,按照github提交commit的流程向源仓库提交PR。
This codebase is built based on the PaddleOCR and PytorchOCR. Thanks!