Full Code of yflv-yanxia/scene_text for AI

master bb2f559159ca cached
1 files
38.0 KB
11.9k tokens
1 requests
Download .txt
Repository: yflv-yanxia/scene_text
Branch: master
Commit: bb2f559159ca
Files: 1
Total size: 38.0 KB

Directory structure:
gitextract_7o8twzf0/

└── README.md

================================================
FILE CONTENTS
================================================

================================================
FILE: README.md
================================================
# scene_text

## Text Detection
[DETRs Beat YOLOs on Real-time Object Detection](https://arxiv.org/abs/2304.08069) -baidu, arxiv2023, [code](https://github.com/lyuwenyu/RT-DETR)<br>
[Real-time Scene Text Detection Based on Global Level and Word Level Features](https://arxiv.org/abs/2203.05251) -arxiv2022<br>
[Kernel Proposal Network for Arbitrary Shape Text Detection](https://arxiv.org/abs/2203.06410) -yinxucheng, TNNLS2022,[code](https://github.com/GXYM/KPN)<br>
[Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion](https://arxiv.org/abs/2202.10304) -baixiang, PAMI2022, [code](https://github.com/MhLiao/DB)<br>
[Towards End-to-End Unified Scene Text Detection and Layout Analysis](https://arxiv.org/abs/2203.15143) -CVPR2022, google, [code](https://github.com/google-research-datasets/hiertext)<br>
[Arbitrary Shape Text Detection using Transformers](https://arxiv.org/abs/2202.11221) -arxiv2022<br>
[Few Could Be Better Than All: Feature Sampling and Grouping for Scene Text Detection](https://arxiv.org/abs/2203.15221) -baixiang, CVPR2022<br>
[Vision-Language Pre-Training for Boosting Scene Text Detectors](https://arxiv.org/abs/2204.13867) -baixiang, CVPR2022<br>
[UNITS: Unsupervised Intermediate Training Stage for Scene Text Detection](https://arxiv.org/abs/2205.04683) -guoyouhui, ICME2022<br>
[Arbitrary Shape Text Detection via Boundary Transformer](https://arxiv.org/abs/2205.05320) -yinxucheng, arxiv2022<br>
[Arbitrary Shape Text Detection via Segmentation with Probability Maps](https://ieeexplore.ieee.org/abstract/document/9779460) -yinxucheng, PAMI2022, [code](https://github.com/GXYM/TextPMs)<br>
[HRRegionNet: Chinese Character Segmentation in Historical Documents with Regional Awareness](https://link.springer.com/chapter/10.1007/978-3-030-86337-1_1) -ICDAR2021<br>
\[Real-Time\][Real-Time Scene Text Detection with Differentiable Binarization](https://arxiv.org/abs/1911.08947) -baixiang, AAAI2020, [code](https://github.com/MhLiao/DB)<br>
[Deep relational reasoning graph network for arbitrary shape text detection](https://openaccess.thecvf.com/content_CVPR_2020/html/Zhang_Deep_Relational_Reasoning_Graph_Network_for_Arbitrary_Shape_Text_Detection_CVPR_2020_paper.html) -yinxucheng, CVPR2020, [code](https://github.com/GXYM/DRRG)<br>
[All you need is boundary: Toward arbitrary-shaped text spotting](https://ojs.aaai.org/index.php/AAAI/article/view/6896) -baixiang, AAAI2020<br>
[All you need is a second look: Towards Tighter Arbitrary shape text detection](https://arxiv.org/abs/2004.12436) -arxiv2020<br>
[Self-Training for Domain Adaptive Scene Text Detection](https://arxiv.org/abs/2005.11487) -arxiv2020<br>
[NENET: An Edge Learnable Network for Link Prediction in Scene Text](https://arxiv.org/abs/2005.12147) -arxiv2020<br>
[Efficient Scene Text Detection with Textual Attention Tower](https://arxiv.org/abs/2002.03741) -Liang Zhang, ICASSP2020<br>
[Scale-Invariant Multi-Oriented Text Detection in Wild Scene Images](https://arxiv.org/abs/2002.06423) -Kinjal Dasgupta, arxiv2020<br>
[PuzzleNet: Scene Text Detection by Segment Context Graph Learning](https://arxiv.org/abs/2002.11371) -Hao Liu, arxiv2020<br>
[Refined Gate: A Simple and Effective Gating Mechanism for Recurrent Units](https://arxiv.org/abs/2002.11338) -Yu Qiao, arxiv2020<br>
[HRCenterNet: An Anchorless Approach to Chinese Character Segmentation in Historical Documents](https://arxiv.org/abs/2012.05739) -BigData2020, [code](https://github.com/Tverous/HRCenterNet)<br>
[Look more than once: An accurate detector for text of arbitrary shapes](https://openaccess.thecvf.com/content_CVPR_2019/html/Zhang_Look_More_Than_Once_An_Accurate_Detector_for_Text_of_CVPR_2019_paper.html) -baidu, CVPR2019<br>
[Gliding vertex on the horizontal bounding box for multi-oriented object detection](https://arxiv.org/abs/1911.09358) -Xiang Bai, arxiv2019[code](https://github.com/MingtaoFu/gliding_vertex)<br>
[Exploring the Capacity of Sequential-free Box Discretization Network for Omnidirectional Scene Text Detection](https://arxiv.org/abs/1912.09629) -jinlianwen, arxiv2019<br>
[Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network](https://arxiv.org/abs/1908.05900) -face++,  ICCV 2019, [code](https://github.com/whai362/pan_pp.pytorch)<br>
[A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning](https://arxiv.org/abs/1908.05498) -Pengfei Wang, arxiv2019<br>
[It's All About The Scale -- Efficient Text Detection Using Adaptive Scaling](https://arxiv.org/abs/1907.12122) -Elad Richardson, arxiv2019<br>
[FaSTExt: Fast and Small Text Extractor](https://arxiv.org/abs/1908.09231) -Alexander Filonenko, arxiv2019<br>
[Curved Text Detection in Natural Scene Images with Semi- and Weakly-Supervised Learning](https://arxiv.org/abs/1908.09990) -Xugong Qin, arxiv2019<br>
[Learning Shape-Aware Embedding for Scene Text Detection](http://jiaya.me/papers/textdetection_cvpr19.pdf) -CUHK, Tencent, CVPR2019<br>
[Shape Robust Text Detection with Progressive Scale Expansion Network](https://arxiv.org/abs/1903.12473) -megi++, CVPR2019<br>
[Arbitrary Shape Scene Text Detection with Adaptive Text Region Representation](https://arxiv.org/abs/1905.05980) -Xiaobing Wang, Yingying Jiang, Zhenbo Luo, Cheng-Lin Liu, Hyunsoo Choi, Sungjin Kim, CVPR2019<br>
[Character Region Awareness for Text Detection](https://arxiv.org/abs/1904.01941) -Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee, CVPR2019<br>
[Towards Robust Curve Text Detection with Conditional Spatial Expansion](https://arxiv.org/abs/1903.08836) -Zichuan Liu, Guosheng Lin, Sheng Yang, Fayao Liu, Weisi Lin, Wang Ling Goh, CVPR2019<br>
[Pyramid Mask Text Detector](https://arxiv.org/abs/1903.11800) -sensetime, arxiv2019<br>
[Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes](https://arxiv.org/abs/1904.06535) -baidu, CVPR2019<br>
[Character Region Awareness for Text Detection](https://arxiv.org/pdf/1904.01941.pdf) -Clova, CVPR2019<br>
[Detecting Text in the Wild with Deep Character Embedding Network](https://arxiv.org/abs/1901.00363) -baidu, arxiv2019<br>
[TextField: Learning A Deep Direction Field for Irregular Scene Text Detection](https://arxiv.org/abs/1812.01393) -Yongchao Xu, Yukang Wang, Wei Zhou, Yongpan Wang, Zhibo Yang, Xiang Bai, arxiv2018<br>
[TextMountain: Accurate Scene Text Detection via Instance Segmentation](https://arxiv.org/abs/1811.12786) -Yixing Zhu, Jun Du, arxiv2018<br>
[Mask R-CNN with Pyramid Attention Network for Scene Text Detection](https://arxiv.org/abs/1811.09058) -MSRA, arxiv2018<br>
[Scene Text Detection with Supervised Pyramid Context Network](https://arxiv.org/abs/1811.08605) -face++, AAAI2019<br>
[Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks](https://arxiv.org/abs/1811.07432) -cloudwalk, arxiv2018<br>
[Improving Rotated Text Detection with Rotation Region Proposal Networks](https://arxiv.org/abs/1811.07031) -facebook, arxiv2018<br>
[IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection](https://arxiv.org/abs/1805.01167) -Alibaba, IJCAI2018<br>
[TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes](https://arxiv.org/abs/1807.01544) -peking, face++, arxiv2018<br>
[PSENET: Shape Robust Text Detection with Progressive Scale Expansion Network](https://arxiv.org/abs/1806.02559) -deepinsight, CVPR2019<br>
[Arbitrary-Oriented Scene Text Detection via Rotation Proposals](https://ieeexplore.ieee.org/abstract/document/8323240/) -J Ma, W Shao, H Ye, L Wang, H Wang, TMM2018<br>
[TextBoxes++: A Single-Shot Oriented Scene Text Detector](https://arxiv.org/abs/1801.02765) -Minghui Liao, Baoguang Shi, Xiang Bai, arxiv2018 [code](https://github.com/MhLiao/TextBoxes_plusplus)<br>
[Dense and Tight Detection of Chinese Characters in Historical Documents: Datasets and a Recognition Guided Detector](https://ieeexplore.ieee.org/document/8364534) -JinLianwen, IEEEaccess2018<br>
[R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection](https://arxiv.org/abs/1706.09579) -Samsung, arxiv2018<br>
[Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation](https://arxiv.org/abs/1802.08948) -Pengyuan Lyu, Cong Yao, Wenhao Wu, Shuicheng Yan, Xiang Bai, arxiv2018<br>
[PixelLink: Detecting Scene Text via Instance Segmentation](https://arxiv.org/abs/1801.01315) -Dan Deng, Haifeng Liu, Xuelong Li, Deng Cai, aaai2018<br>
[EAST: an efficient and accurate scene text detector](http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhou_EAST_An_Efficient_CVPR_2017_paper.pdf) -Megvii, cvpr2017, [code](https://github.com/argman/EAST)<br>
[Scene text detection and segmentation based on cascaded convolution neural networks](https://ieeexplore.ieee.org/abstract/document/7828014/) -Y Tang, X Wu, TIP2017<br>
[TextBoxes: A Fast Text Detector with a Single Deep Neural Network.](http://www.aaai.org/ocs/index.php/AAAI/AAAI17/paper/download/14202/14295) -M Liao, B Shi, X Bai, X Wang, W Liu, AAAI2017, [code](https://github.com/MhLiao/TextBoxes)<br>
[Deep direct regression for multi-oriented scene text detection](http://openaccess.thecvf.com/content_ICCV_2017/papers/He_Deep_Direct_Regression_ICCV_2017_paper.pdf) -W He, XY Zhang, F Yin, CL Liu, ICCV2017<br>
[Detecting oriented text in natural images by linking segments](http://openaccess.thecvf.com/content_cvpr_2017/papers/Shi_Detecting_Oriented_Text_CVPR_2017_paper.pdf) -B Shi, X Bai, S Belongie, CVPR2017, [code](https://github.com/bgshih/seglink)<br>
[Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection](https://arxiv.org/pdf/1703.01425.pdf) -Yuliang Liu, Lianwen Jin, CVPR2017<br>
[Feature Enhancement Network: A Refined Scene Text Detector](https://arxiv.org/abs/1711.04249) -Sheng Zhang, Yuliang Liu, Lianwen Jin, Canjie Luo, arxiv2017<br>
[Single Shot Text Detector with Regional Attention](http://openaccess.thecvf.com/content_ICCV_2017/papers/He_Single_Shot_Text_ICCV_2017_paper.pdf) -Pan He, Weilin Huang, Tong He, Qile Zhu, Yu Qiao, and Xiaolin Li, ICCV2017<br>
[A Convolutional Neural Network-Based Chinese Text Detection Algorithm via Text Structure Modeling](https://ieeexplore.ieee.org/abstract/document/7733055/) -Xiaohang Ren, Yi Zhou, Jianhua He, Kai Chen, Xiaokang Yang, Jun Sun, TMM2017<br>
[Fused Text Segmentation Networks for Multi-oriented Scene Text Detection](https://arxiv.org/abs/1709.03272) -Yuchen Dai, et al, arxiv2017<br>
[Scene Text Detection with Novel Superpixel Based Character Candidate Extraction](https://ieeexplore.ieee.org/abstract/document/8270087/) -Cong Wang, Fei Yin, Cheng-Lin Liu, ICDAR2017<br>
[WeText: Scene Text Detection under Weak Supervision](https://arxiv.org/abs/1710.04826) -Shangxuan Tian, Shijian Lu, Chongshou Li, ICCV2017<br>
[WordSup: Exploiting Word Annotations for Character based Text Detection](https://arxiv.org/pdf/1708.06720.pdf) -MSRA, IDL, ICCV2017<br>
[Deep Residual Text Detection Network for Scene Text](https://arxiv.org/abs/1711.04147) -Xiangyu Zhu, et al, arxiv2017<br>
[Cascaded Segmentation-Detection Networks for Word-Level Text Spotting](https://arxiv.org/abs/1704.00834) -Siyang Qin, Roberto Manduchi, arxiv2017<br>
[Arbitrary-Oriented Scene Text Detection via Rotation Proposals](https://arxiv.org/pdf/1703.01086.pdf) -Jianqi Ma, et al, TMM2017<br>
[Multi-oriented text detection with fully convolutional networks](http://openaccess.thecvf.com/content_cvpr_2016/papers/Zhang_Multi-Oriented_Text_Detection_CVPR_2016_paper.pdf) -Z Zhang, C Zhang, W Shen, C Yao, CVPR2016<br>
[Scene text detection via holistic, multi-channel prediction](https://arxiv.org/abs/1606.09002) -C Yao, X Bai, N Sang, X Zhou, S Zhou, arxiv2016<br>

## Text Recognition
[Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer](https://arxiv.org/abs/2311.13120) -bytedance, CVPR2024, [code](https://github.com/bytedance/E2STR)<br>
[Revisiting Scene Text Recognition: A Data Perspective](https://openaccess.thecvf.com/content/ICCV2023/html/Jiang_Revisiting_Scene_Text_Recognition_A_Data_Perspective_ICCV_2023_paper.html) -jinlianwen, ICCV2023, [code](https://github.com/Mountchicken/Union14M)<br>
[Context Perception Parallel Decoder for Scene Text Recognition](https://arxiv.org/abs/2307.12270) -baidu,arxiv2023<br>
[Cdistnet: Perceiving multi-domain character distance for robust text recognition](https://arxiv.org/pdf/2111.11011) -fudan, IJCV2023, [code](https://github.com/simplify23/CDistNet)<br>
[Trocr: Transformer-based optical character recognition with pre-trained models](https://ojs.aaai.org/index.php/AAAI/article/view/26538) -Microsoft, AAAI2023, [code](https://aka.ms/trocr)<br>
[Context-Based Contrastive Learning for Scene Text Recognition](https://aaai-2022.virtualchair.net/poster_aaai10147) -AAAI2022<br>
[SVTR: Scene Text Recognition with a Single Visual Model](https://arxiv.org/abs/2205.00159) -baidu, IJCAI2022, [code]()<br>
[Multi-modal Text Recognition Networks: Interactive enhancements between visual and semantic features](https://arxiv.org/abs/2111.15263) -ECCV2022<br>
[Reciprocal Feature Learning via Explicit and Implicit Tasks in Scene Text Recognition](https://arxiv.org/abs/2105.06229v2) -hikvision, ICDAR2021, [code](https://github.com/hikopensource/DAVAR-Lab-OCR/tree/main/demo/text_recognition/rflearning)<br>
[Dictionary-Guided Scene Text Recognition](https://openaccess.thecvf.com/content/CVPR2021/html/Nguyen_Dictionary-Guided_Scene_Text_Recognition_CVPR_2021_paper.html) -CVPR2021,[code](https://github.com/VinAIResearch/dict-guided)<br>
[TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) -beihang, arxiv2021, [code](https://aka.ms/TrOCR)<br>
[RecycleNet: An Overlapped Text Instance Recovery Approach](https://dl.acm.org/doi/abs/10.1145/3474085.3481536) -tencent, MMM21<br>
[Vision Transformer for Fast and Efficient Scene Text Recognition](https://arxiv.org/abs/2105.08582) Rowel-ICDAR2021<br>
[Visual-semantic transformer for scene text recognitio](https://arxiv.org/abs/2112.00948)-pingan, arxiv2021<br>
[PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network](https://arxiv.org/abs/2104.05458) -baidu, AAAI2021, [code](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.5/doc/doc_ch/algorithm_e2e_pgnet.md)<br>
[What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels](https://arxiv.org/abs/2103.04400) -tokyo, CVPR2021, [code](https://github.com/ku21fan/STR-Fewer-Labels)<br>
[Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition](https://arxiv.org/abs/2103.06495) -ShanchengFang, CVPR2021, [code](https://github.com/FangShancheng/ABINet)<br>
\[light\][Hamming OCR: A Locality Sensitive Hashing Neural Network for Scene Text Recognition](https://arxiv.org/abs/2009.10874) --pingan, arxiv2020<br>
[Gaussian Constrained Attention Network for Scene Text Recognition](https://arxiv.org/abs/2010.09169) -qiaozhi, ICPR2020, [code](https://github.com/Pay20Y/GCAN)<br>
[Adaptive Text Recognition through Visual Matching](https://arxiv.org/abs/2009.06610) -zisserman, ECCV2020<br>
[On Vocabulary Reliance in Scene Text Recognition](https://arxiv.org/abs/2005.03959) -megvii, CVPR2020<br>
[Joint Layout Analysis, Character Detection and Recognition for Historical Document Digitization](https://arxiv.org/abs/2007.06890) -JinLianwen, ICFHR2020<br>
[Text Recognition in Real Scenarios with a Few Labeled Samples](https://arxiv.org/abs/2006.12209) -arxiv2020<br> 
[RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition](https://arxiv.org/abs/2007.07542) -ECCV2020<br>
[On Recognizing Texts of Arbitrary Shapes With 2D Self-Attention](https://openaccess.thecvf.com/content_CVPRW_2020/html/w34/Lee_On_Recognizing_Texts_of_Arbitrary_Shapes_With_2D_Self-Attention_CVPRW_2020_paper.html) -CVPRW2020<br>
[SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition](https://arxiv.org/abs/2005.10977) -zhi qiao, CVPR2020<br>
[Text Recognition in the Wild: A Survey](https://arxiv.org/abs/2005.03492) -jinlianwen, arxiv2020<br>
[GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition](https://arxiv.org/abs/2002.01276) -Wenyang Hu, AAAI2020<br>
[A New Perspective for Flexible Feature Gathering in Scene Text Recognition Via Character Anchor Pooling](https://arxiv.org/abs/2002.03509) -yao cong, ICASSP2020<br>
[SCATTER: Selective Context Attentional Scene Text Recognizer](https://arxiv.org/abs/2003.11288) -Ron Litman, CVPR2020<br>
[Scene Text Recognition via Transformer](https://arxiv.org/abs/2003.08077) -Xinjie Feng, arxiv2020<br>
[Efficient Backbone Search for Scene Text Recognition](https://arxiv.org/abs/2003.06567) -baixiang, arxiv2020<br>
[Towards Accurate Scene Text Recognition with Semantic Reasoning Networks](https://arxiv.org/abs/2003.12294v1) -Baidu, CVPR2020<br>
[Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition](https://arxiv.org/abs/2003.06606) -jinlianwen, CVPR2020, [code](https://github.com/Canjie-Luo/Text-Image-Augmentation)<br>
[Decoupled Attention Network for Text Recognition](https://arxiv.org/abs/1912.10205) -jianlianwen, AAAI2020<br>
[Fast Dense Residual Network: Enhancing Global Dense Feature Flow for Text Recognition](https://arxiv.org/abs/2001.09021) -Zhao Zhang, arxiv2020<br>
[Separating Content from Style Using Adversarial Learning for Recognizing Text in the Wild](https://arxiv.org/abs/2001.04189) -jin lianwen, arxiv2020<br>
[TextScanner: Reading Characters in Order for Robust Scene Text Recognition](https://arxiv.org/abs/1912.12422) -yao cong, AAAI2020<br>
[What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis](https://arxiv.org/abs/1904.01906) clova-ICCV2019,[code](https://github.com/clovaai/deep-text-recognition-benchmark)<br>
[A Feasible Framework for Arbitrary-Shaped Scene Text Recognition](https://arxiv.org/abs/1912.04561) -champion in ICDAR2019, arxiv2019[code](https://github.com/zhang0jhon/AttentionOCR)<br>
[Deep Neural Network for Semantic-based Text Recognition in Images](https://arxiv.org/abs/1908.01403) -Yi Zheng, arxiv2019<br>
[Symmetry-constrained Rectification Network for Scene Text Recognition](https://arxiv.org/abs/1908.01957) -baixiang, ICCV2019<br>
[Adaptive Embedding Gate for Attention-Based Scene Text Recognition](https://arxiv.org/abs/1908.09475) -Linwen Jin, arxiv2019<br>
[Focus-Enhanced Scene Text Recognition with Deformable Convolutions](https://arxiv.org/abs/1908.10998) -Yanxiang Gong, arxiv2019, [code](https://github.com/Alpaca07/dtr)<br>
[Rethinking Irregular Scene Text Recognition](https://arxiv.org/abs/1908.11834) -yao cong, ICDAR19 art champion, [code](https://github.com/Jyouhou/ICDAR2019-ArT-Recognition-Alchemy)<br>
[Aggregation Cross-Entropy for Sequence Recognition](https://arxiv.org/abs/1904.08364) -Zecheng Xie, Yaoxiong Huang, Yuanzhi Zhu, Lianwen Jin, Yuliang Liu, Lele Xie, CVPR2019,[code](https://github.com/summerlvsong/Aggregation-Cross-Entropy)<br>
[Sequence-to-Sequence Domain Adaptation Networkfor Robust Text Image Recognition](http://openaccess.thecvf.com/content_CVPR_2019/papers/Zhang_Sequence-To-Sequence_Domain_Adaptation_Network_for_Robust_Text_Image_Recognition_CVPR_2019_paper.pdf), CASIA, CVPR2019<br>
[Towards End-to-End Text Spotting in Natural Scenes](https://arxiv.org/abs/1906.06013) -LiHui, et al, arxiv2019<br>
[2D Attentional Irregular Scene Text Recognizer](https://arxiv.org/abs/1906.05708) -Tencent, arxiv2019<br>
[ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification](https://arxiv.org/abs/1812.05824) -Fangneng Zhan, Shijian Lu, CVPR2019<br>
[FACLSTM: ConvLSTM with Focused Attention for Scene Text Recognition](https://arxiv.org/abs/1904.09405) -Qingqing Wang, et al, arxiv2019<br>
[A Multi-Object Rectified Attention Network for Scene Text Recognition](https://arxiv.org/abs/1901.03003) -Canjie Luo, Lianwen Jin, Zenghui Sun, PR2019<br>
[Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition](https://arxiv.org/abs/1811.00751) -Hui Li, Peng Wang, Chunhua Shen, Guyu Zhang, AAAI2019, [code](https://tinyurl.com/ShowAttendRead)<br>
[Scene Text Recognition from Two-Dimensional Perspective](https://arxiv.org/abs/1809.06508) -Minghui Liao, Cong Yao, Xiang Bai, et al, AAAI2019<br>
[Recurrent Calibration Network for Irregular Text Recognition](https://arxiv.org/abs/1812.07145) -Hanqing Lu, arxiv2018<br>
[ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification](https://arxiv.org/abs/1812.05824) -Fangneng Zhan, Shijian Lu, arxiv2018<br>
[Synthetically Supervised Feature Learning for Scene Text Recognition](http://openaccess.thecvf.com/content_ECCV_2018/papers/Yang_Liu_Synthetically_Supervised_Feature_ECCV_2018_paper.pdf) -Adobe, ECCV2018<br>
[Connectionist Temporal Classification with Maximum Entropy Regularization](https://proceedings.neurips.cc/paper/2018/hash/e44fea3bec53bcea3b7513ccef5857ac-Abstract.html) -Tsinghua, NeurIPS2018,[code](https://github.com/liuhu-bigeye/enctc.crnn)<br>
[ASTER: An Attentional Scene Text Recognizer with Flexible Rectification](https://ieeexplore.ieee.org/document/8395027) -Baixiang, PAMI2018, [code](https://github.com/ayumiymk/aster.pytorch)<br>
[Edit Probability for Scene Text Recognition](http://openaccess.thecvf.com/content_cvpr_2018/papers/Bai_Edit_Probability_for_CVPR_2018_paper.pdf) -Fudan, Hikvision, cvpr2018<br>
[SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional Encoder-decoder Network](https://pdfs.semanticscholar.org/0e59/f7d7e9c9380b425a94038c7a2500b2f6063a.pdf) -Zichuan Liu, et al, AAAI2018<br>
[State of the Art Optical Character Recognition of 19th Century Fraktur Scripts using Open Source Engines](https://arxiv.org/abs/1810.03436) -arxiv2018<br>
[SCAN: Sliding Convolutional Attention Network for Scene Text Recognition](https://arxiv.org/abs/1806.00578) -Yichao Wu, et al, arxiv2018<br>
[NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition](https://arxiv.org/abs/1806.00926) -Fenfen Sheng, et al, arxiv2018<br>
[AON: Towards Arbitrarily-Oriented Text Recognition](https://arxiv.org/abs/1711.04226) -Hikvision, et al, CVPR2018<br>
[An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition](https://ieeexplore.ieee.org/abstract/document/7801919/) -B Shi, X Bai, C Yao , TPAMI2017 [code](https://github.com/bgshih/crnn)<br>
[Scene Text Recognition with Sliding Convolutional Character Models](https://arxiv.org/abs/1709.01727) -fei yin, et al, arxiv2017<br>
[Focusing Attention: Towards Accurate Text Recognition in Natural Images](http://openaccess.thecvf.com/content_ICCV_2017/papers/Cheng_Focusing_Attention_Towards_ICCV_2017_paper.pdf) -Hikvision, et al, ICCV2017<br>
[AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition](https://pdfs.semanticscholar.org/2111/d546ac1cbf170302e44a17c88d26b1c55999.pdf) -Chun Yang, Xu-Cheng Yin, arxiv2017<br>
[Strokelets: A learned multi-scale mid-level representation for scene text recognition](https://ieeexplore.ieee.org/abstract/document/7453176/) -X Bai, C Yao, W Liu , TIP2016<br>
[Reading Scene Text in Deep Convolutional Sequences](http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/download/12256/12121) -P He, W Huang, Y Qiao, CC Loy, X Tang, AAAI2016<br>
[Text-Attentional Convolutional Neural Network for Scene Text Detection](http://ieeexplore.ieee.org/abstract/document/7442550/) -Tong He, Weilin Huang, Yu Qiao, Jian Yao, TIP2016<br>
[Robust Scene Text Recognition with Automatic Rectification](https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Shi_Robust_Scene_Text_CVPR_2016_paper.pdf) -Baoguang Shi, Xinggang Wang, Pengyuan Lyu, Cong Yao, Xiang Bai, CVPR2016<br>
[DeepText: A Unified Framework for Text Proposal Generation and Text Detection in Natural Images](https://arxiv.org/abs/1605.07314) -Zhuoyao Zhong, Lianwen Jin, Shuye Zhang, Ziyong Feng, arxiv2016<br>
[Recursive Recurrent Nets with Attention Modeling for OCR in the Wild](https://arxiv.org/pdf/1603.03101v1.pdf) -Yahoo, CVPR2016<br>

## End-to-End & Text Spotting
[ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting](https://arxiv.org/abs/2211.10578) -ustc, PAMI2023,[code](https://github.com/FangShancheng/ABINet-PP)<br>
[Language Matters: A Weakly Supervised Pre-training Approach for Scene Text Detection and Spotting](https://arxiv.org/abs/2203.03911) -bytedance, arxiv2022<br>
[DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting](https://arxiv.org/abs/2203.05122) -naver, arxiv2022, [code]()<br>
[SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition](https://arxiv.org/abs/2203.10209) -jinlianwen, CVPR2022, [code](https://github.com/mxin262/SwinTextSpotter)<br>
[End-to-End Video Text Spotting with Transformer](https://arxiv.org/abs/2203.10539) -shenchunhua, arxiv2022, [code](https://github.com/weijiawu/TransDETR)<br>
[Text Spotting Transformers](https://arxiv.org/abs/2204.01918) -intel, CVPR2022<br>
[PP-OCRv3: More Attempts for the Improvement of Ultra Lightweight OCR System](https://arxiv.org/abs/2206.03001) -baidu, arxiv2022<br>
\[light\][PP-OCRv2: Bag of Tricks for Ultra Lightweight OCR System](https://arxiv.org/abs/2109.03144) -baidu, arxiv2021, [code](https://github.com/PaddlePaddle/PaddleOCR)<br>
[icdar competition][1st Place Solution to ICDAR 2021 RRC-ICTEXT End-to-end Text Spotting and Aesthetic Assessment on Integrated Circuit](https://arxiv.org/abs/2104.03544) -hikvision, arxiv2021<br>
[ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting](https://arxiv.org/abs/2105.03620) -jinlianwen, arxiv2021, [code](https://github.com/aim-uofa/AdelaiDet/)<br>
\[light\][PP-OCR: A Practical Ultra Lightweight OCR System](https://arxiv.org/abs/2009.09941) -baidu, arxiv2020, [code](https://github.com/PaddlePaddle/PaddleOCR)<br>
[Character Region Attention For Text Spotting](https://arxiv.org/abs/2007.09629) -ECCV2020<br>
[Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting](https://arxiv.org/abs/2007.09482) -baixiang, ECCV2020, [code]()<br>
[Text Detection and Recognition in the Wild: A Review](https://arxiv.org/abs/2006.04305) -arxiv2020<br>
[Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting](https://arxiv.org/abs/2002.06820) -Liang Qiao, AAAI2020<br>
[All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting](https://arxiv.org/abs/1911.09550) -baixiang, AAAI2020<br>
[ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network](https://arxiv.org/abs/2002.10200) -jin lianwen, CVPR2020<br>
[Convolutional Character Networks](https://arxiv.org/abs/1910.07954) -Linjie Xing, Zhi Tian, Weilin Huang, Matthew R. Scott, ICCV2019<br>
[TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting](http://openaccess.thecvf.com/content_ICCV_2019/papers/Feng_TextDragon_An_End-to-End_Framework_for_Arbitrary_Shaped_Text_Spotting_ICCV_2019_paper.pdf) -Chenglin Liu, CVPR2019<br>
[Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes](https://arxiv.org/abs/1908.08207) -baixiang, TPAMI2019<br>
[Towards Unconstrained End-to-End Text Spotting](https://arxiv.org/abs/1908.09231) -google ai, arxiv2019<br>
[Towards End-to-End Text Spotting in Natural Scenes](https://arxiv.org/abs/1906.06013) -Hui Li, Peng Wang, Chunhua Shen, arxiv2019<br>
[Weakly supervised precise segmentation for historical document images](https://www.sciencedirect.com/science/article/pii/S0925231219304989) -JIn Lianwen, Neurocomputing2019<br>
[A Novel Integrated Framework for Learning both Text Detection and Recognition](https://arxiv.org/abs/1811.08611) -alibaba, arxiv2018<br>
[TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network](https://arxiv.org/abs/1812.09900) -baidu, arxiv2018<br>
[Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes](https://arxiv.org/abs/1807.02242) -Pengyuan Lyu, Minghui Liao, Cong Yao, Wenhao Wu, Xiang Bai, arxiv2018<br>
[FOTS: Fast Oriented Text Spotting with a Unified Network](https://arxiv.org/abs/1801.01671) -Xuebo Liu, Ding Liang, Shi Yan, Dagui Chen, Yu Qiao, Junjie Yan, CVPR2018<br>
[E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text](https://arxiv.org/abs/1801.09919) -Yash Patel, et al, arxiv2018<br>
[SEE: Towards Semi-Supervised End-to-End Scene Text Recognition](http://arxiv.org/abs/1712.05404) -Christian Bartz, Haojin Yang, Christoph Meinel, AAAI2018<br>
[An end-to-end TextSpotter with Explicit Alignment and Attention](https://arxiv.org/abs/1803.03474) -Tong He, Zhi Tian, Weilin Huang, Chunhua Shen, Yu Qiao, Changming Sun, CVPR2018<br>
[Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks](http://openaccess.thecvf.com/content_ICCV_2017/papers/Li_Towards_End-To-End_Text_ICCV_2017_paper.pdf) -Hui Li, et al, ICCV2017<br>
[Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework](http://openaccess.thecvf.com/content_ICCV_2017/papers/Busta_Deep_TextSpotter_An_ICCV_2017_paper.pdf) -Michal Busta, et al, ICCV2017, [code](https://github.com/MichalBusta/DeepTextSpotter)<br>
[Reading Text in the Wild with Convolutional Neural Networks](https://link.springer.com/article/10.1007%2Fs11263-015-0823-z) -Max Jaderberg, et al, IJCV2016<br>

## Text Retrieval
[Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers](https://arxiv.org/abs/2103.16553) -zisserman, CVPR2021<br>
[Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval](https://arxiv.org/abs/2104.00650) -zisserman, arxiv2021<br>
[Scene Text Retrieval via Joint Text Detection and Similarity Learning](https://arxiv.org/abs/2104.01552) -baixiang, CVPR2021, [code/CSVTR database](https://github.com/lanfeng4659/STR-TDSL)<br>

### Synthesis
[https://github.com/clovaai/synthtiger](synthtiger)<br>
[Editing Text in the Wild](https://arxiv.org/abs/1908.03047) -baidu, ACM MM 2019<br>
[Data Augmentation for Scene Text Recognition](https://arxiv.org/abs/2108.06949) -ICCV2021 workshop, [code](https://github.com/roatienza/straug)<br>
[text_renderer](https://github.com/oh-my-ocr/text_renderer)<br>
[SynthText](https://github.com/ankush-me/SynthText)<br>
[SynthText](https://github.com/ankush-me/SynthText)<br>
[TextRecognitionDataGenerator](https://github.com/Belval/TextRecognitionDataGenerator)<br>
[UnrealText](https://github.com/Jyouhou/UnrealText/)<br>
[ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation](https://arxiv.org/abs/2003.10557) -Sharon Fogel, CVPR2020<br>
[SynthText3D: Synthesizing Scene Text Images from 3D Virtual Worlds](https://arxiv.org/abs/1907.06007) -Minghui Liao, Boyu Song, Minghang He, Shangbang Long, Cong Yao, Xiang Bai, arxiv2019[code](https://github.com/MhLiao/SynthText3D)<br>
[Spatial Fusion GAN for Image Synthesis](https://arxiv.org/abs/1812.05840) -Fangneng Zhan, Hongyuan Zhu, Shijian Lu, CVPR2019, [code](https://github.com/Sunshine352/SF-GAN)<br>
[Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes](https://arxiv.org/abs/1807.03021) -Fangneng Zhan, Shijian Lu, Chuhui Xue, ECCV2018<br>

### Evaluation
[CLEval: Character-Level Evaluation for Text Detection and Recognition Tasks](https://arxiv.org/abs/2006.06244) -arxiv2020<br>
[End-To-End Measure for Text Recognition](https://arxiv.org/abs/1908.09584) -ICDAR2019<br>
[Tightness-aware Evaluation Protocol for Scene Text Detection](https://arxiv.org/abs/1904.00813) -jinlianwen, CVPR2019<br>

### Script identification
[Patch Aggregator for Scene Text Script Identification](https://arxiv.org/abs/1912.03818) --baixiang, arxiv2019<br>

### Super Resolution
[Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution](https://ojs.aaai.org/index.php/AAAI/article/view/19904) -fudan, AAAI2022, [code](https://github.com/FudanVI/FudanOCR/tree/main/text-gestalt)<br>
[Restormer: Efficient Transformer for High-Resolution Image Restoration](https://arxiv.org/abs/2111.09881) google, CVPR2022, [code](https://github.com/swz30/Restormer)<br>
[Scene Text Telescope: Text-Focused Scene Image Super-Resolution](https://openaccess.thecvf.com/content/CVPR2021/html/Chen_Scene_Text_Telescope_Text-Focused_Scene_Image_Super-Resolution_CVPR_2021_paper.html?ref=https://githubhelp.com) -fudan, CVPR2021<br>
[Text Prior Guided Scene Text Image Super-resolution](https://arxiv.org/abs/2106.15368) -arxiv2021, [code](https://github.com/mjq11302010044/TPGSR)<br>
[Scene Text Image Super-Resolution in the Wild](https://arxiv.org/abs/2005.03341) -baixiang, ECCV2020<br>

## Other
[AnyText: Multilingual Visual Text Generation And Editing](https://arxiv.org/abs/2311.03054) -alibaba, arxiv2023, [code](https://github.com/tyxsspa/AnyText)<br> 
[Stroke-Based Scene Text Erasing Using Synthetic Data for Training](https://ieeexplore.ieee.org/abstract/document/9609970) -TIP2021<br>
[Page Layout Analysis System for Unconstrained Historic Documents](https://arxiv.org/abs/2102.11838) -ICDAR2021<br>
[EraseNet: End-to-End Text Removal in the Wild](https://ieeexplore.ieee.org/document/9180003) -Jinlianwen, TIP2020, [code](https://github.com/HCIILAB/SCUT-EnsText)<br>
[SwapText: Image Based Texts Transfer in Scenes](https://arxiv.org/abs/2003.08152) -Qiangpeng Yang, CVPR2020<br>
[UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World](https://arxiv.org/abs/2003.10608) -Cong Yao, CVPR2020<br>
[EnsNet: Ensconce Text in the Wild](https://ojs.aaai.org/index.php/AAAI/article/view/3859) -JinLianwen, AAAI2019, [code](https://github.com/HCIILAB/SCUT-EnsText)<br>
[TextSR: Content-Aware Text Super-Resolution Guided by Recognition](https://arxiv.org/abs/1909.07113) -forevision, arxiv2019<br>
[Editing Text in the Wild](https://arxiv.org/abs/1908.03047) -baixiang, ACM MM2019<br>
[MTRNet: A Generic Scene Text Eraser](https://ieeexplore.ieee.org/abstract/document/8978083) -ICDAR2019<br>
[Scene Text Detection and Recognition: The Deep Learning Era](https://arxiv.org/abs/1811.04256) -face++, arxiv2018<br>
[Text/non-text image classification in the wild with convolutional neural networks](https://www.sciencedirect.com/science/article/pii/S0031320316303922) -X Bai, B Shi, C Zhang, X Cai, L Qi, PR2017<br>
[Scene text script identification with convolutional recurrent neural networks](http://ieeexplore.ieee.org/abstract/document/7900268/) -J Mei, L Dai, B Shi, X Bai, ICPR2016<br>

## Seq2Seq
[Convolutional Sequence to Sequence Learning](https://arxiv.org/abs/1705.03122) -FAIR, ICML2017<br>
[Sequence Level Training with Recurrent Neural Networks](https://arxiv.org/abs/1511.06732) -FAIR, ICLR2016<br>
[A Convolutional Encoder Model for Neural Machine Translation](https://arxiv.org/abs/1611.02344) -FAIR, arxiv2016<br>

## Reading Order
[LayoutReader: Pre-training of Text and Layout for Reading Order Detection](https://arxiv.org/abs/2108.11591) -MSRA, EMNLP2021, [code/database](https://github.com/microsoft/unilm/tree/master/layoutreader)<br>

## Database & Generation
### chinese
[TRW15: ICDAR 2015 Text Reading in the Wild Competition](https://arxiv.org/abs/1506.03184)<br>
RCTW-17: ICDAR2017-Reading Chinese Text in the Wild<br>
STV2k: A New Benchmark for Scene Text Detection and Recognition<br>
[CTW: Chinese Text in the Wild](https://ctwdataset.github.io/)<br>
PAL10K<br>
[COCO TS Dataset](https://arxiv.org/abs/1904.00818)<br>
[ICPR MTWI 2018 挑战赛一:网络图像的文本识别](https://tianchi.aliyun.com/competition/information.htm?raceId=231650)<br>
### other
[Comprehensive Benchmark Datasets for Amharic Scene Text Detection and Recognition](https://arxiv.org/abs/2203.12165)<br>
[Textual Visual Semantic Dataset for Text Spotting](https://github.com/HCIILAB/dataset)<br>
[RoadText-1K: Text Detection & Recognition Dataset for Driving Videos](https://arxiv.org/abs/2005.09496)
[DDI-100: Dataset for Text Detection and Recognition](https://arxiv.org/abs/1912.11658)
[Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes](https://arxiv.org/pdf/1807.03021.pdf) -Fangneng Zhan, Shijian Lu, and Chuhui Xue, arxiv2018<br>
[Total-Text](https://github.com/cs-chan/Total-Text-Dataset) -1555 images<br>
[SCUT-CTW1500](https://github.com/Yuliang-Liu/Curve-Text-Detector) -Curved text in the wild<br>
[MLT: Multi-lingual scene text detection and script identification](http://rrc.cvc.uab.es/?ch=8) -Multi-lingual text: 18,000 images, 9 different languages representing 6 different scripts<br>
[Synthetic Word Dataset](http://www.robots.ox.ac.uk/~vgg/data/text/), [Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition](https://arxiv.org/abs/1406.2227)<br>
[Total-text: A comprehensive dataset for scene text detection and recognition](https://ieeexplore.ieee.org/abstract/document/8270088/) - -Chee Kheng Ch'ng, Chee Seng Chan<br>
Street View Text(SVT)<br>
IIIT 5k-words<br>
MSRA-TD500<br>
KAIST Scene_Text Database<br>
ICDAR2011, ICDAR2013, ICDAR2015, ICDAR2017, robust reading-Focused Scene Text<br>
ICDAR2017-ICDAR 2017 Robust Reading Challenge on Omnidirectional Video(DOST)<br>
COCO-Text<br>
Google French Street Name Signs (FSNS) dataset<br>
ICDAR2017-ICDAR2017 Competition on Multi-lingual scene text detection and script identification(MLT)<br>
ICDAR2017-Born-Digital Images (Web and Email)<br>
[Detecting Curve Text in the Wild: New Dataset and New Solution](https://arxiv.org/abs/1712.02170)<br>
[Synthetic Word](http://www.robots.ox.ac.uk/~vgg/data/text/)<br>
[Synthetic Data for Text Localisation in Natural Images](https://www.cv-foundation.org/openaccess/content_cvpr_2016/app/S10-06.pdf) -Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR2016<br>
### vietnamese
[VinText](https://github.com/VinAIResearch/dict-guided)<br>

## Competition
[ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17)](https://arxiv.org/abs/1708.09585) -B Shi, C Yao, M Liao, M Yang, P Xu, L Cui, arxiv2017<br>
[ICDAR 2015 competition on robust reading](https://ieeexplore.ieee.org/abstract/document/7333942/)<br>
[Incidental Scene Text Understanding: Recent Progresses on ICDAR 2015 Robust Reading Competition Challenge 4](https://arxiv.org/abs/1511.09207) -Cong Yao, Jianan Wu, Xinyu Zhou, Chi Zhang, Shuchang Zhou, Zhimin Cao, Qi Yin<br>

## Link
[awesome-deep-text-detection-recognition](https://github.com/hwalsuklee/awesome-deep-text-detection-recognition)<br>
[Awesome-Scene-Text-Recognition](https://github.com/chongyangtao/Awesome-Scene-Text-Recognition)<br>
[Scene Text Detection](https://paperswithcode.com/task/scene-text-detection/codeless)<br>
Download .txt
gitextract_7o8twzf0/

└── README.md
Condensed preview — 1 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (39K chars).
[
  {
    "path": "README.md",
    "chars": 38868,
    "preview": "# scene_text\n\n## Text Detection\n[DETRs Beat YOLOs on Real-time Object Detection](https://arxiv.org/abs/2304.08069) -baid"
  }
]

About this extraction

This page contains the full source code of the yflv-yanxia/scene_text GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 1 files (38.0 KB), approximately 11.9k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!