SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

Scene text recognition is a hot research topic in computer vision. Recently, many recognition methods based on the encoder-decoder framework have been proposed, and they can handle scene texts of perspective distortion and curve shape. Nevertheless, they still face lots of challenges like image blur...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) S. 13525 - 13534
Hauptverfasser: Qiao, Zhi, Zhou, Yu, Yang, Dongbao, Zhou, Yucan, Wang, Weiping
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.06.2020
Schlagworte:
ISSN:1063-6919
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Scene text recognition is a hot research topic in computer vision. Recently, many recognition methods based on the encoder-decoder framework have been proposed, and they can handle scene texts of perspective distortion and curve shape. Nevertheless, they still face lots of challenges like image blur, uneven illumination, and incomplete characters. We argue that most encoder-decoder methods are based on local visual features without explicit global semantic information. In this work, we propose a semantics enhanced encoder-decoder framework to robustly recognize low-quality scene texts. The semantic information is used both in the encoder module for supervision and in the decoder module for initializing. In particular, the state-of-the-art ASTER method is integrated into the proposed framework as an exemplar. Extensive experiments demonstrate that the proposed framework is more robust for low-quality text images, and achieves state-of-the-art results on several benchmark datasets. The source code will be available.
AbstractList Scene text recognition is a hot research topic in computer vision. Recently, many recognition methods based on the encoder-decoder framework have been proposed, and they can handle scene texts of perspective distortion and curve shape. Nevertheless, they still face lots of challenges like image blur, uneven illumination, and incomplete characters. We argue that most encoder-decoder methods are based on local visual features without explicit global semantic information. In this work, we propose a semantics enhanced encoder-decoder framework to robustly recognize low-quality scene texts. The semantic information is used both in the encoder module for supervision and in the decoder module for initializing. In particular, the state-of-the-art ASTER method is integrated into the proposed framework as an exemplar. Extensive experiments demonstrate that the proposed framework is more robust for low-quality text images, and achieves state-of-the-art results on several benchmark datasets. The source code will be available.
Author Yang, Dongbao
Zhou, Yucan
Wang, Weiping
Qiao, Zhi
Zhou, Yu
Author_xml – sequence: 1
  givenname: Zhi
  surname: Qiao
  fullname: Qiao, Zhi
  organization: Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
– sequence: 2
  givenname: Yu
  surname: Zhou
  fullname: Zhou, Yu
  organization: Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
– sequence: 3
  givenname: Dongbao
  surname: Yang
  fullname: Yang, Dongbao
  organization: Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
– sequence: 4
  givenname: Yucan
  surname: Zhou
  fullname: Zhou, Yucan
  organization: Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
– sequence: 5
  givenname: Weiping
  surname: Wang
  fullname: Wang, Weiping
  organization: Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
BookMark eNotjM1OAjEURqvRRESeQBd9gcF7205_3BkY1ASjAXRLSnuro9IxM5Ogby9RFyfnLL58p-woN5kYu0AYI4K7nDw_LpTQAGMBAsaAslQHbOSMRSP2oLblIRsgaFloh-6EjbruDQCkQNTODtj9sqqmV3xJW5_7OnS8yq8-B4r7CE2ktpjSr_ms9VvaNe07T03Ll4Ey8RV99XyxH7zkuq-bfMaOk__oaPTvIXuaVavJbTF_uLmbXM-LWoDsCxeNjUolQSkpFUUgC-SNDhhTNJ6MdjJomxSVSCoiqmCl3ziBbqM2HuSQnf_91kS0_mzrrW-_1w5LI7WQPzDWUQs
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR42600.2020.01354
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9781728171685
1728171687
EISSN 1063-6919
EndPage 13534
ExternalDocumentID 9157362
Genre orig-research
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-i203t-9d78d44f2eff44d2ce80ea76c1dfd7ae7693c68f4e51e4d114c83ab9219b4ba03
IEDL.DBID RIE
ISICitedReferencesCount 219
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001309199906042&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:30:35 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-9d78d44f2eff44d2ce80ea76c1dfd7ae7693c68f4e51e4d114c83ab9219b4ba03
PageCount 10
ParticipantIDs ieee_primary_9157362
PublicationCentury 2000
PublicationDate 2020-Jun
PublicationDateYYYYMMDD 2020-06-01
PublicationDate_xml – month: 06
  year: 2020
  text: 2020-Jun
PublicationDecade 2020
PublicationTitle Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev CVPR
PublicationYear 2020
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211698
Score 2.600729
Snippet Scene text recognition is a hot research topic in computer vision. Recently, many recognition methods based on the encoder-decoder framework have been...
SourceID ieee
SourceType Publisher
StartPage 13525
SubjectTerms Decoding
Feature extraction
Image recognition
Semantics
Task analysis
Text recognition
Visualization
Title SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition
URI https://ieeexplore.ieee.org/document/9157362
WOSCitedRecordID wos001309199906042&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG6QePCECsZ3evBoobtbtq1XWOJBCQE03Ei3M40cXAwPf7_dssLFi6dO2nSaTNPptDPfDCEPGJkEue0yJQCZACuY5i5l2pM6lMaBkGf2RQ6HajbToxp53GNhEDEEn2G7JIMvH5Z2W36VdXTUlUHhHkmZ7rBa-_8UzzVKtarQcRHXnd77aBzyr_tXYMzb3tYJOf8PNVTCFTJo_G_xU9I6YPHoaH_LnJEaFuekURmPtDqa6yZ5nWRZ_4lO8NPLamHXNCs-gnffEyVufcX6GFo6-I3Hot5g9Ry8tqNTr6Pp-DeYaFm0yNsgm_aeWVUrgS1inmyYBqlACBejc0JAbFFxNDK1ETiQBsuShzZVTmA3QgH-FWRVYnLtFVYucsOTC1IvlgVeEhpDLjAGPysPQ8aI1BrPD5C7HOCKNEvpzL926TDmlWCu_-6-ISel-HfRVbekvllt8Y4c2-_NYr26D3v4A0UUnuw
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG4ImugJFYxve_BoobvbfdQrLMEIhAAabqTbmY17cDE8_P12ywIXL546adNpMk2n0858M4Q8oaM85NpnkQBkArRgkqcBk4aUtjQO2Dyz_XA4jGYzOaqQ5z0WBhFt8Bk2C9L68mGhN8VXWUs6fmgV7pEvhMu3aK39j4rh6wQyKvFxDpet9sdobDOwm3egy5vG2rFZ_w9VVOwl0q39b_kz0jig8ehof8-ckwrmF6RWmo-0PJyrOhlM4rjzQif4ZaSV6RWN80_r3zdEgVxfsg7alnZ3EVnUmKyGg9F3dGq0NB3vwokWeYO8d-Npu8fKagksc7m3ZhLCCIRIXUxTIcDVGHFUYaAdSCFUWBQ91EGUCvQdFGDeQTryVCKNykpEorh3Sar5IscrQl1IBLpgZiV2SCkRaGX4AfI0Abgm9UI68-9tQox5KZibv7sfyUlvOujP-6_Dt1tyWmzFNtbqjlTXyw3ek2P9s85Wywe7n7-3OKIz
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=SEED%3A+Semantics+Enhanced+Encoder-Decoder+Framework+for+Scene+Text+Recognition&rft.au=Qiao%2C+Zhi&rft.au=Zhou%2C+Yu&rft.au=Yang%2C+Dongbao&rft.au=Zhou%2C+Yucan&rft.date=2020-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=13525&rft.epage=13534&rft_id=info:doi/10.1109%2FCVPR42600.2020.01354&rft.externalDocID=9157362