TransRegex: Multi-modal Regular Expression Synthesis by Generate-and-Repair

Since regular expressions (abbrev. regexes) are difficult to understand and compose, automatically generating regexes has been an important research problem. This paper introduces TransRegex, for automatically constructing regexes from both natural language descriptions and examples. To the best of...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings / International Conference on Software Engineering s. 1210 - 1222
Hlavní autoři: Li, Yeting, Li, Shuaimin, Xu, Zhiwu, Cao, Jialun, Chen, Zixuan, Hu, Yun, Chen, Haiming, Cheung, Shing-Chi
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.05.2021
Témata:
ISBN:1665402962, 9781665402965
ISSN:1558-1225
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Since regular expressions (abbrev. regexes) are difficult to understand and compose, automatically generating regexes has been an important research problem. This paper introduces TransRegex, for automatically constructing regexes from both natural language descriptions and examples. To the best of our knowledge, TransRegex is the first to treat the NLP-and-example-based regex synthesis problem as the problem of NLP-based synthesis with regex repair. For this purpose, we present novel algorithms for both NLP-based synthesis and regex repair. We evaluate TransRegex with ten relevant state-of-the-art tools on three publicly available datasets. The evaluation results demonstrate that the accuracy of our TransRegex is 17.4%, 35.8% and 38.9% higher than that of NLP-based approaches on the three datasets, respectively. Furthermore, TransRegex can achieve higher accuracy than the state-of-the-art multi-modal techniques with 10% to 30% higher accuracy on all three datasets. The evaluation results also indicate TransRegex utilizing natural language and examples in a more effective way.
AbstractList Since regular expressions (abbrev. regexes) are difficult to understand and compose, automatically generating regexes has been an important research problem. This paper introduces TransRegex, for automatically constructing regexes from both natural language descriptions and examples. To the best of our knowledge, TransRegex is the first to treat the NLP-and-example-based regex synthesis problem as the problem of NLP-based synthesis with regex repair. For this purpose, we present novel algorithms for both NLP-based synthesis and regex repair. We evaluate TransRegex with ten relevant state-of-the-art tools on three publicly available datasets. The evaluation results demonstrate that the accuracy of our TransRegex is 17.4%, 35.8% and 38.9% higher than that of NLP-based approaches on the three datasets, respectively. Furthermore, TransRegex can achieve higher accuracy than the state-of-the-art multi-modal techniques with 10% to 30% higher accuracy on all three datasets. The evaluation results also indicate TransRegex utilizing natural language and examples in a more effective way.
Author Hu, Yun
Li, Yeting
Cheung, Shing-Chi
Cao, Jialun
Xu, Zhiwu
Chen, Haiming
Li, Shuaimin
Chen, Zixuan
Author_xml – sequence: 1
  givenname: Yeting
  surname: Li
  fullname: Li, Yeting
  email: liyt@ios.ac.cn
  organization: State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China
– sequence: 2
  givenname: Shuaimin
  surname: Li
  fullname: Li, Shuaimin
  email: lishuaimin17@mails.ucas.ac.cn
  organization: University of Chinese Academy of Sciences, China
– sequence: 3
  givenname: Zhiwu
  surname: Xu
  fullname: Xu, Zhiwu
  email: xuzhiwu@szu.edu.cn
  organization: Shenzhen University, China
– sequence: 4
  givenname: Jialun
  surname: Cao
  fullname: Cao, Jialun
  email: fjcaoap@cse.ust.hk
  organization: The Hong Kong University of Science and Technology, China
– sequence: 5
  givenname: Zixuan
  surname: Chen
  fullname: Chen, Zixuan
  email: chenzx@ios.ac.cn
  organization: State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China
– sequence: 6
  givenname: Yun
  surname: Hu
  fullname: Hu, Yun
  email: huyun2016@iscas.ac.cn
  organization: Science University of Chinese Academy of Sciences, China
– sequence: 7
  givenname: Haiming
  surname: Chen
  fullname: Chen, Haiming
  email: chm@ios.ac.cn
  organization: State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, China
– sequence: 8
  givenname: Shing-Chi
  surname: Cheung
  fullname: Cheung, Shing-Chi
  email: scc@cse.ust.hk
  organization: The Hong Kong University of Science and Technology, China
BookMark eNotjN1qwkAQRhdqoWp9gvYiL7DpzP4l27siqZVaCmqvZZKMbSCusomgb6-lvfrgcM43EoOwDyzEI0KKCP5pPl0VRntQqQKFKQAi3ogROmcNKO_UQAzR2lyiUvZOTLquKcGYzCM4MxTv60ihW_I3n56Tj2PbN3K3r6lNrujYUkyK0yHyNdqHZHUO_Q93TZeU52TGgSP1LCnUcskHauK9uN1S2_Hkf8fi67VYT9_k4nM2n74sJGmNvUSuiSpvTAnauAx8jbXOHFQeNeXaQk6_oilzzqlCuzXOe5uZssw1VxXrsXj4-22YeXOIzY7ieeMNoLeoLzagT6c
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICSE43902.2021.00111
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EndPage 1222
ExternalDocumentID 9401951
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  funderid: 10.13039/501100001809
– fundername: Research and Development
  funderid: 10.13039/100006190
GroupedDBID -~X
.4S
.DC
123
23M
29O
5VS
6IE
6IF
6IH
6IK
6IL
6IM
6IN
8US
AAJGR
AAWTH
ABLEC
ADZIZ
AFFNX
ALMA_UNASSIGNED_HOLDINGS
APO
ARCSS
AVWKF
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
EDO
FEDTE
I-F
I07
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
XOL
ID FETCH-LOGICAL-a331t-1edaac944b0346709d1d3760c913a83508aa3314b8e8ac15f4699574bb83ecce3
IEDL.DBID RIE
ISBN 1665402962
9781665402965
ISICitedReferencesCount 12
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000684601800098&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1558-1225
IngestDate Wed Aug 27 02:50:26 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a331t-1edaac944b0346709d1d3760c913a83508aa3314b8e8ac15f4699574bb83ecce3
OpenAccessLink https://repository.hkust.edu.hk/ir/bitstream/1783.1-110215/1/037557_1.pdf
PageCount 13
ParticipantIDs ieee_primary_9401951
PublicationCentury 2000
PublicationDate 2021-May
PublicationDateYYYYMMDD 2021-05-01
PublicationDate_xml – month: 05
  year: 2021
  text: 2021-May
PublicationDecade 2020
PublicationTitle Proceedings / International Conference on Software Engineering
PublicationTitleAbbrev ICSE
PublicationYear 2021
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib044791064
ssj0006499
Score 2.3168676
Snippet Since regular expressions (abbrev. regexes) are difficult to understand and compose, automatically generating regexes has been an important research problem....
SourceID ieee
SourceType Publisher
StartPage 1210
SubjectTerms Maintenance engineering
Natural languages
programming by example
programming by natural languages
regex repair
regex synthesis
Software engineering
Title TransRegex: Multi-modal Regular Expression Synthesis by Generate-and-Repair
URI https://ieeexplore.ieee.org/document/9401951
WOSCitedRecordID wos000684601800098&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFG6QePCECsbf6cGjlXXt2s0rgWhMCAFNuJGu7QyJDjKGgf_e125gTLx425Yemvfafu-t73sfQnehlFQFgSYp1THhmdtzPA1IZnhghdGMhcqLTcjhMJ5Ok1ED3e-5MNZaX3xmH9yjv8s3C712v8q6CXf0Nsh1DqQUFVdrt3Y4lwB8LvSvT2HBvXYkwCVkSbBoHanLKe0GYSLCutfT7j2qOXU0SLrPvUkfQNqTtELqLyroL-UVDzyD1v-mfIw6Pww-PNpj0wlq2PwUtXYSDrje0W304rFqbN_t5hF7Mi75XBj1gcdeo77A_U1dKZvjyTaHaHE1X-F0i6t21aUlKjcEong1LzrobdB_7T2RWl-BKMZoSag1SumEg28Yd33cDDWuRkYnlCmIzIJYuYE8jW2sNI0ySKWTSPI0jRl43rIz1MwXuT1HWCjtOu0JwETJI65hOBwdKnbd_lQmxQVqO9vMllULjVltlsu_P1-hI2f8qq7wGjXLYm1v0KH-Kuer4tb7_Rt4uqVS
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFG4ImugJFYy_7cGjlXbt1s0rgUBAQgATbqRrO0Oiw4xh4L-37QbGxIu3bemh6Xvt9976vvcB8OBxTgTGEsVEhoglds-xGKNEMawDJSn1hBOb4MNhOJtFowp43HNhtNau-Ew_2Ud3l6-Wcm1_lTUjZultJtc58BnzcMHW2nkPY9xAnw3-y3M4YE490gCmyZOM21pal9XaxV4UeGW3p927X7LqCI6avdakbWDa0bQ84q4qyC_tFQc9ndr_Jn0CGj8cPjjao9MpqOj0DNR2Ig6w3NN10HdoNdZvevMMHR0XfSyVeIdjp1KfwfamrJVN4WSbmnhxtVjBeAuLhtW5RiJVyMTxYpE1wGunPW11UamwgASlJEdEKyFkxIx1KLOd3BRRtkpGRoQKE5vhUNiBLA51KCTxE5NMRz5ncRxSY3tNz0E1Xab6AsBASNtrLzCoyJnPpBluDg8R2n5_IuHBJajbtZl_Fk005uWyXP39-R4cdacvg_mgN-xfg2NriKLK8AZU82ytb8Gh_MoXq-zO-cA3Sk2omQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Software+Engineering&rft.atitle=TransRegex%3A+Multi-modal+Regular+Expression+Synthesis+by+Generate-and-Repair&rft.au=Li%2C+Yeting&rft.au=Li%2C+Shuaimin&rft.au=Xu%2C+Zhiwu&rft.au=Cao%2C+Jialun&rft.date=2021-05-01&rft.pub=IEEE&rft.isbn=9781665402965&rft.issn=1558-1225&rft.spage=1210&rft.epage=1222&rft_id=info:doi/10.1109%2FICSE43902.2021.00111&rft.externalDocID=9401951
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1558-1225&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1558-1225&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1558-1225&client=summon