Applying Fellegi-Sunter (FS) Model for Traceability Link Recovery between Bug Databases and Version Archives

Defect tracking systems such as Bugzilla and JIRA and source code version control systems such as CVS and SVN are widely used applications to support software development and maintenance activities. Previous studies show that bug databases and version databases are often used as standalone and separ...

Full description

Saved in:
Bibliographic Details
Published in:2011 18th Asia-Pacific Software Engineering Conference pp. 146 - 153
Main Authors: Sureka, A., Lal, S., Agarwal, L.
Format: Conference Proceeding
Language:English
Published: IEEE 01.12.2011
Subjects:
ISBN:9781457721991, 1457721996
ISSN:1530-1362
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Defect tracking systems such as Bugzilla and JIRA and source code version control systems such as CVS and SVN are widely used applications to support software development and maintenance activities. Previous studies show that bug databases and version databases are often used as standalone and separate repositories without explicit linkages between issue reports and corresponding commit transactions. This is because developers often do not explicitly mention or tag commit transactions with the relevant bug report IDs. The lack of explicit links between these two databases has been identified as a serious process data quality issue (incomplete and biased data) having implications in predictive model building (such as defect density and error proneness computation) and hypothesis-testing based on the dataset. Researchers have proposed solutions to link the two databases and performed experiments on open source projects such as Fire Fox Mozilla. We review previous approaches and propose a novel technique (based on Fellegi-Sunter (FS) Model for record linkages) to automatically integrate the two databases that overcomes some of the drawbacks of traditional methods. We validate the proposed approach by performing experiments on publicly available bug and version dataset obtained from two open-source projects (Apache HTTP Server and WikiMedia). The results of our experiments demonstrate that the proposed solution is effective in recovering trace ability links (missing links) between bug-fixing commits and corresponding bug reports.
AbstractList Defect tracking systems such as Bugzilla and JIRA and source code version control systems such as CVS and SVN are widely used applications to support software development and maintenance activities. Previous studies show that bug databases and version databases are often used as standalone and separate repositories without explicit linkages between issue reports and corresponding commit transactions. This is because developers often do not explicitly mention or tag commit transactions with the relevant bug report IDs. The lack of explicit links between these two databases has been identified as a serious process data quality issue (incomplete and biased data) having implications in predictive model building (such as defect density and error proneness computation) and hypothesis-testing based on the dataset. Researchers have proposed solutions to link the two databases and performed experiments on open source projects such as Fire Fox Mozilla. We review previous approaches and propose a novel technique (based on Fellegi-Sunter (FS) Model for record linkages) to automatically integrate the two databases that overcomes some of the drawbacks of traditional methods. We validate the proposed approach by performing experiments on publicly available bug and version dataset obtained from two open-source projects (Apache HTTP Server and WikiMedia). The results of our experiments demonstrate that the proposed solution is effective in recovering trace ability links (missing links) between bug-fixing commits and corresponding bug reports.
Author Lal, S.
Agarwal, L.
Sureka, A.
Author_xml – sequence: 1
  givenname: A.
  surname: Sureka
  fullname: Sureka, A.
  email: ashish@iiitd.ac.in
  organization: IIIT-D, New Delhi, India
– sequence: 2
  givenname: S.
  surname: Lal
  fullname: Lal, S.
  email: sangeeta@iiitd.ac.in
  organization: IIIT-D, New Delhi, India
– sequence: 3
  givenname: L.
  surname: Agarwal
  fullname: Agarwal, L.
  email: luckyagarwal3247@gmail.com
  organization: LNMIIT, Jaipur, India
BookMark eNotjD1PwzAUAI0oEqV0ZGLxCEOKn789ltICUhGIVqyVk7wUi-BUTlqUfw8CptMNd2dkEJuIhFwAmwAwdzN9Wc1nE84AJsCPyNgZy4x2Smrm5PGvg1TGcHAOBmQISrAMhOanZNy2IWegrOBSyyGpp7td3Ye4pQusa9yGbLWPHSZ6tVhd06emxJpWTaLr5Av0eahD19NliB_0FYvmgKmnOXZfiJHe7rf0znc-9y221MeSvmFqQxPpNBXv4YDtOTmpfN3i-J8jsl7M17OHbPl8_zibLrPgWJcJqXMrlWNotLEWbekADS88q4wGUYHhQuS6UpgzJkpegUTtUGjJbaGsEiNy-bcNiLjZpfDpU7_5SZm2IL4BZZxb6A
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/APSEC.2011.12
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9780769546094
0769546099
EndPage 153
ExternalDocumentID 6130681
Genre orig-research
GroupedDBID 29O
6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
RNS
ID FETCH-LOGICAL-i90t-346b84590e76788e8d91e72ca0f7613f17233b6f5eb003d2f14e69e36428c5853
IEDL.DBID RIE
ISBN 9781457721991
1457721996
ISSN 1530-1362
IngestDate Wed Aug 27 04:12:19 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i90t-346b84590e76788e8d91e72ca0f7613f17233b6f5eb003d2f14e69e36428c5853
PageCount 8
ParticipantIDs ieee_primary_6130681
PublicationCentury 2000
PublicationDate 2011-Dec.
PublicationDateYYYYMMDD 2011-12-01
PublicationDate_xml – month: 12
  year: 2011
  text: 2011-Dec.
PublicationDecade 2010
PublicationTitle 2011 18th Asia-Pacific Software Engineering Conference
PublicationTitleAbbrev apsec
PublicationYear 2011
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib015832464
ssj0020405
ssj0000669876
Score 1.8487962
Snippet Defect tracking systems such as Bugzilla and JIRA and source code version control systems such as CVS and SVN are widely used applications to support software...
SourceID ieee
SourceType Publisher
StartPage 146
SubjectTerms Automated Software Engineering
Couplings
Data Integration
Defect Tracking Systems
Fellegi-Sunter (FS) Model
Joining processes
Mining Software Repositories
Record Linkages
Software
Software engineering
Software Engineering Process Data Analysis
Traceability Link Recovery
Training
Vectors
Version Archives
Title Applying Fellegi-Sunter (FS) Model for Traceability Link Recovery between Bug Databases and Version Archives
URI https://ieeexplore.ieee.org/document/6130681
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT4NAEN3UxoOnqq3xO3vwoIlYYNkFjlpLPDVN2kNvzcLOmiYNNS2Y9N87A7T14MULAU6wmWXeG2beY-xB-EZZHbqONCZAgmJ9R2NmwQNYG0Ma2TSozCbC0SiazeJxiz3vZ2EAoGo-gxc6rf7lm1VWUqmsT1hX0Zz1URiG9azWLnY8iaEZNNC__gorpNNqT74wWGWtneo6Hn61qyEvidiS2nB32k_NtXcQ4-y_jifDQS31Sa6VvyxYqgyUdP737Kesdxjl4-N9kjpjLcjPWWfn5cCbrd1lS4KjNPLEEyrlI2WelLTm_DGZPHFyTFtyxLccc1sGtbb3lhOR5URgcT9sedPyxd_KT_6uC00JcsN1bnhTlOM7mdsemybD6eDDaZwYnEXsFo4IVBoFMnYhxNwWQWRiD0I_064N8b0sgiAhUmUlGREJ41svABWDIG6TIR8RF6ydr3K4ZFzoiCTUpEVcEwgrU9cH4QtQqTa-n5kr1qW1m3_VWhvzZtmu_759w06qGm_VXnLL2sW6hDt2nH0Xi836vgqQH2PWsqU
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8JAEN4QNNETKhjf7sGDJlbaffRxVIRgREICB26k7c4aEgKGhwn_3pm2gAcvXpq2p3Yz2_m-6cz3MXYnhfFtHLiONkYhQbHCiTGz4AGsjSAJbaIys4mg2w2Hw6hXYo_bWRgAyJrP4IlOs3_5ZpauqFRWJ6zr05z1nlZKePm01iZ6PI3BqQrwn3-HfSTU_pZ-YbjqXD3VdTz8bmdjXhrRJTXibtSfimtvJ8dZf-71m41c7JN8K3-ZsGQ5qFX539MfsdpumI_3tmnqmJVgesIqGzcHXmzuKpsQIKWhJ96iYj6S5v6KVp3ft_oPnDzTJhwRLsfslkKu7r3mRGU5UVjcEWteNH3xl9Unf42XMaXIBY-nhhdlOb4Ruq2xQas5aLSdwovBGUfu0pHKT0KlIxcCzG4hhCbyIBBp7NoA38siDJIy8a0mKyJphPUU-BFIYjcpMhJ5ysrT2RTOGJdxSCJq2iKyUdLqxBUghQQ_iY0QqTlnVVq70VeutjEqlu3i79u37KA9-OiMOm_d90t2mFV8s2aTK1ZezldwzfbT7-V4Mb_JguUH5Zi17A
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2011+18th+Asia-Pacific+Software+Engineering+Conference&rft.atitle=Applying+Fellegi-Sunter+%28FS%29+Model+for+Traceability+Link+Recovery+between+Bug+Databases+and+Version+Archives&rft.au=Sureka%2C+A.&rft.au=Lal%2C+S.&rft.au=Agarwal%2C+L.&rft.date=2011-12-01&rft.pub=IEEE&rft.isbn=9781457721991&rft.issn=1530-1362&rft.spage=146&rft.epage=153&rft_id=info:doi/10.1109%2FAPSEC.2011.12&rft.externalDocID=6130681
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1530-1362&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1530-1362&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1530-1362&client=summon