Effectively Incorporating Expert Knowledge in Automated Software Remodularisation

Remodularising the components of a software system is challenging: sound design principles (e.g., coupling and cohesion) need to be balanced against developer intuition of which entities conceptually belong together. Despite this, automated approaches to remodularisation tend to ignore domain knowle...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on software engineering Jg. 44; H. 7; S. 613 - 630
Hauptverfasser: Hall, Mathew, Walkinshaw, Neil, McMinn, Phil
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.07.2018
IEEE Computer Society
Schlagworte:
ISSN:0098-5589, 1939-3520
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Remodularising the components of a software system is challenging: sound design principles (e.g., coupling and cohesion) need to be balanced against developer intuition of which entities conceptually belong together. Despite this, automated approaches to remodularisation tend to ignore domain knowledge, leading to results that can be nonsensical to developers. Nevertheless, suppling such knowledge is a potentially burdensome task to perform manually. A lot information may need to be specified, particularly for large systems. Addressing these concerns, we propose the SUpervised reMOdularisation (SUMO) approach. SUMO is a technique that aims to leverage a small subset of domain knowledge about a system to produce a remodularisation that will be acceptable to a developer. With SUMO, developers refine a modularisation by iteratively supplying corrections. These corrections constrain the type of remodularisation eventually required, enabling SUMO to dramatically reduce the solution space. This in turn reduces the amount of feedback the developer needs to supply. We perform a comprehensive systematic evaluation using 100 real world subject systems. Our results show that SUMO guarantees convergence on a target remodularisation with a tractable amount of user interaction.
AbstractList Remodularising the components of a software system is challenging: sound design principles (e.g., coupling and cohesion) need to be balanced against developer intuition of which entities conceptually belong together. Despite this, automated approaches to remodularisation tend to ignore domain knowledge, leading to results that can be nonsensical to developers. Nevertheless, suppling such knowledge is a potentially burdensome task to perform manually. A lot information may need to be specified, particularly for large systems. Addressing these concerns, we propose the SUpervised reMOdularisation (SUMO) approach. SUMO is a technique that aims to leverage a small subset of domain knowledge about a system to produce a remodularisation that will be acceptable to a developer. With SUMO, developers refine a modularisation by iteratively supplying corrections. These corrections constrain the type of remodularisation eventually required, enabling SUMO to dramatically reduce the solution space. This in turn reduces the amount of feedback the developer needs to supply. We perform a comprehensive systematic evaluation using 100 real world subject systems. Our results show that SUMO guarantees convergence on a target remodularisation with a tractable amount of user interaction.
Author McMinn, Phil
Walkinshaw, Neil
Hall, Mathew
Author_xml – sequence: 1
  givenname: Mathew
  orcidid: 0000-0002-9408-2996
  surname: Hall
  fullname: Hall, Mathew
  email: mathew.hall@sheffield.ac.uk
  organization: Department of Computer Science, University of Sheffield, Sheffield, United Kingdom
– sequence: 2
  givenname: Neil
  orcidid: 0000-0003-2134-6548
  surname: Walkinshaw
  fullname: Walkinshaw, Neil
  email: nw91@leicester.ac.uk
  organization: Department of Informatics, University of Leicester, Leicester, United Kingdom
– sequence: 3
  givenname: Phil
  orcidid: 0000-0001-9137-7433
  surname: McMinn
  fullname: McMinn, Phil
  email: p.mcminn@sheffield.ac.uk
  organization: Department of Computer Science, University of Sheffield, Sheffield, United Kingdom
BookMark eNp9kDFPwzAQhS1UJNrCjsQSiTnlbMeJM1ZVgAokBC1z5DrnKlUaB8eh9N-T0oqBgemW9713-kZkUNsaCbmmMKEU0rvlIpswoMmEJTJmjJ2RIU15GnLBYECGAKkMhZDpBRm17QYARJKIIXnNjEHty0-s9sG81tY11ilf1usg-2rQ-eCptrsKizUGZR1MO2-3ymMRLKzxO-UweMOtLbpKubLtOVtfknOjqhavTndM3u-z5ewxfH55mM-mz6HmnPuQMqnVKhKx5KliGoFrSHm8khgBSGGKSGMSM4ykoIYnQI2gTEexNpEpWGz4mNweextnPzpsfb6xnav7yZxRmoCApB8ak_iY0s62rUOT69L__OmdKqucQn7Ql_f68oO-_KSvB-EP2Lhyq9z-P-TmiJSI-BuXTKScM_4NcDd9Cw
CODEN IESEDJ
CitedBy_id crossref_primary_10_1016_j_infsof_2024_107567
crossref_primary_10_1145_3676960
crossref_primary_10_1109_TEM_2022_3160069
crossref_primary_10_1109_TSE_2020_3042553
crossref_primary_10_1007_s10664_021_10049_7
crossref_primary_10_1109_TSE_2024_3523487
crossref_primary_10_1016_j_jss_2021_111162
Cites_doi 10.1109/ICSM.2006.22
10.1007/s10664-012-9226-8
10.1016/0890-5401(87)90052-6
10.1016/j.infsof.2010.07.005
10.1109/TSE.2006.31
10.1109/WPC.2000.852478
10.1109/ICSE.2012.6227195
10.1145/1143997.1144314
10.1007/3-540-45750-X_8
10.2307/2280779
10.1109/WCRE.1999.806964
10.1109/ICSM.2005.31
10.1145/361598.361623
10.1145/302405.302629
10.1109/WCRE.2013.6671296
10.1109/TSE.2009.19
10.1007/978-3-642-33119-0_7
10.1109/TSE.2010.26
10.1002/stvr.1486
10.1109/TSE.2005.25
10.1109/CSMR.2011.34
10.1016/0004-3702(82)90040-6
10.1016/j.jss.2004.03.033
10.1109/ICSM.2012.6405309
10.1109/WCRE.1999.806959
10.1145/2928268
10.1109/ASE.2011.6100123
10.1002/smr.401
10.1109/WICSA.2004.1310696
10.1109/WCRE.2005.24
10.1109/32.917524
10.2307/2312585
10.1002/(SICI)1096-908X(199905/06)11:3<201::AID-SMR192>3.0.CO;2-1
10.1016/j.scico.2014.02.016
10.1109/TSE.2002.1027796
10.1111/j.2517-6161.1995.tb02031.x
10.1109/WCRE.2010.28
10.1109/ICSME.2014.75
10.1002/smr.4360050402
10.1109/ICSE.1991.130626
10.1023/A:1008655230736
ContentType Journal Article
Copyright Copyright IEEE Computer Society 2018
Copyright_xml – notice: Copyright IEEE Computer Society 2018
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
JQ2
K9.
DOI 10.1109/TSE.2017.2786222
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
ProQuest Computer Science Collection
ProQuest Health & Medical Complete (Alumni)
DatabaseTitle CrossRef
ProQuest Health & Medical Complete (Alumni)
ProQuest Computer Science Collection
DatabaseTitleList
ProQuest Health & Medical Complete (Alumni)
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1939-3520
EndPage 630
ExternalDocumentID 10_1109_TSE_2017_2786222
8259332
Genre orig-research
GrantInformation_xml – fundername: Engineering and Physical Sciences Research Council; EPSRC
  grantid: EP/F065825/1
  funderid: 10.13039/501100000266
GroupedDBID --Z
-DZ
-~X
.DC
0R~
29I
4.4
5GY
6IK
85S
8R4
8R5
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABPPZ
ABQJQ
ABVLG
ACGFO
ACGOD
ACIWK
ACNCT
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BKOMP
BPEOZ
CS3
DU5
EBS
EDO
EJD
ESBDL
HZ~
I-F
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
Q2X
RIA
RIE
RNS
RXW
S10
TAE
TN5
TWZ
UHB
UPT
WH7
YZZ
AAYXX
CITATION
JQ2
K9.
ID FETCH-LOGICAL-c333t-128cab456839a2ce03c0936b8e40085fd4ce762e4851f3701f512c46cf4fd26f3
IEDL.DBID RIE
ISICitedReferencesCount 7
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000438906900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0098-5589
IngestDate Sun Nov 30 04:45:22 EST 2025
Sat Nov 29 03:10:24 EST 2025
Tue Nov 18 22:37:43 EST 2025
Wed Aug 27 02:32:40 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 7
Language English
License https://creativecommons.org/licenses/by/3.0/legalcode
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c333t-128cab456839a2ce03c0936b8e40085fd4ce762e4851f3701f512c46cf4fd26f3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-2134-6548
0000-0002-9408-2996
0000-0001-9137-7433
OpenAccessLink https://ieeexplore.ieee.org/document/8259332
PQID 2117050733
PQPubID 21418
PageCount 18
ParticipantIDs proquest_journals_2117050733
crossref_citationtrail_10_1109_TSE_2017_2786222
crossref_primary_10_1109_TSE_2017_2786222
ieee_primary_8259332
PublicationCentury 2000
PublicationDate 2018-07-01
PublicationDateYYYYMMDD 2018-07-01
PublicationDate_xml – month: 07
  year: 2018
  text: 2018-07-01
  day: 01
PublicationDecade 2010
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on software engineering
PublicationTitleAbbrev TSE
PublicationYear 2018
Publisher IEEE
IEEE Computer Society
Publisher_xml – name: IEEE
– name: IEEE Computer Society
References ref13
ref12
ref14
ref11
ref10
ref16
ref19
ref18
wagstaff (ref15) 2001
benjamini (ref34) 1995; 57
heer (ref27) 2005
ref46
ref48
ref47
ref42
ref41
ref44
ref43
ref49
ref8
ref7
ref9
ref4
popper (ref25) 1959
ref3
ref6
ref5
ref40
ref35
ref37
ref36
ref30
ref33
ref32
ref2
ref1
ref39
ponisio (ref17) 2006; 2006
(ref31) 2013
wen (ref45) 2003
ref24
ref23
ref20
ref22
ref21
ref28
ref29
bessière (ref38) 2007
jussien (ref26) 2008
References_xml – ident: ref44
  doi: 10.1109/ICSM.2006.22
– ident: ref37
  doi: 10.1007/s10664-012-9226-8
– ident: ref39
  doi: 10.1016/0890-5401(87)90052-6
– volume: 2006
  start-page: 91
  year: 2006
  ident: ref17
  article-title: Using context information to re-architect a system
  publication-title: Proc of the 3rd Software Measurement European Forum (SMEF)
– start-page: 1
  year: 2008
  ident: ref26
  article-title: The CHOCO constraint programming solver
  publication-title: Proc Workshop Open-Source Softw Integer Contraint Program
– ident: ref47
  doi: 10.1016/j.infsof.2010.07.005
– ident: ref11
  doi: 10.1109/TSE.2006.31
– ident: ref43
  doi: 10.1109/WPC.2000.852478
– ident: ref29
  doi: 10.1109/ICSE.2012.6227195
– ident: ref49
  doi: 10.1145/1143997.1144314
– year: 1959
  ident: ref25
  publication-title: The Logic of Scientific Discovery
– ident: ref9
  doi: 10.1007/3-540-45750-X_8
– start-page: 227
  year: 2003
  ident: ref45
  article-title: An optimal algorithm for MoJo distance
  publication-title: Proc 11th Int l Workshop Program Comprehension
– ident: ref32
  doi: 10.2307/2280779
– ident: ref5
  doi: 10.1109/WCRE.1999.806964
– ident: ref4
  doi: 10.1109/ICSM.2005.31
– ident: ref1
  doi: 10.1145/361598.361623
– start-page: 421
  year: 2005
  ident: ref27
  article-title: Prefuse: A toolkit for interactive information visualization
  publication-title: Proc Conf Human Factors Comput Syst
– ident: ref7
  doi: 10.1145/302405.302629
– ident: ref36
  doi: 10.1109/WCRE.2013.6671296
– ident: ref41
  doi: 10.1109/TSE.2009.19
– year: 2013
  ident: ref31
  publication-title: R A Language and Environment for Statistical Computing
– ident: ref16
  doi: 10.1007/978-3-642-33119-0_7
– start-page: 577
  year: 2001
  ident: ref15
  article-title: Constrained k-means clustering with background knowledge
  publication-title: Proc 18th Int Conf Mach Learn
– ident: ref35
  doi: 10.1109/TSE.2010.26
– ident: ref33
  doi: 10.1002/stvr.1486
– ident: ref10
  doi: 10.1109/TSE.2005.25
– ident: ref12
  doi: 10.1109/CSMR.2011.34
– ident: ref24
  doi: 10.1016/0004-3702(82)90040-6
– ident: ref2
  doi: 10.1016/j.jss.2004.03.033
– start-page: 50
  year: 2007
  ident: ref38
  article-title: Query-driven constraint acquisition
  publication-title: Proc 20th Int Joint Conf Artif Intell
– ident: ref14
  doi: 10.1109/ICSM.2012.6405309
– ident: ref28
  doi: 10.1109/WCRE.1999.806959
– ident: ref19
  doi: 10.1145/2928268
– ident: ref18
  doi: 10.1109/ASE.2011.6100123
– ident: ref3
  doi: 10.1002/smr.401
– ident: ref40
  doi: 10.1109/WICSA.2004.1310696
– ident: ref21
  doi: 10.1109/WCRE.2005.24
– ident: ref8
  doi: 10.1109/32.917524
– ident: ref23
  doi: 10.2307/2312585
– ident: ref6
  doi: 10.1002/(SICI)1096-908X(199905/06)11:3<201::AID-SMR192>3.0.CO;2-1
– ident: ref46
  doi: 10.1016/j.scico.2014.02.016
– ident: ref30
  doi: 10.1109/TSE.2002.1027796
– volume: 57
  start-page: 289
  year: 1995
  ident: ref34
  article-title: Controlling the false discovery rate: A practical and powerful approach to multiple testing
  publication-title: J Royal Statistical Society Series B (Methodological)
  doi: 10.1111/j.2517-6161.1995.tb02031.x
– ident: ref22
  doi: 10.1109/WCRE.2010.28
– ident: ref48
  doi: 10.1109/ICSME.2014.75
– ident: ref20
  doi: 10.1002/smr.4360050402
– ident: ref13
  doi: 10.1109/ICSE.1991.130626
– ident: ref42
  doi: 10.1023/A:1008655230736
SSID ssj0005775
ssib053395008
Score 2.3162827
Snippet Remodularising the components of a software system is challenging: sound design principles (e.g., coupling and cohesion) need to be balanced against developer...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 613
SubjectTerms Acceptable noise levels
Algorithm design and analysis
Automation
Clustering algorithms
domain knowledge
set partitioning
Software
Software algorithms
Software engineering
Software remodularisation
Software systems
Solution space
Title Effectively Incorporating Expert Knowledge in Automated Software Remodularisation
URI https://ieeexplore.ieee.org/document/8259332
https://www.proquest.com/docview/2117050733
Volume 44
WOSCitedRecordID wos000438906900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1939-3520
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0005775
  issn: 0098-5589
  databaseCode: RIE
  dateStart: 19750101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFA5zePDi1ClOp-TgRbBbl7RNcxwyUYShbsJupc0PGMxWtk7xv_clTSeiCN56eCklr3nve8nL9yF0oaHs4ZpqT0M28AIlmZcN_NhjPiOaEB5pLq3YBBuP49mMPzTQ1eYujFLKNp-pnnm0Z_myEGuzVdaHagbqbwi4W4yx6q7WVzsHY2HNjxmGMa-PJH3en05GpoeL9QgD_E7ItxRkNVV-BGKbXW5a__uuPbTrUCQeVm7fRw2VH6BWrdCA3YJto8eKnBgi2uID3xnKSktbDOkKW47jEt_Xe2p4nuPhuiwAwSqJJxCd39Olwk_qpZCmVdW1_Ryi55vR9PrWcyIKnqCUlh7kH5FmAJMACaXEqIMJn9Moi1Vg4JaWgVAQEFUA0EtT5g80QAARREIHWpJI0yPUzItcHSOcBulAZBIwoeFZk5QLSSAJarDjaUjCDurX85oIxzBuhC4Wia00fJ6AJxLjicR5ooMuNyNeK3aNP2zbZuY3dm7SO6hbuy5xy2-VEKOnExo9ypPfR52iHXh3XPXddlGzXK7VGdoWb-V8tTy3f9YnxHTLUg
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3fS8MwED6GCvri1ClOp-bBF8G6Lk2X5nHIxLE5_DFhb6XLDxjMTbZO8b_3krYTUQTf-nChJdfcfZdcvg_g3GDZI0xgPIPZwGNacW_U8COP-5waSkXTCOXEJni_Hw2H4r4El6u7MFpr13ymr-yjO8tXM7m0W2V1rGaw_saAux4yRhvZba2vhg7Ow4IhMwwjURxK-qI-eGrbLi5-RTkieEq_JSGnqvIjFLv8clP-35ftwHaOI0krc_wulPR0D8qFRgPJl2wFHjJ6Yoxpkw_SsaSVjrgYExZxLMcp6Ra7amQ8Ja1lOkMMqxV5wvj8nsw1edQvM2WbVfPGn314vmkPrm-9XEbBk0EQpB5mIJmMECghFkqo1QeTvgiao0gzC7iMYlJjSNQMwZcJuN8wCAIka0rDjKJNExzA2nQ21YdAEpY05EghKrRMayoQUlFMgwbtRBLSsAr1Yl5jmXOMW6mLSexqDV_E6InYeiLOPVGFi9WI14xf4w_bip35lV0-6VWoFa6L8wW4iKlV1AmtIuXR76POYPN2cNeLe51-9xi28D1R1oVbg7V0vtQnsCHf0vFifur-sk-BFc6Z
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Effectively+Incorporating+Expert+Knowledge+in+Automated+Software+Remodularisation&rft.jtitle=IEEE+transactions+on+software+engineering&rft.au=Hall%2C+Mathew&rft.au=Walkinshaw%2C+Neil&rft.au=McMinn%2C+Phil&rft.date=2018-07-01&rft.pub=IEEE+Computer+Society&rft.issn=0098-5589&rft.eissn=1939-3520&rft.volume=44&rft.issue=7&rft.spage=613&rft_id=info:doi/10.1109%2FTSE.2017.2786222&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0098-5589&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0098-5589&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0098-5589&client=summon