S-SIRUS: an explainability algorithm for spatial regression Random Forest

Random Forest (RF) is a widely used machine learning algorithm known for its flexibility, user-friendliness, and high predictive performance across various domains. However, it is non-interpretable. This can limit its usefulness in applied sciences, where understanding the relationships between pred...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Statistics and computing Ročník 35; číslo 5
Hlavní autori: Patelli, Luca, Golini, Natalia, Ignaccolo, Rosaria, Cameletti, Michela
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Dordrecht Springer Nature B.V 01.10.2025
Predmet:
ISSN:0960-3174, 1573-1375
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Random Forest (RF) is a widely used machine learning algorithm known for its flexibility, user-friendliness, and high predictive performance across various domains. However, it is non-interpretable. This can limit its usefulness in applied sciences, where understanding the relationships between predictors and response variable is crucial from a decision-making perspective. In the literature, several methods have been proposed to explain RF, but none of them addresses the challenge of explaining RF in the context of spatially dependent data. Therefore, this work aims to explain regression RF in the case of spatially dependent data by extracting a compact and simple list of rules from an RF that explicitly takes into account the spatial correlation, i.e. RF-GLS. In this respect, we propose S-SIRUS, a spatial extension of SIRUS, the latter being a well-established regression rule algorithm able to extract a stable and short list of rules from the classical regression RF algorithm. To our knowledge, S-SIRUS is the only explainability tool proposed to open an RF-GLS, which, in turn, is the only random forest algorithm in the literature that accounts for spatial correlation internally in the algorithm. A simulation study was conducted to evaluate the explainability capability of the proposed S-SIRUS, by considering different levels of spatial dependence among the data. The results suggest that S-SIRUS exhibits a higher test predictive accuracy than SIRUS when spatial correlation is present. We encourage the use of SIRUS in the absence of spatial correlation and recommend adopting S-SIRUS when such correlation is present.
AbstractList Random Forest (RF) is a widely used machine learning algorithm known for its flexibility, user-friendliness, and high predictive performance across various domains. However, it is non-interpretable. This can limit its usefulness in applied sciences, where understanding the relationships between predictors and response variable is crucial from a decision-making perspective. In the literature, several methods have been proposed to explain RF, but none of them addresses the challenge of explaining RF in the context of spatially dependent data. Therefore, this work aims to explain regression RF in the case of spatially dependent data by extracting a compact and simple list of rules from an RF that explicitly takes into account the spatial correlation, i.e. RF-GLS. In this respect, we propose S-SIRUS, a spatial extension of SIRUS, the latter being a well-established regression rule algorithm able to extract a stable and short list of rules from the classical regression RF algorithm. To our knowledge, S-SIRUS is the only explainability tool proposed to open an RF-GLS, which, in turn, is the only random forest algorithm in the literature that accounts for spatial correlation internally in the algorithm. A simulation study was conducted to evaluate the explainability capability of the proposed S-SIRUS, by considering different levels of spatial dependence among the data. The results suggest that S-SIRUS exhibits a higher test predictive accuracy than SIRUS when spatial correlation is present. We encourage the use of SIRUS in the absence of spatial correlation and recommend adopting S-SIRUS when such correlation is present.
ArticleNumber 142
Author Golini, Natalia
Cameletti, Michela
Ignaccolo, Rosaria
Patelli, Luca
Author_xml – sequence: 1
  givenname: Luca
  surname: Patelli
  fullname: Patelli, Luca
– sequence: 2
  givenname: Natalia
  surname: Golini
  fullname: Golini, Natalia
– sequence: 3
  givenname: Rosaria
  surname: Ignaccolo
  fullname: Ignaccolo, Rosaria
– sequence: 4
  givenname: Michela
  surname: Cameletti
  fullname: Cameletti, Michela
BookMark eNotkNFLwzAYxINMcJv-Az4FfI5-ydckq28y3BwMhM09h7RNZkfX1KQD999bnU8Hx3F3_CZk1IbWEXLP4ZED6KfEuRCCgZCMg5KKwRUZc6mRcdRyRMaQK2DIdXZDJikdADhXmI3Jasu2q81u-0xtS91319i6tUXd1P2Z2mYfYt1_HqkPkabO9rVtaHT76FKqQ0s3tq3CkS7CYPS35NrbJrm7f52S3eL1Y_7G1u_L1fxlzUohZv1wzyq0iF7p3JV5VahK5EJkmZqVyulMOVnwCmzpxazyiitdYKkwR19aEIXFKXm49HYxfJ2GYXMIp9gOkwaF0AiYSTmkxCVVxpBSdN50sT7aeDYczC8yc0FmBmTmD5kB_AGEfl_a
Cites_doi 10.1214/07-AOAS148
10.1007/s41060-018-0144-8
10.1073/pnas.1900654116
10.1214/10-AOAS367
10.1073/pnas.1901326117
10.1002/9781119115151
10.1111/j.1467-842X.1989.tb00510.x
10.1145/3236009
10.1016/j.cageo.2004.03.012
10.1145/3236386.3241340
10.21105/joss.03780
10.18637/jss.v077.i01
10.1038/s41597-023-02034-0
10.1007/s10651-023-00589-0
10.1613/jair.1.12228
10.1016/j.atmosenv.2021.118192
10.1007/s42979-021-00815-1
10.1214/21-SS133
10.1214/20-EJS1792
10.1007/978-3-031-69111-9_23
10.1007/s42979-021-00592-x
10.1016/j.mlwa.2021.100094
10.1109/ICSSD47982.2019.9002770
10.1080/01621459.2021.1950003
10.1002/env.2772
10.1023/A:1010933404324
10.1080/01621459.2015.1044091
10.1016/0022-247X(71)90184-3
10.1080/01621459.2020.1801451
10.1002/sta4.184
ContentType Journal Article
Copyright The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
Copyright_xml – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
DBID AAYXX
CITATION
JQ2
DOI 10.1007/s11222-025-10656-0
DatabaseName CrossRef
ProQuest Computer Science Collection
DatabaseTitle CrossRef
ProQuest Computer Science Collection
DatabaseTitleList ProQuest Computer Science Collection
DeliveryMethod fulltext_linktorsrc
Discipline Statistics
Mathematics
Computer Science
EISSN 1573-1375
ExternalDocumentID 10_1007_s11222_025_10656_0
GroupedDBID -Y2
-~C
.86
.DC
.VR
06D
0R~
0VY
123
199
1N0
1SB
2.D
203
28-
29Q
2J2
2JN
2JY
2KG
2KM
2LR
2P1
2VQ
2~H
30V
4.4
406
408
409
40D
40E
5QI
5VS
67Z
6NX
78A
8TC
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AAPKM
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYXX
ABAKF
ABBBX
ABBRH
ABBXA
ABDBE
ABDZT
ABECU
ABFSG
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKCH
ABKTR
ABLJU
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABRTQ
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFS
ACHSB
ACHXU
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACSNA
ACSTC
ACZOJ
ADHHG
ADHIR
ADHKG
ADIMF
ADKFA
ADKNI
ADKPE
ADRFC
ADTPH
ADURQ
ADYFF
ADZKW
AEBTG
AEFIE
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AENEX
AEOHA
AEPYU
AETLH
AEVLU
AEXYK
AEZWR
AFBBN
AFDZB
AFEXP
AFGCZ
AFHIU
AFLOW
AFOHR
AFQWF
AFWTZ
AFZKB
AGAYW
AGDGC
AGGDS
AGJBK
AGMZJ
AGQEE
AGQMX
AGQPQ
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHPBZ
AHSBF
AHWEU
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AIXLP
AJBLW
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMXSW
AMYLF
AMYQR
AOCGG
ARMRJ
ASPBG
ATHPR
AVWKF
AXYYD
AYFIA
AYJHY
AZFZN
B-.
BA0
BAPOH
BBWZM
BDATZ
BGNMA
BSONS
CAG
CITATION
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
EBLON
EBS
EIOEI
EJD
ESBYG
F5P
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNWQR
GQ7
GQ8
GXS
H13
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I09
IHE
IJ-
IKXTQ
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
KDC
KOV
KOW
LAK
LLZTM
M4Y
MA-
N2Q
NB0
NDZJH
NPVJJ
NQJWS
NU0
O9-
O93
O9G
O9I
O9J
OAM
OVD
P19
P2P
P9R
PF0
PT4
PT5
QOK
QOS
R4E
R89
R9I
RHV
RNI
RNS
ROL
RPX
RSV
RZC
RZE
RZK
S16
S1Z
S26
S27
S28
S3B
SAP
SCJ
SCLPG
SDD
SDH
SDM
SHX
SISQX
SJYHP
SMT
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
T16
TEORI
TN5
TSG
TSK
TSV
TUC
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK8
YLTOR
Z45
ZMTXR
ZWQNP
~EX
AESKC
JQ2
ID FETCH-LOGICAL-c228t-10a63a33f679ec9db6d29224468c6e746e5b1d0acf28df6167b3c6393fca02ba3
IEDL.DBID RSV
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001522722800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0960-3174
IngestDate Sat Nov 01 15:25:38 EDT 2025
Sat Nov 29 07:37:27 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 5
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c228t-10a63a33f679ec9db6d29224468c6e746e5b1d0acf28df6167b3c6393fca02ba3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 3227303455
PQPubID 2043829
ParticipantIDs proquest_journals_3227303455
crossref_primary_10_1007_s11222_025_10656_0
PublicationCentury 2000
PublicationDate 2025-10-01
PublicationDateYYYYMMDD 2025-10-01
PublicationDate_xml – month: 10
  year: 2025
  text: 2025-10-01
  day: 01
PublicationDecade 2020
PublicationPlace Dordrecht
PublicationPlace_xml – name: Dordrecht
PublicationTitle Statistics and computing
PublicationYear 2025
Publisher Springer Nature B.V
Publisher_xml – name: Springer Nature B.V
References N Burkart (10656_CR5) 2021; 70
10656_CR2
R Guidotti (10656_CR13) 2018; 51
A Saha (10656_CR26) 2022; 7
M Aria (10656_CR1) 2021; 6
JH Friedman (10656_CR12) 2008; 2
WJ Murdoch (10656_CR19) 2019; 116
EJ Pebesma (10656_CR22) 2004; 30
10656_CR18
10656_CR14
H Deng (10656_CR8) 2019; 7
PJ Diggle (10656_CR9) 1989; 31
A Rabinowicz (10656_CR23) 2022; 117
ZC Lipton (10656_CR16) 2018; 16
G Fioravanti (10656_CR11) 2021; 248
MN Wright (10656_CR32) 2017; 77
P Otto (10656_CR20) 2024; 31
L Breiman (10656_CR4) 2001; 45
G Kimeldorf (10656_CR15) 1971; 33
A Saha (10656_CR27) 2023; 118
B Yu (10656_CR33) 2020; 117
A Datta (10656_CR7) 2016; 111
C Rudin (10656_CR25) 2022; 16
A Rabinowicz (10656_CR24) 2022; 23
N Meinshausen (10656_CR17) 2010; 4
10656_CR28
CK Wikle (10656_CR31) 2023; 34
C Bénard (10656_CR3) 2021; 15
A Fassò (10656_CR10) 2023; 10
10656_CR21
IH Sarker (10656_CR29) 2021; 2
N Cressie (10656_CR6) 1993
IH Sarker (10656_CR30) 2021; 2
References_xml – volume: 2
  start-page: 916
  year: 2008
  ident: 10656_CR12
  publication-title: The Annals of Applied Statistics
  doi: 10.1214/07-AOAS148
– volume: 7
  start-page: 277
  year: 2019
  ident: 10656_CR8
  publication-title: International Journal of Data Science and Analytics
  doi: 10.1007/s41060-018-0144-8
– volume: 116
  start-page: 22071
  year: 2019
  ident: 10656_CR19
  publication-title: Proc. Natl. Acad. Sci.
  doi: 10.1073/pnas.1900654116
– volume: 4
  start-page: 2049
  year: 2010
  ident: 10656_CR17
  publication-title: The Annals of Applied Statistics
  doi: 10.1214/10-AOAS367
– volume: 117
  start-page: 3920
  year: 2020
  ident: 10656_CR33
  publication-title: Proc. Natl. Acad. Sci. U.S.A.
  doi: 10.1073/pnas.1901326117
– ident: 10656_CR2
– volume-title: Statistics for Spatial Data
  year: 1993
  ident: 10656_CR6
  doi: 10.1002/9781119115151
– volume: 31
  start-page: 166
  year: 1989
  ident: 10656_CR9
  publication-title: Australian Journal of Statistics
  doi: 10.1111/j.1467-842X.1989.tb00510.x
– volume: 51
  start-page: 1
  year: 2018
  ident: 10656_CR13
  publication-title: ACM computing surveys (CSUR)
  doi: 10.1145/3236009
– volume: 30
  start-page: 683
  year: 2004
  ident: 10656_CR22
  publication-title: Computers & Geosciences
  doi: 10.1016/j.cageo.2004.03.012
– volume: 16
  start-page: 31
  year: 2018
  ident: 10656_CR16
  publication-title: Queue
  doi: 10.1145/3236386.3241340
– volume: 7
  start-page: 3780
  year: 2022
  ident: 10656_CR26
  publication-title: Journal Open Source Software
  doi: 10.21105/joss.03780
– volume: 77
  start-page: 1
  year: 2017
  ident: 10656_CR32
  publication-title: J. Stat. Softw.
  doi: 10.18637/jss.v077.i01
– ident: 10656_CR18
– volume: 10
  start-page: 1
  year: 2023
  ident: 10656_CR10
  publication-title: Scientific Data
  doi: 10.1038/s41597-023-02034-0
– volume: 31
  start-page: 1
  year: 2024
  ident: 10656_CR20
  publication-title: Environ. Ecol. Stat.
  doi: 10.1007/s10651-023-00589-0
– volume: 70
  start-page: 245
  year: 2021
  ident: 10656_CR5
  publication-title: Journal of Artificial Intelligence Research
  doi: 10.1613/jair.1.12228
– volume: 248
  year: 2021
  ident: 10656_CR11
  publication-title: Atmos. Environ.
  doi: 10.1016/j.atmosenv.2021.118192
– volume: 2
  start-page: 420
  year: 2021
  ident: 10656_CR29
  publication-title: SN Computer Science
  doi: 10.1007/s42979-021-00815-1
– volume: 16
  start-page: 1
  year: 2022
  ident: 10656_CR25
  publication-title: Statistics Surveys
  doi: 10.1214/21-SS133
– volume: 15
  start-page: 427
  year: 2021
  ident: 10656_CR3
  publication-title: Electronic Journal of Statistics
  doi: 10.1214/20-EJS1792
– volume: 23
  start-page: 1
  year: 2022
  ident: 10656_CR24
  publication-title: J. Mach. Learn. Res.
– ident: 10656_CR21
  doi: 10.1007/978-3-031-69111-9_23
– volume: 2
  start-page: 160
  year: 2021
  ident: 10656_CR30
  publication-title: SN computer science
  doi: 10.1007/s42979-021-00592-x
– volume: 6
  year: 2021
  ident: 10656_CR1
  publication-title: Machine Learning with Applications
  doi: 10.1016/j.mlwa.2021.100094
– ident: 10656_CR14
  doi: 10.1109/ICSSD47982.2019.9002770
– volume: 118
  start-page: 665
  year: 2023
  ident: 10656_CR27
  publication-title: J. Am. Stat. Assoc.
  doi: 10.1080/01621459.2021.1950003
– volume: 34
  year: 2023
  ident: 10656_CR31
  publication-title: Environmetrics
  doi: 10.1002/env.2772
– volume: 45
  start-page: 5
  year: 2001
  ident: 10656_CR4
  publication-title: Random forests. Machine learning
  doi: 10.1023/A:1010933404324
– volume: 111
  start-page: 800
  year: 2016
  ident: 10656_CR7
  publication-title: J. Am. Stat. Assoc.
  doi: 10.1080/01621459.2015.1044091
– volume: 33
  start-page: 82
  year: 1971
  ident: 10656_CR15
  publication-title: J. Math. Anal. Appl.
  doi: 10.1016/0022-247X(71)90184-3
– volume: 117
  start-page: 718
  year: 2022
  ident: 10656_CR23
  publication-title: J. Am. Stat. Assoc.
  doi: 10.1080/01621459.2020.1801451
– ident: 10656_CR28
  doi: 10.1002/sta4.184
SSID ssj0011634
Score 2.408466
Snippet Random Forest (RF) is a widely used machine learning algorithm known for its flexibility, user-friendliness, and high predictive performance across various...
SourceID proquest
crossref
SourceType Aggregation Database
Index Database
SubjectTerms Algorithms
Correlation
Decision trees
Machine learning
Regression
Spatial dependencies
Title S-SIRUS: an explainability algorithm for spatial regression Random Forest
URI https://www.proquest.com/docview/3227303455
Volume 35
WOSCitedRecordID wos001522722800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAVX
  databaseName: SpringerLINK Contemporary 1997-Present
  customDbUrl:
  eissn: 1573-1375
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0011634
  issn: 0960-3174
  databaseCode: RSV
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1RS8MwED6G-DAfnE7F6ZQ8-KbFNk3T1DcRhwMdsjnZW0nTdApbN9op-u-9tN1koA97L2m4S777jtzdB3Bha6GpVL6lIuZigsI9C9NmbnETTiMhHCkLTz_6vZ4YjYLnGlz9-4J_nSMjwITJyK5i-uJh_ouA63Bq5Ar6g9fVkwESi2JWFFJyBBafVR0yfy-xHoXWQbiILJ3GZnvag92KQZLb0uX7UNNpExpLdQZSXdYm7DytJrLmTagbVlkOZT6A7sAadPvDwQ2RKdFf80nRQmWqZL-JnIxn2fvibUqQzpLcFFzj3zI9LgtmU9KXaTybEqPpmS8OYdi5f7l7sCpNBUtRKha4Xcld6boJ9wOtgjjiMQ0wjDMuFNc-49qLnNiWKqEiTrjD_chVyGLcREmbRtI9gq10lupjIInkirE4pkIwhh8EylaRp01XgxNx4bTgcmnjcF6Ozgh_hyQbA4ZowLAwYGi3oL10Q1hdozxEtEEEcpnnnWy02CnUaekcy3basLXIPvQZbKtPNHR2XpybH2s0t60
linkProvider Springer Nature
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=S-SIRUS%3A+an+explainability+algorithm+for+spatial+regression+Random+Forest&rft.jtitle=Statistics+and+computing&rft.au=Patelli+Luca&rft.au=Golini+Natalia&rft.au=Ignaccolo+Rosaria&rft.au=Cameletti+Michela&rft.date=2025-10-01&rft.pub=Springer+Nature+B.V&rft.issn=0960-3174&rft.eissn=1573-1375&rft.volume=35&rft.issue=5&rft_id=info:doi/10.1007%2Fs11222-025-10656-0&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0960-3174&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0960-3174&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0960-3174&client=summon