S-SIRUS: an explainability algorithm for spatial regression Random Forest
Random Forest (RF) is a widely used machine learning algorithm known for its flexibility, user-friendliness, and high predictive performance across various domains. However, it is non-interpretable. This can limit its usefulness in applied sciences, where understanding the relationships between pred...
Uložené v:
| Vydané v: | Statistics and computing Ročník 35; číslo 5 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Dordrecht
Springer Nature B.V
01.10.2025
|
| Predmet: | |
| ISSN: | 0960-3174, 1573-1375 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Random Forest (RF) is a widely used machine learning algorithm known for its flexibility, user-friendliness, and high predictive performance across various domains. However, it is non-interpretable. This can limit its usefulness in applied sciences, where understanding the relationships between predictors and response variable is crucial from a decision-making perspective. In the literature, several methods have been proposed to explain RF, but none of them addresses the challenge of explaining RF in the context of spatially dependent data. Therefore, this work aims to explain regression RF in the case of spatially dependent data by extracting a compact and simple list of rules from an RF that explicitly takes into account the spatial correlation, i.e. RF-GLS. In this respect, we propose S-SIRUS, a spatial extension of SIRUS, the latter being a well-established regression rule algorithm able to extract a stable and short list of rules from the classical regression RF algorithm. To our knowledge, S-SIRUS is the only explainability tool proposed to open an RF-GLS, which, in turn, is the only random forest algorithm in the literature that accounts for spatial correlation internally in the algorithm. A simulation study was conducted to evaluate the explainability capability of the proposed S-SIRUS, by considering different levels of spatial dependence among the data. The results suggest that S-SIRUS exhibits a higher test predictive accuracy than SIRUS when spatial correlation is present. We encourage the use of SIRUS in the absence of spatial correlation and recommend adopting S-SIRUS when such correlation is present. |
|---|---|
| AbstractList | Random Forest (RF) is a widely used machine learning algorithm known for its flexibility, user-friendliness, and high predictive performance across various domains. However, it is non-interpretable. This can limit its usefulness in applied sciences, where understanding the relationships between predictors and response variable is crucial from a decision-making perspective. In the literature, several methods have been proposed to explain RF, but none of them addresses the challenge of explaining RF in the context of spatially dependent data. Therefore, this work aims to explain regression RF in the case of spatially dependent data by extracting a compact and simple list of rules from an RF that explicitly takes into account the spatial correlation, i.e. RF-GLS. In this respect, we propose S-SIRUS, a spatial extension of SIRUS, the latter being a well-established regression rule algorithm able to extract a stable and short list of rules from the classical regression RF algorithm. To our knowledge, S-SIRUS is the only explainability tool proposed to open an RF-GLS, which, in turn, is the only random forest algorithm in the literature that accounts for spatial correlation internally in the algorithm. A simulation study was conducted to evaluate the explainability capability of the proposed S-SIRUS, by considering different levels of spatial dependence among the data. The results suggest that S-SIRUS exhibits a higher test predictive accuracy than SIRUS when spatial correlation is present. We encourage the use of SIRUS in the absence of spatial correlation and recommend adopting S-SIRUS when such correlation is present. |
| ArticleNumber | 142 |
| Author | Golini, Natalia Cameletti, Michela Ignaccolo, Rosaria Patelli, Luca |
| Author_xml | – sequence: 1 givenname: Luca surname: Patelli fullname: Patelli, Luca – sequence: 2 givenname: Natalia surname: Golini fullname: Golini, Natalia – sequence: 3 givenname: Rosaria surname: Ignaccolo fullname: Ignaccolo, Rosaria – sequence: 4 givenname: Michela surname: Cameletti fullname: Cameletti, Michela |
| BookMark | eNotkNFLwzAYxINMcJv-Az4FfI5-ydckq28y3BwMhM09h7RNZkfX1KQD999bnU8Hx3F3_CZk1IbWEXLP4ZED6KfEuRCCgZCMg5KKwRUZc6mRcdRyRMaQK2DIdXZDJikdADhXmI3Jasu2q81u-0xtS91319i6tUXd1P2Z2mYfYt1_HqkPkabO9rVtaHT76FKqQ0s3tq3CkS7CYPS35NrbJrm7f52S3eL1Y_7G1u_L1fxlzUohZv1wzyq0iF7p3JV5VahK5EJkmZqVyulMOVnwCmzpxazyiitdYKkwR19aEIXFKXm49HYxfJ2GYXMIp9gOkwaF0AiYSTmkxCVVxpBSdN50sT7aeDYczC8yc0FmBmTmD5kB_AGEfl_a |
| Cites_doi | 10.1214/07-AOAS148 10.1007/s41060-018-0144-8 10.1073/pnas.1900654116 10.1214/10-AOAS367 10.1073/pnas.1901326117 10.1002/9781119115151 10.1111/j.1467-842X.1989.tb00510.x 10.1145/3236009 10.1016/j.cageo.2004.03.012 10.1145/3236386.3241340 10.21105/joss.03780 10.18637/jss.v077.i01 10.1038/s41597-023-02034-0 10.1007/s10651-023-00589-0 10.1613/jair.1.12228 10.1016/j.atmosenv.2021.118192 10.1007/s42979-021-00815-1 10.1214/21-SS133 10.1214/20-EJS1792 10.1007/978-3-031-69111-9_23 10.1007/s42979-021-00592-x 10.1016/j.mlwa.2021.100094 10.1109/ICSSD47982.2019.9002770 10.1080/01621459.2021.1950003 10.1002/env.2772 10.1023/A:1010933404324 10.1080/01621459.2015.1044091 10.1016/0022-247X(71)90184-3 10.1080/01621459.2020.1801451 10.1002/sta4.184 |
| ContentType | Journal Article |
| Copyright | The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025. |
| Copyright_xml | – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025. |
| DBID | AAYXX CITATION JQ2 |
| DOI | 10.1007/s11222-025-10656-0 |
| DatabaseName | CrossRef ProQuest Computer Science Collection |
| DatabaseTitle | CrossRef ProQuest Computer Science Collection |
| DatabaseTitleList | ProQuest Computer Science Collection |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Statistics Mathematics Computer Science |
| EISSN | 1573-1375 |
| ExternalDocumentID | 10_1007_s11222_025_10656_0 |
| GroupedDBID | -Y2 -~C .86 .DC .VR 06D 0R~ 0VY 123 199 1N0 1SB 2.D 203 28- 29Q 2J2 2JN 2JY 2KG 2KM 2LR 2P1 2VQ 2~H 30V 4.4 406 408 409 40D 40E 5QI 5VS 67Z 6NX 78A 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AAPKM AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYXX ABAKF ABBBX ABBRH ABBXA ABDBE ABDZT ABECU ABFSG ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABLJU ABMNI ABMQK ABNWP ABQBU ABQSL ABRTQ ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABWNU ABXPI ACAOD ACBXY ACDTI ACGFS ACHSB ACHXU ACKNC ACMDZ ACMLO ACOKC ACOMO ACPIV ACSNA ACSTC ACZOJ ADHHG ADHIR ADHKG ADIMF ADKFA ADKNI ADKPE ADRFC ADTPH ADURQ ADYFF ADZKW AEBTG AEFIE AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AEMSY AENEX AEOHA AEPYU AETLH AEVLU AEXYK AEZWR AFBBN AFDZB AFEXP AFGCZ AFHIU AFLOW AFOHR AFQWF AFWTZ AFZKB AGAYW AGDGC AGGDS AGJBK AGMZJ AGQEE AGQMX AGQPQ AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHPBZ AHSBF AHWEU AHYZX AIAKS AIGIU AIIXL AILAN AITGF AIXLP AJBLW AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARMRJ ASPBG ATHPR AVWKF AXYYD AYFIA AYJHY AZFZN B-. BA0 BAPOH BBWZM BDATZ BGNMA BSONS CAG CITATION COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 EBLON EBS EIOEI EJD ESBYG F5P FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNWQR GQ7 GQ8 GXS H13 HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I09 IHE IJ- IKXTQ ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ KDC KOV KOW LAK LLZTM M4Y MA- N2Q NB0 NDZJH NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM OVD P19 P2P P9R PF0 PT4 PT5 QOK QOS R4E R89 R9I RHV RNI RNS ROL RPX RSV RZC RZE RZK S16 S1Z S26 S27 S28 S3B SAP SCJ SCLPG SDD SDH SDM SHX SISQX SJYHP SMT SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TEORI TN5 TSG TSK TSV TUC U2A UG4 UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WK8 YLTOR Z45 ZMTXR ZWQNP ~EX AESKC JQ2 |
| ID | FETCH-LOGICAL-c228t-10a63a33f679ec9db6d29224468c6e746e5b1d0acf28df6167b3c6393fca02ba3 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001522722800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0960-3174 |
| IngestDate | Sat Nov 01 15:25:38 EDT 2025 Sat Nov 29 07:37:27 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 5 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c228t-10a63a33f679ec9db6d29224468c6e746e5b1d0acf28df6167b3c6393fca02ba3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| PQID | 3227303455 |
| PQPubID | 2043829 |
| ParticipantIDs | proquest_journals_3227303455 crossref_primary_10_1007_s11222_025_10656_0 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-10-01 |
| PublicationDateYYYYMMDD | 2025-10-01 |
| PublicationDate_xml | – month: 10 year: 2025 text: 2025-10-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | Dordrecht |
| PublicationPlace_xml | – name: Dordrecht |
| PublicationTitle | Statistics and computing |
| PublicationYear | 2025 |
| Publisher | Springer Nature B.V |
| Publisher_xml | – name: Springer Nature B.V |
| References | N Burkart (10656_CR5) 2021; 70 10656_CR2 R Guidotti (10656_CR13) 2018; 51 A Saha (10656_CR26) 2022; 7 M Aria (10656_CR1) 2021; 6 JH Friedman (10656_CR12) 2008; 2 WJ Murdoch (10656_CR19) 2019; 116 EJ Pebesma (10656_CR22) 2004; 30 10656_CR18 10656_CR14 H Deng (10656_CR8) 2019; 7 PJ Diggle (10656_CR9) 1989; 31 A Rabinowicz (10656_CR23) 2022; 117 ZC Lipton (10656_CR16) 2018; 16 G Fioravanti (10656_CR11) 2021; 248 MN Wright (10656_CR32) 2017; 77 P Otto (10656_CR20) 2024; 31 L Breiman (10656_CR4) 2001; 45 G Kimeldorf (10656_CR15) 1971; 33 A Saha (10656_CR27) 2023; 118 B Yu (10656_CR33) 2020; 117 A Datta (10656_CR7) 2016; 111 C Rudin (10656_CR25) 2022; 16 A Rabinowicz (10656_CR24) 2022; 23 N Meinshausen (10656_CR17) 2010; 4 10656_CR28 CK Wikle (10656_CR31) 2023; 34 C Bénard (10656_CR3) 2021; 15 A Fassò (10656_CR10) 2023; 10 10656_CR21 IH Sarker (10656_CR29) 2021; 2 N Cressie (10656_CR6) 1993 IH Sarker (10656_CR30) 2021; 2 |
| References_xml | – volume: 2 start-page: 916 year: 2008 ident: 10656_CR12 publication-title: The Annals of Applied Statistics doi: 10.1214/07-AOAS148 – volume: 7 start-page: 277 year: 2019 ident: 10656_CR8 publication-title: International Journal of Data Science and Analytics doi: 10.1007/s41060-018-0144-8 – volume: 116 start-page: 22071 year: 2019 ident: 10656_CR19 publication-title: Proc. Natl. Acad. Sci. doi: 10.1073/pnas.1900654116 – volume: 4 start-page: 2049 year: 2010 ident: 10656_CR17 publication-title: The Annals of Applied Statistics doi: 10.1214/10-AOAS367 – volume: 117 start-page: 3920 year: 2020 ident: 10656_CR33 publication-title: Proc. Natl. Acad. Sci. U.S.A. doi: 10.1073/pnas.1901326117 – ident: 10656_CR2 – volume-title: Statistics for Spatial Data year: 1993 ident: 10656_CR6 doi: 10.1002/9781119115151 – volume: 31 start-page: 166 year: 1989 ident: 10656_CR9 publication-title: Australian Journal of Statistics doi: 10.1111/j.1467-842X.1989.tb00510.x – volume: 51 start-page: 1 year: 2018 ident: 10656_CR13 publication-title: ACM computing surveys (CSUR) doi: 10.1145/3236009 – volume: 30 start-page: 683 year: 2004 ident: 10656_CR22 publication-title: Computers & Geosciences doi: 10.1016/j.cageo.2004.03.012 – volume: 16 start-page: 31 year: 2018 ident: 10656_CR16 publication-title: Queue doi: 10.1145/3236386.3241340 – volume: 7 start-page: 3780 year: 2022 ident: 10656_CR26 publication-title: Journal Open Source Software doi: 10.21105/joss.03780 – volume: 77 start-page: 1 year: 2017 ident: 10656_CR32 publication-title: J. Stat. Softw. doi: 10.18637/jss.v077.i01 – ident: 10656_CR18 – volume: 10 start-page: 1 year: 2023 ident: 10656_CR10 publication-title: Scientific Data doi: 10.1038/s41597-023-02034-0 – volume: 31 start-page: 1 year: 2024 ident: 10656_CR20 publication-title: Environ. Ecol. Stat. doi: 10.1007/s10651-023-00589-0 – volume: 70 start-page: 245 year: 2021 ident: 10656_CR5 publication-title: Journal of Artificial Intelligence Research doi: 10.1613/jair.1.12228 – volume: 248 year: 2021 ident: 10656_CR11 publication-title: Atmos. Environ. doi: 10.1016/j.atmosenv.2021.118192 – volume: 2 start-page: 420 year: 2021 ident: 10656_CR29 publication-title: SN Computer Science doi: 10.1007/s42979-021-00815-1 – volume: 16 start-page: 1 year: 2022 ident: 10656_CR25 publication-title: Statistics Surveys doi: 10.1214/21-SS133 – volume: 15 start-page: 427 year: 2021 ident: 10656_CR3 publication-title: Electronic Journal of Statistics doi: 10.1214/20-EJS1792 – volume: 23 start-page: 1 year: 2022 ident: 10656_CR24 publication-title: J. Mach. Learn. Res. – ident: 10656_CR21 doi: 10.1007/978-3-031-69111-9_23 – volume: 2 start-page: 160 year: 2021 ident: 10656_CR30 publication-title: SN computer science doi: 10.1007/s42979-021-00592-x – volume: 6 year: 2021 ident: 10656_CR1 publication-title: Machine Learning with Applications doi: 10.1016/j.mlwa.2021.100094 – ident: 10656_CR14 doi: 10.1109/ICSSD47982.2019.9002770 – volume: 118 start-page: 665 year: 2023 ident: 10656_CR27 publication-title: J. Am. Stat. Assoc. doi: 10.1080/01621459.2021.1950003 – volume: 34 year: 2023 ident: 10656_CR31 publication-title: Environmetrics doi: 10.1002/env.2772 – volume: 45 start-page: 5 year: 2001 ident: 10656_CR4 publication-title: Random forests. Machine learning doi: 10.1023/A:1010933404324 – volume: 111 start-page: 800 year: 2016 ident: 10656_CR7 publication-title: J. Am. Stat. Assoc. doi: 10.1080/01621459.2015.1044091 – volume: 33 start-page: 82 year: 1971 ident: 10656_CR15 publication-title: J. Math. Anal. Appl. doi: 10.1016/0022-247X(71)90184-3 – volume: 117 start-page: 718 year: 2022 ident: 10656_CR23 publication-title: J. Am. Stat. Assoc. doi: 10.1080/01621459.2020.1801451 – ident: 10656_CR28 doi: 10.1002/sta4.184 |
| SSID | ssj0011634 |
| Score | 2.408466 |
| Snippet | Random Forest (RF) is a widely used machine learning algorithm known for its flexibility, user-friendliness, and high predictive performance across various... |
| SourceID | proquest crossref |
| SourceType | Aggregation Database Index Database |
| SubjectTerms | Algorithms Correlation Decision trees Machine learning Regression Spatial dependencies |
| Title | S-SIRUS: an explainability algorithm for spatial regression Random Forest |
| URI | https://www.proquest.com/docview/3227303455 |
| Volume | 35 |
| WOSCitedRecordID | wos001522722800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 1573-1375 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0011634 issn: 0960-3174 databaseCode: RSV dateStart: 19970101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1RS8MwED6G-DAfnE7F6ZQ8-KbFNk3T1DcRhwMdsjnZW0nTdApbN9op-u-9tN1koA97L2m4S777jtzdB3Bha6GpVL6lIuZigsI9C9NmbnETTiMhHCkLTz_6vZ4YjYLnGlz9-4J_nSMjwITJyK5i-uJh_ouA63Bq5Ar6g9fVkwESi2JWFFJyBBafVR0yfy-xHoXWQbiILJ3GZnvag92KQZLb0uX7UNNpExpLdQZSXdYm7DytJrLmTagbVlkOZT6A7sAadPvDwQ2RKdFf80nRQmWqZL-JnIxn2fvibUqQzpLcFFzj3zI9LgtmU9KXaTybEqPpmS8OYdi5f7l7sCpNBUtRKha4Xcld6boJ9wOtgjjiMQ0wjDMuFNc-49qLnNiWKqEiTrjD_chVyGLcREmbRtI9gq10lupjIInkirE4pkIwhh8EylaRp01XgxNx4bTgcmnjcF6Ozgh_hyQbA4ZowLAwYGi3oL10Q1hdozxEtEEEcpnnnWy02CnUaekcy3basLXIPvQZbKtPNHR2XpybH2s0t60 |
| linkProvider | Springer Nature |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=S-SIRUS%3A+an+explainability+algorithm+for+spatial+regression+Random+Forest&rft.jtitle=Statistics+and+computing&rft.au=Patelli+Luca&rft.au=Golini+Natalia&rft.au=Ignaccolo+Rosaria&rft.au=Cameletti+Michela&rft.date=2025-10-01&rft.pub=Springer+Nature+B.V&rft.issn=0960-3174&rft.eissn=1573-1375&rft.volume=35&rft.issue=5&rft_id=info:doi/10.1007%2Fs11222-025-10656-0&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0960-3174&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0960-3174&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0960-3174&client=summon |