BOOMER — An algorithm for learning gradient boosted multi-label classification rules
Multi-label classification is concerned with the assignment of sets of labels to individual data points. Due to its diverse real-world applications, e.g., the annotation of text documents with topics, it has become a well-established field of machine learning research. Compared to traditional classi...
Uloženo v:
| Vydáno v: | Software impacts Ročník 10; s. 100137 |
|---|---|
| Hlavní autor: | |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Elsevier B.V
01.11.2021
|
| Témata: | |
| ISSN: | 2665-9638, 2665-9638 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Multi-label classification is concerned with the assignment of sets of labels to individual data points. Due to its diverse real-world applications, e.g., the annotation of text documents with topics, it has become a well-established field of machine learning research. Compared to traditional classification, where classes are mutually exclusive, multi-label classification comes with interesting challenges, most prominently the requirement to take dependencies between labels into account. In this work, we present a modular and customizable implementation of BOOMER – an algorithm for learning gradient boosted multi-label classification rules – that can flexibly be adjusted to different use cases and requirements.
•BOOMER is an algorithm for learning gradient boosted multi-label classification rules.•The goal of multi-label classification is the automatic assignment of sets of labels to individual data points.•BOOMER enables to optimize decomposable and non-decomposable loss functions.•The implementation incorporates several optimizations and approximation techniques to be able to deal with large datasets.•Gradient-based Label Binning can be used to form groups of similar labels. |
|---|---|
| AbstractList | Multi-label classification is concerned with the assignment of sets of labels to individual data points. Due to its diverse real-world applications, e.g., the annotation of text documents with topics, it has become a well-established field of machine learning research. Compared to traditional classification, where classes are mutually exclusive, multi-label classification comes with interesting challenges, most prominently the requirement to take dependencies between labels into account. In this work, we present a modular and customizable implementation of BOOMER – an algorithm for learning gradient boosted multi-label classification rules – that can flexibly be adjusted to different use cases and requirements.
•BOOMER is an algorithm for learning gradient boosted multi-label classification rules.•The goal of multi-label classification is the automatic assignment of sets of labels to individual data points.•BOOMER enables to optimize decomposable and non-decomposable loss functions.•The implementation incorporates several optimizations and approximation techniques to be able to deal with large datasets.•Gradient-based Label Binning can be used to form groups of similar labels. |
| ArticleNumber | 100137 |
| Author | Rapp, Michael |
| Author_xml | – sequence: 1 givenname: Michael orcidid: 0000-0001-8570-8240 surname: Rapp fullname: Rapp, Michael email: mrapp@ke.tu-darmstadt.de organization: Knowledge Engineering Group, TU Darmstadt, Hochschulstraße 10, 64289 Darmstadt, Germany |
| BookMark | eNqFkMtKAzEUhoNUsNY-gZu8wNRc5rpwUUu9QKUg6jZkMic1JTMpSSq48yF8Qp_EaetCXOjqHA58h___TtGgcx0gdE7JhBKaX6wnwbQbOWGE0f5CKC-O0JDleZZUOS8HP_YTNA5hTQhhGaU0L4fo-Wq5vJ8_4M_3DzztsLQr5018abF2HluQvjPdCq-8bAx0EdfOhQgNbrc2msTKGixWVoZgtFEyGtdhv7UQztCxljbA-HuO0NP1_HF2myyWN3ez6SJRPC1jolNCSwKUAS9zXTQFryQQzVlalwC0oqpKG5CyThkjBc-0riQlKmOyymqWST5C1eGv8i4ED1ooE_c5opfGCkrEzpFYi70jsXMkDo56lv9iN9600r_9Q10eKOhrvRrwIqhejYLGeFBRNM78yX8BeAKEhQ |
| CitedBy_id | crossref_primary_10_3390_jimaging9020033 crossref_primary_10_1007_s10489_022_04370_x crossref_primary_10_1016_j_asoc_2025_112740 |
| Cites_doi | 10.1109/MCSE.2010.118 10.1007/978-3-030-57977-7_1 10.1145/2939672.2939785 10.1007/978-3-030-67664-3_8 10.1007/s10994-012-5285-8 10.1007/978-3-030-86523-8_28 10.1002/widm.1139 10.1145/567806.567807 10.1007/978-3-642-23808-6_10 |
| ContentType | Journal Article |
| Copyright | 2021 The Author(s) |
| Copyright_xml | – notice: 2021 The Author(s) |
| DBID | 6I. AAFTH AAYXX CITATION |
| DOI | 10.1016/j.simpa.2021.100137 |
| DatabaseName | ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| EISSN | 2665-9638 |
| ExternalDocumentID | 10_1016_j_simpa_2021_100137 S2665963821000567 |
| GroupedDBID | 0SF 6I. AAEDW AAFTH AALRI AAXUO AEXQZ AITUG ALMA_UNASSIGNED_HOLDINGS AMRAJ EBS EJD FDB M41 M~E NCXOZ ROL 0R~ AAYWO AAYXX ACVFH ADCNI ADVLN AEUPX AFJKZ AFPUW AIGII AKBMS AKRWK AKYEP APXCP CITATION |
| ID | FETCH-LOGICAL-c348t-f40180e12e386f7d739ae0f324b8ee191c94deaab4220735ff9a10c52a95b25a3 |
| ISICitedReferencesCount | 7 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000837034900025&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2665-9638 |
| IngestDate | Tue Nov 18 22:27:08 EST 2025 Thu Nov 20 00:56:09 EST 2025 Thu Jul 20 20:14:31 EDT 2023 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Rule learning Multi-label classification Machine learning Gradient boosting |
| Language | English |
| License | This is an open access article under the CC BY license. |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c348t-f40180e12e386f7d739ae0f324b8ee191c94deaab4220735ff9a10c52a95b25a3 |
| ORCID | 0000-0001-8570-8240 |
| OpenAccessLink | https://dx.doi.org/10.1016/j.simpa.2021.100137 |
| ParticipantIDs | crossref_citationtrail_10_1016_j_simpa_2021_100137 crossref_primary_10_1016_j_simpa_2021_100137 elsevier_sciencedirect_doi_10_1016_j_simpa_2021_100137 |
| PublicationCentury | 2000 |
| PublicationDate | November 2021 2021-11-00 |
| PublicationDateYYYYMMDD | 2021-11-01 |
| PublicationDate_xml | – month: 11 year: 2021 text: November 2021 |
| PublicationDecade | 2020 |
| PublicationTitle | Software impacts |
| PublicationYear | 2021 |
| Publisher | Elsevier B.V |
| Publisher_xml | – name: Elsevier B.V |
| References | Tianqi Chen, Carlos Guestrin, XGBoost: A scalable tree boosting system, in: Proc. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 785–794. Tsoumakas, Spyromitros-Xioufis, Vilcek, Vlahavas (b15) 2011; 12 Anderson, Bai, Bischof, Blackford, Demmel, Dongarra, Du Croz, Greenbaum, Hammarling, McKenney (b12) 1999 Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Prettenhofer, Weiss, Dubourg (b13) 2011; 12 Dembczyński, Waegeman, Cheng, Hüllermeier (b2) 2012; 88 Hüllermeier, Wever, Loza Mencía, Fürnkranz, Rapp (b9) 2020 Si Si, Huan Zhang, S. Sathiya Keerthi, Dhruv Mahajan, Inderjit S. Dhillon, Cho-Jui Hsieh, Gradient boosted decision trees for high dimensional sparse output, in: Proc. International Conference on Machine Learning (ICML), 2017, pp. 3182–3190. Kirchhof, Schmid, Reining, ten Hompel, Pauly (b20) 2021 Michael Rapp, Eneldo Loza Mencía, Johannes Fürnkranz, Vu-Linh Nguyen, Eyke Hüllermeier, Learning gradient boosted multi-label classification rules, in: Proc. European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), 2020, pp. 124–140. Ke, Meng, Finley, Wang, Chen, Ma, Ye, Liu (b4) 2017; 30 Chandra, Dagum, Kohr, Menon, Maydan, McDonald (b10) 2001 Michael Rapp, Eneldo Loza Mencía, Johannes Fürnkranz, Eyke Hüllermeier, Gradient-based label binning in multi-label classification, in: Proc. European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), 2021, pp. 462–477. Behnel, Bradshaw, Citro, Dalcin, Seljebotn, Smith (b14) 2010; 13 Gibaja, Ventura (b1) 2014; 4 Blackford, Petitet, Pozo, Remington, Whaley, Demmel, Dongarra, Duff, Hammarling, Henry (b11) 2002; 28 Zhang, Jung (b7) 2020 Eyke Hüllermeier, Johannes Fürnkranz, Eneldo Loza Mencía, Vu-Linh Nguyen, Michael Rapp, Rule-based multi-label classification: Challenges and opportunities, in: Proc. International Joint Conference on Rules and Reasoning, 2020, pp. 3–19. Konstantinos Sechidis, Grigorios Tsoumakas, Ioannis Vlahavas, On the stratification of multi-label data, in: Proc. European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), 2011, pp. 145–158. Loza Mencía, Fürnkranz, Hüllermeier, Rapp (b16) 2018 Yonatan Amit, Ofer Dekel, Yoram Singer, A boosting algorithm for label covering in multilabel problems, in: Proc. International Conference on Artificial Intelligence and Statistics (AISTATS), 2007, pp. 27–34. Ke (10.1016/j.simpa.2021.100137_b4) 2017; 30 10.1016/j.simpa.2021.100137_b5 10.1016/j.simpa.2021.100137_b19 10.1016/j.simpa.2021.100137_b18 10.1016/j.simpa.2021.100137_b3 10.1016/j.simpa.2021.100137_b17 10.1016/j.simpa.2021.100137_b8 10.1016/j.simpa.2021.100137_b6 Zhang (10.1016/j.simpa.2021.100137_b7) 2020 Blackford (10.1016/j.simpa.2021.100137_b11) 2002; 28 Hüllermeier (10.1016/j.simpa.2021.100137_b9) 2020 Tsoumakas (10.1016/j.simpa.2021.100137_b15) 2011; 12 Anderson (10.1016/j.simpa.2021.100137_b12) 1999 Behnel (10.1016/j.simpa.2021.100137_b14) 2010; 13 Dembczyński (10.1016/j.simpa.2021.100137_b2) 2012; 88 Pedregosa (10.1016/j.simpa.2021.100137_b13) 2011; 12 Loza Mencía (10.1016/j.simpa.2021.100137_b16) 2018 Kirchhof (10.1016/j.simpa.2021.100137_b20) 2021 Gibaja (10.1016/j.simpa.2021.100137_b1) 2014; 4 Chandra (10.1016/j.simpa.2021.100137_b10) 2001 |
| References_xml | – reference: Michael Rapp, Eneldo Loza Mencía, Johannes Fürnkranz, Eyke Hüllermeier, Gradient-based label binning in multi-label classification, in: Proc. European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), 2021, pp. 462–477. – volume: 12 start-page: 2825 year: 2011 end-page: 2830 ident: b13 article-title: Scikit-learn: Machine learning in Python publication-title: J. Mach. Learn. Res. – year: 2021 ident: b20 article-title: PRSL: Interpretable multi-label stacking by learning probabilistic rules – year: 1999 ident: b12 article-title: LAPACK Users’ guide – volume: 30 start-page: 3146 year: 2017 end-page: 3154 ident: b4 article-title: LightGBM: A highly efficient gradient boosting decision tree publication-title: Adv. Neural Inf. Process. Syst. – reference: Si Si, Huan Zhang, S. Sathiya Keerthi, Dhruv Mahajan, Inderjit S. Dhillon, Cho-Jui Hsieh, Gradient boosted decision trees for high dimensional sparse output, in: Proc. International Conference on Machine Learning (ICML), 2017, pp. 3182–3190. – reference: Michael Rapp, Eneldo Loza Mencía, Johannes Fürnkranz, Vu-Linh Nguyen, Eyke Hüllermeier, Learning gradient boosted multi-label classification rules, in: Proc. European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), 2020, pp. 124–140. – year: 2020 ident: b9 article-title: A flexible class of dependence-aware multi-label loss functions – volume: 4 start-page: 411 year: 2014 end-page: 444 ident: b1 article-title: Multi-label learning: A review of the state of the art and ongoing research publication-title: Wiley Interdiscip. Rev. Data Min. Knowl. Discov. – year: 2020 ident: b7 article-title: GBDT-MO: Gradient-boosted decision trees for multiple outputs publication-title: IEEE Trans. Neural Netw. Learn. Syst. – reference: Eyke Hüllermeier, Johannes Fürnkranz, Eneldo Loza Mencía, Vu-Linh Nguyen, Michael Rapp, Rule-based multi-label classification: Challenges and opportunities, in: Proc. International Joint Conference on Rules and Reasoning, 2020, pp. 3–19. – volume: 12 start-page: 2411 year: 2011 end-page: 2414 ident: b15 article-title: Mulan: A Java library for multi-label learning publication-title: J. Mach. Learn. Res. – reference: Yonatan Amit, Ofer Dekel, Yoram Singer, A boosting algorithm for label covering in multilabel problems, in: Proc. International Conference on Artificial Intelligence and Statistics (AISTATS), 2007, pp. 27–34. – start-page: 81 year: 2018 end-page: 113 ident: b16 article-title: Learning interpretable rules for multi-label classification publication-title: Explainable and Interpretable Models in Computer Vision and Machine Learning – reference: Konstantinos Sechidis, Grigorios Tsoumakas, Ioannis Vlahavas, On the stratification of multi-label data, in: Proc. European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), 2011, pp. 145–158. – year: 2001 ident: b10 article-title: Parallel Programming in OpenMP – volume: 13 start-page: 31 year: 2010 end-page: 39 ident: b14 article-title: Cython: The best of both worlds publication-title: Comput. Sci. Eng. – volume: 88 start-page: 5 year: 2012 end-page: 45 ident: b2 article-title: On label dependence and loss minimization in multi-label classification publication-title: Mach. Learn. – reference: Tianqi Chen, Carlos Guestrin, XGBoost: A scalable tree boosting system, in: Proc. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 785–794. – volume: 28 start-page: 135 year: 2002 end-page: 151 ident: b11 article-title: An updated set of basic linear algebra subprograms (BLAS) publication-title: ACM Trans. Math. Software – volume: 13 start-page: 31 issue: 2 year: 2010 ident: 10.1016/j.simpa.2021.100137_b14 article-title: Cython: The best of both worlds publication-title: Comput. Sci. Eng. doi: 10.1109/MCSE.2010.118 – ident: 10.1016/j.simpa.2021.100137_b17 doi: 10.1007/978-3-030-57977-7_1 – year: 2021 ident: 10.1016/j.simpa.2021.100137_b20 – ident: 10.1016/j.simpa.2021.100137_b3 doi: 10.1145/2939672.2939785 – year: 2020 ident: 10.1016/j.simpa.2021.100137_b7 article-title: GBDT-MO: Gradient-boosted decision trees for multiple outputs publication-title: IEEE Trans. Neural Netw. Learn. Syst. – volume: 12 start-page: 2411 year: 2011 ident: 10.1016/j.simpa.2021.100137_b15 article-title: Mulan: A Java library for multi-label learning publication-title: J. Mach. Learn. Res. – ident: 10.1016/j.simpa.2021.100137_b8 doi: 10.1007/978-3-030-67664-3_8 – ident: 10.1016/j.simpa.2021.100137_b5 – year: 2001 ident: 10.1016/j.simpa.2021.100137_b10 – volume: 30 start-page: 3146 year: 2017 ident: 10.1016/j.simpa.2021.100137_b4 article-title: LightGBM: A highly efficient gradient boosting decision tree publication-title: Adv. Neural Inf. Process. Syst. – ident: 10.1016/j.simpa.2021.100137_b6 – volume: 88 start-page: 5 issue: 1–2 year: 2012 ident: 10.1016/j.simpa.2021.100137_b2 article-title: On label dependence and loss minimization in multi-label classification publication-title: Mach. Learn. doi: 10.1007/s10994-012-5285-8 – year: 1999 ident: 10.1016/j.simpa.2021.100137_b12 – ident: 10.1016/j.simpa.2021.100137_b18 doi: 10.1007/978-3-030-86523-8_28 – volume: 4 start-page: 411 issue: 6 year: 2014 ident: 10.1016/j.simpa.2021.100137_b1 article-title: Multi-label learning: A review of the state of the art and ongoing research publication-title: Wiley Interdiscip. Rev. Data Min. Knowl. Discov. doi: 10.1002/widm.1139 – start-page: 81 year: 2018 ident: 10.1016/j.simpa.2021.100137_b16 article-title: Learning interpretable rules for multi-label classification – volume: 12 start-page: 2825 year: 2011 ident: 10.1016/j.simpa.2021.100137_b13 article-title: Scikit-learn: Machine learning in Python publication-title: J. Mach. Learn. Res. – year: 2020 ident: 10.1016/j.simpa.2021.100137_b9 – volume: 28 start-page: 135 issue: 2 year: 2002 ident: 10.1016/j.simpa.2021.100137_b11 article-title: An updated set of basic linear algebra subprograms (BLAS) publication-title: ACM Trans. Math. Software doi: 10.1145/567806.567807 – ident: 10.1016/j.simpa.2021.100137_b19 doi: 10.1007/978-3-642-23808-6_10 |
| SSID | ssj0002511168 |
| Score | 2.2104938 |
| Snippet | Multi-label classification is concerned with the assignment of sets of labels to individual data points. Due to its diverse real-world applications, e.g., the... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 100137 |
| SubjectTerms | Gradient boosting Machine learning Multi-label classification Rule learning |
| Title | BOOMER — An algorithm for learning gradient boosted multi-label classification rules |
| URI | https://dx.doi.org/10.1016/j.simpa.2021.100137 |
| Volume | 10 |
| WOSCitedRecordID | wos000837034900025&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2665-9638 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002511168 issn: 2665-9638 databaseCode: M~E dateStart: 20190101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Nb9QwELWgcOCCQIAoX_KB2-LVJrHX9nGpijjQFqFS9RY5Xrt0laarbAo9Vf0R_ML-EsZf2basVvTAJYqs2LHyJuPn0ZsxQu85tVRkY00o0GMCf6ImwjBFuOKjseRGV17le_CF7-6Kw0P5NZ7fufDHCfCmEefncv5foYY2ANulzt4B7n5QaIB7AB2uADtc_wn4j3t7O9vfkoiBTpqBqo9O2-Pux4nXFNYpGHLUerlXNwCi7YKeQVyYOhZgHqYeaMeunZwoGEp7VkfR4Syl99rul1OPhXTLpXJezee3ZfkxuJBnMcuu90GwfDPi_tGwXKxoS050dM0LZr5a6UoHHWIFs-HCzWroXjlcPn2zHPatZaoXDyZd2qz0g5RukDIMch89yDmTTtq3c7GMtbltVOazIvu5pwJUXur312RWk5RrxGP_CXocdwx4EpB-iu6Z5hk6CCjjq8vfeNLgHmEMCOOEME4I44gw9ggTjyy-iSz2yD5H3z9t7299JvGIDKILKjpiqSvAZrLcFGJs-ZQXUpmRBZZcCWNgL64lnRqlKprn4MyZtVJlI81yJVmVM1W8QBvNaWNeIlxIW1jgk1YDSzWZEJWxilsKG06mBJtuojx9k1LH-vHuGJO6XAPIJvrQd5qH8inrHx-nj11GBhiYXQn2s67jq7u95zV6tLT2N2ija8_MW_RQ_-yOF-07bzx_AFJpebc |
| linkProvider | ISSN International Centre |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=BOOMER%E2%80%94An+algorithm+for+learning+gradient+boosted+multi%E2%80%93label+classification+rules&rft.jtitle=Software+impacts&rft.au=Rapp%2C+Michael&rft.date=2021-11-01&rft.issn=2665-9638&rft.eissn=2665-9638&rft.volume=10&rft.spage=100137&rft_id=info:doi/10.1016%2Fj.simpa.2021.100137&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_simpa_2021_100137 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2665-9638&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2665-9638&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2665-9638&client=summon |