HMATC: Hierarchical multi-label Arabic text classification model using machine learning
Multi-label classification assigns multiple labels to each document concurrently. Many real-world classification problems tend to employ high-dimensional label spaces, which can be naturally structured in a hierarchy. In this type of problem, each instance may belong to multiple labels and labels ar...
Uložené v:
| Vydané v: | Egyptian informatics journal Ročník 22; číslo 3; s. 225 - 237 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier B.V
01.09.2021
Elsevier |
| Predmet: | |
| ISSN: | 1110-8665, 2090-4754 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Multi-label classification assigns multiple labels to each document concurrently. Many real-world classification problems tend to employ high-dimensional label spaces, which can be naturally structured in a hierarchy. In this type of problem, each instance may belong to multiple labels and labels are organized in a hierarchical structure. It presents a more complex problem than flat classification, given that the classification algorithm has to take into account hierarchical relationships between labels and be able to predict multiple labels for the same instance. Few studies have investigated multi-label text classification for the Arabic language. Most of these studies have focused mainly on flat classification and have neglected the hierarchical structure. Therefore, this paper explores the hierarchical multi-label classification in the context of the Arabic language. It proposes a hierarchical multi-label Arabic text classification (HMATC) model with a machine learning approach. The impact of feature selection methods and feature set dimensions on classification performance are also investigated. In addition, the Hierarchy Of Multilabel ClassifiER (HOMER) algorithm is optimized via examination of different sets of multi-label classifiers, clustering algorithms and different numbers of clusters to improve the hierarchical classification. Moreover, this study contributes to existing research by introducing a hierarchical multi-label Arabic dataset in an appropriate format for hierarchical classification and making it publicly available. The results reveal that the proposed model outperforms all models considered in the experiments in terms of the computational cost, which consumed less cost (2 h) compared with other evaluated models. In addition, it shows a significant improvement compared with the state-of-the-art model (Fatwa model) in terms of Hamming loss (0.004), hierarchical loss (1.723), multi-label accuracy (0.758), subset accuracy (0.292), micro-averaged precision (0.879), micro-averaged recall (0.828), and micro-averaged F-measure (0.853). |
|---|---|
| AbstractList | Multi-label classification assigns multiple labels to each document concurrently. Many real-world classification problems tend to employ high-dimensional label spaces, which can be naturally structured in a hierarchy. In this type of problem, each instance may belong to multiple labels and labels are organized in a hierarchical structure. It presents a more complex problem than flat classification, given that the classification algorithm has to take into account hierarchical relationships between labels and be able to predict multiple labels for the same instance. Few studies have investigated multi-label text classification for the Arabic language. Most of these studies have focused mainly on flat classification and have neglected the hierarchical structure. Therefore, this paper explores the hierarchical multi-label classification in the context of the Arabic language. It proposes a hierarchical multi-label Arabic text classification (HMATC) model with a machine learning approach. The impact of feature selection methods and feature set dimensions on classification performance are also investigated. In addition, the Hierarchy Of Multilabel ClassifiER (HOMER) algorithm is optimized via examination of different sets of multi-label classifiers, clustering algorithms and different numbers of clusters to improve the hierarchical classification. Moreover, this study contributes to existing research by introducing a hierarchical multi-label Arabic dataset in an appropriate format for hierarchical classification and making it publicly available. The results reveal that the proposed model outperforms all models considered in the experiments in terms of the computational cost, which consumed less cost (2 h) compared with other evaluated models. In addition, it shows a significant improvement compared with the state-of-the-art model (Fatwa model) in terms of Hamming loss (0.004), hierarchical loss (1.723), multi-label accuracy (0.758), subset accuracy (0.292), micro-averaged precision (0.879), micro-averaged recall (0.828), and micro-averaged F-measure (0.853). |
| Author | Aljedani, Nawal Alotaibi, Reem Taileb, Mounira |
| Author_xml | – sequence: 1 givenname: Nawal surname: Aljedani fullname: Aljedani, Nawal email: naljedani0026@stu.kau.edu.sa – sequence: 2 givenname: Reem surname: Alotaibi fullname: Alotaibi, Reem email: ralotibi@kau.edu.sa – sequence: 3 givenname: Mounira surname: Taileb fullname: Taileb, Mounira email: mtaileb@kau.edu.sa |
| BookMark | eNp9kE1v1DAQhi1UJJbSH8AtfyBhxnGcGE6rFbCVirgUcbTGHymOvAmyXQT_vm4XceDQuYw0ep9Xo-c1u1i31TP2FqFDQPlu6XxYOg4cOpg6APGC7TgoaMU4iAu2Q0RoJymHV-wq5wXqSORikDv2_fhlf3t43xyDT5Tsj2ApNqf7WEIbyfjY7BOZYJvif5fGRso5zDVTwrY2p83VwH0O611zosquvome0loPb9jLmWL2V3_3Jfv26ePt4djefP18fdjftFaALK2ZJyBulROzk0Y41498JOf9YEgpNEpwgT0phB5sP6Gxo1KDUTQ4IOKyv2TX51630aJ_pnCi9EdvFPTTYUt3mlIJNnrN-TAhgRFoQHg5k50Mgq2PoJEKsHbhucumLefk5399CPpRtF50Fa0fRWuYdBVdmfE_xobypKckCvFZ8sOZ9FXPr6pfZxv8ar0LydtS_w_P0A-_IJmf |
| CitedBy_id | crossref_primary_10_1007_s11063_024_11500_8 crossref_primary_10_32604_cmc_2023_033564 crossref_primary_10_1007_s00500_023_08341_3 crossref_primary_10_1016_j_compbiomed_2024_107921 crossref_primary_10_48084_etasr_9994 crossref_primary_10_1007_s00500_023_08384_6 crossref_primary_10_1016_j_susoc_2022_03_001 crossref_primary_10_1109_TASLP_2023_3294699 crossref_primary_10_1016_j_sasc_2025_200365 crossref_primary_10_1016_j_jksuci_2022_10_015 crossref_primary_10_3390_electronics13071199 crossref_primary_10_1016_j_buildenv_2023_111124 crossref_primary_10_1109_ACCESS_2024_3450507 crossref_primary_10_3390_app12136424 crossref_primary_10_3846_aviation_2023_19739 crossref_primary_10_1109_ACCESS_2023_3265712 |
| Cites_doi | 10.1109/TKDE.2010.164 10.1016/j.patcog.2010.09.010 10.1007/s10994-008-5064-8 10.1016/j.ipm.2019.102121 10.1109/CSIT.2016.7549465 10.1080/02286203.2003.11442267 10.5121/acij.2012.3607 10.1016/j.ipm.2015.09.002 10.1007/978-3-540-74958-5_38 10.1007/s10994-008-5077-3 10.1007/978-0-387-09823-4_34 10.1109/ACLing.2015.28 10.1016/j.eswa.2010.08.100 10.1016/j.patcog.2004.03.009 10.1166/jctn.2016.5077 10.1145/2716262 10.1016/j.artint.2008.08.002 10.1023/A:1007614523901 10.1016/j.ipm.2016.10.003 10.1007/3-540-45164-1_8 10.1016/j.ipm.2004.08.006 10.1016/j.ipm.2018.09.008 10.1016/j.knosys.2016.03.029 10.1016/j.ipm.2013.08.006 10.1145/1076034.1076082 10.1016/j.entcs.2013.02.010 10.1007/3-540-44794-6_4 10.1016/j.patcog.2017.05.007 10.4018/jdwm.2007070101 10.1109/IACS.2015.7103229 10.1016/j.patcog.2006.12.019 10.1109/IGARSS.2004.1368565 10.1007/s10994-011-5256-5 10.1007/s10618-010-0175-9 10.1007/978-3-540-87881-0_40 10.14445/22312803/IJCTT-V7P109 10.1109/ICDM.2008.74 |
| ContentType | Journal Article |
| Copyright | 2020 |
| Copyright_xml | – notice: 2020 |
| DBID | 6I. AAFTH AAYXX CITATION DOA |
| DOI | 10.1016/j.eij.2020.08.004 |
| DatabaseName | ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef DOAJ Directory of Open Access Journals |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 2090-4754 |
| EndPage | 237 |
| ExternalDocumentID | oai_doaj_org_article_22581a0b41b04e6fac8b10cc401b6901 10_1016_j_eij_2020_08_004 S1110866520301523 |
| GroupedDBID | --K 0R~ 0SF 1B1 4.4 457 5VS 6I. AACTN AAEDT AAEDW AAFTH AAIKJ AALRI AAXUO ABMAC ACGFS ADBBV ADEZE AEXQZ AFTJW AGHFR AITUG ALMA_UNASSIGNED_HOLDINGS AMRAJ BCNDV E3Z EBS EJD FDB GROUPED_DOAJ HZ~ IPNFZ IXB KQ8 M41 NCXOZ O-L O9- OK1 RIG ROL SES SSZ AAYWO AAYXX ACVFH ADCNI ADVLN AEUPX AFJKZ AFPUW AIGII AKBMS AKRWK AKYEP APXCP CITATION |
| ID | FETCH-LOGICAL-c406t-bf80a2c9d4fd6b4dd3727adee5ba991b942413a91030c381bc7995b9a5d0aa263 |
| IEDL.DBID | DOA |
| ISICitedReferencesCount | 24 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000701191400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1110-8665 |
| IngestDate | Fri Oct 03 12:48:30 EDT 2025 Tue Nov 18 21:01:49 EST 2025 Wed Oct 29 21:12:38 EDT 2025 Wed May 17 00:08:48 EDT 2023 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Keywords | Arabic natural language processing Multi-label classification Hierarchical classification Machine learning Text classification |
| Language | English |
| License | This is an open access article under the CC BY-NC-ND license. |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c406t-bf80a2c9d4fd6b4dd3727adee5ba991b942413a91030c381bc7995b9a5d0aa263 |
| OpenAccessLink | https://doaj.org/article/22581a0b41b04e6fac8b10cc401b6901 |
| PageCount | 13 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_22581a0b41b04e6fac8b10cc401b6901 crossref_primary_10_1016_j_eij_2020_08_004 crossref_citationtrail_10_1016_j_eij_2020_08_004 elsevier_sciencedirect_doi_10_1016_j_eij_2020_08_004 |
| PublicationCentury | 2000 |
| PublicationDate | September 2021 2021-09-00 2021-09-01 |
| PublicationDateYYYYMMDD | 2021-09-01 |
| PublicationDate_xml | – month: 09 year: 2021 text: September 2021 |
| PublicationDecade | 2020 |
| PublicationTitle | Egyptian informatics journal |
| PublicationYear | 2021 |
| Publisher | Elsevier B.V Elsevier |
| Publisher_xml | – name: Elsevier B.V – name: Elsevier |
| References | Zhang, Zhou (b0060) 2007; 40 Joachims T. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. Carnegie-mellon univ pittsburgh pa dept of computer science, Tech. Rep.; 1996. Eldos (b0050) 2003; 23 Mustafa (b0180) 2012; 21 Cheng, W, Hüllermeier, E. Combining instance-based learning and logistic regression for multilabel classification. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (Eds.), Machine learning and knowledge discovery in databases. Berlin, Heidelberg: Springer, Berlin Heidelberg; 2009. p. 6–6. Chen, Chen (b0230) 2011; 38 Tsoumakas, Katakis, Vlahavas (b0095) 2011; 23 Yahya, Salhi (b0125) 2014; 13 Syiam, Fayed, Habib (b0215) 2006; 6 Clare A, King RD. Knowledge discovery in multi-label phenotype data. In: European conference on principles of data mining and knowledge discovery. Springer; 2001. p. 42–53. Boutell, Luo, Shen, Brown (b0070) 2004; 37 Habib MB. An intelligent system for automated arabic text categorization, Master’s thesis, University of Twente; 2008. Froud H, Lachkar A, Ouatik SA. A comparative study of root-based and stem-based approaches for measuring the similarity between arabic words for arabic text mining applications, arXiv preprint arXiv:1212.3634, 2012. Cesa-Bianchi N, Gentile C, Zaniboni L. Incremental algorithms for hierarchical classification. J Mach Learn Res 7(Jan);2006:31–54. Ayedh, Tan (b0220) 2016; 13 Tsoumakas G, Katakis I, Vlahavas I. Effective and efficient multilabel classification in domains with large number of labels. In: Proc. ECML/PKDD 2008 workshop on mining multidimensional data (MMD’08), vol. 21. sn, 2008. pp. 53–59. Silla, Freitas (b0040) 2011; 22 Larkey, Ballesteros, Connell (b0225) 2002 SpolaôR, Cherman, Monard, Lee (b0240) 2013; 292 Brucker, Benites, Sapozhnikova (b0035) 2011; 44 Hmeidi, Al-Ayyoub, Mahyoub, Shehab (b0140) 2016; 12 Chen Y, Crawford MM, Ghosh J. Integrating support vector machines in a hierarchical output space decomposition framework. In: Geoscience and remote sensing symposium, 2004. IGARSS’04. Proceedings. 2004 IEEE International, vol. 2. IEEE; 2004. p. 949–952. Ahmed Y, Xiang J, Zhao D, Al-qaness MAA, Elsayed abd el aziz M, Abdelghani D. A study of the effects of stemming strategies on arabic document classification. IEEE Access PP:2019;1–1. Brazdil PB, Soares C. A comparison of ranking methods for classification algorithm selection. In: European conference on machine learning. Springer; 2000. p. 63–75. Tsoumakas G, Vlahavas I. Random k-labelsets: an ensemble method for multilabel classification. In: European conference on machine learning. Springer; 2007. p. 406–417. Al-Salemi, Noah, Ab Aziz (b0010) 2016; 103 Mubarak, Darwish (b0045) 2014 Demšar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan);2006:1–30 Karisani, Rahgozar, Oroumchian (b0200) 2016; 52 Zayed, RA, Hady MFA, Hefny H. Islamic fatwa request routing via hierarchical multi-label arabic text categorization. In: Arabic computational linguistics (ACLing), 2015 first international conference on. IEEE; 2015. p. 145–151. Wu, Gu, Gu (b0205) 2017; 53 Tsoumakas, Spyromitros-Xioufis, Vilcek, Vlahavas (b0255) 2011; 12 Read, Pfahringer, Holmes, Frank (b0075) 2011; 85 Spyromitros E, Tsoumakas G, Vlahavas I. An empirical study of lazy multilabel classification algorithms. In: Darzentas, J., Vouros, G.A., Vosinakis, S., Arnellos, A. (Eds.), Artificial intelligence: theories, models and applications, Berlin, Heidelberg: Springer, Berlin Heidelberg; 2008. p. 401–406. Uysal, Gunal (b0210) 2014; 50 Lee, Lee (b0235) 2006; 42 Ahmed NA, Shehab MA, Al-Ayyoub M, Hmeidi I. Scalable multi-label arabic text classification. In: Information and communication systems (ICICS), 2015 6th international conference on. IEEE; 2015. p. 212–217. Shehab MA, Badarneh O, Al-Ayyoub M, Jararweh Y. A supervised approach for multi-label classification of arabic news articles. In: Computer science and information technology (CSIT), 2016 7th international conference on. IEEE; 2016. p. 1–6. Fürnkranz, Hüllermeier, Mencía, Brinker (b0145) 2008; 73 Read J, Pfahringer B, Holmes G. Multi-label classification using ensembles of pruned sets. In: Data mining, 2008. ICDM’08. Eighth IEEE international conference on. IEEE; 2008. p. 995–1000. Zhang, Shah, Kakadiaris (b0115) 2017; 70 Taha AY, Tiun S. Binary relevance (br) method classifier of multi-label classification for arabic text. J Theor Appl Inf Technol 84(3):2016. Schapire RE, Singer Y. Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3);1999:297–336. [Online]. Available: doi: 10.1023/A:1007614523901. Al-Salemi, Ayob, Kendall, Noah (b0005) 2019; 56 Tsoumakas, Katakis (b0020) 2007; 3 Zhu S, Ji X, Xu W, Gong Y. Multi-labelled classification using maximum entropy method. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. ACM; 2005. p. 274–281 Hüllermeier, Fürnkranz, Cheng, Brinker (b0080) 2008; 172 Ababneh, Almomani, Hadi, El-Omari, Al-Ibrahim (b0175) 2014; 7 Tsoumakas G, Katakis I, Vlahavas I. Mining multi-label data. In: Data mining and knowledge discovery handbook. Springer; 2009. p. 667–685. Vens, Struyf, Schietgat, Džeroski, Blockeel (b0100) 2008; 73 Elnagar, Al-Debsi, Einea (b0165) 2020; 57 Duwairi, Al-Zubaidi (b0030) 2011; 8 Gibaja E, Ventura S. A tutorial on multi tutorial on multilabel learningilabel learning, ACM Comput Surv 47(3):2015; 52:1–52:38. [Online]. Available: http://doi.acm.org/10.1145/2716262. Tsoumakas (10.1016/j.eij.2020.08.004_b0095) 2011; 23 Zhang (10.1016/j.eij.2020.08.004_b0115) 2017; 70 Tsoumakas (10.1016/j.eij.2020.08.004_b0020) 2007; 3 Al-Salemi (10.1016/j.eij.2020.08.004_b0005) 2019; 56 10.1016/j.eij.2020.08.004_b0160 10.1016/j.eij.2020.08.004_b0120 SpolaôR (10.1016/j.eij.2020.08.004_b0240) 2013; 292 10.1016/j.eij.2020.08.004_b0085 10.1016/j.eij.2020.08.004_b0245 Fürnkranz (10.1016/j.eij.2020.08.004_b0145) 2008; 73 Uysal (10.1016/j.eij.2020.08.004_b0210) 2014; 50 10.1016/j.eij.2020.08.004_b0090 Yahya (10.1016/j.eij.2020.08.004_b0125) 2014; 13 10.1016/j.eij.2020.08.004_b0170 10.1016/j.eij.2020.08.004_b0055 10.1016/j.eij.2020.08.004_b0130 Tsoumakas (10.1016/j.eij.2020.08.004_b0255) 2011; 12 10.1016/j.eij.2020.08.004_b0250 10.1016/j.eij.2020.08.004_b0015 10.1016/j.eij.2020.08.004_b0135 Elnagar (10.1016/j.eij.2020.08.004_b0165) 2020; 57 Mubarak (10.1016/j.eij.2020.08.004_b0045) 2014 Ababneh (10.1016/j.eij.2020.08.004_b0175) 2014; 7 Boutell (10.1016/j.eij.2020.08.004_b0070) 2004; 37 Read (10.1016/j.eij.2020.08.004_b0075) 2011; 85 Al-Salemi (10.1016/j.eij.2020.08.004_b0010) 2016; 103 Eldos (10.1016/j.eij.2020.08.004_b0050) 2003; 23 10.1016/j.eij.2020.08.004_b0260 10.1016/j.eij.2020.08.004_b0065 10.1016/j.eij.2020.08.004_b0185 Ayedh (10.1016/j.eij.2020.08.004_b0220) 2016; 13 10.1016/j.eij.2020.08.004_b0025 Larkey (10.1016/j.eij.2020.08.004_b0225) 2002 10.1016/j.eij.2020.08.004_b0265 Hmeidi (10.1016/j.eij.2020.08.004_b0140) 2016; 12 Wu (10.1016/j.eij.2020.08.004_b0205) 2017; 53 Zhang (10.1016/j.eij.2020.08.004_b0060) 2007; 40 Lee (10.1016/j.eij.2020.08.004_b0235) 2006; 42 Duwairi (10.1016/j.eij.2020.08.004_b0030) 2011; 8 10.1016/j.eij.2020.08.004_b0190 Karisani (10.1016/j.eij.2020.08.004_b0200) 2016; 52 10.1016/j.eij.2020.08.004_b0150 10.1016/j.eij.2020.08.004_b0110 Brucker (10.1016/j.eij.2020.08.004_b0035) 2011; 44 10.1016/j.eij.2020.08.004_b0195 Mustafa (10.1016/j.eij.2020.08.004_b0180) 2012; 21 Chen (10.1016/j.eij.2020.08.004_b0230) 2011; 38 Vens (10.1016/j.eij.2020.08.004_b0100) 2008; 73 Silla (10.1016/j.eij.2020.08.004_b0040) 2011; 22 10.1016/j.eij.2020.08.004_b0155 10.1016/j.eij.2020.08.004_b0105 Hüllermeier (10.1016/j.eij.2020.08.004_b0080) 2008; 172 Syiam (10.1016/j.eij.2020.08.004_b0215) 2006; 6 |
| References_xml | – volume: 13 start-page: 4 year: 2014 ident: b0125 article-title: Arabic text categorization based on arabic wikipedia publication-title: ACM Trans Asian Lang Inf Process (TALIP) – volume: 12 start-page: 504 year: 2016 end-page: 532 ident: b0140 article-title: A lexicon based approach for classifying arabic multi-labeled text publication-title: Int J Web Inf Syst – reference: Cheng, W, Hüllermeier, E. Combining instance-based learning and logistic regression for multilabel classification. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (Eds.), Machine learning and knowledge discovery in databases. Berlin, Heidelberg: Springer, Berlin Heidelberg; 2009. p. 6–6. – reference: Brazdil PB, Soares C. A comparison of ranking methods for classification algorithm selection. In: European conference on machine learning. Springer; 2000. p. 63–75. – reference: Chen Y, Crawford MM, Ghosh J. Integrating support vector machines in a hierarchical output space decomposition framework. In: Geoscience and remote sensing symposium, 2004. IGARSS’04. Proceedings. 2004 IEEE International, vol. 2. IEEE; 2004. p. 949–952. – reference: Joachims T. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. Carnegie-mellon univ pittsburgh pa dept of computer science, Tech. Rep.; 1996. – reference: Tsoumakas G, Katakis I, Vlahavas I. Effective and efficient multilabel classification in domains with large number of labels. In: Proc. ECML/PKDD 2008 workshop on mining multidimensional data (MMD’08), vol. 21. sn, 2008. pp. 53–59. – volume: 12 start-page: 2411 year: 2011 end-page: 2414 ident: b0255 article-title: Mulan: a java library for multi-label learning publication-title: J Mach Learn Res – volume: 85 start-page: 333 year: 2011 ident: b0075 article-title: Classifier chains for multi-label classification publication-title: Mach Learn – start-page: 275 year: 2002 end-page: 282 ident: b0225 article-title: Improving stemming for arabic information retrieval: light stemming and co-occurrence analysis publication-title: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval – volume: 23 start-page: 1079 year: 2011 end-page: 1089 ident: b0095 article-title: Random k-labelsets for multilabel classification publication-title: IEEE Trans Knowl Data Eng – volume: 292 start-page: 135 year: 2013 end-page: 151 ident: b0240 article-title: A comparison of multi-label feature selection methods using the problem transformation approach publication-title: Electron Notes Theor Comput Sci – volume: 3 start-page: 1 year: 2007 end-page: 13 ident: b0020 article-title: Multi-label classification: an overview publication-title: Int J Data Warehousing Min (IJDWM) – volume: 7 start-page: 219 year: 2014 end-page: 223 ident: b0175 article-title: Vector space models to classify arabic text publication-title: Int J Comput Trends Technol (IJCTT) – volume: 6 start-page: 1 year: 2006 end-page: 19 ident: b0215 article-title: An intelligent system for arabic text categorization publication-title: Int J Intell Comput Inf Sci – volume: 172 start-page: 1897 year: 2008 end-page: 1916 ident: b0080 article-title: Label ranking by learning pairwise preferences publication-title: Artif Intell – reference: Gibaja E, Ventura S. A tutorial on multi tutorial on multilabel learningilabel learning, ACM Comput Surv 47(3):2015; 52:1–52:38. [Online]. Available: http://doi.acm.org/10.1145/2716262. – volume: 42 start-page: 155 year: 2006 end-page: 165 ident: b0235 article-title: Information gain and divergence-based feature selection for machine learning-based text categorization publication-title: Inf Process Manage – volume: 44 start-page: 724 year: 2011 end-page: 738 ident: b0035 article-title: Multi-label classification and extracting predicted class hierarchies publication-title: Pattern Recogn – volume: 23 start-page: 158 year: 2003 end-page: 166 ident: b0050 article-title: Arabic text data mining: a root-based hierarchical indexing model publication-title: Int J Model Simul – reference: Read J, Pfahringer B, Holmes G. Multi-label classification using ensembles of pruned sets. In: Data mining, 2008. ICDM’08. Eighth IEEE international conference on. IEEE; 2008. p. 995–1000. – reference: Ahmed NA, Shehab MA, Al-Ayyoub M, Hmeidi I. Scalable multi-label arabic text classification. In: Information and communication systems (ICICS), 2015 6th international conference on. IEEE; 2015. p. 212–217. – reference: Tsoumakas G, Katakis I, Vlahavas I. Mining multi-label data. In: Data mining and knowledge discovery handbook. Springer; 2009. p. 667–685. – volume: 37 start-page: 1757 year: 2004 end-page: 1771 ident: b0070 article-title: Learning multi-label scene classification publication-title: Pattern Recogn – reference: Shehab MA, Badarneh O, Al-Ayyoub M, Jararweh Y. A supervised approach for multi-label classification of arabic news articles. In: Computer science and information technology (CSIT), 2016 7th international conference on. IEEE; 2016. p. 1–6. – volume: 38 start-page: 3085 year: 2011 end-page: 3090 ident: b0230 article-title: Using chi-square statistics to measure similarities for text categorization publication-title: Expert Syst Appl – start-page: 1 year: 2014 end-page: 7 ident: b0045 article-title: Using twitter to collect a multi-dialectal corpus of arabic publication-title: Proceedings of the EMNLP 2014 workshop on arabic natural language processing (ANLP) – volume: 40 start-page: 2038 year: 2007 end-page: 2048 ident: b0060 article-title: Ml-knn: a lazy learning approach to multi-label learning publication-title: Pattern Recogn – volume: 8 start-page: 251 year: 2011 end-page: 259 ident: b0030 article-title: A hierarchical k-NN classifier for textual data publication-title: Int Arab J Inf Technol – reference: Clare A, King RD. Knowledge discovery in multi-label phenotype data. In: European conference on principles of data mining and knowledge discovery. Springer; 2001. p. 42–53. – volume: 56 start-page: 212 year: 2019 end-page: 227 ident: b0005 article-title: Multi-label arabic text categorization: a benchmark and baseline comparison of multi-label learning algorithms publication-title: Inf Process Manage – reference: Schapire RE, Singer Y. Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3);1999:297–336. [Online]. Available: doi: 10.1023/A:1007614523901. – volume: 53 start-page: 547 year: 2017 end-page: 557 ident: b0205 article-title: Balancing between over-weighting and under-weighting in supervised term weighting publication-title: Inf Process Manage – volume: 73 start-page: 133 year: 2008 end-page: 153 ident: b0145 article-title: Multilabel classification via calibrated label ranking publication-title: Mach Learn – reference: Taha AY, Tiun S. Binary relevance (br) method classifier of multi-label classification for arabic text. J Theor Appl Inf Technol 84(3):2016. – reference: Demšar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan);2006:1–30 – volume: 73 start-page: 185 year: 2008 ident: b0100 article-title: Decision trees for hierarchical multi-label classification publication-title: Mach Learn – reference: Froud H, Lachkar A, Ouatik SA. A comparative study of root-based and stem-based approaches for measuring the similarity between arabic words for arabic text mining applications, arXiv preprint arXiv:1212.3634, 2012. – volume: 22 start-page: 31 year: 2011 end-page: 72 ident: b0040 article-title: A survey of hierarchical classification across different application domains publication-title: Data Min Knowl Discovery – reference: Cesa-Bianchi N, Gentile C, Zaniboni L. Incremental algorithms for hierarchical classification. J Mach Learn Res 7(Jan);2006:31–54. – volume: 70 start-page: 89 year: 2017 end-page: 103 ident: b0115 article-title: Hierarchical multi-label classification using fully associative ensemble learning publication-title: Pattern Recogn – volume: 13 start-page: 1527 year: 2016 end-page: 1535 ident: b0220 article-title: Building and benchmarking novel arabic stemmer for document classification publication-title: J Comput Theor Nanosci – volume: 57 year: 2020 ident: b0165 article-title: Arabic text classification using deep learning models publication-title: Inf Process Manage – volume: 21 start-page: 2012 year: 2012 ident: b0180 article-title: Word stemming for arabic information retrieval: the case for simple light stemming publication-title: Abhath Al-Yarmouk Sci Eng Ser – volume: 103 start-page: 104 year: 2016 end-page: 117 ident: b0010 article-title: Rfboost: an improved multi-label boosting algorithm and its application to text categorisation publication-title: Knowl-Based Syst – reference: Spyromitros E, Tsoumakas G, Vlahavas I. An empirical study of lazy multilabel classification algorithms. In: Darzentas, J., Vouros, G.A., Vosinakis, S., Arnellos, A. (Eds.), Artificial intelligence: theories, models and applications, Berlin, Heidelberg: Springer, Berlin Heidelberg; 2008. p. 401–406. – reference: Ahmed Y, Xiang J, Zhao D, Al-qaness MAA, Elsayed abd el aziz M, Abdelghani D. A study of the effects of stemming strategies on arabic document classification. IEEE Access PP:2019;1–1. – volume: 50 start-page: 104 year: 2014 end-page: 112 ident: b0210 article-title: The impact of preprocessing on text classification publication-title: Inf Process Manage – reference: Zayed, RA, Hady MFA, Hefny H. Islamic fatwa request routing via hierarchical multi-label arabic text categorization. In: Arabic computational linguistics (ACLing), 2015 first international conference on. IEEE; 2015. p. 145–151. – reference: Zhu S, Ji X, Xu W, Gong Y. Multi-labelled classification using maximum entropy method. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. ACM; 2005. p. 274–281 – reference: Tsoumakas G, Vlahavas I. Random k-labelsets: an ensemble method for multilabel classification. In: European conference on machine learning. Springer; 2007. p. 406–417. – reference: Habib MB. An intelligent system for automated arabic text categorization, Master’s thesis, University of Twente; 2008. – volume: 52 start-page: 478 year: 2016 end-page: 489 ident: b0200 article-title: A query term re-weighting approach using document similarity publication-title: Inf Process Manage – start-page: 275 year: 2002 ident: 10.1016/j.eij.2020.08.004_b0225 article-title: Improving stemming for arabic information retrieval: light stemming and co-occurrence analysis – volume: 23 start-page: 1079 issue: 7 year: 2011 ident: 10.1016/j.eij.2020.08.004_b0095 article-title: Random k-labelsets for multilabel classification publication-title: IEEE Trans Knowl Data Eng doi: 10.1109/TKDE.2010.164 – ident: 10.1016/j.eij.2020.08.004_b0265 – volume: 44 start-page: 724 issue: 3 year: 2011 ident: 10.1016/j.eij.2020.08.004_b0035 article-title: Multi-label classification and extracting predicted class hierarchies publication-title: Pattern Recogn doi: 10.1016/j.patcog.2010.09.010 – volume: 73 start-page: 133 issue: 2 year: 2008 ident: 10.1016/j.eij.2020.08.004_b0145 article-title: Multilabel classification via calibrated label ranking publication-title: Mach Learn doi: 10.1007/s10994-008-5064-8 – volume: 57 issue: 1 year: 2020 ident: 10.1016/j.eij.2020.08.004_b0165 article-title: Arabic text classification using deep learning models publication-title: Inf Process Manage doi: 10.1016/j.ipm.2019.102121 – ident: 10.1016/j.eij.2020.08.004_b0190 – volume: 6 start-page: 1 issue: 1 year: 2006 ident: 10.1016/j.eij.2020.08.004_b0215 article-title: An intelligent system for arabic text categorization publication-title: Int J Intell Comput Inf Sci – ident: 10.1016/j.eij.2020.08.004_b0135 doi: 10.1109/CSIT.2016.7549465 – volume: 23 start-page: 158 issue: 3 year: 2003 ident: 10.1016/j.eij.2020.08.004_b0050 article-title: Arabic text data mining: a root-based hierarchical indexing model publication-title: Int J Model Simul doi: 10.1080/02286203.2003.11442267 – ident: 10.1016/j.eij.2020.08.004_b0185 doi: 10.5121/acij.2012.3607 – volume: 52 start-page: 478 issue: 3 year: 2016 ident: 10.1016/j.eij.2020.08.004_b0200 article-title: A query term re-weighting approach using document similarity publication-title: Inf Process Manage doi: 10.1016/j.ipm.2015.09.002 – ident: 10.1016/j.eij.2020.08.004_b0085 doi: 10.1007/978-3-540-74958-5_38 – volume: 12 start-page: 2411 year: 2011 ident: 10.1016/j.eij.2020.08.004_b0255 article-title: Mulan: a java library for multi-label learning publication-title: J Mach Learn Res – volume: 73 start-page: 185 issue: 2 year: 2008 ident: 10.1016/j.eij.2020.08.004_b0100 article-title: Decision trees for hierarchical multi-label classification publication-title: Mach Learn doi: 10.1007/s10994-008-5077-3 – ident: 10.1016/j.eij.2020.08.004_b0245 doi: 10.1007/978-0-387-09823-4_34 – ident: 10.1016/j.eij.2020.08.004_b0170 doi: 10.1109/ACLing.2015.28 – volume: 38 start-page: 3085 issue: 4 year: 2011 ident: 10.1016/j.eij.2020.08.004_b0230 article-title: Using chi-square statistics to measure similarities for text categorization publication-title: Expert Syst Appl doi: 10.1016/j.eswa.2010.08.100 – volume: 37 start-page: 1757 issue: 9 year: 2004 ident: 10.1016/j.eij.2020.08.004_b0070 article-title: Learning multi-label scene classification publication-title: Pattern Recogn doi: 10.1016/j.patcog.2004.03.009 – ident: 10.1016/j.eij.2020.08.004_b0025 – ident: 10.1016/j.eij.2020.08.004_b0120 – volume: 13 start-page: 1527 issue: 3 year: 2016 ident: 10.1016/j.eij.2020.08.004_b0220 article-title: Building and benchmarking novel arabic stemmer for document classification publication-title: J Comput Theor Nanosci doi: 10.1166/jctn.2016.5077 – volume: 13 start-page: 4 issue: 1 year: 2014 ident: 10.1016/j.eij.2020.08.004_b0125 article-title: Arabic text categorization based on arabic wikipedia publication-title: ACM Trans Asian Lang Inf Process (TALIP) – ident: 10.1016/j.eij.2020.08.004_b0015 doi: 10.1145/2716262 – ident: 10.1016/j.eij.2020.08.004_b0055 – volume: 21 start-page: 2012 issue: 1 year: 2012 ident: 10.1016/j.eij.2020.08.004_b0180 article-title: Word stemming for arabic information retrieval: the case for simple light stemming publication-title: Abhath Al-Yarmouk Sci Eng Ser – volume: 172 start-page: 1897 issue: 16–17 year: 2008 ident: 10.1016/j.eij.2020.08.004_b0080 article-title: Label ranking by learning pairwise preferences publication-title: Artif Intell doi: 10.1016/j.artint.2008.08.002 – ident: 10.1016/j.eij.2020.08.004_b0160 – ident: 10.1016/j.eij.2020.08.004_b0150 doi: 10.1023/A:1007614523901 – volume: 53 start-page: 547 issue: 2 year: 2017 ident: 10.1016/j.eij.2020.08.004_b0205 article-title: Balancing between over-weighting and under-weighting in supervised term weighting publication-title: Inf Process Manage doi: 10.1016/j.ipm.2016.10.003 – ident: 10.1016/j.eij.2020.08.004_b0260 doi: 10.1007/3-540-45164-1_8 – volume: 42 start-page: 155 issue: 1 year: 2006 ident: 10.1016/j.eij.2020.08.004_b0235 article-title: Information gain and divergence-based feature selection for machine learning-based text categorization publication-title: Inf Process Manage doi: 10.1016/j.ipm.2004.08.006 – volume: 56 start-page: 212 issue: 1 year: 2019 ident: 10.1016/j.eij.2020.08.004_b0005 article-title: Multi-label arabic text categorization: a benchmark and baseline comparison of multi-label learning algorithms publication-title: Inf Process Manage doi: 10.1016/j.ipm.2018.09.008 – volume: 103 start-page: 104 year: 2016 ident: 10.1016/j.eij.2020.08.004_b0010 article-title: Rfboost: an improved multi-label boosting algorithm and its application to text categorisation publication-title: Knowl-Based Syst doi: 10.1016/j.knosys.2016.03.029 – volume: 8 start-page: 251 issue: 3 year: 2011 ident: 10.1016/j.eij.2020.08.004_b0030 article-title: A hierarchical k-NN classifier for textual data publication-title: Int Arab J Inf Technol – volume: 50 start-page: 104 issue: 1 year: 2014 ident: 10.1016/j.eij.2020.08.004_b0210 article-title: The impact of preprocessing on text classification publication-title: Inf Process Manage doi: 10.1016/j.ipm.2013.08.006 – ident: 10.1016/j.eij.2020.08.004_b0250 doi: 10.1145/1076034.1076082 – volume: 292 start-page: 135 year: 2013 ident: 10.1016/j.eij.2020.08.004_b0240 article-title: A comparison of multi-label feature selection methods using the problem transformation approach publication-title: Electron Notes Theor Comput Sci doi: 10.1016/j.entcs.2013.02.010 – ident: 10.1016/j.eij.2020.08.004_b0065 doi: 10.1007/3-540-44794-6_4 – volume: 70 start-page: 89 year: 2017 ident: 10.1016/j.eij.2020.08.004_b0115 article-title: Hierarchical multi-label classification using fully associative ensemble learning publication-title: Pattern Recogn doi: 10.1016/j.patcog.2017.05.007 – volume: 3 start-page: 1 issue: 3 year: 2007 ident: 10.1016/j.eij.2020.08.004_b0020 article-title: Multi-label classification: an overview publication-title: Int J Data Warehousing Min (IJDWM) doi: 10.4018/jdwm.2007070101 – ident: 10.1016/j.eij.2020.08.004_b0130 doi: 10.1109/IACS.2015.7103229 – volume: 12 start-page: 504 issue: 4 year: 2016 ident: 10.1016/j.eij.2020.08.004_b0140 article-title: A lexicon based approach for classifying arabic multi-labeled text publication-title: Int J Web Inf Syst – start-page: 1 year: 2014 ident: 10.1016/j.eij.2020.08.004_b0045 article-title: Using twitter to collect a multi-dialectal corpus of arabic – volume: 40 start-page: 2038 issue: 7 year: 2007 ident: 10.1016/j.eij.2020.08.004_b0060 article-title: Ml-knn: a lazy learning approach to multi-label learning publication-title: Pattern Recogn doi: 10.1016/j.patcog.2006.12.019 – ident: 10.1016/j.eij.2020.08.004_b0110 doi: 10.1109/IGARSS.2004.1368565 – ident: 10.1016/j.eij.2020.08.004_b0105 – volume: 85 start-page: 333 issue: 3 year: 2011 ident: 10.1016/j.eij.2020.08.004_b0075 article-title: Classifier chains for multi-label classification publication-title: Mach Learn doi: 10.1007/s10994-011-5256-5 – volume: 22 start-page: 31 issue: 1–2 year: 2011 ident: 10.1016/j.eij.2020.08.004_b0040 article-title: A survey of hierarchical classification across different application domains publication-title: Data Min Knowl Discovery doi: 10.1007/s10618-010-0175-9 – ident: 10.1016/j.eij.2020.08.004_b0155 doi: 10.1007/978-3-540-87881-0_40 – volume: 7 start-page: 219 issue: 4 year: 2014 ident: 10.1016/j.eij.2020.08.004_b0175 article-title: Vector space models to classify arabic text publication-title: Int J Comput Trends Technol (IJCTT) doi: 10.14445/22312803/IJCTT-V7P109 – ident: 10.1016/j.eij.2020.08.004_b0195 – ident: 10.1016/j.eij.2020.08.004_b0090 doi: 10.1109/ICDM.2008.74 |
| SSID | ssj0000612456 |
| Score | 2.3247917 |
| Snippet | Multi-label classification assigns multiple labels to each document concurrently. Many real-world classification problems tend to employ high-dimensional label... |
| SourceID | doaj crossref elsevier |
| SourceType | Open Website Enrichment Source Index Database Publisher |
| StartPage | 225 |
| SubjectTerms | Arabic natural language processing Hierarchical classification Machine learning Multi-label classification Text classification |
| Title | HMATC: Hierarchical multi-label Arabic text classification model using machine learning |
| URI | https://dx.doi.org/10.1016/j.eij.2020.08.004 https://doaj.org/article/22581a0b41b04e6fac8b10cc401b6901 |
| Volume | 22 |
| WOSCitedRecordID | wos000701191400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 2090-4754 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000612456 issn: 1110-8665 databaseCode: DOA dateStart: 20100101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV09T8MwELVQxQAD4lOUL3lgQopwEseN2UpF1YWKoYhulu3YVapSUCn8fu6cpMoCLKyRHUd3J_ud_PIeIdciNVAV1iNTLY-49yIyLPcRgAlnfWZR-y6YTfTG43w6lU8tqy_khFXywFXgbqHe8lgzw2PDuBNe29zEzFroCwyaKeHuC6in1UxVe3CMN3rBWQU2GhR1a640A7nLlXPoDRMW5Dtrk7bmUAra_a2zqXXeDPfJXg0Uab_6wAOy5ZaHZLclH3hEXkaP_cngjo5K_Is4mJosaGAIRpBbh5O1KS1Fcge1CJORFxRSQYMDDkXW-4y-BkKlo7WDxOyYPA8fJoNRVBslRBAGsY6Mz5lOrCy4L4ThRZECKtGFc5nRgP-M5Hh7piVailk4oo1FGTgjdVYwrRORnpDO8m3pTgmF9kHLNOZeJNA7pTZ3mRWQ6KywUvZk0iWsiZSytYo4mlksVEMXmysIrsLgKjS4ZLxLbjZT3isJjd8G32P4NwNR_To8gJpQdU2ov2qiS3iTPFUDiQogwKvKn9c--4-1z8lOgrSXQEO7IJ316tNdkm37tS4_VlehSr8BHmfoUw |
| linkProvider | Directory of Open Access Journals |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=HMATC%3A+Hierarchical+multi-label+Arabic+text+classification+model+using+machine+learning&rft.jtitle=Egyptian+informatics+journal&rft.au=Aljedani%2C+Nawal&rft.au=Alotaibi%2C+Reem&rft.au=Taileb%2C+Mounira&rft.date=2021-09-01&rft.issn=1110-8665&rft.volume=22&rft.issue=3&rft.spage=225&rft.epage=237&rft_id=info:doi/10.1016%2Fj.eij.2020.08.004&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_eij_2020_08_004 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1110-8665&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1110-8665&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1110-8665&client=summon |