HMATC: Hierarchical multi-label Arabic text classification model using machine learning

Multi-label classification assigns multiple labels to each document concurrently. Many real-world classification problems tend to employ high-dimensional label spaces, which can be naturally structured in a hierarchy. In this type of problem, each instance may belong to multiple labels and labels ar...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Egyptian informatics journal Ročník 22; číslo 3; s. 225 - 237
Hlavní autori: Aljedani, Nawal, Alotaibi, Reem, Taileb, Mounira
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier B.V 01.09.2021
Elsevier
Predmet:
ISSN:1110-8665, 2090-4754
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Multi-label classification assigns multiple labels to each document concurrently. Many real-world classification problems tend to employ high-dimensional label spaces, which can be naturally structured in a hierarchy. In this type of problem, each instance may belong to multiple labels and labels are organized in a hierarchical structure. It presents a more complex problem than flat classification, given that the classification algorithm has to take into account hierarchical relationships between labels and be able to predict multiple labels for the same instance. Few studies have investigated multi-label text classification for the Arabic language. Most of these studies have focused mainly on flat classification and have neglected the hierarchical structure. Therefore, this paper explores the hierarchical multi-label classification in the context of the Arabic language. It proposes a hierarchical multi-label Arabic text classification (HMATC) model with a machine learning approach. The impact of feature selection methods and feature set dimensions on classification performance are also investigated. In addition, the Hierarchy Of Multilabel ClassifiER (HOMER) algorithm is optimized via examination of different sets of multi-label classifiers, clustering algorithms and different numbers of clusters to improve the hierarchical classification. Moreover, this study contributes to existing research by introducing a hierarchical multi-label Arabic dataset in an appropriate format for hierarchical classification and making it publicly available. The results reveal that the proposed model outperforms all models considered in the experiments in terms of the computational cost, which consumed less cost (2 h) compared with other evaluated models. In addition, it shows a significant improvement compared with the state-of-the-art model (Fatwa model) in terms of Hamming loss (0.004), hierarchical loss (1.723), multi-label accuracy (0.758), subset accuracy (0.292), micro-averaged precision (0.879), micro-averaged recall (0.828), and micro-averaged F-measure (0.853).
AbstractList Multi-label classification assigns multiple labels to each document concurrently. Many real-world classification problems tend to employ high-dimensional label spaces, which can be naturally structured in a hierarchy. In this type of problem, each instance may belong to multiple labels and labels are organized in a hierarchical structure. It presents a more complex problem than flat classification, given that the classification algorithm has to take into account hierarchical relationships between labels and be able to predict multiple labels for the same instance. Few studies have investigated multi-label text classification for the Arabic language. Most of these studies have focused mainly on flat classification and have neglected the hierarchical structure. Therefore, this paper explores the hierarchical multi-label classification in the context of the Arabic language. It proposes a hierarchical multi-label Arabic text classification (HMATC) model with a machine learning approach. The impact of feature selection methods and feature set dimensions on classification performance are also investigated. In addition, the Hierarchy Of Multilabel ClassifiER (HOMER) algorithm is optimized via examination of different sets of multi-label classifiers, clustering algorithms and different numbers of clusters to improve the hierarchical classification. Moreover, this study contributes to existing research by introducing a hierarchical multi-label Arabic dataset in an appropriate format for hierarchical classification and making it publicly available. The results reveal that the proposed model outperforms all models considered in the experiments in terms of the computational cost, which consumed less cost (2 h) compared with other evaluated models. In addition, it shows a significant improvement compared with the state-of-the-art model (Fatwa model) in terms of Hamming loss (0.004), hierarchical loss (1.723), multi-label accuracy (0.758), subset accuracy (0.292), micro-averaged precision (0.879), micro-averaged recall (0.828), and micro-averaged F-measure (0.853).
Author Aljedani, Nawal
Alotaibi, Reem
Taileb, Mounira
Author_xml – sequence: 1
  givenname: Nawal
  surname: Aljedani
  fullname: Aljedani, Nawal
  email: naljedani0026@stu.kau.edu.sa
– sequence: 2
  givenname: Reem
  surname: Alotaibi
  fullname: Alotaibi, Reem
  email: ralotibi@kau.edu.sa
– sequence: 3
  givenname: Mounira
  surname: Taileb
  fullname: Taileb, Mounira
  email: mtaileb@kau.edu.sa
BookMark eNp9kE1v1DAQhi1UJJbSH8AtfyBhxnGcGE6rFbCVirgUcbTGHymOvAmyXQT_vm4XceDQuYw0ep9Xo-c1u1i31TP2FqFDQPlu6XxYOg4cOpg6APGC7TgoaMU4iAu2Q0RoJymHV-wq5wXqSORikDv2_fhlf3t43xyDT5Tsj2ApNqf7WEIbyfjY7BOZYJvif5fGRso5zDVTwrY2p83VwH0O611zosquvome0loPb9jLmWL2V3_3Jfv26ePt4djefP18fdjftFaALK2ZJyBulROzk0Y41498JOf9YEgpNEpwgT0phB5sP6Gxo1KDUTQ4IOKyv2TX51630aJ_pnCi9EdvFPTTYUt3mlIJNnrN-TAhgRFoQHg5k50Mgq2PoJEKsHbhucumLefk5399CPpRtF50Fa0fRWuYdBVdmfE_xobypKckCvFZ8sOZ9FXPr6pfZxv8ar0LydtS_w_P0A-_IJmf
CitedBy_id crossref_primary_10_1007_s11063_024_11500_8
crossref_primary_10_32604_cmc_2023_033564
crossref_primary_10_1007_s00500_023_08341_3
crossref_primary_10_1016_j_compbiomed_2024_107921
crossref_primary_10_48084_etasr_9994
crossref_primary_10_1007_s00500_023_08384_6
crossref_primary_10_1016_j_susoc_2022_03_001
crossref_primary_10_1109_TASLP_2023_3294699
crossref_primary_10_1016_j_sasc_2025_200365
crossref_primary_10_1016_j_jksuci_2022_10_015
crossref_primary_10_3390_electronics13071199
crossref_primary_10_1016_j_buildenv_2023_111124
crossref_primary_10_1109_ACCESS_2024_3450507
crossref_primary_10_3390_app12136424
crossref_primary_10_3846_aviation_2023_19739
crossref_primary_10_1109_ACCESS_2023_3265712
Cites_doi 10.1109/TKDE.2010.164
10.1016/j.patcog.2010.09.010
10.1007/s10994-008-5064-8
10.1016/j.ipm.2019.102121
10.1109/CSIT.2016.7549465
10.1080/02286203.2003.11442267
10.5121/acij.2012.3607
10.1016/j.ipm.2015.09.002
10.1007/978-3-540-74958-5_38
10.1007/s10994-008-5077-3
10.1007/978-0-387-09823-4_34
10.1109/ACLing.2015.28
10.1016/j.eswa.2010.08.100
10.1016/j.patcog.2004.03.009
10.1166/jctn.2016.5077
10.1145/2716262
10.1016/j.artint.2008.08.002
10.1023/A:1007614523901
10.1016/j.ipm.2016.10.003
10.1007/3-540-45164-1_8
10.1016/j.ipm.2004.08.006
10.1016/j.ipm.2018.09.008
10.1016/j.knosys.2016.03.029
10.1016/j.ipm.2013.08.006
10.1145/1076034.1076082
10.1016/j.entcs.2013.02.010
10.1007/3-540-44794-6_4
10.1016/j.patcog.2017.05.007
10.4018/jdwm.2007070101
10.1109/IACS.2015.7103229
10.1016/j.patcog.2006.12.019
10.1109/IGARSS.2004.1368565
10.1007/s10994-011-5256-5
10.1007/s10618-010-0175-9
10.1007/978-3-540-87881-0_40
10.14445/22312803/IJCTT-V7P109
10.1109/ICDM.2008.74
ContentType Journal Article
Copyright 2020
Copyright_xml – notice: 2020
DBID 6I.
AAFTH
AAYXX
CITATION
DOA
DOI 10.1016/j.eij.2020.08.004
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2090-4754
EndPage 237
ExternalDocumentID oai_doaj_org_article_22581a0b41b04e6fac8b10cc401b6901
10_1016_j_eij_2020_08_004
S1110866520301523
GroupedDBID --K
0R~
0SF
1B1
4.4
457
5VS
6I.
AACTN
AAEDT
AAEDW
AAFTH
AAIKJ
AALRI
AAXUO
ABMAC
ACGFS
ADBBV
ADEZE
AEXQZ
AFTJW
AGHFR
AITUG
ALMA_UNASSIGNED_HOLDINGS
AMRAJ
BCNDV
E3Z
EBS
EJD
FDB
GROUPED_DOAJ
HZ~
IPNFZ
IXB
KQ8
M41
NCXOZ
O-L
O9-
OK1
RIG
ROL
SES
SSZ
AAYWO
AAYXX
ACVFH
ADCNI
ADVLN
AEUPX
AFJKZ
AFPUW
AIGII
AKBMS
AKRWK
AKYEP
APXCP
CITATION
ID FETCH-LOGICAL-c406t-bf80a2c9d4fd6b4dd3727adee5ba991b942413a91030c381bc7995b9a5d0aa263
IEDL.DBID DOA
ISICitedReferencesCount 24
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000701191400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1110-8665
IngestDate Fri Oct 03 12:48:30 EDT 2025
Tue Nov 18 21:01:49 EST 2025
Wed Oct 29 21:12:38 EDT 2025
Wed May 17 00:08:48 EDT 2023
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Keywords Arabic natural language processing
Multi-label classification
Hierarchical classification
Machine learning
Text classification
Language English
License This is an open access article under the CC BY-NC-ND license.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c406t-bf80a2c9d4fd6b4dd3727adee5ba991b942413a91030c381bc7995b9a5d0aa263
OpenAccessLink https://doaj.org/article/22581a0b41b04e6fac8b10cc401b6901
PageCount 13
ParticipantIDs doaj_primary_oai_doaj_org_article_22581a0b41b04e6fac8b10cc401b6901
crossref_primary_10_1016_j_eij_2020_08_004
crossref_citationtrail_10_1016_j_eij_2020_08_004
elsevier_sciencedirect_doi_10_1016_j_eij_2020_08_004
PublicationCentury 2000
PublicationDate September 2021
2021-09-00
2021-09-01
PublicationDateYYYYMMDD 2021-09-01
PublicationDate_xml – month: 09
  year: 2021
  text: September 2021
PublicationDecade 2020
PublicationTitle Egyptian informatics journal
PublicationYear 2021
Publisher Elsevier B.V
Elsevier
Publisher_xml – name: Elsevier B.V
– name: Elsevier
References Zhang, Zhou (b0060) 2007; 40
Joachims T. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. Carnegie-mellon univ pittsburgh pa dept of computer science, Tech. Rep.; 1996.
Eldos (b0050) 2003; 23
Mustafa (b0180) 2012; 21
Cheng, W, Hüllermeier, E. Combining instance-based learning and logistic regression for multilabel classification. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (Eds.), Machine learning and knowledge discovery in databases. Berlin, Heidelberg: Springer, Berlin Heidelberg; 2009. p. 6–6.
Chen, Chen (b0230) 2011; 38
Tsoumakas, Katakis, Vlahavas (b0095) 2011; 23
Yahya, Salhi (b0125) 2014; 13
Syiam, Fayed, Habib (b0215) 2006; 6
Clare A, King RD. Knowledge discovery in multi-label phenotype data. In: European conference on principles of data mining and knowledge discovery. Springer; 2001. p. 42–53.
Boutell, Luo, Shen, Brown (b0070) 2004; 37
Habib MB. An intelligent system for automated arabic text categorization, Master’s thesis, University of Twente; 2008.
Froud H, Lachkar A, Ouatik SA. A comparative study of root-based and stem-based approaches for measuring the similarity between arabic words for arabic text mining applications, arXiv preprint arXiv:1212.3634, 2012.
Cesa-Bianchi N, Gentile C, Zaniboni L. Incremental algorithms for hierarchical classification. J Mach Learn Res 7(Jan);2006:31–54.
Ayedh, Tan (b0220) 2016; 13
Tsoumakas G, Katakis I, Vlahavas I. Effective and efficient multilabel classification in domains with large number of labels. In: Proc. ECML/PKDD 2008 workshop on mining multidimensional data (MMD’08), vol. 21. sn, 2008. pp. 53–59.
Silla, Freitas (b0040) 2011; 22
Larkey, Ballesteros, Connell (b0225) 2002
SpolaôR, Cherman, Monard, Lee (b0240) 2013; 292
Brucker, Benites, Sapozhnikova (b0035) 2011; 44
Hmeidi, Al-Ayyoub, Mahyoub, Shehab (b0140) 2016; 12
Chen Y, Crawford MM, Ghosh J. Integrating support vector machines in a hierarchical output space decomposition framework. In: Geoscience and remote sensing symposium, 2004. IGARSS’04. Proceedings. 2004 IEEE International, vol. 2. IEEE; 2004. p. 949–952.
Ahmed Y, Xiang J, Zhao D, Al-qaness MAA, Elsayed abd el aziz M, Abdelghani D. A study of the effects of stemming strategies on arabic document classification. IEEE Access PP:2019;1–1.
Brazdil PB, Soares C. A comparison of ranking methods for classification algorithm selection. In: European conference on machine learning. Springer; 2000. p. 63–75.
Tsoumakas G, Vlahavas I. Random k-labelsets: an ensemble method for multilabel classification. In: European conference on machine learning. Springer; 2007. p. 406–417.
Al-Salemi, Noah, Ab Aziz (b0010) 2016; 103
Mubarak, Darwish (b0045) 2014
Demšar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan);2006:1–30
Karisani, Rahgozar, Oroumchian (b0200) 2016; 52
Zayed, RA, Hady MFA, Hefny H. Islamic fatwa request routing via hierarchical multi-label arabic text categorization. In: Arabic computational linguistics (ACLing), 2015 first international conference on. IEEE; 2015. p. 145–151.
Wu, Gu, Gu (b0205) 2017; 53
Tsoumakas, Spyromitros-Xioufis, Vilcek, Vlahavas (b0255) 2011; 12
Read, Pfahringer, Holmes, Frank (b0075) 2011; 85
Spyromitros E, Tsoumakas G, Vlahavas I. An empirical study of lazy multilabel classification algorithms. In: Darzentas, J., Vouros, G.A., Vosinakis, S., Arnellos, A. (Eds.), Artificial intelligence: theories, models and applications, Berlin, Heidelberg: Springer, Berlin Heidelberg; 2008. p. 401–406.
Uysal, Gunal (b0210) 2014; 50
Lee, Lee (b0235) 2006; 42
Ahmed NA, Shehab MA, Al-Ayyoub M, Hmeidi I. Scalable multi-label arabic text classification. In: Information and communication systems (ICICS), 2015 6th international conference on. IEEE; 2015. p. 212–217.
Shehab MA, Badarneh O, Al-Ayyoub M, Jararweh Y. A supervised approach for multi-label classification of arabic news articles. In: Computer science and information technology (CSIT), 2016 7th international conference on. IEEE; 2016. p. 1–6.
Fürnkranz, Hüllermeier, Mencía, Brinker (b0145) 2008; 73
Read J, Pfahringer B, Holmes G. Multi-label classification using ensembles of pruned sets. In: Data mining, 2008. ICDM’08. Eighth IEEE international conference on. IEEE; 2008. p. 995–1000.
Zhang, Shah, Kakadiaris (b0115) 2017; 70
Taha AY, Tiun S. Binary relevance (br) method classifier of multi-label classification for arabic text. J Theor Appl Inf Technol 84(3):2016.
Schapire RE, Singer Y. Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3);1999:297–336. [Online]. Available: doi: 10.1023/A:1007614523901.
Al-Salemi, Ayob, Kendall, Noah (b0005) 2019; 56
Tsoumakas, Katakis (b0020) 2007; 3
Zhu S, Ji X, Xu W, Gong Y. Multi-labelled classification using maximum entropy method. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. ACM; 2005. p. 274–281
Hüllermeier, Fürnkranz, Cheng, Brinker (b0080) 2008; 172
Ababneh, Almomani, Hadi, El-Omari, Al-Ibrahim (b0175) 2014; 7
Tsoumakas G, Katakis I, Vlahavas I. Mining multi-label data. In: Data mining and knowledge discovery handbook. Springer; 2009. p. 667–685.
Vens, Struyf, Schietgat, Džeroski, Blockeel (b0100) 2008; 73
Elnagar, Al-Debsi, Einea (b0165) 2020; 57
Duwairi, Al-Zubaidi (b0030) 2011; 8
Gibaja E, Ventura S. A tutorial on multi tutorial on multilabel learningilabel learning, ACM Comput Surv 47(3):2015; 52:1–52:38. [Online]. Available: http://doi.acm.org/10.1145/2716262.
Tsoumakas (10.1016/j.eij.2020.08.004_b0095) 2011; 23
Zhang (10.1016/j.eij.2020.08.004_b0115) 2017; 70
Tsoumakas (10.1016/j.eij.2020.08.004_b0020) 2007; 3
Al-Salemi (10.1016/j.eij.2020.08.004_b0005) 2019; 56
10.1016/j.eij.2020.08.004_b0160
10.1016/j.eij.2020.08.004_b0120
SpolaôR (10.1016/j.eij.2020.08.004_b0240) 2013; 292
10.1016/j.eij.2020.08.004_b0085
10.1016/j.eij.2020.08.004_b0245
Fürnkranz (10.1016/j.eij.2020.08.004_b0145) 2008; 73
Uysal (10.1016/j.eij.2020.08.004_b0210) 2014; 50
10.1016/j.eij.2020.08.004_b0090
Yahya (10.1016/j.eij.2020.08.004_b0125) 2014; 13
10.1016/j.eij.2020.08.004_b0170
10.1016/j.eij.2020.08.004_b0055
10.1016/j.eij.2020.08.004_b0130
Tsoumakas (10.1016/j.eij.2020.08.004_b0255) 2011; 12
10.1016/j.eij.2020.08.004_b0250
10.1016/j.eij.2020.08.004_b0015
10.1016/j.eij.2020.08.004_b0135
Elnagar (10.1016/j.eij.2020.08.004_b0165) 2020; 57
Mubarak (10.1016/j.eij.2020.08.004_b0045) 2014
Ababneh (10.1016/j.eij.2020.08.004_b0175) 2014; 7
Boutell (10.1016/j.eij.2020.08.004_b0070) 2004; 37
Read (10.1016/j.eij.2020.08.004_b0075) 2011; 85
Al-Salemi (10.1016/j.eij.2020.08.004_b0010) 2016; 103
Eldos (10.1016/j.eij.2020.08.004_b0050) 2003; 23
10.1016/j.eij.2020.08.004_b0260
10.1016/j.eij.2020.08.004_b0065
10.1016/j.eij.2020.08.004_b0185
Ayedh (10.1016/j.eij.2020.08.004_b0220) 2016; 13
10.1016/j.eij.2020.08.004_b0025
Larkey (10.1016/j.eij.2020.08.004_b0225) 2002
10.1016/j.eij.2020.08.004_b0265
Hmeidi (10.1016/j.eij.2020.08.004_b0140) 2016; 12
Wu (10.1016/j.eij.2020.08.004_b0205) 2017; 53
Zhang (10.1016/j.eij.2020.08.004_b0060) 2007; 40
Lee (10.1016/j.eij.2020.08.004_b0235) 2006; 42
Duwairi (10.1016/j.eij.2020.08.004_b0030) 2011; 8
10.1016/j.eij.2020.08.004_b0190
Karisani (10.1016/j.eij.2020.08.004_b0200) 2016; 52
10.1016/j.eij.2020.08.004_b0150
10.1016/j.eij.2020.08.004_b0110
Brucker (10.1016/j.eij.2020.08.004_b0035) 2011; 44
10.1016/j.eij.2020.08.004_b0195
Mustafa (10.1016/j.eij.2020.08.004_b0180) 2012; 21
Chen (10.1016/j.eij.2020.08.004_b0230) 2011; 38
Vens (10.1016/j.eij.2020.08.004_b0100) 2008; 73
Silla (10.1016/j.eij.2020.08.004_b0040) 2011; 22
10.1016/j.eij.2020.08.004_b0155
10.1016/j.eij.2020.08.004_b0105
Hüllermeier (10.1016/j.eij.2020.08.004_b0080) 2008; 172
Syiam (10.1016/j.eij.2020.08.004_b0215) 2006; 6
References_xml – volume: 13
  start-page: 4
  year: 2014
  ident: b0125
  article-title: Arabic text categorization based on arabic wikipedia
  publication-title: ACM Trans Asian Lang Inf Process (TALIP)
– volume: 12
  start-page: 504
  year: 2016
  end-page: 532
  ident: b0140
  article-title: A lexicon based approach for classifying arabic multi-labeled text
  publication-title: Int J Web Inf Syst
– reference: Cheng, W, Hüllermeier, E. Combining instance-based learning and logistic regression for multilabel classification. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (Eds.), Machine learning and knowledge discovery in databases. Berlin, Heidelberg: Springer, Berlin Heidelberg; 2009. p. 6–6.
– reference: Brazdil PB, Soares C. A comparison of ranking methods for classification algorithm selection. In: European conference on machine learning. Springer; 2000. p. 63–75.
– reference: Chen Y, Crawford MM, Ghosh J. Integrating support vector machines in a hierarchical output space decomposition framework. In: Geoscience and remote sensing symposium, 2004. IGARSS’04. Proceedings. 2004 IEEE International, vol. 2. IEEE; 2004. p. 949–952.
– reference: Joachims T. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. Carnegie-mellon univ pittsburgh pa dept of computer science, Tech. Rep.; 1996.
– reference: Tsoumakas G, Katakis I, Vlahavas I. Effective and efficient multilabel classification in domains with large number of labels. In: Proc. ECML/PKDD 2008 workshop on mining multidimensional data (MMD’08), vol. 21. sn, 2008. pp. 53–59.
– volume: 12
  start-page: 2411
  year: 2011
  end-page: 2414
  ident: b0255
  article-title: Mulan: a java library for multi-label learning
  publication-title: J Mach Learn Res
– volume: 85
  start-page: 333
  year: 2011
  ident: b0075
  article-title: Classifier chains for multi-label classification
  publication-title: Mach Learn
– start-page: 275
  year: 2002
  end-page: 282
  ident: b0225
  article-title: Improving stemming for arabic information retrieval: light stemming and co-occurrence analysis
  publication-title: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval
– volume: 23
  start-page: 1079
  year: 2011
  end-page: 1089
  ident: b0095
  article-title: Random k-labelsets for multilabel classification
  publication-title: IEEE Trans Knowl Data Eng
– volume: 292
  start-page: 135
  year: 2013
  end-page: 151
  ident: b0240
  article-title: A comparison of multi-label feature selection methods using the problem transformation approach
  publication-title: Electron Notes Theor Comput Sci
– volume: 3
  start-page: 1
  year: 2007
  end-page: 13
  ident: b0020
  article-title: Multi-label classification: an overview
  publication-title: Int J Data Warehousing Min (IJDWM)
– volume: 7
  start-page: 219
  year: 2014
  end-page: 223
  ident: b0175
  article-title: Vector space models to classify arabic text
  publication-title: Int J Comput Trends Technol (IJCTT)
– volume: 6
  start-page: 1
  year: 2006
  end-page: 19
  ident: b0215
  article-title: An intelligent system for arabic text categorization
  publication-title: Int J Intell Comput Inf Sci
– volume: 172
  start-page: 1897
  year: 2008
  end-page: 1916
  ident: b0080
  article-title: Label ranking by learning pairwise preferences
  publication-title: Artif Intell
– reference: Gibaja E, Ventura S. A tutorial on multi tutorial on multilabel learningilabel learning, ACM Comput Surv 47(3):2015; 52:1–52:38. [Online]. Available: http://doi.acm.org/10.1145/2716262.
– volume: 42
  start-page: 155
  year: 2006
  end-page: 165
  ident: b0235
  article-title: Information gain and divergence-based feature selection for machine learning-based text categorization
  publication-title: Inf Process Manage
– volume: 44
  start-page: 724
  year: 2011
  end-page: 738
  ident: b0035
  article-title: Multi-label classification and extracting predicted class hierarchies
  publication-title: Pattern Recogn
– volume: 23
  start-page: 158
  year: 2003
  end-page: 166
  ident: b0050
  article-title: Arabic text data mining: a root-based hierarchical indexing model
  publication-title: Int J Model Simul
– reference: Read J, Pfahringer B, Holmes G. Multi-label classification using ensembles of pruned sets. In: Data mining, 2008. ICDM’08. Eighth IEEE international conference on. IEEE; 2008. p. 995–1000.
– reference: Ahmed NA, Shehab MA, Al-Ayyoub M, Hmeidi I. Scalable multi-label arabic text classification. In: Information and communication systems (ICICS), 2015 6th international conference on. IEEE; 2015. p. 212–217.
– reference: Tsoumakas G, Katakis I, Vlahavas I. Mining multi-label data. In: Data mining and knowledge discovery handbook. Springer; 2009. p. 667–685.
– volume: 37
  start-page: 1757
  year: 2004
  end-page: 1771
  ident: b0070
  article-title: Learning multi-label scene classification
  publication-title: Pattern Recogn
– reference: Shehab MA, Badarneh O, Al-Ayyoub M, Jararweh Y. A supervised approach for multi-label classification of arabic news articles. In: Computer science and information technology (CSIT), 2016 7th international conference on. IEEE; 2016. p. 1–6.
– volume: 38
  start-page: 3085
  year: 2011
  end-page: 3090
  ident: b0230
  article-title: Using chi-square statistics to measure similarities for text categorization
  publication-title: Expert Syst Appl
– start-page: 1
  year: 2014
  end-page: 7
  ident: b0045
  article-title: Using twitter to collect a multi-dialectal corpus of arabic
  publication-title: Proceedings of the EMNLP 2014 workshop on arabic natural language processing (ANLP)
– volume: 40
  start-page: 2038
  year: 2007
  end-page: 2048
  ident: b0060
  article-title: Ml-knn: a lazy learning approach to multi-label learning
  publication-title: Pattern Recogn
– volume: 8
  start-page: 251
  year: 2011
  end-page: 259
  ident: b0030
  article-title: A hierarchical k-NN classifier for textual data
  publication-title: Int Arab J Inf Technol
– reference: Clare A, King RD. Knowledge discovery in multi-label phenotype data. In: European conference on principles of data mining and knowledge discovery. Springer; 2001. p. 42–53.
– volume: 56
  start-page: 212
  year: 2019
  end-page: 227
  ident: b0005
  article-title: Multi-label arabic text categorization: a benchmark and baseline comparison of multi-label learning algorithms
  publication-title: Inf Process Manage
– reference: Schapire RE, Singer Y. Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3);1999:297–336. [Online]. Available: doi: 10.1023/A:1007614523901.
– volume: 53
  start-page: 547
  year: 2017
  end-page: 557
  ident: b0205
  article-title: Balancing between over-weighting and under-weighting in supervised term weighting
  publication-title: Inf Process Manage
– volume: 73
  start-page: 133
  year: 2008
  end-page: 153
  ident: b0145
  article-title: Multilabel classification via calibrated label ranking
  publication-title: Mach Learn
– reference: Taha AY, Tiun S. Binary relevance (br) method classifier of multi-label classification for arabic text. J Theor Appl Inf Technol 84(3):2016.
– reference: Demšar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan);2006:1–30
– volume: 73
  start-page: 185
  year: 2008
  ident: b0100
  article-title: Decision trees for hierarchical multi-label classification
  publication-title: Mach Learn
– reference: Froud H, Lachkar A, Ouatik SA. A comparative study of root-based and stem-based approaches for measuring the similarity between arabic words for arabic text mining applications, arXiv preprint arXiv:1212.3634, 2012.
– volume: 22
  start-page: 31
  year: 2011
  end-page: 72
  ident: b0040
  article-title: A survey of hierarchical classification across different application domains
  publication-title: Data Min Knowl Discovery
– reference: Cesa-Bianchi N, Gentile C, Zaniboni L. Incremental algorithms for hierarchical classification. J Mach Learn Res 7(Jan);2006:31–54.
– volume: 70
  start-page: 89
  year: 2017
  end-page: 103
  ident: b0115
  article-title: Hierarchical multi-label classification using fully associative ensemble learning
  publication-title: Pattern Recogn
– volume: 13
  start-page: 1527
  year: 2016
  end-page: 1535
  ident: b0220
  article-title: Building and benchmarking novel arabic stemmer for document classification
  publication-title: J Comput Theor Nanosci
– volume: 57
  year: 2020
  ident: b0165
  article-title: Arabic text classification using deep learning models
  publication-title: Inf Process Manage
– volume: 21
  start-page: 2012
  year: 2012
  ident: b0180
  article-title: Word stemming for arabic information retrieval: the case for simple light stemming
  publication-title: Abhath Al-Yarmouk Sci Eng Ser
– volume: 103
  start-page: 104
  year: 2016
  end-page: 117
  ident: b0010
  article-title: Rfboost: an improved multi-label boosting algorithm and its application to text categorisation
  publication-title: Knowl-Based Syst
– reference: Spyromitros E, Tsoumakas G, Vlahavas I. An empirical study of lazy multilabel classification algorithms. In: Darzentas, J., Vouros, G.A., Vosinakis, S., Arnellos, A. (Eds.), Artificial intelligence: theories, models and applications, Berlin, Heidelberg: Springer, Berlin Heidelberg; 2008. p. 401–406.
– reference: Ahmed Y, Xiang J, Zhao D, Al-qaness MAA, Elsayed abd el aziz M, Abdelghani D. A study of the effects of stemming strategies on arabic document classification. IEEE Access PP:2019;1–1.
– volume: 50
  start-page: 104
  year: 2014
  end-page: 112
  ident: b0210
  article-title: The impact of preprocessing on text classification
  publication-title: Inf Process Manage
– reference: Zayed, RA, Hady MFA, Hefny H. Islamic fatwa request routing via hierarchical multi-label arabic text categorization. In: Arabic computational linguistics (ACLing), 2015 first international conference on. IEEE; 2015. p. 145–151.
– reference: Zhu S, Ji X, Xu W, Gong Y. Multi-labelled classification using maximum entropy method. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. ACM; 2005. p. 274–281
– reference: Tsoumakas G, Vlahavas I. Random k-labelsets: an ensemble method for multilabel classification. In: European conference on machine learning. Springer; 2007. p. 406–417.
– reference: Habib MB. An intelligent system for automated arabic text categorization, Master’s thesis, University of Twente; 2008.
– volume: 52
  start-page: 478
  year: 2016
  end-page: 489
  ident: b0200
  article-title: A query term re-weighting approach using document similarity
  publication-title: Inf Process Manage
– start-page: 275
  year: 2002
  ident: 10.1016/j.eij.2020.08.004_b0225
  article-title: Improving stemming for arabic information retrieval: light stemming and co-occurrence analysis
– volume: 23
  start-page: 1079
  issue: 7
  year: 2011
  ident: 10.1016/j.eij.2020.08.004_b0095
  article-title: Random k-labelsets for multilabel classification
  publication-title: IEEE Trans Knowl Data Eng
  doi: 10.1109/TKDE.2010.164
– ident: 10.1016/j.eij.2020.08.004_b0265
– volume: 44
  start-page: 724
  issue: 3
  year: 2011
  ident: 10.1016/j.eij.2020.08.004_b0035
  article-title: Multi-label classification and extracting predicted class hierarchies
  publication-title: Pattern Recogn
  doi: 10.1016/j.patcog.2010.09.010
– volume: 73
  start-page: 133
  issue: 2
  year: 2008
  ident: 10.1016/j.eij.2020.08.004_b0145
  article-title: Multilabel classification via calibrated label ranking
  publication-title: Mach Learn
  doi: 10.1007/s10994-008-5064-8
– volume: 57
  issue: 1
  year: 2020
  ident: 10.1016/j.eij.2020.08.004_b0165
  article-title: Arabic text classification using deep learning models
  publication-title: Inf Process Manage
  doi: 10.1016/j.ipm.2019.102121
– ident: 10.1016/j.eij.2020.08.004_b0190
– volume: 6
  start-page: 1
  issue: 1
  year: 2006
  ident: 10.1016/j.eij.2020.08.004_b0215
  article-title: An intelligent system for arabic text categorization
  publication-title: Int J Intell Comput Inf Sci
– ident: 10.1016/j.eij.2020.08.004_b0135
  doi: 10.1109/CSIT.2016.7549465
– volume: 23
  start-page: 158
  issue: 3
  year: 2003
  ident: 10.1016/j.eij.2020.08.004_b0050
  article-title: Arabic text data mining: a root-based hierarchical indexing model
  publication-title: Int J Model Simul
  doi: 10.1080/02286203.2003.11442267
– ident: 10.1016/j.eij.2020.08.004_b0185
  doi: 10.5121/acij.2012.3607
– volume: 52
  start-page: 478
  issue: 3
  year: 2016
  ident: 10.1016/j.eij.2020.08.004_b0200
  article-title: A query term re-weighting approach using document similarity
  publication-title: Inf Process Manage
  doi: 10.1016/j.ipm.2015.09.002
– ident: 10.1016/j.eij.2020.08.004_b0085
  doi: 10.1007/978-3-540-74958-5_38
– volume: 12
  start-page: 2411
  year: 2011
  ident: 10.1016/j.eij.2020.08.004_b0255
  article-title: Mulan: a java library for multi-label learning
  publication-title: J Mach Learn Res
– volume: 73
  start-page: 185
  issue: 2
  year: 2008
  ident: 10.1016/j.eij.2020.08.004_b0100
  article-title: Decision trees for hierarchical multi-label classification
  publication-title: Mach Learn
  doi: 10.1007/s10994-008-5077-3
– ident: 10.1016/j.eij.2020.08.004_b0245
  doi: 10.1007/978-0-387-09823-4_34
– ident: 10.1016/j.eij.2020.08.004_b0170
  doi: 10.1109/ACLing.2015.28
– volume: 38
  start-page: 3085
  issue: 4
  year: 2011
  ident: 10.1016/j.eij.2020.08.004_b0230
  article-title: Using chi-square statistics to measure similarities for text categorization
  publication-title: Expert Syst Appl
  doi: 10.1016/j.eswa.2010.08.100
– volume: 37
  start-page: 1757
  issue: 9
  year: 2004
  ident: 10.1016/j.eij.2020.08.004_b0070
  article-title: Learning multi-label scene classification
  publication-title: Pattern Recogn
  doi: 10.1016/j.patcog.2004.03.009
– ident: 10.1016/j.eij.2020.08.004_b0025
– ident: 10.1016/j.eij.2020.08.004_b0120
– volume: 13
  start-page: 1527
  issue: 3
  year: 2016
  ident: 10.1016/j.eij.2020.08.004_b0220
  article-title: Building and benchmarking novel arabic stemmer for document classification
  publication-title: J Comput Theor Nanosci
  doi: 10.1166/jctn.2016.5077
– volume: 13
  start-page: 4
  issue: 1
  year: 2014
  ident: 10.1016/j.eij.2020.08.004_b0125
  article-title: Arabic text categorization based on arabic wikipedia
  publication-title: ACM Trans Asian Lang Inf Process (TALIP)
– ident: 10.1016/j.eij.2020.08.004_b0015
  doi: 10.1145/2716262
– ident: 10.1016/j.eij.2020.08.004_b0055
– volume: 21
  start-page: 2012
  issue: 1
  year: 2012
  ident: 10.1016/j.eij.2020.08.004_b0180
  article-title: Word stemming for arabic information retrieval: the case for simple light stemming
  publication-title: Abhath Al-Yarmouk Sci Eng Ser
– volume: 172
  start-page: 1897
  issue: 16–17
  year: 2008
  ident: 10.1016/j.eij.2020.08.004_b0080
  article-title: Label ranking by learning pairwise preferences
  publication-title: Artif Intell
  doi: 10.1016/j.artint.2008.08.002
– ident: 10.1016/j.eij.2020.08.004_b0160
– ident: 10.1016/j.eij.2020.08.004_b0150
  doi: 10.1023/A:1007614523901
– volume: 53
  start-page: 547
  issue: 2
  year: 2017
  ident: 10.1016/j.eij.2020.08.004_b0205
  article-title: Balancing between over-weighting and under-weighting in supervised term weighting
  publication-title: Inf Process Manage
  doi: 10.1016/j.ipm.2016.10.003
– ident: 10.1016/j.eij.2020.08.004_b0260
  doi: 10.1007/3-540-45164-1_8
– volume: 42
  start-page: 155
  issue: 1
  year: 2006
  ident: 10.1016/j.eij.2020.08.004_b0235
  article-title: Information gain and divergence-based feature selection for machine learning-based text categorization
  publication-title: Inf Process Manage
  doi: 10.1016/j.ipm.2004.08.006
– volume: 56
  start-page: 212
  issue: 1
  year: 2019
  ident: 10.1016/j.eij.2020.08.004_b0005
  article-title: Multi-label arabic text categorization: a benchmark and baseline comparison of multi-label learning algorithms
  publication-title: Inf Process Manage
  doi: 10.1016/j.ipm.2018.09.008
– volume: 103
  start-page: 104
  year: 2016
  ident: 10.1016/j.eij.2020.08.004_b0010
  article-title: Rfboost: an improved multi-label boosting algorithm and its application to text categorisation
  publication-title: Knowl-Based Syst
  doi: 10.1016/j.knosys.2016.03.029
– volume: 8
  start-page: 251
  issue: 3
  year: 2011
  ident: 10.1016/j.eij.2020.08.004_b0030
  article-title: A hierarchical k-NN classifier for textual data
  publication-title: Int Arab J Inf Technol
– volume: 50
  start-page: 104
  issue: 1
  year: 2014
  ident: 10.1016/j.eij.2020.08.004_b0210
  article-title: The impact of preprocessing on text classification
  publication-title: Inf Process Manage
  doi: 10.1016/j.ipm.2013.08.006
– ident: 10.1016/j.eij.2020.08.004_b0250
  doi: 10.1145/1076034.1076082
– volume: 292
  start-page: 135
  year: 2013
  ident: 10.1016/j.eij.2020.08.004_b0240
  article-title: A comparison of multi-label feature selection methods using the problem transformation approach
  publication-title: Electron Notes Theor Comput Sci
  doi: 10.1016/j.entcs.2013.02.010
– ident: 10.1016/j.eij.2020.08.004_b0065
  doi: 10.1007/3-540-44794-6_4
– volume: 70
  start-page: 89
  year: 2017
  ident: 10.1016/j.eij.2020.08.004_b0115
  article-title: Hierarchical multi-label classification using fully associative ensemble learning
  publication-title: Pattern Recogn
  doi: 10.1016/j.patcog.2017.05.007
– volume: 3
  start-page: 1
  issue: 3
  year: 2007
  ident: 10.1016/j.eij.2020.08.004_b0020
  article-title: Multi-label classification: an overview
  publication-title: Int J Data Warehousing Min (IJDWM)
  doi: 10.4018/jdwm.2007070101
– ident: 10.1016/j.eij.2020.08.004_b0130
  doi: 10.1109/IACS.2015.7103229
– volume: 12
  start-page: 504
  issue: 4
  year: 2016
  ident: 10.1016/j.eij.2020.08.004_b0140
  article-title: A lexicon based approach for classifying arabic multi-labeled text
  publication-title: Int J Web Inf Syst
– start-page: 1
  year: 2014
  ident: 10.1016/j.eij.2020.08.004_b0045
  article-title: Using twitter to collect a multi-dialectal corpus of arabic
– volume: 40
  start-page: 2038
  issue: 7
  year: 2007
  ident: 10.1016/j.eij.2020.08.004_b0060
  article-title: Ml-knn: a lazy learning approach to multi-label learning
  publication-title: Pattern Recogn
  doi: 10.1016/j.patcog.2006.12.019
– ident: 10.1016/j.eij.2020.08.004_b0110
  doi: 10.1109/IGARSS.2004.1368565
– ident: 10.1016/j.eij.2020.08.004_b0105
– volume: 85
  start-page: 333
  issue: 3
  year: 2011
  ident: 10.1016/j.eij.2020.08.004_b0075
  article-title: Classifier chains for multi-label classification
  publication-title: Mach Learn
  doi: 10.1007/s10994-011-5256-5
– volume: 22
  start-page: 31
  issue: 1–2
  year: 2011
  ident: 10.1016/j.eij.2020.08.004_b0040
  article-title: A survey of hierarchical classification across different application domains
  publication-title: Data Min Knowl Discovery
  doi: 10.1007/s10618-010-0175-9
– ident: 10.1016/j.eij.2020.08.004_b0155
  doi: 10.1007/978-3-540-87881-0_40
– volume: 7
  start-page: 219
  issue: 4
  year: 2014
  ident: 10.1016/j.eij.2020.08.004_b0175
  article-title: Vector space models to classify arabic text
  publication-title: Int J Comput Trends Technol (IJCTT)
  doi: 10.14445/22312803/IJCTT-V7P109
– ident: 10.1016/j.eij.2020.08.004_b0195
– ident: 10.1016/j.eij.2020.08.004_b0090
  doi: 10.1109/ICDM.2008.74
SSID ssj0000612456
Score 2.3247917
Snippet Multi-label classification assigns multiple labels to each document concurrently. Many real-world classification problems tend to employ high-dimensional label...
SourceID doaj
crossref
elsevier
SourceType Open Website
Enrichment Source
Index Database
Publisher
StartPage 225
SubjectTerms Arabic natural language processing
Hierarchical classification
Machine learning
Multi-label classification
Text classification
Title HMATC: Hierarchical multi-label Arabic text classification model using machine learning
URI https://dx.doi.org/10.1016/j.eij.2020.08.004
https://doaj.org/article/22581a0b41b04e6fac8b10cc401b6901
Volume 22
WOSCitedRecordID wos000701191400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2090-4754
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000612456
  issn: 1110-8665
  databaseCode: DOA
  dateStart: 20100101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV09T8MwELVQxQAD4lOUL3lgQopwEseN2UpF1YWKoYhulu3YVapSUCn8fu6cpMoCLKyRHUd3J_ud_PIeIdciNVAV1iNTLY-49yIyLPcRgAlnfWZR-y6YTfTG43w6lU8tqy_khFXywFXgbqHe8lgzw2PDuBNe29zEzFroCwyaKeHuC6in1UxVe3CMN3rBWQU2GhR1a640A7nLlXPoDRMW5Dtrk7bmUAra_a2zqXXeDPfJXg0Uab_6wAOy5ZaHZLclH3hEXkaP_cngjo5K_Is4mJosaGAIRpBbh5O1KS1Fcge1CJORFxRSQYMDDkXW-4y-BkKlo7WDxOyYPA8fJoNRVBslRBAGsY6Mz5lOrCy4L4ThRZECKtGFc5nRgP-M5Hh7piVailk4oo1FGTgjdVYwrRORnpDO8m3pTgmF9kHLNOZeJNA7pTZ3mRWQ6KywUvZk0iWsiZSytYo4mlksVEMXmysIrsLgKjS4ZLxLbjZT3isJjd8G32P4NwNR_To8gJpQdU2ov2qiS3iTPFUDiQogwKvKn9c--4-1z8lOgrSXQEO7IJ316tNdkm37tS4_VlehSr8BHmfoUw
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=HMATC%3A+Hierarchical+multi-label+Arabic+text+classification+model+using+machine+learning&rft.jtitle=Egyptian+informatics+journal&rft.au=Aljedani%2C+Nawal&rft.au=Alotaibi%2C+Reem&rft.au=Taileb%2C+Mounira&rft.date=2021-09-01&rft.issn=1110-8665&rft.volume=22&rft.issue=3&rft.spage=225&rft.epage=237&rft_id=info:doi/10.1016%2Fj.eij.2020.08.004&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_eij_2020_08_004
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1110-8665&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1110-8665&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1110-8665&client=summon