A Survey on Text Classification Algorithms: From Text to Predictions

In recent years, the exponential growth of digital documents has been met by rapid progress in text classification techniques. Newly proposed machine learning algorithms leverage the latest advancements in deep learning methods, allowing for the automatic extraction of expressive features. The swift...

Full description

Saved in:
Bibliographic Details
Published in:Information (Basel) Vol. 13; no. 2; p. 83
Main Authors: Gasparetto, Andrea, Marcuzzo, Matteo, Zangari, Alessandro, Albarelli, Andrea
Format: Journal Article
Language:English
Published: Basel MDPI AG 01.02.2022
Subjects:
ISSN:2078-2489, 2078-2489
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract In recent years, the exponential growth of digital documents has been met by rapid progress in text classification techniques. Newly proposed machine learning algorithms leverage the latest advancements in deep learning methods, allowing for the automatic extraction of expressive features. The swift development of these methods has led to a plethora of strategies to encode natural language into machine-interpretable data. The latest language modelling algorithms are used in conjunction with ad hoc preprocessing procedures, of which the description is often omitted in favour of a more detailed explanation of the classification step. This paper offers a concise review of recent text classification models, with emphasis on the flow of data, from raw text to output labels. We highlight the differences between earlier methods and more recent, deep learning-based methods in both their functioning and in how they transform input data. To give a better perspective on the text classification landscape, we provide an overview of datasets for the English language, as well as supplying instructions for the synthesis of two new multilabel datasets, which we found to be particularly scarce in this setting. Finally, we provide an outline of new experimental results and discuss the open research challenges posed by deep learning-based language models.
AbstractList In recent years, the exponential growth of digital documents has been met by rapid progress in text classification techniques. Newly proposed machine learning algorithms leverage the latest advancements in deep learning methods, allowing for the automatic extraction of expressive features. The swift development of these methods has led to a plethora of strategies to encode natural language into machine-interpretable data. The latest language modelling algorithms are used in conjunction with ad hoc preprocessing procedures, of which the description is often omitted in favour of a more detailed explanation of the classification step. This paper offers a concise review of recent text classification models, with emphasis on the flow of data, from raw text to output labels. We highlight the differences between earlier methods and more recent, deep learning-based methods in both their functioning and in how they transform input data. To give a better perspective on the text classification landscape, we provide an overview of datasets for the English language, as well as supplying instructions for the synthesis of two new multilabel datasets, which we found to be particularly scarce in this setting. Finally, we provide an outline of new experimental results and discuss the open research challenges posed by deep learning-based language models.
Author Marcuzzo, Matteo
Albarelli, Andrea
Zangari, Alessandro
Gasparetto, Andrea
Author_xml – sequence: 1
  givenname: Andrea
  orcidid: 0000-0003-4986-0442
  surname: Gasparetto
  fullname: Gasparetto, Andrea
– sequence: 2
  givenname: Matteo
  orcidid: 0000-0002-0451-4899
  surname: Marcuzzo
  fullname: Marcuzzo, Matteo
– sequence: 3
  givenname: Alessandro
  orcidid: 0000-0002-3634-6607
  surname: Zangari
  fullname: Zangari, Alessandro
– sequence: 4
  givenname: Andrea
  orcidid: 0000-0002-3659-5099
  surname: Albarelli
  fullname: Albarelli, Andrea
BookMark eNptkE1LAzEQhoNUsNbe_AELXl3N1yZZb6VaLRQUrOeQTbI1ZbupSSr237ttFYo4lxlmnnlneM9Br_WtBeASwRtCSnjr2tojAjGEgpyAPoZc5JiKsndUn4FhjEvYBeeCCtQH96PsdRM-7TbzbTa3XykbNypGVzutkut6o2bhg0vvq3iXTYJfHaDks5dgjdM7Jl6A01o10Q5_8gC8TR7m46d89vw4HY9muSaMp5xXJUEKm0IhpJhguNKKIlzVRWEQxSVWCGtVlNV-ZIyuWGELUjKCGassIQMwPegar5ZyHdxKha30ysl9w4eFVCE53VjJaGkrVnGMqaHGcGUER4JBrmvCSG07rauD1jr4j42NSS79JrTd-xJ3BznnGNKOwgdKBx9jsLXULu2NSUG5RiIod-bLY_O7pes_S7-v_ot_A2rmhmI
CitedBy_id crossref_primary_10_3390_ai5030082
crossref_primary_10_1109_ACCESS_2024_3513550
crossref_primary_10_1007_s11227_024_06698_2
crossref_primary_10_1016_j_array_2025_100508
crossref_primary_10_1007_s10462_024_10869_1
crossref_primary_10_1016_j_engappai_2025_112005
crossref_primary_10_1145_3705000
crossref_primary_10_1145_3706057
crossref_primary_10_1007_s44196_025_00785_9
crossref_primary_10_1016_j_cie_2025_111486
crossref_primary_10_3390_info16040253
crossref_primary_10_2478_acss_2025_0005
crossref_primary_10_1142_S0218126625502810
crossref_primary_10_1109_ACCESS_2025_3588179
crossref_primary_10_1109_ACCESS_2025_3604068
crossref_primary_10_3390_make7010003
crossref_primary_10_3389_frai_2023_1278796
crossref_primary_10_32604_cmc_2024_050585
crossref_primary_10_1109_ACCESS_2025_3579232
crossref_primary_10_62762_TACS_2024_928069
crossref_primary_10_1016_j_asoc_2023_110721
crossref_primary_10_1038_s41598_025_96404_w
crossref_primary_10_1016_j_neucom_2025_129620
crossref_primary_10_2139_ssrn_5061012
crossref_primary_10_24054_rcta_v2i44_3018
crossref_primary_10_1109_ACCESS_2024_3349952
crossref_primary_10_1177_1748006X221140196
crossref_primary_10_1016_j_procs_2025_04_228
crossref_primary_10_1145_3631391
crossref_primary_10_3390_info16050363
crossref_primary_10_1109_ACCESS_2024_3400693
crossref_primary_10_3390_fi17040135
crossref_primary_10_1016_j_procs_2023_01_097
crossref_primary_10_3390_app15169057
crossref_primary_10_3390_app13127266
crossref_primary_10_1007_s10844_025_00973_1
crossref_primary_10_2339_politeknik_1423293
crossref_primary_10_3390_app15010259
crossref_primary_10_1080_14737167_2024_2322664
crossref_primary_10_1108_IJWIS_01_2025_0019
crossref_primary_10_3390_app142311143
crossref_primary_10_1109_ACCESS_2025_3585164
crossref_primary_10_3389_fpubh_2025_1591491
crossref_primary_10_3390_info14080462
crossref_primary_10_1109_ACCESS_2025_3591455
crossref_primary_10_1016_j_egyr_2024_10_048
crossref_primary_10_1016_j_artmed_2023_102701
crossref_primary_10_1016_j_eswa_2023_119984
crossref_primary_10_1371_journal_pone_0270904
crossref_primary_10_3390_su16010207
crossref_primary_10_48084_etasr_9994
crossref_primary_10_1109_ACCESS_2025_3599564
crossref_primary_10_3390_info15090521
crossref_primary_10_3390_electronics13071199
crossref_primary_10_1109_ACCESS_2025_3602874
crossref_primary_10_1016_j_neucom_2023_127064
crossref_primary_10_1016_j_neunet_2025_107754
crossref_primary_10_1016_j_eswa_2025_127977
crossref_primary_10_1016_j_ipm_2022_103213
crossref_primary_10_1080_1369118X_2024_2351439
crossref_primary_10_3390_technologies13020087
crossref_primary_10_1016_j_apenergy_2024_123276
crossref_primary_10_1109_ACCESS_2024_3467920
crossref_primary_10_1051_itmconf_20257004028
crossref_primary_10_1057_s41270_025_00404_8
crossref_primary_10_1007_s00521_023_08687_7
crossref_primary_10_1049_ell2_70397
crossref_primary_10_1145_3718096
crossref_primary_10_3390_ijerph191610347
crossref_primary_10_1016_j_engappai_2023_107028
crossref_primary_10_3390_app15179403
crossref_primary_10_1093_jamiaopen_ooaf064
crossref_primary_10_1093_jssam_smad015
crossref_primary_10_1038_s41598_024_71020_2
crossref_primary_10_3390_app14198914
crossref_primary_10_1109_ACCESS_2025_3591005
crossref_primary_10_1111_exsy_70073
crossref_primary_10_1016_j_swevo_2025_102073
crossref_primary_10_1109_ACCESS_2022_3217478
crossref_primary_10_2196_70733
crossref_primary_10_1109_ACCESS_2024_3356568
crossref_primary_10_1016_j_cosrev_2024_100664
crossref_primary_10_3390_electronics14163280
crossref_primary_10_1016_j_isci_2024_110192
crossref_primary_10_1155_int_6472544
crossref_primary_10_1142_S0218126625504213
crossref_primary_10_1016_j_ins_2025_121956
crossref_primary_10_1057_s41599_024_03894_6
crossref_primary_10_1515_opth_2025_0052
crossref_primary_10_1080_19312458_2023_2261372
crossref_primary_10_1109_ACCESS_2022_3194536
crossref_primary_10_1016_j_eswa_2025_127928
crossref_primary_10_61186_jsdp_22_1_39
crossref_primary_10_1371_journal_pdig_0000680
crossref_primary_10_7717_peerj_cs_3069
crossref_primary_10_1080_09544828_2025_2518657
crossref_primary_10_1007_s10844_024_00852_1
Cites_doi 10.1162/tacl_a_00051
10.1126/science.153.3731.34
10.1007/s42452-019-1356-9
10.1145/3439726
10.1007/978-3-662-44415-3_3
10.1017/CBO9780511809071
10.3115/v1/P15-1162
10.18653/v1/P18-1002
10.18653/v1/2020.emnlp-demos.6
10.18653/v1/2020.acl-main.747
10.18653/v1/D19-1345
10.18653/v1/2021.blackboxnlp-1.19
10.18653/v1/D18-2012
10.1007/978-981-10-5041-1_57
10.18653/v1/N18-1202
10.1038/s41598-020-65070-5
10.1007/BF00058655
10.1007/BF00116037
10.18653/v1/D15-1166
10.1609/aaai.v32i1.11604
10.18653/v1/2021.findings-acl.126
10.1007/BF00994018
10.3115/981623.981633
10.1109/3DV.2017.00061
10.18653/v1/D19-1417
10.18653/v1/P18-1215
10.1109/TIT.1967.1053964
10.1007/978-3-319-24261-3_12
10.1007/BFb0026683
10.18653/v1/D18-2029
10.1108/eb026526
10.18653/v1/P18-1007
10.1145/3077136.3080834
10.18653/v1/E17-2068
10.1561/2200000013
10.1016/j.optlaseng.2019.05.006
10.1006/jcss.1997.1504
10.1109/TKDE.2018.2807452
10.1145/3442188.3445922
10.1109/ICPR.2018.8545465
10.24963/ijcai.2019/477
10.1145/3206025.3206030
10.1109/CVPR.2017.113
10.18653/v1/D19-1006
10.3115/v1/D14-1179
10.1145/3394486.3403296
10.1109/21.97458
10.1007/978-3-642-24797-2_3
10.1145/3357384.3357891
10.1109/ICASSP.2012.6289079
10.18653/v1/P18-1031
10.3115/v1/P15-1150
10.1162/neco.1997.9.8.1735
10.18653/v1/P16-1162
10.3233/AIC-170729
10.18653/v1/2021.acl-long.227
10.1016/j.jbi.2021.103699
10.1007/978-1-4899-7687-1_124
10.1109/78.650093
10.18653/v1/W18-5446
10.18653/v1/N19-1033
10.1109/5.880083
10.3115/v1/P14-1023
10.3115/1118693.1118704
10.1198/004017007000000245
10.1109/CVPR.2017.85
10.3390/info10040150
10.1109/TPAMI.2005.127
10.3115/v1/D14-1181
10.3115/v1/D14-1162
10.1007/978-3-030-32381-3_16
10.1109/3DV.2015.46
10.1016/0098-3004(93)90090-R
10.18653/v1/2020.findings-emnlp.372
10.18653/v1/E17-1104
10.1145/130385.130401
10.18653/v1/N19-1408
10.1007/978-3-030-30493-5_39
10.1007/978-3-030-30487-4_16
10.18653/v1/N16-1174
ContentType Journal Article
Copyright 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
3V.
7SC
7XB
8AL
8FD
8FE
8FG
8FK
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
GNUQQ
HCIFZ
JQ2
K7-
L7M
L~C
L~D
M0N
P5Z
P62
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
Q9U
DOA
DOI 10.3390/info13020083
DatabaseName CrossRef
ProQuest Central (Corporate)
Computer and Information Systems Abstracts
ProQuest Central (purchase pre-March 2016)
Computing Database (Alumni Edition)
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials - QC
ProQuest Central
Technology Collection
ProQuest One Community College
ProQuest Central
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Computing Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central Basic
Open Access: DOAJ - Directory of Open Access Journals
DatabaseTitle CrossRef
Publicly Available Content Database
Computer Science Database
ProQuest Central Student
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Central Korea
ProQuest Central (New)
Advanced Technologies Database with Aerospace
Advanced Technologies & Aerospace Collection
ProQuest Computing
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
DatabaseTitleList Publicly Available Content Database
CrossRef

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ - Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: PIMPY
  name: Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2078-2489
ExternalDocumentID oai_doaj_org_article_649eb6b7224d4dd7ad8718607cf363fe
10_3390_info13020083
GroupedDBID .4I
5VS
8FE
8FG
AADQD
AAFWJ
AAYXX
ABDBF
ABUWG
ADBBV
ADMLS
AFFHD
AFKRA
AFPKN
AFZYC
ALMA_UNASSIGNED_HOLDINGS
ARAPS
AZQEC
BCNDV
BENPR
BGLVJ
BPHCQ
CCPQU
CITATION
DWQXO
GNUQQ
GROUPED_DOAJ
HCIFZ
IAO
K6V
K7-
KQ8
MK~
ML~
MODMG
M~E
OK1
P2P
P62
PHGZM
PHGZT
PIMPY
PQGLB
PQQKQ
PROAC
3V.
7SC
7XB
8AL
8FD
8FK
JQ2
L7M
L~C
L~D
M0N
PKEHL
PQEST
PQUKI
Q9U
ID FETCH-LOGICAL-c367t-7b931a2d5a11a6862bca412bf55d14292a12ca59b862bcddcb65e53963266be33
IEDL.DBID DOA
ISICitedReferencesCount 117
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000763620800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2078-2489
IngestDate Fri Oct 03 12:42:38 EDT 2025
Sun Nov 09 06:15:51 EST 2025
Sat Nov 29 07:10:19 EST 2025
Tue Nov 18 20:57:18 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c367t-7b931a2d5a11a6862bca412bf55d14292a12ca59b862bcddcb65e53963266be33
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-3659-5099
0000-0002-3634-6607
0000-0003-4986-0442
0000-0002-0451-4899
OpenAccessLink https://doaj.org/article/649eb6b7224d4dd7ad8718607cf363fe
PQID 2632777204
PQPubID 2032384
ParticipantIDs doaj_primary_oai_doaj_org_article_649eb6b7224d4dd7ad8718607cf363fe
proquest_journals_2632777204
crossref_citationtrail_10_3390_info13020083
crossref_primary_10_3390_info13020083
PublicationCentury 2000
PublicationDate 2022-02-01
PublicationDateYYYYMMDD 2022-02-01
PublicationDate_xml – month: 02
  year: 2022
  text: 2022-02-01
  day: 01
PublicationDecade 2020
PublicationPlace Basel
PublicationPlace_xml – name: Basel
PublicationTitle Information (Basel)
PublicationYear 2022
Publisher MDPI AG
Publisher_xml – name: MDPI AG
References ref_137
ref_136
ref_92
ref_139
ref_91
ref_138
ref_90
ref_13
ref_131
ref_11
ref_99
ref_130
ref_133
ref_132
ref_96
Hochreiter (ref_58) 1997; 9
ref_95
ref_134
Dasgupta (ref_77) 2013; Volume 28
Yan (ref_72) 2020; 10
ref_19
ref_18
ref_17
ref_16
ref_15
Bellman (ref_42) 1966; 153
ref_126
ref_125
ref_128
ref_127
Jivani (ref_8) 2011; 2
ref_129
Nikolentzos (ref_101) 2020; 34
ref_22
ref_21
ref_122
ref_20
ref_121
ref_124
ref_123
ref_29
Sutton (ref_38) 2012; 4
ref_27
Rosenfeld (ref_28) 2000; 88
ref_71
ref_158
ref_70
ref_151
ref_79
ref_150
ref_78
ref_153
ref_152
ref_76
ref_155
ref_154
ref_74
ref_157
ref_73
ref_160
Cover (ref_39) 1967; 13
Sutskever (ref_75) 2014; Volume 2
ref_83
ref_148
ref_82
ref_147
ref_81
ref_80
ref_149
ref_140
ref_89
ref_142
Ibrahim (ref_54) 2021; 116
ref_88
ref_141
ref_144
ref_86
ref_143
ref_85
ref_146
ref_84
ref_145
Freund (ref_52) 1997; 55
Jones (ref_24) 1972; 28
Krishnapuram (ref_49) 2005; 27
Tharwat (ref_26) 2017; 30
Sechidis (ref_159) 2011; Volume 6913
ref_57
ref_56
ref_55
ref_53
Raffel (ref_87) 2020; 21
Minaee (ref_3) 2021; 54
Gage (ref_10) 1994; 12
Cai (ref_94) 2018; 30
Safavian (ref_45) 1991; 21
ref_59
Schiavinato (ref_93) 2015; 9370
Cortes (ref_43) 1995; 20
ref_61
ref_60
Bojanowski (ref_35) 2017; 5
ref_69
ref_162
ref_68
ref_67
ref_164
ref_66
ref_163
ref_65
ref_166
ref_165
ref_63
ref_62
Wang (ref_14) 2020; 34
Schuster (ref_64) 1997; 45
ref_115
Pistellato (ref_23) 2019; 121
ref_114
ref_117
ref_116
ref_119
ref_118
ref_36
ref_34
Schapire (ref_51) 1990; 5
ref_33
ref_32
ref_111
ref_31
ref_110
ref_30
ref_113
Radford (ref_12) 2019; 1
ref_112
Lewis (ref_120) 2004; 5
Gupta (ref_156) 2020; 325
ref_37
Ali (ref_41) 2019; 1
Torsello (ref_97) 2014; 8621
Genkin (ref_48) 2007; 49
ref_104
ref_103
ref_106
ref_105
ref_108
ref_107
Ratajczak (ref_25) 1993; 19
ref_109
ref_47
ref_46
Sachan (ref_135) 2019; 33
ref_44
ref_100
Jin (ref_161) 2020; 34
Yao (ref_98) 2019; 33
ref_102
ref_40
ref_1
ref_2
Breiman (ref_50) 2004; 24
ref_9
ref_5
ref_4
ref_7
ref_6
References_xml – volume: 5
  start-page: 135
  year: 2017
  ident: ref_35
  article-title: Enriching Word Vectors with Subword Information
  publication-title: Trans. Assoc. Comput. Linguist.
  doi: 10.1162/tacl_a_00051
– volume: 153
  start-page: 34
  year: 1966
  ident: ref_42
  article-title: Dynamic Programming
  publication-title: Science
  doi: 10.1126/science.153.3731.34
– ident: ref_117
– ident: ref_9
– volume: 1
  start-page: 1
  year: 2019
  ident: ref_41
  article-title: Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets
  publication-title: SN Appl. Sci.
  doi: 10.1007/s42452-019-1356-9
– volume: 54
  start-page: 1
  year: 2021
  ident: ref_3
  article-title: Deep Learning–Based Text Classification: A Comprehensive Review
  publication-title: Acm Comput. Surv.
  doi: 10.1145/3439726
– volume: 8621
  start-page: 22
  year: 2014
  ident: ref_97
  article-title: Transitive State Alignment for the Quantum Jensen-Shannon Kernel
  publication-title: Lect. Notes Comput. Sci.
  doi: 10.1007/978-3-662-44415-3_3
– ident: ref_5
  doi: 10.1017/CBO9780511809071
– ident: ref_55
  doi: 10.3115/v1/P15-1162
– ident: ref_152
  doi: 10.18653/v1/P18-1002
– ident: ref_158
  doi: 10.18653/v1/2020.emnlp-demos.6
– ident: ref_16
– ident: ref_20
  doi: 10.18653/v1/2020.acl-main.747
– ident: ref_88
– ident: ref_155
– ident: ref_103
  doi: 10.18653/v1/D19-1345
– ident: ref_141
  doi: 10.18653/v1/2021.blackboxnlp-1.19
– ident: ref_18
  doi: 10.18653/v1/D18-2012
– ident: ref_1
– ident: ref_36
  doi: 10.1007/978-981-10-5041-1_57
– volume: 34
  start-page: 8544
  year: 2020
  ident: ref_101
  article-title: Message Passing Attention Networks for Document Understanding
  publication-title: Proc. AAAI Conf. Artif. Intell.
– ident: ref_123
– ident: ref_146
– ident: ref_65
  doi: 10.18653/v1/N18-1202
– ident: ref_166
– ident: ref_114
– ident: ref_31
– ident: ref_56
– volume: 10
  start-page: 8055
  year: 2020
  ident: ref_72
  article-title: Temporal Convolutional Networks for the Advance Prediction of ENSO
  publication-title: Sci. Rep.
  doi: 10.1038/s41598-020-65070-5
– ident: ref_27
– volume: 24
  start-page: 123
  year: 2004
  ident: ref_50
  article-title: Bagging predictors
  publication-title: Mach. Learn.
  doi: 10.1007/BF00058655
– ident: ref_83
– volume: 5
  start-page: 197
  year: 1990
  ident: ref_51
  article-title: The Strength of Weak Learnability
  publication-title: Mach. Learn.
  doi: 10.1007/BF00116037
– ident: ref_13
– volume: 33
  start-page: 7370
  year: 2019
  ident: ref_98
  article-title: Graph Convolutional Networks for Text Classification
  publication-title: Proc. AAAI Conf. Artif. Intell.
– volume: 33
  start-page: 6940
  year: 2019
  ident: ref_135
  article-title: Revisiting LSTM Networks for Semi-Supervised Text Classification via Mixed Objective Function
  publication-title: Proc. AAAI Conf. Artif. Intell.
– volume: 34
  start-page: 8018
  year: 2020
  ident: ref_161
  article-title: Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment
  publication-title: Proc. AAAI Conf. Artif. Intell.
– ident: ref_30
– ident: ref_78
  doi: 10.18653/v1/D15-1166
– ident: ref_105
  doi: 10.1609/aaai.v32i1.11604
– ident: ref_121
– ident: ref_100
  doi: 10.18653/v1/2021.findings-acl.126
– ident: ref_134
– ident: ref_86
– ident: ref_157
– volume: 20
  start-page: 273
  year: 1995
  ident: ref_43
  article-title: Support-Vector Networks
  publication-title: Mach. Learn.
  doi: 10.1007/BF00994018
– ident: ref_99
  doi: 10.3115/981623.981633
– volume: 5
  start-page: 361
  year: 2004
  ident: ref_120
  article-title: RCV1: A New Benchmark Collection for Text Categorization Research
  publication-title: J. Mach. Learn. Res.
– ident: ref_92
– ident: ref_129
– volume: 21
  start-page: 1
  year: 2020
  ident: ref_87
  article-title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
  publication-title: J. Mach. Learn. Res.
– ident: ref_106
– ident: ref_109
  doi: 10.1109/3DV.2017.00061
– ident: ref_148
– ident: ref_163
– ident: ref_137
  doi: 10.18653/v1/D19-1417
– ident: ref_62
  doi: 10.18653/v1/P18-1215
– volume: 13
  start-page: 21
  year: 1967
  ident: ref_39
  article-title: Nearest Neighbor pattern classification
  publication-title: IEEE Trans. Inf. Theory
  doi: 10.1109/TIT.1967.1053964
– ident: ref_6
– ident: ref_143
– ident: ref_81
– volume: 9370
  start-page: 146
  year: 2015
  ident: ref_93
  article-title: Transitive assignment kernels for structural classification
  publication-title: Lect. Notes Comput. Sci.
  doi: 10.1007/978-3-319-24261-3_12
– ident: ref_112
– ident: ref_130
  doi: 10.1007/BFb0026683
– ident: ref_138
  doi: 10.18653/v1/D18-2029
– volume: 28
  start-page: 11
  year: 1972
  ident: ref_24
  article-title: A statistical interpretation of term specificity and its application in retrieval
  publication-title: J. Doc.
  doi: 10.1108/eb026526
– ident: ref_17
  doi: 10.18653/v1/P18-1007
– ident: ref_89
– ident: ref_118
  doi: 10.1145/3077136.3080834
– ident: ref_154
– ident: ref_126
– ident: ref_19
– ident: ref_160
– ident: ref_95
– ident: ref_53
  doi: 10.18653/v1/E17-2068
– volume: 4
  start-page: 267
  year: 2012
  ident: ref_38
  article-title: An Introduction to Conditional Random Fields
  publication-title: Found. Trends® Mach. Learn.
  doi: 10.1561/2200000013
– ident: ref_165
– volume: 121
  start-page: 428
  year: 2019
  ident: ref_23
  article-title: Robust phase unwrapping by probabilistic consensus
  publication-title: Opt. Lasers Eng.
  doi: 10.1016/j.optlaseng.2019.05.006
– ident: ref_32
– volume: 55
  start-page: 119
  year: 1997
  ident: ref_52
  article-title: A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting
  publication-title: J. Comput. Syst. Sci.
  doi: 10.1006/jcss.1997.1504
– ident: ref_113
– volume: 30
  start-page: 1616
  year: 2018
  ident: ref_94
  article-title: A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications
  publication-title: IEEE Trans. Knowl. Data Eng.
  doi: 10.1109/TKDE.2018.2807452
– ident: ref_84
– ident: ref_136
– ident: ref_79
  doi: 10.1145/3442188.3445922
– volume: 12
  start-page: 23
  year: 1994
  ident: ref_10
  article-title: A New Algorithm for Data Compression
  publication-title: C Users J.
– ident: ref_68
  doi: 10.1109/ICPR.2018.8545465
– ident: ref_90
– ident: ref_139
  doi: 10.24963/ijcai.2019/477
– ident: ref_127
– ident: ref_119
  doi: 10.1145/3206025.3206030
– ident: ref_70
  doi: 10.1109/CVPR.2017.113
– ident: ref_151
– ident: ref_104
– volume: Volume 6913
  start-page: 145
  year: 2011
  ident: ref_159
  article-title: On the Stratification of Multi-label Data
  publication-title: Machine Learning and Knowledge Discovery in Databases
– ident: ref_110
  doi: 10.18653/v1/D19-1006
– ident: ref_63
  doi: 10.3115/v1/D14-1179
– ident: ref_108
  doi: 10.1145/3394486.3403296
– volume: 21
  start-page: 660
  year: 1991
  ident: ref_45
  article-title: A survey of decision tree classifier methodology
  publication-title: IEEE Trans. Syst. Man Cybern.
  doi: 10.1109/21.97458
– ident: ref_4
  doi: 10.1007/978-3-642-24797-2_3
– ident: ref_47
  doi: 10.1145/3357384.3357891
– ident: ref_15
  doi: 10.1109/ICASSP.2012.6289079
– volume: Volume 2
  start-page: 3104
  year: 2014
  ident: ref_75
  article-title: Sequence to Sequence Learning with Neural Networks
  publication-title: Proceedings of the 27th International Conference on Neural Information Processing Systems
– ident: ref_61
  doi: 10.18653/v1/P18-1031
– ident: ref_59
  doi: 10.3115/v1/P15-1150
– ident: ref_66
– volume: 9
  start-page: 1735
  year: 1997
  ident: ref_58
  article-title: Long Short-term Memory
  publication-title: Neural Comput.
  doi: 10.1162/neco.1997.9.8.1735
– ident: ref_107
– ident: ref_131
– ident: ref_11
  doi: 10.18653/v1/P16-1162
– volume: 1
  start-page: 9
  year: 2019
  ident: ref_12
  article-title: Language Models are Unsupervised Multitask Learners
  publication-title: OpenAI Blog
– ident: ref_124
– ident: ref_145
– ident: ref_162
– ident: ref_7
– volume: 30
  start-page: 169
  year: 2017
  ident: ref_26
  article-title: Linear discriminant analysis: A detailed tutorial
  publication-title: Ai Commun.
  doi: 10.3233/AIC-170729
– ident: ref_142
  doi: 10.18653/v1/2021.acl-long.227
– ident: ref_76
– ident: ref_144
– volume: 116
  start-page: 103699
  year: 2021
  ident: ref_54
  article-title: GHS-NET a generic hybridized shallow neural network for multi-label biomedical text classification
  publication-title: J. Biomed. Inform.
  doi: 10.1016/j.jbi.2021.103699
– ident: ref_37
  doi: 10.1007/978-1-4899-7687-1_124
– volume: 45
  start-page: 2673
  year: 1997
  ident: ref_64
  article-title: Bidirectional recurrent neural networks
  publication-title: IEEE Trans. Signal Process.
  doi: 10.1109/78.650093
– ident: ref_85
  doi: 10.18653/v1/W18-5446
– ident: ref_82
– ident: ref_40
– ident: ref_140
  doi: 10.18653/v1/N19-1033
– volume: 88
  start-page: 1270
  year: 2000
  ident: ref_28
  article-title: Two decades of statistical language modeling: Where do we go from here?
  publication-title: Proc. IEEE
  doi: 10.1109/5.880083
– ident: ref_153
– ident: ref_102
– ident: ref_125
– ident: ref_34
  doi: 10.3115/v1/P14-1023
– ident: ref_128
  doi: 10.3115/1118693.1118704
– volume: 49
  start-page: 291
  year: 2007
  ident: ref_48
  article-title: Large-Scale Bayesian Logistic Regression for Text Categorization
  publication-title: Technometrics
  doi: 10.1198/004017007000000245
– ident: ref_67
  doi: 10.1109/CVPR.2017.85
– ident: ref_2
  doi: 10.3390/info10040150
– ident: ref_111
– ident: ref_96
– ident: ref_21
– volume: 27
  start-page: 957
  year: 2005
  ident: ref_49
  article-title: Sparse multinomial logistic regression: Fast algorithms and generalization bounds
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/TPAMI.2005.127
– ident: ref_116
– volume: 34
  start-page: 9154
  year: 2020
  ident: ref_14
  article-title: Neural Machine Translation with Byte-Level Subwords
  publication-title: Proc. AAAI Conf. Artif. Intell.
– ident: ref_69
  doi: 10.3115/v1/D14-1181
– volume: Volume 28
  start-page: 1310
  year: 2013
  ident: ref_77
  article-title: On the difficulty of training recurrent neural networks
  publication-title: Proceedings of the 30th International Conference on Machine Learning
– ident: ref_164
– ident: ref_33
  doi: 10.3115/v1/D14-1162
– ident: ref_29
– ident: ref_122
– ident: ref_132
  doi: 10.1007/978-3-030-32381-3_16
– ident: ref_22
  doi: 10.1109/3DV.2015.46
– ident: ref_46
– volume: 19
  start-page: 303
  year: 1993
  ident: ref_25
  article-title: Principal components analysis (PCA)
  publication-title: Comput. Geosci.
  doi: 10.1016/0098-3004(93)90090-R
– ident: ref_115
  doi: 10.18653/v1/2020.findings-emnlp.372
– ident: ref_73
  doi: 10.18653/v1/E17-1104
– ident: ref_44
  doi: 10.1145/130385.130401
– ident: ref_149
  doi: 10.18653/v1/N19-1408
– ident: ref_91
– volume: 325
  start-page: 2030
  year: 2020
  ident: ref_156
  article-title: Improving Document Classification with Multi-Sense Embeddings
  publication-title: Front. Artif. Intell. Appl.
– ident: ref_71
  doi: 10.1007/978-3-030-30493-5_39
– ident: ref_133
– ident: ref_150
– ident: ref_74
  doi: 10.1007/978-3-030-30487-4_16
– ident: ref_60
– ident: ref_147
– ident: ref_57
– ident: ref_80
  doi: 10.18653/v1/N16-1174
– volume: 2
  start-page: 1930
  year: 2011
  ident: ref_8
  article-title: A Comparative Study of Stemming Algorithms
  publication-title: Int. J. Comput. Technol. Appl.
SSID ssj0000778481
Score 2.6021774
Snippet In recent years, the exponential growth of digital documents has been met by rapid progress in text classification techniques. Newly proposed machine learning...
SourceID doaj
proquest
crossref
SourceType Open Website
Aggregation Database
Enrichment Source
Index Database
StartPage 83
SubjectTerms Algorithms
Artificial intelligence
Classification
Datasets
Deep learning
Electronic documents
English language
Feature extraction
Labeling
Machine learning
Natural language
Neural networks
news classification
Sentiment analysis
shallow learning
Text categorization
text classification
tokenisation
topic labelling
transformer
SummonAdditionalLinks – databaseName: Advanced Technologies & Aerospace Database
  dbid: P5Z
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3PS8MwFA46PejB3-J0Sg56krK2SZPWi8wfw4OMgROGl9Im2RS2drbdwP_evLSbE9GLx7aPUvrlvbyXvHwfQudE2pFkAbWIYsqiQnt6oK911cp8nxKlU35D4vrIOx2_3w-61YJbXrVVzmOiCdQyFbBG3gRecc5BUuV68m6BahTsrlYSGqtoDVgSQLqh670s1lhszoEtvux3J7q6bwJqsFUHmce3mcgQ9v-Ix2aSaW__9_N20FaVXuJWOR520YpK9tDmEungPrpr4adpNlMfOE1wT8dmbIQxoWXIoIRbo6F-c_E6zq9wO0vHpVGR4m4GuzpmoB6g5_Z97_bBqrQULEEYLyweB8SJXOlFjhPBqZBYRNRx44HnSQckqyLHFZEXxOaRlCJmnvKIdk89g8eKkENUS9JEHSFMmaQikLZPJaP-wA2AIk9b-J7PibJJHV3O_2soKqJx0LsYhbrgABTCZRTq6GJhPSkJNn6xuwGIFjZAi21upNkwrLwsZDRQMYu5zksklZJHUteDPrO5GBBGBqqOGnP0wspX8_ALuuO_H5-gDRcOP5ie7QaqFdlUnaJ1MSve8uzMDL1PZYDgyA
  priority: 102
  providerName: ProQuest
Title A Survey on Text Classification Algorithms: From Text to Predictions
URI https://www.proquest.com/docview/2632777204
https://doaj.org/article/649eb6b7224d4dd7ad8718607cf363fe
Volume 13
WOSCitedRecordID wos000763620800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ - Open Access Journals
  customDbUrl:
  eissn: 2078-2489
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000778481
  issn: 2078-2489
  databaseCode: DOA
  dateStart: 20100101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2078-2489
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000778481
  issn: 2078-2489
  databaseCode: M~E
  dateStart: 20100101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVPQU
  databaseName: Advanced Technologies & Aerospace Database
  customDbUrl:
  eissn: 2078-2489
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000778481
  issn: 2078-2489
  databaseCode: P5Z
  dateStart: 20100301
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Computer Science Database
  customDbUrl:
  eissn: 2078-2489
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000778481
  issn: 2078-2489
  databaseCode: K7-
  dateStart: 20100301
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/compscijour
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 2078-2489
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000778481
  issn: 2078-2489
  databaseCode: BENPR
  dateStart: 20100301
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database
  customDbUrl:
  eissn: 2078-2489
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000778481
  issn: 2078-2489
  databaseCode: PIMPY
  dateStart: 20100301
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA6iHvQgPnF1lRz0JMW2SZPG26q7KOpSfIB6KW2SVWHdSndd8OJvdyatsiLixctAm4GWmcyLTL4hZIcZPzNCcY9ZYT2uwdIVPEPVKuKYMwspvwNxPZfdbnx7q5KJUV_YE1bBA1eC2xdc2VzkEkKN4cbIzECKHwtf6h4TrGfR-_pSTRRTzgdLiTjxVac7g7p-H_WFh3SYc3yLQQ6q_4cnduGls0gW6ryQtqr_WSJTdrBM5ifQAlfIcYtevZZj-0aLAb0Gp0rdREvs9XHipa3-QwG1_uPz8IB2yuK5YhoVNCnxOMbtsFVy02lfH5149RAETzMhR57MFQuy0ERZEGR4nSPXGQ_CvBdFJsBZU1kQ6ixSuVsyRucishEDu4LQm1vG1sj0oBjYdUK5MFwr48fcCB73QoXYdsARR7Fk1mcNsvcpllTXCOE4qKKfQqWAQkwnhdggu1_cLxUyxi98hyjhLx7Es3YvQMtpreX0Ly03SPNTP2ltZMMUoealxCk7G__xjU0yF-LdBteS3STTo_LVbpFZPR49DcttMnPY7iaX226fAT2THtCL9zbQJLqH9eT0Irn7AEUj2S8
linkProvider Directory of Open Access Journals
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1NT9wwEB0hQGp74KtFXaDUBzihiCR27LhShbbACrTLColF4pYmtheQYAPZQMWf4jfW4yQLqKI3DhyTzMWZ55mxPX4PYINqP9VcMo8abjym7EyX9tmuWnkcM2psye9IXHui34_PzuTxFDw2d2GwrbKJiS5Q61zhHvk28ooLgZIqOze3HqpG4elqI6FRwaJrHv7YJdv45-Ge9e9mGHb2B7sHXq0q4CnKRemJTNIgDXWUBkGK9yMylbIgzIZRpAMUb0qDUKWRzNwnrVXGIxNRC1SbyzKDG6A25M8wGgucV13hTfZ0fCGQnb7qr6dU-tuIEjwaxErnReZzAgH_xH-X1Drz7-13LMBcXT6TdoX3RZgyoyX49IxU8TPstcnJXXFvHkg-IgObe4gT_sSWKIdC0r46tyMpL67HP0inyK8rozInxwWeWrmJ-AVO32QYyzA9ykfmKxDGNVNS-zHTnMXDUCIFoLWII-tx49MWbDV-TFRNpI56HleJXVCh15PnXm_B5sT6piIQecXuF0JiYoO03-5FXpwndRRJOJMm45mwdZdmWotU2_VuzH2hhpTToWnBWoOWpI5F4-QJKiv___wdPhwMjnpJ77DfXYWPIV70cP3pazBdFnfmG8yq-_JyXKw72BP4_dbA-guL-zzk
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9QwEB5VpUJwgPJSlxbwgZ5QtEns2HElhBa2K6pWq5UoUsUlJLZTKrWbkk2L-tf4dcw4yVKE6K0HjonnEMffvOzxNwCvuQ1zK7UIuJMuEAY1XeMzZq0yTQV3GPJ7EtcDNZ2mR0d6tgI_-7swVFbZ20RvqG1laI98SLziSlFLlWHZlUXMxpN3598D6iBFJ619O40WIvvu6gemb4u3e2Nc6-04nuwefvgYdB0GAsOlagJVaB7lsU3yKMrprkRhchHFRZkkNqJGTnkUmzzRhR-y1hQycQlH0KJfKxxthqL5v6Mwx6RywlnyZbm_EypFTPVtrT3nOhwSYuiYkKKeP7ygbxbwly_wDm7y8H_-NevwoAur2ajVg0ew4uaP4f41ssUnMB6xTxf1pbti1Zwd4tcz3xCUSqU8Otno9Bhn0nw7W-ywSV2dtUJNxWY1nWZ5BX0Kn29lGs9gdV7N3QYwIa0w2oapsFKkZayJGhAl0iRV3IV8AG_6Nc1MR7BOfT5OM0y0CAHZdQQMYHspfd4Si_xD7j3BYylDdOD-RVUfZ511yaTQrpCFwnjMCmtVbjEPTmWoTMklL90AtnrkZJ2NWmS_YfP85uFXcBfxlB3sTfc34V5M9z982foWrDb1hXsBa-ayOVnUL70GMPh627j6BVZKRgg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Survey+on+Text+Classification+Algorithms%3A+From+Text+to+Predictions&rft.jtitle=Information+%28Basel%29&rft.au=Andrea+Gasparetto&rft.au=Matteo+Marcuzzo&rft.au=Alessandro+Zangari&rft.au=Andrea+Albarelli&rft.date=2022-02-01&rft.pub=MDPI+AG&rft.eissn=2078-2489&rft.volume=13&rft.issue=2&rft.spage=83&rft_id=info:doi/10.3390%2Finfo13020083&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_649eb6b7224d4dd7ad8718607cf363fe
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2078-2489&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2078-2489&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2078-2489&client=summon