A Survey on Text Classification Algorithms: From Text to Predictions
In recent years, the exponential growth of digital documents has been met by rapid progress in text classification techniques. Newly proposed machine learning algorithms leverage the latest advancements in deep learning methods, allowing for the automatic extraction of expressive features. The swift...
Saved in:
| Published in: | Information (Basel) Vol. 13; no. 2; p. 83 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Basel
MDPI AG
01.02.2022
|
| Subjects: | |
| ISSN: | 2078-2489, 2078-2489 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | In recent years, the exponential growth of digital documents has been met by rapid progress in text classification techniques. Newly proposed machine learning algorithms leverage the latest advancements in deep learning methods, allowing for the automatic extraction of expressive features. The swift development of these methods has led to a plethora of strategies to encode natural language into machine-interpretable data. The latest language modelling algorithms are used in conjunction with ad hoc preprocessing procedures, of which the description is often omitted in favour of a more detailed explanation of the classification step. This paper offers a concise review of recent text classification models, with emphasis on the flow of data, from raw text to output labels. We highlight the differences between earlier methods and more recent, deep learning-based methods in both their functioning and in how they transform input data. To give a better perspective on the text classification landscape, we provide an overview of datasets for the English language, as well as supplying instructions for the synthesis of two new multilabel datasets, which we found to be particularly scarce in this setting. Finally, we provide an outline of new experimental results and discuss the open research challenges posed by deep learning-based language models. |
|---|---|
| AbstractList | In recent years, the exponential growth of digital documents has been met by rapid progress in text classification techniques. Newly proposed machine learning algorithms leverage the latest advancements in deep learning methods, allowing for the automatic extraction of expressive features. The swift development of these methods has led to a plethora of strategies to encode natural language into machine-interpretable data. The latest language modelling algorithms are used in conjunction with ad hoc preprocessing procedures, of which the description is often omitted in favour of a more detailed explanation of the classification step. This paper offers a concise review of recent text classification models, with emphasis on the flow of data, from raw text to output labels. We highlight the differences between earlier methods and more recent, deep learning-based methods in both their functioning and in how they transform input data. To give a better perspective on the text classification landscape, we provide an overview of datasets for the English language, as well as supplying instructions for the synthesis of two new multilabel datasets, which we found to be particularly scarce in this setting. Finally, we provide an outline of new experimental results and discuss the open research challenges posed by deep learning-based language models. |
| Author | Marcuzzo, Matteo Albarelli, Andrea Zangari, Alessandro Gasparetto, Andrea |
| Author_xml | – sequence: 1 givenname: Andrea orcidid: 0000-0003-4986-0442 surname: Gasparetto fullname: Gasparetto, Andrea – sequence: 2 givenname: Matteo orcidid: 0000-0002-0451-4899 surname: Marcuzzo fullname: Marcuzzo, Matteo – sequence: 3 givenname: Alessandro orcidid: 0000-0002-3634-6607 surname: Zangari fullname: Zangari, Alessandro – sequence: 4 givenname: Andrea orcidid: 0000-0002-3659-5099 surname: Albarelli fullname: Albarelli, Andrea |
| BookMark | eNptkE1LAzEQhoNUsNbe_AELXl3N1yZZb6VaLRQUrOeQTbI1ZbupSSr237ttFYo4lxlmnnlneM9Br_WtBeASwRtCSnjr2tojAjGEgpyAPoZc5JiKsndUn4FhjEvYBeeCCtQH96PsdRM-7TbzbTa3XykbNypGVzutkut6o2bhg0vvq3iXTYJfHaDks5dgjdM7Jl6A01o10Q5_8gC8TR7m46d89vw4HY9muSaMp5xXJUEKm0IhpJhguNKKIlzVRWEQxSVWCGtVlNV-ZIyuWGELUjKCGassIQMwPegar5ZyHdxKha30ysl9w4eFVCE53VjJaGkrVnGMqaHGcGUER4JBrmvCSG07rauD1jr4j42NSS79JrTd-xJ3BznnGNKOwgdKBx9jsLXULu2NSUG5RiIod-bLY_O7pes_S7-v_ot_A2rmhmI |
| CitedBy_id | crossref_primary_10_3390_ai5030082 crossref_primary_10_1109_ACCESS_2024_3513550 crossref_primary_10_1007_s11227_024_06698_2 crossref_primary_10_1016_j_array_2025_100508 crossref_primary_10_1007_s10462_024_10869_1 crossref_primary_10_1016_j_engappai_2025_112005 crossref_primary_10_1145_3705000 crossref_primary_10_1145_3706057 crossref_primary_10_1007_s44196_025_00785_9 crossref_primary_10_1016_j_cie_2025_111486 crossref_primary_10_3390_info16040253 crossref_primary_10_2478_acss_2025_0005 crossref_primary_10_1142_S0218126625502810 crossref_primary_10_1109_ACCESS_2025_3588179 crossref_primary_10_1109_ACCESS_2025_3604068 crossref_primary_10_3390_make7010003 crossref_primary_10_3389_frai_2023_1278796 crossref_primary_10_32604_cmc_2024_050585 crossref_primary_10_1109_ACCESS_2025_3579232 crossref_primary_10_62762_TACS_2024_928069 crossref_primary_10_1016_j_asoc_2023_110721 crossref_primary_10_1038_s41598_025_96404_w crossref_primary_10_1016_j_neucom_2025_129620 crossref_primary_10_2139_ssrn_5061012 crossref_primary_10_24054_rcta_v2i44_3018 crossref_primary_10_1109_ACCESS_2024_3349952 crossref_primary_10_1177_1748006X221140196 crossref_primary_10_1016_j_procs_2025_04_228 crossref_primary_10_1145_3631391 crossref_primary_10_3390_info16050363 crossref_primary_10_1109_ACCESS_2024_3400693 crossref_primary_10_3390_fi17040135 crossref_primary_10_1016_j_procs_2023_01_097 crossref_primary_10_3390_app15169057 crossref_primary_10_3390_app13127266 crossref_primary_10_1007_s10844_025_00973_1 crossref_primary_10_2339_politeknik_1423293 crossref_primary_10_3390_app15010259 crossref_primary_10_1080_14737167_2024_2322664 crossref_primary_10_1108_IJWIS_01_2025_0019 crossref_primary_10_3390_app142311143 crossref_primary_10_1109_ACCESS_2025_3585164 crossref_primary_10_3389_fpubh_2025_1591491 crossref_primary_10_3390_info14080462 crossref_primary_10_1109_ACCESS_2025_3591455 crossref_primary_10_1016_j_egyr_2024_10_048 crossref_primary_10_1016_j_artmed_2023_102701 crossref_primary_10_1016_j_eswa_2023_119984 crossref_primary_10_1371_journal_pone_0270904 crossref_primary_10_3390_su16010207 crossref_primary_10_48084_etasr_9994 crossref_primary_10_1109_ACCESS_2025_3599564 crossref_primary_10_3390_info15090521 crossref_primary_10_3390_electronics13071199 crossref_primary_10_1109_ACCESS_2025_3602874 crossref_primary_10_1016_j_neucom_2023_127064 crossref_primary_10_1016_j_neunet_2025_107754 crossref_primary_10_1016_j_eswa_2025_127977 crossref_primary_10_1016_j_ipm_2022_103213 crossref_primary_10_1080_1369118X_2024_2351439 crossref_primary_10_3390_technologies13020087 crossref_primary_10_1016_j_apenergy_2024_123276 crossref_primary_10_1109_ACCESS_2024_3467920 crossref_primary_10_1051_itmconf_20257004028 crossref_primary_10_1057_s41270_025_00404_8 crossref_primary_10_1007_s00521_023_08687_7 crossref_primary_10_1049_ell2_70397 crossref_primary_10_1145_3718096 crossref_primary_10_3390_ijerph191610347 crossref_primary_10_1016_j_engappai_2023_107028 crossref_primary_10_3390_app15179403 crossref_primary_10_1093_jamiaopen_ooaf064 crossref_primary_10_1093_jssam_smad015 crossref_primary_10_1038_s41598_024_71020_2 crossref_primary_10_3390_app14198914 crossref_primary_10_1109_ACCESS_2025_3591005 crossref_primary_10_1111_exsy_70073 crossref_primary_10_1016_j_swevo_2025_102073 crossref_primary_10_1109_ACCESS_2022_3217478 crossref_primary_10_2196_70733 crossref_primary_10_1109_ACCESS_2024_3356568 crossref_primary_10_1016_j_cosrev_2024_100664 crossref_primary_10_3390_electronics14163280 crossref_primary_10_1016_j_isci_2024_110192 crossref_primary_10_1155_int_6472544 crossref_primary_10_1142_S0218126625504213 crossref_primary_10_1016_j_ins_2025_121956 crossref_primary_10_1057_s41599_024_03894_6 crossref_primary_10_1515_opth_2025_0052 crossref_primary_10_1080_19312458_2023_2261372 crossref_primary_10_1109_ACCESS_2022_3194536 crossref_primary_10_1016_j_eswa_2025_127928 crossref_primary_10_61186_jsdp_22_1_39 crossref_primary_10_1371_journal_pdig_0000680 crossref_primary_10_7717_peerj_cs_3069 crossref_primary_10_1080_09544828_2025_2518657 crossref_primary_10_1007_s10844_024_00852_1 |
| Cites_doi | 10.1162/tacl_a_00051 10.1126/science.153.3731.34 10.1007/s42452-019-1356-9 10.1145/3439726 10.1007/978-3-662-44415-3_3 10.1017/CBO9780511809071 10.3115/v1/P15-1162 10.18653/v1/P18-1002 10.18653/v1/2020.emnlp-demos.6 10.18653/v1/2020.acl-main.747 10.18653/v1/D19-1345 10.18653/v1/2021.blackboxnlp-1.19 10.18653/v1/D18-2012 10.1007/978-981-10-5041-1_57 10.18653/v1/N18-1202 10.1038/s41598-020-65070-5 10.1007/BF00058655 10.1007/BF00116037 10.18653/v1/D15-1166 10.1609/aaai.v32i1.11604 10.18653/v1/2021.findings-acl.126 10.1007/BF00994018 10.3115/981623.981633 10.1109/3DV.2017.00061 10.18653/v1/D19-1417 10.18653/v1/P18-1215 10.1109/TIT.1967.1053964 10.1007/978-3-319-24261-3_12 10.1007/BFb0026683 10.18653/v1/D18-2029 10.1108/eb026526 10.18653/v1/P18-1007 10.1145/3077136.3080834 10.18653/v1/E17-2068 10.1561/2200000013 10.1016/j.optlaseng.2019.05.006 10.1006/jcss.1997.1504 10.1109/TKDE.2018.2807452 10.1145/3442188.3445922 10.1109/ICPR.2018.8545465 10.24963/ijcai.2019/477 10.1145/3206025.3206030 10.1109/CVPR.2017.113 10.18653/v1/D19-1006 10.3115/v1/D14-1179 10.1145/3394486.3403296 10.1109/21.97458 10.1007/978-3-642-24797-2_3 10.1145/3357384.3357891 10.1109/ICASSP.2012.6289079 10.18653/v1/P18-1031 10.3115/v1/P15-1150 10.1162/neco.1997.9.8.1735 10.18653/v1/P16-1162 10.3233/AIC-170729 10.18653/v1/2021.acl-long.227 10.1016/j.jbi.2021.103699 10.1007/978-1-4899-7687-1_124 10.1109/78.650093 10.18653/v1/W18-5446 10.18653/v1/N19-1033 10.1109/5.880083 10.3115/v1/P14-1023 10.3115/1118693.1118704 10.1198/004017007000000245 10.1109/CVPR.2017.85 10.3390/info10040150 10.1109/TPAMI.2005.127 10.3115/v1/D14-1181 10.3115/v1/D14-1162 10.1007/978-3-030-32381-3_16 10.1109/3DV.2015.46 10.1016/0098-3004(93)90090-R 10.18653/v1/2020.findings-emnlp.372 10.18653/v1/E17-1104 10.1145/130385.130401 10.18653/v1/N19-1408 10.1007/978-3-030-30493-5_39 10.1007/978-3-030-30487-4_16 10.18653/v1/N16-1174 |
| ContentType | Journal Article |
| Copyright | 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | AAYXX CITATION 3V. 7SC 7XB 8AL 8FD 8FE 8FG 8FK ABUWG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ HCIFZ JQ2 K7- L7M L~C L~D M0N P5Z P62 PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI Q9U DOA |
| DOI | 10.3390/info13020083 |
| DatabaseName | CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts ProQuest Central (purchase pre-March 2016) Computing Database (Alumni Edition) Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection ProQuest Central Essentials - QC ProQuest Central Technology Collection ProQuest One Community College ProQuest Central ProQuest Central Student SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Computing Database Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central Basic Open Access: DOAJ - Directory of Open Access Journals |
| DatabaseTitle | CrossRef Publicly Available Content Database Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Central ProQuest One Applied & Life Sciences ProQuest Central Korea ProQuest Central (New) Advanced Technologies Database with Aerospace Advanced Technologies & Aerospace Collection ProQuest Computing ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) |
| DatabaseTitleList | Publicly Available Content Database CrossRef |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ - Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: PIMPY name: Publicly Available Content Database url: http://search.proquest.com/publiccontent sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 2078-2489 |
| ExternalDocumentID | oai_doaj_org_article_649eb6b7224d4dd7ad8718607cf363fe 10_3390_info13020083 |
| GroupedDBID | .4I 5VS 8FE 8FG AADQD AAFWJ AAYXX ABDBF ABUWG ADBBV ADMLS AFFHD AFKRA AFPKN AFZYC ALMA_UNASSIGNED_HOLDINGS ARAPS AZQEC BCNDV BENPR BGLVJ BPHCQ CCPQU CITATION DWQXO GNUQQ GROUPED_DOAJ HCIFZ IAO K6V K7- KQ8 MK~ ML~ MODMG M~E OK1 P2P P62 PHGZM PHGZT PIMPY PQGLB PQQKQ PROAC 3V. 7SC 7XB 8AL 8FD 8FK JQ2 L7M L~C L~D M0N PKEHL PQEST PQUKI Q9U |
| ID | FETCH-LOGICAL-c367t-7b931a2d5a11a6862bca412bf55d14292a12ca59b862bcddcb65e53963266be33 |
| IEDL.DBID | DOA |
| ISICitedReferencesCount | 117 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000763620800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2078-2489 |
| IngestDate | Fri Oct 03 12:42:38 EDT 2025 Sun Nov 09 06:15:51 EST 2025 Sat Nov 29 07:10:19 EST 2025 Tue Nov 18 20:57:18 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 2 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c367t-7b931a2d5a11a6862bca412bf55d14292a12ca59b862bcddcb65e53963266be33 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-3659-5099 0000-0002-3634-6607 0000-0003-4986-0442 0000-0002-0451-4899 |
| OpenAccessLink | https://doaj.org/article/649eb6b7224d4dd7ad8718607cf363fe |
| PQID | 2632777204 |
| PQPubID | 2032384 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_649eb6b7224d4dd7ad8718607cf363fe proquest_journals_2632777204 crossref_citationtrail_10_3390_info13020083 crossref_primary_10_3390_info13020083 |
| PublicationCentury | 2000 |
| PublicationDate | 2022-02-01 |
| PublicationDateYYYYMMDD | 2022-02-01 |
| PublicationDate_xml | – month: 02 year: 2022 text: 2022-02-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | Basel |
| PublicationPlace_xml | – name: Basel |
| PublicationTitle | Information (Basel) |
| PublicationYear | 2022 |
| Publisher | MDPI AG |
| Publisher_xml | – name: MDPI AG |
| References | ref_137 ref_136 ref_92 ref_139 ref_91 ref_138 ref_90 ref_13 ref_131 ref_11 ref_99 ref_130 ref_133 ref_132 ref_96 Hochreiter (ref_58) 1997; 9 ref_95 ref_134 Dasgupta (ref_77) 2013; Volume 28 Yan (ref_72) 2020; 10 ref_19 ref_18 ref_17 ref_16 ref_15 Bellman (ref_42) 1966; 153 ref_126 ref_125 ref_128 ref_127 Jivani (ref_8) 2011; 2 ref_129 Nikolentzos (ref_101) 2020; 34 ref_22 ref_21 ref_122 ref_20 ref_121 ref_124 ref_123 ref_29 Sutton (ref_38) 2012; 4 ref_27 Rosenfeld (ref_28) 2000; 88 ref_71 ref_158 ref_70 ref_151 ref_79 ref_150 ref_78 ref_153 ref_152 ref_76 ref_155 ref_154 ref_74 ref_157 ref_73 ref_160 Cover (ref_39) 1967; 13 Sutskever (ref_75) 2014; Volume 2 ref_83 ref_148 ref_82 ref_147 ref_81 ref_80 ref_149 ref_140 ref_89 ref_142 Ibrahim (ref_54) 2021; 116 ref_88 ref_141 ref_144 ref_86 ref_143 ref_85 ref_146 ref_84 ref_145 Freund (ref_52) 1997; 55 Jones (ref_24) 1972; 28 Krishnapuram (ref_49) 2005; 27 Tharwat (ref_26) 2017; 30 Sechidis (ref_159) 2011; Volume 6913 ref_57 ref_56 ref_55 ref_53 Raffel (ref_87) 2020; 21 Minaee (ref_3) 2021; 54 Gage (ref_10) 1994; 12 Cai (ref_94) 2018; 30 Safavian (ref_45) 1991; 21 ref_59 Schiavinato (ref_93) 2015; 9370 Cortes (ref_43) 1995; 20 ref_61 ref_60 Bojanowski (ref_35) 2017; 5 ref_69 ref_162 ref_68 ref_67 ref_164 ref_66 ref_163 ref_65 ref_166 ref_165 ref_63 ref_62 Wang (ref_14) 2020; 34 Schuster (ref_64) 1997; 45 ref_115 Pistellato (ref_23) 2019; 121 ref_114 ref_117 ref_116 ref_119 ref_118 ref_36 ref_34 Schapire (ref_51) 1990; 5 ref_33 ref_32 ref_111 ref_31 ref_110 ref_30 ref_113 Radford (ref_12) 2019; 1 ref_112 Lewis (ref_120) 2004; 5 Gupta (ref_156) 2020; 325 ref_37 Ali (ref_41) 2019; 1 Torsello (ref_97) 2014; 8621 Genkin (ref_48) 2007; 49 ref_104 ref_103 ref_106 ref_105 ref_108 ref_107 Ratajczak (ref_25) 1993; 19 ref_109 ref_47 ref_46 Sachan (ref_135) 2019; 33 ref_44 ref_100 Jin (ref_161) 2020; 34 Yao (ref_98) 2019; 33 ref_102 ref_40 ref_1 ref_2 Breiman (ref_50) 2004; 24 ref_9 ref_5 ref_4 ref_7 ref_6 |
| References_xml | – volume: 5 start-page: 135 year: 2017 ident: ref_35 article-title: Enriching Word Vectors with Subword Information publication-title: Trans. Assoc. Comput. Linguist. doi: 10.1162/tacl_a_00051 – volume: 153 start-page: 34 year: 1966 ident: ref_42 article-title: Dynamic Programming publication-title: Science doi: 10.1126/science.153.3731.34 – ident: ref_117 – ident: ref_9 – volume: 1 start-page: 1 year: 2019 ident: ref_41 article-title: Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets publication-title: SN Appl. Sci. doi: 10.1007/s42452-019-1356-9 – volume: 54 start-page: 1 year: 2021 ident: ref_3 article-title: Deep Learning–Based Text Classification: A Comprehensive Review publication-title: Acm Comput. Surv. doi: 10.1145/3439726 – volume: 8621 start-page: 22 year: 2014 ident: ref_97 article-title: Transitive State Alignment for the Quantum Jensen-Shannon Kernel publication-title: Lect. Notes Comput. Sci. doi: 10.1007/978-3-662-44415-3_3 – ident: ref_5 doi: 10.1017/CBO9780511809071 – ident: ref_55 doi: 10.3115/v1/P15-1162 – ident: ref_152 doi: 10.18653/v1/P18-1002 – ident: ref_158 doi: 10.18653/v1/2020.emnlp-demos.6 – ident: ref_16 – ident: ref_20 doi: 10.18653/v1/2020.acl-main.747 – ident: ref_88 – ident: ref_155 – ident: ref_103 doi: 10.18653/v1/D19-1345 – ident: ref_141 doi: 10.18653/v1/2021.blackboxnlp-1.19 – ident: ref_18 doi: 10.18653/v1/D18-2012 – ident: ref_1 – ident: ref_36 doi: 10.1007/978-981-10-5041-1_57 – volume: 34 start-page: 8544 year: 2020 ident: ref_101 article-title: Message Passing Attention Networks for Document Understanding publication-title: Proc. AAAI Conf. Artif. Intell. – ident: ref_123 – ident: ref_146 – ident: ref_65 doi: 10.18653/v1/N18-1202 – ident: ref_166 – ident: ref_114 – ident: ref_31 – ident: ref_56 – volume: 10 start-page: 8055 year: 2020 ident: ref_72 article-title: Temporal Convolutional Networks for the Advance Prediction of ENSO publication-title: Sci. Rep. doi: 10.1038/s41598-020-65070-5 – ident: ref_27 – volume: 24 start-page: 123 year: 2004 ident: ref_50 article-title: Bagging predictors publication-title: Mach. Learn. doi: 10.1007/BF00058655 – ident: ref_83 – volume: 5 start-page: 197 year: 1990 ident: ref_51 article-title: The Strength of Weak Learnability publication-title: Mach. Learn. doi: 10.1007/BF00116037 – ident: ref_13 – volume: 33 start-page: 7370 year: 2019 ident: ref_98 article-title: Graph Convolutional Networks for Text Classification publication-title: Proc. AAAI Conf. Artif. Intell. – volume: 33 start-page: 6940 year: 2019 ident: ref_135 article-title: Revisiting LSTM Networks for Semi-Supervised Text Classification via Mixed Objective Function publication-title: Proc. AAAI Conf. Artif. Intell. – volume: 34 start-page: 8018 year: 2020 ident: ref_161 article-title: Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment publication-title: Proc. AAAI Conf. Artif. Intell. – ident: ref_30 – ident: ref_78 doi: 10.18653/v1/D15-1166 – ident: ref_105 doi: 10.1609/aaai.v32i1.11604 – ident: ref_121 – ident: ref_100 doi: 10.18653/v1/2021.findings-acl.126 – ident: ref_134 – ident: ref_86 – ident: ref_157 – volume: 20 start-page: 273 year: 1995 ident: ref_43 article-title: Support-Vector Networks publication-title: Mach. Learn. doi: 10.1007/BF00994018 – ident: ref_99 doi: 10.3115/981623.981633 – volume: 5 start-page: 361 year: 2004 ident: ref_120 article-title: RCV1: A New Benchmark Collection for Text Categorization Research publication-title: J. Mach. Learn. Res. – ident: ref_92 – ident: ref_129 – volume: 21 start-page: 1 year: 2020 ident: ref_87 article-title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer publication-title: J. Mach. Learn. Res. – ident: ref_106 – ident: ref_109 doi: 10.1109/3DV.2017.00061 – ident: ref_148 – ident: ref_163 – ident: ref_137 doi: 10.18653/v1/D19-1417 – ident: ref_62 doi: 10.18653/v1/P18-1215 – volume: 13 start-page: 21 year: 1967 ident: ref_39 article-title: Nearest Neighbor pattern classification publication-title: IEEE Trans. Inf. Theory doi: 10.1109/TIT.1967.1053964 – ident: ref_6 – ident: ref_143 – ident: ref_81 – volume: 9370 start-page: 146 year: 2015 ident: ref_93 article-title: Transitive assignment kernels for structural classification publication-title: Lect. Notes Comput. Sci. doi: 10.1007/978-3-319-24261-3_12 – ident: ref_112 – ident: ref_130 doi: 10.1007/BFb0026683 – ident: ref_138 doi: 10.18653/v1/D18-2029 – volume: 28 start-page: 11 year: 1972 ident: ref_24 article-title: A statistical interpretation of term specificity and its application in retrieval publication-title: J. Doc. doi: 10.1108/eb026526 – ident: ref_17 doi: 10.18653/v1/P18-1007 – ident: ref_89 – ident: ref_118 doi: 10.1145/3077136.3080834 – ident: ref_154 – ident: ref_126 – ident: ref_19 – ident: ref_160 – ident: ref_95 – ident: ref_53 doi: 10.18653/v1/E17-2068 – volume: 4 start-page: 267 year: 2012 ident: ref_38 article-title: An Introduction to Conditional Random Fields publication-title: Found. Trends® Mach. Learn. doi: 10.1561/2200000013 – ident: ref_165 – volume: 121 start-page: 428 year: 2019 ident: ref_23 article-title: Robust phase unwrapping by probabilistic consensus publication-title: Opt. Lasers Eng. doi: 10.1016/j.optlaseng.2019.05.006 – ident: ref_32 – volume: 55 start-page: 119 year: 1997 ident: ref_52 article-title: A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting publication-title: J. Comput. Syst. Sci. doi: 10.1006/jcss.1997.1504 – ident: ref_113 – volume: 30 start-page: 1616 year: 2018 ident: ref_94 article-title: A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications publication-title: IEEE Trans. Knowl. Data Eng. doi: 10.1109/TKDE.2018.2807452 – ident: ref_84 – ident: ref_136 – ident: ref_79 doi: 10.1145/3442188.3445922 – volume: 12 start-page: 23 year: 1994 ident: ref_10 article-title: A New Algorithm for Data Compression publication-title: C Users J. – ident: ref_68 doi: 10.1109/ICPR.2018.8545465 – ident: ref_90 – ident: ref_139 doi: 10.24963/ijcai.2019/477 – ident: ref_127 – ident: ref_119 doi: 10.1145/3206025.3206030 – ident: ref_70 doi: 10.1109/CVPR.2017.113 – ident: ref_151 – ident: ref_104 – volume: Volume 6913 start-page: 145 year: 2011 ident: ref_159 article-title: On the Stratification of Multi-label Data publication-title: Machine Learning and Knowledge Discovery in Databases – ident: ref_110 doi: 10.18653/v1/D19-1006 – ident: ref_63 doi: 10.3115/v1/D14-1179 – ident: ref_108 doi: 10.1145/3394486.3403296 – volume: 21 start-page: 660 year: 1991 ident: ref_45 article-title: A survey of decision tree classifier methodology publication-title: IEEE Trans. Syst. Man Cybern. doi: 10.1109/21.97458 – ident: ref_4 doi: 10.1007/978-3-642-24797-2_3 – ident: ref_47 doi: 10.1145/3357384.3357891 – ident: ref_15 doi: 10.1109/ICASSP.2012.6289079 – volume: Volume 2 start-page: 3104 year: 2014 ident: ref_75 article-title: Sequence to Sequence Learning with Neural Networks publication-title: Proceedings of the 27th International Conference on Neural Information Processing Systems – ident: ref_61 doi: 10.18653/v1/P18-1031 – ident: ref_59 doi: 10.3115/v1/P15-1150 – ident: ref_66 – volume: 9 start-page: 1735 year: 1997 ident: ref_58 article-title: Long Short-term Memory publication-title: Neural Comput. doi: 10.1162/neco.1997.9.8.1735 – ident: ref_107 – ident: ref_131 – ident: ref_11 doi: 10.18653/v1/P16-1162 – volume: 1 start-page: 9 year: 2019 ident: ref_12 article-title: Language Models are Unsupervised Multitask Learners publication-title: OpenAI Blog – ident: ref_124 – ident: ref_145 – ident: ref_162 – ident: ref_7 – volume: 30 start-page: 169 year: 2017 ident: ref_26 article-title: Linear discriminant analysis: A detailed tutorial publication-title: Ai Commun. doi: 10.3233/AIC-170729 – ident: ref_142 doi: 10.18653/v1/2021.acl-long.227 – ident: ref_76 – ident: ref_144 – volume: 116 start-page: 103699 year: 2021 ident: ref_54 article-title: GHS-NET a generic hybridized shallow neural network for multi-label biomedical text classification publication-title: J. Biomed. Inform. doi: 10.1016/j.jbi.2021.103699 – ident: ref_37 doi: 10.1007/978-1-4899-7687-1_124 – volume: 45 start-page: 2673 year: 1997 ident: ref_64 article-title: Bidirectional recurrent neural networks publication-title: IEEE Trans. Signal Process. doi: 10.1109/78.650093 – ident: ref_85 doi: 10.18653/v1/W18-5446 – ident: ref_82 – ident: ref_40 – ident: ref_140 doi: 10.18653/v1/N19-1033 – volume: 88 start-page: 1270 year: 2000 ident: ref_28 article-title: Two decades of statistical language modeling: Where do we go from here? publication-title: Proc. IEEE doi: 10.1109/5.880083 – ident: ref_153 – ident: ref_102 – ident: ref_125 – ident: ref_34 doi: 10.3115/v1/P14-1023 – ident: ref_128 doi: 10.3115/1118693.1118704 – volume: 49 start-page: 291 year: 2007 ident: ref_48 article-title: Large-Scale Bayesian Logistic Regression for Text Categorization publication-title: Technometrics doi: 10.1198/004017007000000245 – ident: ref_67 doi: 10.1109/CVPR.2017.85 – ident: ref_2 doi: 10.3390/info10040150 – ident: ref_111 – ident: ref_96 – ident: ref_21 – volume: 27 start-page: 957 year: 2005 ident: ref_49 article-title: Sparse multinomial logistic regression: Fast algorithms and generalization bounds publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2005.127 – ident: ref_116 – volume: 34 start-page: 9154 year: 2020 ident: ref_14 article-title: Neural Machine Translation with Byte-Level Subwords publication-title: Proc. AAAI Conf. Artif. Intell. – ident: ref_69 doi: 10.3115/v1/D14-1181 – volume: Volume 28 start-page: 1310 year: 2013 ident: ref_77 article-title: On the difficulty of training recurrent neural networks publication-title: Proceedings of the 30th International Conference on Machine Learning – ident: ref_164 – ident: ref_33 doi: 10.3115/v1/D14-1162 – ident: ref_29 – ident: ref_122 – ident: ref_132 doi: 10.1007/978-3-030-32381-3_16 – ident: ref_22 doi: 10.1109/3DV.2015.46 – ident: ref_46 – volume: 19 start-page: 303 year: 1993 ident: ref_25 article-title: Principal components analysis (PCA) publication-title: Comput. Geosci. doi: 10.1016/0098-3004(93)90090-R – ident: ref_115 doi: 10.18653/v1/2020.findings-emnlp.372 – ident: ref_73 doi: 10.18653/v1/E17-1104 – ident: ref_44 doi: 10.1145/130385.130401 – ident: ref_149 doi: 10.18653/v1/N19-1408 – ident: ref_91 – volume: 325 start-page: 2030 year: 2020 ident: ref_156 article-title: Improving Document Classification with Multi-Sense Embeddings publication-title: Front. Artif. Intell. Appl. – ident: ref_71 doi: 10.1007/978-3-030-30493-5_39 – ident: ref_133 – ident: ref_150 – ident: ref_74 doi: 10.1007/978-3-030-30487-4_16 – ident: ref_60 – ident: ref_147 – ident: ref_57 – ident: ref_80 doi: 10.18653/v1/N16-1174 – volume: 2 start-page: 1930 year: 2011 ident: ref_8 article-title: A Comparative Study of Stemming Algorithms publication-title: Int. J. Comput. Technol. Appl. |
| SSID | ssj0000778481 |
| Score | 2.6021774 |
| Snippet | In recent years, the exponential growth of digital documents has been met by rapid progress in text classification techniques. Newly proposed machine learning... |
| SourceID | doaj proquest crossref |
| SourceType | Open Website Aggregation Database Enrichment Source Index Database |
| StartPage | 83 |
| SubjectTerms | Algorithms Artificial intelligence Classification Datasets Deep learning Electronic documents English language Feature extraction Labeling Machine learning Natural language Neural networks news classification Sentiment analysis shallow learning Text categorization text classification tokenisation topic labelling transformer |
| SummonAdditionalLinks | – databaseName: Advanced Technologies & Aerospace Database dbid: P5Z link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3PS8MwFA46PejB3-J0Sg56krK2SZPWi8wfw4OMgROGl9Im2RS2drbdwP_evLSbE9GLx7aPUvrlvbyXvHwfQudE2pFkAbWIYsqiQnt6oK911cp8nxKlU35D4vrIOx2_3w-61YJbXrVVzmOiCdQyFbBG3gRecc5BUuV68m6BahTsrlYSGqtoDVgSQLqh670s1lhszoEtvux3J7q6bwJqsFUHmce3mcgQ9v-Ix2aSaW__9_N20FaVXuJWOR520YpK9tDmEungPrpr4adpNlMfOE1wT8dmbIQxoWXIoIRbo6F-c_E6zq9wO0vHpVGR4m4GuzpmoB6g5_Z97_bBqrQULEEYLyweB8SJXOlFjhPBqZBYRNRx44HnSQckqyLHFZEXxOaRlCJmnvKIdk89g8eKkENUS9JEHSFMmaQikLZPJaP-wA2AIk9b-J7PibJJHV3O_2soKqJx0LsYhbrgABTCZRTq6GJhPSkJNn6xuwGIFjZAi21upNkwrLwsZDRQMYu5zksklZJHUteDPrO5GBBGBqqOGnP0wspX8_ALuuO_H5-gDRcOP5ie7QaqFdlUnaJ1MSve8uzMDL1PZYDgyA priority: 102 providerName: ProQuest |
| Title | A Survey on Text Classification Algorithms: From Text to Predictions |
| URI | https://www.proquest.com/docview/2632777204 https://doaj.org/article/649eb6b7224d4dd7ad8718607cf363fe |
| Volume | 13 |
| WOSCitedRecordID | wos000763620800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ - Open Access Journals customDbUrl: eissn: 2078-2489 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000778481 issn: 2078-2489 databaseCode: DOA dateStart: 20100101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2078-2489 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000778481 issn: 2078-2489 databaseCode: M~E dateStart: 20100101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVPQU databaseName: Advanced Technologies & Aerospace Database customDbUrl: eissn: 2078-2489 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000778481 issn: 2078-2489 databaseCode: P5Z dateStart: 20100301 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: Computer Science Database customDbUrl: eissn: 2078-2489 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000778481 issn: 2078-2489 databaseCode: K7- dateStart: 20100301 isFulltext: true titleUrlDefault: http://search.proquest.com/compscijour providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 2078-2489 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000778481 issn: 2078-2489 databaseCode: BENPR dateStart: 20100301 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: Publicly Available Content Database customDbUrl: eissn: 2078-2489 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000778481 issn: 2078-2489 databaseCode: PIMPY dateStart: 20100301 isFulltext: true titleUrlDefault: http://search.proquest.com/publiccontent providerName: ProQuest |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA6iHvQgPnF1lRz0JMW2SZPG26q7KOpSfIB6KW2SVWHdSndd8OJvdyatsiLixctAm4GWmcyLTL4hZIcZPzNCcY9ZYT2uwdIVPEPVKuKYMwspvwNxPZfdbnx7q5KJUV_YE1bBA1eC2xdc2VzkEkKN4cbIzECKHwtf6h4TrGfR-_pSTRRTzgdLiTjxVac7g7p-H_WFh3SYc3yLQQ6q_4cnduGls0gW6ryQtqr_WSJTdrBM5ifQAlfIcYtevZZj-0aLAb0Gp0rdREvs9XHipa3-QwG1_uPz8IB2yuK5YhoVNCnxOMbtsFVy02lfH5149RAETzMhR57MFQuy0ERZEGR4nSPXGQ_CvBdFJsBZU1kQ6ixSuVsyRucishEDu4LQm1vG1sj0oBjYdUK5MFwr48fcCB73QoXYdsARR7Fk1mcNsvcpllTXCOE4qKKfQqWAQkwnhdggu1_cLxUyxi98hyjhLx7Es3YvQMtpreX0Ly03SPNTP2ltZMMUoealxCk7G__xjU0yF-LdBteS3STTo_LVbpFZPR49DcttMnPY7iaX226fAT2THtCL9zbQJLqH9eT0Irn7AEUj2S8 |
| linkProvider | Directory of Open Access Journals |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1NT9wwEB0hQGp74KtFXaDUBzihiCR27LhShbbACrTLColF4pYmtheQYAPZQMWf4jfW4yQLqKI3DhyTzMWZ55mxPX4PYINqP9VcMo8abjym7EyX9tmuWnkcM2psye9IXHui34_PzuTxFDw2d2GwrbKJiS5Q61zhHvk28ooLgZIqOze3HqpG4elqI6FRwaJrHv7YJdv45-Ge9e9mGHb2B7sHXq0q4CnKRemJTNIgDXWUBkGK9yMylbIgzIZRpAMUb0qDUKWRzNwnrVXGIxNRC1SbyzKDG6A25M8wGgucV13hTfZ0fCGQnb7qr6dU-tuIEjwaxErnReZzAgH_xH-X1Drz7-13LMBcXT6TdoX3RZgyoyX49IxU8TPstcnJXXFvHkg-IgObe4gT_sSWKIdC0r46tyMpL67HP0inyK8rozInxwWeWrmJ-AVO32QYyzA9ykfmKxDGNVNS-zHTnMXDUCIFoLWII-tx49MWbDV-TFRNpI56HleJXVCh15PnXm_B5sT6piIQecXuF0JiYoO03-5FXpwndRRJOJMm45mwdZdmWotU2_VuzH2hhpTToWnBWoOWpI5F4-QJKiv___wdPhwMjnpJ77DfXYWPIV70cP3pazBdFnfmG8yq-_JyXKw72BP4_dbA-guL-zzk |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9QwEB5VpUJwgPJSlxbwgZ5QtEns2HElhBa2K6pWq5UoUsUlJLZTKrWbkk2L-tf4dcw4yVKE6K0HjonnEMffvOzxNwCvuQ1zK7UIuJMuEAY1XeMzZq0yTQV3GPJ7EtcDNZ2mR0d6tgI_-7swVFbZ20RvqG1laI98SLziSlFLlWHZlUXMxpN3598D6iBFJ619O40WIvvu6gemb4u3e2Nc6-04nuwefvgYdB0GAsOlagJVaB7lsU3yKMrprkRhchHFRZkkNqJGTnkUmzzRhR-y1hQycQlH0KJfKxxthqL5v6Mwx6RywlnyZbm_EypFTPVtrT3nOhwSYuiYkKKeP7ygbxbwly_wDm7y8H_-NevwoAur2ajVg0ew4uaP4f41ssUnMB6xTxf1pbti1Zwd4tcz3xCUSqU8Otno9Bhn0nw7W-ywSV2dtUJNxWY1nWZ5BX0Kn29lGs9gdV7N3QYwIa0w2oapsFKkZayJGhAl0iRV3IV8AG_6Nc1MR7BOfT5OM0y0CAHZdQQMYHspfd4Si_xD7j3BYylDdOD-RVUfZ511yaTQrpCFwnjMCmtVbjEPTmWoTMklL90AtnrkZJ2NWmS_YfP85uFXcBfxlB3sTfc34V5M9z982foWrDb1hXsBa-ayOVnUL70GMPh627j6BVZKRgg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Survey+on+Text+Classification+Algorithms%3A+From+Text+to+Predictions&rft.jtitle=Information+%28Basel%29&rft.au=Andrea+Gasparetto&rft.au=Matteo+Marcuzzo&rft.au=Alessandro+Zangari&rft.au=Andrea+Albarelli&rft.date=2022-02-01&rft.pub=MDPI+AG&rft.eissn=2078-2489&rft.volume=13&rft.issue=2&rft.spage=83&rft_id=info:doi/10.3390%2Finfo13020083&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_649eb6b7224d4dd7ad8718607cf363fe |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2078-2489&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2078-2489&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2078-2489&client=summon |