On the classification of text documents taking into account their structural features
A modification of the conventional bag of words model that can take into account the structural features of text documents in their classification (categorization) using machine learning techniques is studied. It is proposed to describe these features by relations on the set of certain lexemes and u...
Saved in:
| Published in: | Journal of computer & systems sciences international Vol. 55; no. 3; pp. 394 - 403 |
|---|---|
| Main Authors: | , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Moscow
Pleiades Publishing
01.05.2016
Springer Nature B.V |
| Subjects: | |
| ISSN: | 1064-2307, 1555-6530 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | A modification of the conventional bag of words model that can take into account the structural features of text documents in their classification (categorization) using machine learning techniques is studied. It is proposed to describe these features by relations on the set of certain lexemes and use the relation names, along with the lexeme names, as features. This is a distinction from the conventional model in which only unary relations are used. The effectiveness of the proposed machine learning techniques is analyzed using computer experiments on the class of the Reuters-21578 collection with eight known classifiers. It is shown that it is reasonable to apply the proposed models to classify documents using simple classifiers. |
|---|---|
| AbstractList | A modification of the conventional bag of words model that can take into account the structural features of text documents in their classification (categorization) using machine learning techniques is studied. It is proposed to describe these features by relations on the set of certain lexemes and use the relation names, along with the lexeme names, as features. This is a distinction from the conventional model in which only unary relations are used. The effectiveness of the proposed machine learning techniques is analyzed using computer experiments on the class of the Reuters-21578 collection with eight known classifiers. It is shown that it is reasonable to apply the proposed models to classify documents using simple classifiers. |
| Author | Frolov, A. B. Gulin, V. V. |
| Author_xml | – sequence: 1 givenname: V. V. surname: Gulin fullname: Gulin, V. V. email: gulin.vladimir@gmail.com organization: Moscow Power Engineering Institute (National Research University) – sequence: 2 givenname: A. B. surname: Frolov fullname: Frolov, A. B. organization: Moscow Power Engineering Institute (National Research University) |
| BookMark | eNp9kE1LAzEQhoNUsK3-AG8BL15WJ5vs11GKX1DoQXtestnZmrpNapIF_fem1oNU9DQD8zwzwzshI2MNEnLO4IoxLq6fGOQi5VCwHDgwSI_ImGVZluQZh1Hs4zjZzU_IxPs1AK9yEGOyXBgaXpCqXnqvO61k0NZQ29GA74G2Vg0bNMHTIF-1WVFtgqVSKTuYsBO1oz64QYXByZ52KGOD_pQcd7L3ePZdp2R5d_s8e0jmi_vH2c08UVxUIRFctB1veJOiQlW0PEdopGpYhVWTylI0jKWdyNIqA-R5Lsqu5aoRAopSlZGfksv93q2zbwP6UG-0V9j30qAdfM3KNGYAgvGIXhygazs4E7-LFAAUcWsZKbanlLPeO-zqrdMb6T5qBvUu6PpX0NEpDhylw1eMwUnd_2ume9PHK2aF7sdPf0qfwO-SLA |
| CitedBy_id | crossref_primary_10_1007_s00500_020_05209_8 |
| Cites_doi | 10.1145/361219.361220 10.1023/A:1015142527070 10.1134/S1064230710010089 10.1017/CBO9780511809071 10.1006/jcss.1997.1504 10.1023/A:1015190410232 10.1145/381854.381890 10.1007/978-1-4757-2440-0 10.1080/00437956.1954.11659520 10.1023/A:1010933404324 10.1145/505282.505283 |
| ContentType | Journal Article |
| Copyright | Pleiades Publishing, Ltd. 2016 |
| Copyright_xml | – notice: Pleiades Publishing, Ltd. 2016 |
| DBID | AAYXX CITATION 3V. 7SC 7SP 7WY 7WZ 7XB 87Z 8AL 8FD 8FE 8FG 8FK 8FL ABJCF ABUWG AFKRA ARAPS AZQEC BENPR BEZIV BGLVJ CCPQU DWQXO FRNLG F~G GNUQQ HCIFZ JQ2 K60 K6~ K7- L.- L.0 L6V L7M L~C L~D M0C M0N M7S P5Z P62 PHGZM PHGZT PKEHL PQBIZ PQBZA PQEST PQGLB PQQKQ PQUKI PTHSS PYYUZ Q9U |
| DOI | 10.1134/S1064230716030102 |
| DatabaseName | CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts Electronics & Communications Abstracts ABI/INFORM Collection ABI/INFORM Global (PDF only) ProQuest Central (purchase pre-March 2016) ABI/INFORM Global (Alumni Edition) Computing Database (Alumni Edition) Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ABI/INFORM Collection (Alumni) ProQuest SciTech Premium Collection Technology Collection Materials Science & Engineering Database ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest SciTech Premium Collection Technology Collection Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central Business Premium Collection ProQuest Technology Collection ProQuest One ProQuest Central Korea Business Premium Collection (Alumni) ABI/INFORM Global (Corporate) ProQuest Central Student SciTech Premium Collection ProQuest Computer Science Collection ProQuest Business Collection (Alumni Edition) ProQuest Business Collection Computer Science Database ABI/INFORM Professional Advanced ABI/INFORM Professional Standard ProQuest Engineering Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ABI/INFORM Global Computing Database Engineering Database Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic (New) ProQuest One Academic Middle East (New) ProQuest One Business ProQuest One Business (Alumni) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition Engineering Collection ABI/INFORM Collection China ProQuest Central Basic |
| DatabaseTitle | CrossRef ProQuest Business Collection (Alumni Edition) Computer Science Database ProQuest Central Student ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts SciTech Premium Collection ABI/INFORM Complete ProQuest One Applied & Life Sciences ProQuest Central (New) Engineering Collection Advanced Technologies & Aerospace Collection Business Premium Collection ABI/INFORM Global Engineering Database ProQuest One Academic Eastern Edition Electronics & Communications Abstracts ProQuest Technology Collection ProQuest Business Collection ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) ABI/INFORM Global (Corporate) ProQuest One Business Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Central (Alumni Edition) ProQuest One Community College ProQuest Central ABI/INFORM Professional Advanced ProQuest Engineering Collection ABI/INFORM Professional Standard ProQuest Central Korea Advanced Technologies Database with Aerospace ABI/INFORM Complete (Alumni Edition) ProQuest Computing ABI/INFORM Global (Alumni Edition) ProQuest Central Basic ProQuest Computing (Alumni Edition) ABI/INFORM China ProQuest SciTech Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database Materials Science & Engineering Collection ProQuest One Business (Alumni) ProQuest Central (Alumni) Business Premium Collection (Alumni) |
| DatabaseTitleList | ProQuest Business Collection (Alumni Edition) Technology Research Database |
| Database_xml | – sequence: 1 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Sciences (General) Computer Science |
| EISSN | 1555-6530 |
| EndPage | 403 |
| ExternalDocumentID | 4103064091 10_1134_S1064230716030102 |
| Genre | Feature |
| GroupedDBID | -5B -5G -BR -EM -Y2 -~C .4S .VR 06D 0R~ 0VY 1N0 29K 29~ 2J2 2JN 2JY 2KG 2KM 2LR 2VQ 2~H 30V 3V. 4.4 408 40D 40E 5GY 5VS 6NX 7WY 8FE 8FG 8FL 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYZH ABAKF ABBBX ABDZT ABECU ABFTD ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABQSL ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACAOD ACBXY ACGFO ACGFS ACHSB ACHXU ACKNC ACMDZ ACMLO ACOKC ACOMO ACPIV ACREN ACSNA ACZOJ ADHHG ADHIR ADINQ ADKNI ADKPE ADMLS ADRFC ADTPH ADURQ ADYFF ADYOE ADZKW AEBTG AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AEMSY AENEX AEOHA AEPYU AETLH AEVLU AEXYK AFGCZ AFKRA AFLOW AFQWF AFWTZ AFYQB AFZKB AGAYW AGDGC AGJBK AGMZJ AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMTXH AMXSW AMYLF AMYQR AOCGG ARAPS ARCSS ARMRJ ASPBG AVWKF AXYYD AZFZN AZQEC B-. BA0 BDATZ BENPR BEZIV BGLVJ BGNMA BPHCQ BSONS CAG CCPQU COF CS3 CSCUP D-I DDRTE DNIVK DPUIP DU5 DWQXO EBLON EBS EIOEI EJD ESBYG FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GROUPED_ABI_INFORM_COMPLETE H13 HCIFZ HF~ HG6 HLICF HMJXF HRMNR HVGLF HZ~ IJ- IKXTQ IWAJR IXD I~X I~Z J-C JBSCW JZLTJ K60 K6V K6~ K7- KOV LLZTM M0C M0N M4Y MA- MK~ ML~ N2Q NB0 NPVJJ NQJWS NU0 O9- O93 O9J P2P P62 P9P PF0 PQBIZ PQBZA PQQKQ PROAC PT4 Q2X QOS R89 R9I RIG RNS ROL RSV S16 S1Z S27 S3B SAP SDH SEG SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 TSG TUC TUS UG4 UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WH7 WK8 XU3 YLTOR Z7R ZMTXR ~A9 AAPKM AAYXX ABDBE ABFSG ABJCF ABRTQ ACSTC ADHKG AEZWR AFDZB AFFHD AFHIU AFOHR AGQPQ AHPBZ AHWEU AIXLP ATHPR CITATION M7S PHGZM PHGZT PQGLB PTHSS 7SC 7SP 7XB 8AL 8FD 8FK JQ2 L.- L.0 L6V L7M L~C L~D PKEHL PQEST PQUKI PUEGO Q9U |
| ID | FETCH-LOGICAL-c349t-434df3b3b2ecec7d36e0bacb19e9b2a84b112f452950e36648fd3cb44078c87d3 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 2 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000379020700006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1064-2307 |
| IngestDate | Sun Sep 28 02:18:12 EDT 2025 Wed Sep 17 23:55:17 EDT 2025 Sat Nov 29 01:44:19 EST 2025 Tue Nov 18 22:25:48 EST 2025 Fri Feb 21 02:38:35 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c349t-434df3b3b2ecec7d36e0bacb19e9b2a84b112f452950e36648fd3cb44078c87d3 |
| Notes | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23 |
| PQID | 1800074408 |
| PQPubID | 326259 |
| PageCount | 10 |
| ParticipantIDs | proquest_miscellaneous_1825550413 proquest_journals_1800074408 crossref_primary_10_1134_S1064230716030102 crossref_citationtrail_10_1134_S1064230716030102 springer_journals_10_1134_S1064230716030102 |
| PublicationCentury | 2000 |
| PublicationDate | 20160500 2016-5-00 20160501 |
| PublicationDateYYYYMMDD | 2016-05-01 |
| PublicationDate_xml | – month: 5 year: 2016 text: 20160500 |
| PublicationDecade | 2010 |
| PublicationPlace | Moscow |
| PublicationPlace_xml | – name: Moscow – name: Silver Spring |
| PublicationTitle | Journal of computer & systems sciences international |
| PublicationTitleAbbrev | J. Comput. Syst. Sci. Int |
| PublicationYear | 2016 |
| Publisher | Pleiades Publishing Springer Nature B.V |
| Publisher_xml | – name: Pleiades Publishing – name: Springer Nature B.V |
| References | Joachims (CR5) 1998 Metzler, Strohman (CR8) 2010 Quinlan (CR25) 1993 Baeza-Yates, Baeza-Yates, Navarro (CR9) 1996; 25 Scott, Matwin (CR10) 1999 Frolov, Jako, Mezey (CR17) 2001; 30 Mezey (CR19) 1993 Vapnik, Chervonenkis (CR22) 1974 Hofmann, Cai (CR4) 2003 CR30 Freund, Schapire (CR26) 1997; 55 Manning, Raghavan, Schutze (CR6) 2008 Salton, Wong, Yang (CR13) 1975; 18 Buttcher, Clarke, Cormack (CR14) 2010 Tibshirani, Friedman (CR24) 2009 Harris (CR7) 1954; 10 Schapire (CR3) 1990 Manning, Schutze (CR11) 1999 Frolov (CR16) 2010; 49 Vapnik (CR2) 1995 Sebastiani (CR1) 2002; 34 Cavnar, Trenkle (CR12) 1994 Frolov, Jako, Mezey (CR18) 2001; 30 CR23 Gulin (CR15) 2011; 4 CR20 Breiman (CR27) 2001; 45 Gulin (CR28) 2012; 6 Gulin (CR29) 2013 van Rijsbergen (CR21) 1979 Zhuravlev, Ryazanov, Sen’ko (CR31) 2006 V. V. Gulin (6597_CR28) 2012; 6 V. V. Gulin (6597_CR29) 2013 P. G. Mezey (6597_CR19) 1993 V. Vapnik (6597_CR2) 1995 D. Metzler (6597_CR8) 2010 6597_CR20 V. K. Vapnik (6597_CR22) 1974 Yu. I. Zhuravlev (6597_CR31) 2006 C. Manning (6597_CR6) 2008 R. Baeza-Yates (6597_CR9) 1996; 25 A. B. Frolov (6597_CR16) 2010; 49 6597_CR23 V. V. Gulin (6597_CR15) 2011; 4 A. Frolov (6597_CR17) 2001; 30 L. Breiman (6597_CR27) 2001; 45 T. Hofmann (6597_CR4) 2003 J. R. Quinlan (6597_CR25) 1993 R. Tibshirani (6597_CR24) 2009 Y. Freund (6597_CR26) 1997; 55 S. Scott (6597_CR10) 1999 C. Buttcher (6597_CR14) 2010 6597_CR30 C. J. van Rijsbergen (6597_CR21) 1979 F. Sebastiani (6597_CR1) 2002; 34 Z. Harris (6597_CR7) 1954; 10 W. Cavnar (6597_CR12) 1994 G. Salton (6597_CR13) 1975; 18 T. Joachims (6597_CR5) 1998 D. Manning (6597_CR11) 1999 A. Frolov (6597_CR18) 2001; 30 R. Schapire (6597_CR3) 1990 |
| References_xml | – volume: 4 start-page: 100 year: 2011 end-page: 108 ident: CR15 article-title: A comparative analysis of text document classification methods publication-title: Vestn. MEI – start-page: 137 year: 1998 end-page: 142 ident: CR5 article-title: Text categorization with support vector machines: learning with many relevant features publication-title: in Pro-ceedings of the 10th European Conference on Machine Learning – year: 1979 ident: CR21 publication-title: Information Retrieval – ident: CR30 – year: 2010 ident: CR8 publication-title: Search Engines: Information Retrieval in Practice – start-page: 161 year: 1994 end-page: 175 ident: CR12 article-title: N-Gram-based text categorization publication-title: in Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval SDAIR-94, Las Vegas, NV – volume: 18 start-page: 613 issue: 11 year: 1975 end-page: 620 ident: CR13 article-title: A vector space model for automatic indexing publication-title: Commun. ACM doi: 10.1145/361219.361220 – volume: 30 start-page: 411 year: 2001 end-page: 428 ident: CR18 article-title: Metric properties of factor space of molecular shapes publication-title: Math. Chem. doi: 10.1023/A:1015142527070 – volume: 6 start-page: 124 year: 2012 end-page: 131 ident: CR28 article-title: Study of gradien boosting method on “inattentive” decision trees in text documents classification problem publication-title: Vestn. MEI – volume: 49 start-page: 65 year: 2010 ident: CR16 article-title: A finite topology principle in recognizing topological forms publication-title: J. Comput. Syst. Sci. Int. doi: 10.1134/S1064230710010089 – year: 1993 ident: CR19 publication-title: Shape in Chemistry: An Introduction to Molecular Shape Topology – ident: CR23 – year: 2008 ident: CR6 publication-title: Introduction to Information Retrieval doi: 10.1017/CBO9780511809071 – volume: 55 start-page: 119 year: 1997 end-page: 139 ident: CR26 article-title: Learning and an application to boosting publication-title: J. Comput. Syst. Sci. doi: 10.1006/jcss.1997.1504 – year: 2010 ident: CR14 publication-title: Information Retrieval: Implementing and Evaluating Search Engines – volume: 30 start-page: 389 year: 2001 end-page: 409 ident: CR17 article-title: Logical models of molecular shapes and their families publication-title: Math. Chem. doi: 10.1023/A:1015190410232 – year: 1999 ident: CR11 publication-title: Foundations of Statistical Natural Language Processing – year: 1974 ident: CR22 publication-title: Theory of Pattern Recognition – year: 2013 ident: CR29 publication-title: Certificate of official registration of the computer program No. 2013612095, Machine Learning Library – start-page: 197 year: 1990 end-page: 227 ident: CR3 article-title: The strength of weak Learnability publication-title: in Machine Learning – volume: 25 start-page: 67 issue: 1 year: 1996 end-page: 79 ident: CR9 article-title: Integrating contents and structure in text retrieval publication-title: ACM SIGMOD Record doi: 10.1145/381854.381890 – start-page: 370 year: 1999 end-page: 388 ident: CR10 article-title: Feature engineering for text classification publication-title: in Proceedings of 16th International Con-ference on Machine Learning ICML-99, Bled, Slovenia – year: 2006 ident: CR31 publication-title: Recognition. Mathematical Methods. Softwave System. Prac-tical Applications – year: 1995 ident: CR2 publication-title: The Nature of Statistical Learning Theory doi: 10.1007/978-1-4757-2440-0 – volume: 10 start-page: 146 issue: 23 year: 1954 end-page: 162 ident: CR7 article-title: Distributional structure publication-title: Word doi: 10.1080/00437956.1954.11659520 – volume: 45 start-page: 5 issue: 1 year: 2001 end-page: 32 ident: CR27 article-title: Random forests publication-title: Machine Learning doi: 10.1023/A:1010933404324 – year: 2009 ident: CR24 publication-title: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics – year: 1993 ident: CR25 publication-title: C4.5: Programs for Machine Learning – volume: 34 start-page: 1 issue: 1 year: 2002 end-page: 47 ident: CR1 article-title: Machine learning in automated text categorization publication-title: ACM Comput. Surv. doi: 10.1145/505282.505283 – start-page: 182 year: 2003 end-page: 189 ident: CR4 article-title: Text categorization by boosting automatically extracted concepts publication-title: in Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval – ident: CR20 – volume-title: Search Engines: Information Retrieval in Practice year: 2010 ident: 6597_CR8 – start-page: 161 volume-title: in Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval SDAIR-94, Las Vegas, NV year: 1994 ident: 6597_CR12 – volume: 6 start-page: 124 year: 2012 ident: 6597_CR28 publication-title: Vestn. MEI – start-page: 137 volume-title: in Pro-ceedings of the 10th European Conference on Machine Learning year: 1998 ident: 6597_CR5 – volume: 45 start-page: 5 issue: 1 year: 2001 ident: 6597_CR27 publication-title: Machine Learning doi: 10.1023/A:1010933404324 – volume: 25 start-page: 67 issue: 1 year: 1996 ident: 6597_CR9 publication-title: ACM SIGMOD Record doi: 10.1145/381854.381890 – volume-title: Foundations of Statistical Natural Language Processing year: 1999 ident: 6597_CR11 – ident: 6597_CR30 – volume-title: Information Retrieval year: 1979 ident: 6597_CR21 – ident: 6597_CR23 – volume: 49 start-page: 65 year: 2010 ident: 6597_CR16 publication-title: J. Comput. Syst. Sci. Int. doi: 10.1134/S1064230710010089 – volume: 4 start-page: 100 year: 2011 ident: 6597_CR15 publication-title: Vestn. MEI – volume-title: Introduction to Information Retrieval year: 2008 ident: 6597_CR6 doi: 10.1017/CBO9780511809071 – volume-title: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics year: 2009 ident: 6597_CR24 – volume: 18 start-page: 613 issue: 11 year: 1975 ident: 6597_CR13 publication-title: Commun. ACM doi: 10.1145/361219.361220 – volume-title: The Nature of Statistical Learning Theory year: 1995 ident: 6597_CR2 doi: 10.1007/978-1-4757-2440-0 – volume-title: Recognition. Mathematical Methods. Softwave System. Prac-tical Applications year: 2006 ident: 6597_CR31 – volume-title: Shape in Chemistry: An Introduction to Molecular Shape Topology year: 1993 ident: 6597_CR19 – volume: 10 start-page: 146 issue: 23 year: 1954 ident: 6597_CR7 publication-title: Word doi: 10.1080/00437956.1954.11659520 – start-page: 197 volume-title: in Machine Learning year: 1990 ident: 6597_CR3 – volume: 34 start-page: 1 issue: 1 year: 2002 ident: 6597_CR1 publication-title: ACM Comput. Surv. doi: 10.1145/505282.505283 – volume: 30 start-page: 389 year: 2001 ident: 6597_CR17 publication-title: Math. Chem. doi: 10.1023/A:1015190410232 – volume-title: Information Retrieval: Implementing and Evaluating Search Engines year: 2010 ident: 6597_CR14 – start-page: 182 volume-title: in Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval year: 2003 ident: 6597_CR4 – start-page: 370 volume-title: in Proceedings of 16th International Con-ference on Machine Learning ICML-99, Bled, Slovenia year: 1999 ident: 6597_CR10 – volume: 30 start-page: 411 year: 2001 ident: 6597_CR18 publication-title: Math. Chem. doi: 10.1023/A:1015142527070 – volume-title: C4.5: Programs for Machine Learning year: 1993 ident: 6597_CR25 – volume: 55 start-page: 119 year: 1997 ident: 6597_CR26 publication-title: J. Comput. Syst. Sci. doi: 10.1006/jcss.1997.1504 – volume-title: Certificate of official registration of the computer program No. 2013612095, Machine Learning Library year: 2013 ident: 6597_CR29 – ident: 6597_CR20 – volume-title: Theory of Pattern Recognition year: 1974 ident: 6597_CR22 |
| SSID | ssj0039604 |
| Score | 2.0087345 |
| Snippet | A modification of the conventional bag of words model that can take into account the structural features of text documents in their classification... |
| SourceID | proquest crossref springer |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 394 |
| SubjectTerms | Artificial intelligence Classification Classifiers Collection Computer science Computer simulation Control Dictionaries Documents Engineering Machine learning Mathematical analysis Mathematical models Mechatronics Names Pattern Recognition and Image Processing Random variables Robotics Studies Text categorization Texts |
| SummonAdditionalLinks | – databaseName: Computer Science Database dbid: K7- link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1JS8QwFH64HfTgMiqOGxE8uBBsm3Q7iYgiCCqoMLeSrSBIO9oZf795aTpu6MVrmjaF95J86ff6fQD7mUhiHQeGCnu-ojw3iqKsGGVayJQnQsRcO7OJ9OYmGwzyO__BrfFlld2a6BZqXSv8Rn4SZm6740F2Onyh6BqF7Kq30JiG2TCKQszz65R2KzFD4RHHdiacYsGzZzVDxk_usRHb0GYZddW-7ksfYPMbP-q2ncul_77wMix6wEnO2gxZgSlT9WCpM3Mgfm73YOGTMmEPVnx7Qw68MPXhKjzeVsTiRaIQcWOJkYsqqUuC5SPEvtLY_TBHRs7iijxVo5qI1o6COEaCtHK1KPVBSuM0RZs1eLy8eDi_ot6WgSrG8xHljOuSSSYjo4xKNUtMIIWSYW5yGYmMS4vhSiR0bQ6wJOFZqZmSHBlDldn-6zBT1ZXZABLJNLb4lMUKRXJKIXUYBUpbUFfyMBRBH4IuKIXymuVonfFcuLML48WPOPbhaHLLsBXs-Kvzdhe7ws_dpvgIXB_2JpftrEMqRVSmHmMfexSLA4sA-nDcZcinR_w24ObfA27BvAVkSVtQuQ0zNiZmB-bU2-iped11yf0O8ab8HA priority: 102 providerName: ProQuest |
| Title | On the classification of text documents taking into account their structural features |
| URI | https://link.springer.com/article/10.1134/S1064230716030102 https://www.proquest.com/docview/1800074408 https://www.proquest.com/docview/1825550413 |
| Volume | 55 |
| WOSCitedRecordID | wos000379020700006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVPQU databaseName: ABI/INFORM Collection customDbUrl: eissn: 1555-6530 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0039604 issn: 1064-2307 databaseCode: 7WY dateStart: 20060101 isFulltext: true titleUrlDefault: https://www.proquest.com/abicomplete providerName: ProQuest – providerCode: PRVPQU databaseName: ABI/INFORM Global customDbUrl: eissn: 1555-6530 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0039604 issn: 1064-2307 databaseCode: M0C dateStart: 20060101 isFulltext: true titleUrlDefault: https://search.proquest.com/abiglobal providerName: ProQuest – providerCode: PRVPQU databaseName: Advanced Technologies & Aerospace Database customDbUrl: eissn: 1555-6530 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0039604 issn: 1064-2307 databaseCode: P5Z dateStart: 20060101 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: Computer Science Database customDbUrl: eissn: 1555-6530 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0039604 issn: 1064-2307 databaseCode: K7- dateStart: 20060101 isFulltext: true titleUrlDefault: http://search.proquest.com/compscijour providerName: ProQuest – providerCode: PRVPQU databaseName: Engineering Database customDbUrl: eissn: 1555-6530 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0039604 issn: 1064-2307 databaseCode: M7S dateStart: 20060101 isFulltext: true titleUrlDefault: http://search.proquest.com providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 1555-6530 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0039604 issn: 1064-2307 databaseCode: BENPR dateStart: 20060101 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVAVX databaseName: SpringerLink Journals customDbUrl: eissn: 1555-6530 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0039604 issn: 1064-2307 databaseCode: RSV dateStart: 20060101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Za9wwEB6a46F9aJpNSrdNFwX60CSI2Cv5emxDQqFku2ST5ngxugwLxQ7xpr-_M7KcJr0gfRHYHh94NNInZvR9AO9ylSY2iRxXuL7isnCGE60YF1bpTKZKJdJ6sYlsMskvLopp2Mfd9tXufUrSj9Sd7ojcn8WElbFLkjAyMaEtwQrOdjlF48nsaz_8CmIb8SnOVHIyD6nMPz7i4WT0E2H-khT1c83R2n995Qt4HqAl-9D1hXV44uoBrPWyDSxE8QCe3eMgHMB6ON-y94GCemcDzr7UDJEhM4StqZjI-481FaNCEWYbc-u3xrGFF7Ni83rRMNUJTzCfe2AdMS2RerDKefbQdhPOjg5PDz7xIMDAjZDFgkshbSW00GNnnMmsSF2kldFx4Qo9VrnUiNYqSt2it0WayryywmhJuUGTo_1LWK6b2r0CNtZZgkhUJIbocCqlbTyOjEX4Vsk4VtEQot4TpQns5CSS8a30qxQhy9_-7BB272657qg5_mW81bu3DFHalnHuIZSM8iFs313G-KKkiapdc0s2uOhKIpzrh7DXu_zeI_72wtePsn4DTxGJpV0l5RYso4vcW1g13xfz9mYES9n55QhWPh5Opid49Dnj2B5HB9RmM2ynydXIh8EPlL73Sg |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3fS90wFD6oG8w9zHmneKdzGUzYlGDbpL3twxjDTRT1KqjgW82vgiCt2uvG_qn9jZ6TtuoUffPBx6ZpQpKTk3N6Tr4P4HOqktjGgeMK_SsuM2c4wYpxYZUeyESpWFpPNjEYDtOjo2xvDP51d2EorbLTiV5R28rQP_LVMPXHnQzS72fnnFijKLraUWg0YrHl_v5Bl63-tvkT13cpitZ_Haxt8JZVgBshsxGXQtpCaKEjZ5wZWJG4QCujw8xlOlKp1GiCFBSPxCGIJJFpYYXRkgJeJsX62O44vMBnQYpgJ1jrNL8goBMfXU0kpwTrNooaCrm6T4VURrTOhOP2_zl4Y9zeicf6Y2596rlN0Ft40xrU7EezA6ZhzJU9mOrIKliru3rw-hbyYg-m2_KafWmBt7--g8PdkqE9zAx5FJRC5aWWVQWj9BiGU3DpLwSykafwYiflqGKqodtgPuLCGjhegjJhhfOYqfUMHD7J-GdhoqxKNwcs0oMY7W8RGwIBKpS2YRQYi0ZrIcNQBX0IOiHITYvJTtQgp7n3zYTM78lNH5avPzlrAEkeq7zQyUre6qY6vxGUPny6fo1ahUJFqnTVJdVBVzMO0MLpw0onkbeaeKjD9493-BFebRzsbOfbm8OteZhE4zNpkkcXYALXx32Al-b36KS-WPQbi8HxUwvqFTPDWgM |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1ZS8QwEB50FdEHj1VxPSP44EGx3aTd9lHURVFWwQPfSq6CIK3Y6u83k6brLYiv7TQtnUn7hW_yfQBbMY9CFfra42Z95bFESw9lxTyquOixiPOQKWs20RsM4ru75NL5nJZNt3tDSdZ7GlClKa_2H1XmPEjY_lWAuNmUJ5okoyraKIwx9AzC5frVbfMppqg8YunOiHkY7mjNb4f4-GN6Q5ufCFL73-nP_PuJZ2HaQU5yUNfIHIzovA0zjZ0DcbO7DVPvtAnbMOeOl2TbSVPvzMPNRU4MYiQSMTc2Gdm8kiIj2EBCVCGf7ZY5UlmTK3KfVwXhtSEFsZwEqQVrUeyDZNqqipYLcNM_vj488ZwxgycpSyqPUaYyKqjoaqllT9FI-4JLESQ6EV0eM2FQXIaUrqkCGkUszhSVgiFnKGMTvwitvMj1EpCu6IUGodJQokxOxoUKur5UBtZlLAi43wG_yUoqnWo5mmc8pHb1Qln65c12YHd4yWMt2fFb8GqT6tTN3jINYgutmB93YHN42sw7JFN4rotnjDGLsdA3GKADe0363w3x0w2X_xS9AROXR_30_HRwtgKTBqxFdbPlKrRMtvQajMuX6r58Wrd1_wqErPzH |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=On+the+classification+of+text+documents+taking+into+account+their+structural+features&rft.jtitle=Journal+of+computer+%26+systems+sciences+international&rft.au=Gulin%2C+V.+V.&rft.au=Frolov%2C+A.+B.&rft.date=2016-05-01&rft.pub=Pleiades+Publishing&rft.issn=1064-2307&rft.eissn=1555-6530&rft.volume=55&rft.issue=3&rft.spage=394&rft.epage=403&rft_id=info:doi/10.1134%2FS1064230716030102&rft.externalDocID=10_1134_S1064230716030102 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1064-2307&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1064-2307&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1064-2307&client=summon |