Taxonomy grooming algorithm ‐ An autodidactic domain specific dimensionality reduction approach for fast clustering of social media text data

Social media being the most eminent source toward the growth of big data is important for information retrieval‐based applications to improve the efficiency in proportional to the volume it must deal with. One way to achieve better performance is to upgrade the processing capacity and the alternativ...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Concurrency and computation Jg. 34; H. 11
Hauptverfasser: Renjith, Shini, Sreekumar, A., Jathavedan, M.
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Hoboken, USA John Wiley & Sons, Inc 15.05.2022
Wiley Subscription Services, Inc
Schlagworte:
ISSN:1532-0626, 1532-0634
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Social media being the most eminent source toward the growth of big data is important for information retrieval‐based applications to improve the efficiency in proportional to the volume it must deal with. One way to achieve better performance is to upgrade the processing capacity and the alternative option is to improve the processing methodology. The latter can be achieved using smarter processing techniques and/or better algorithms. Reducing the data volume that needs to be processed is a good strategy and it can be achieved by extracting only the relevant information via user segmentation by adopting an appropriate clustering technique. However, while dealing with text content, clustering algorithms do suffer due to the very high dimensions to be dealt with. Since the domain‐specific aspects are getting lost while applying traditional dimensionality reduction approaches, it is important to device an alternate strategy. This work proposes a taxonomy grooming algorithm (TGA), an autodidactic domain‐specific dimensionality reduction approach, for fast clustering of social media text data. Our experiment results are very promising and the dimensionality reduction using TGA resulted in better results in comparison with the traditional dimensionality reduction approaches.
AbstractList Social media being the most eminent source toward the growth of big data is important for information retrieval‐based applications to improve the efficiency in proportional to the volume it must deal with. One way to achieve better performance is to upgrade the processing capacity and the alternative option is to improve the processing methodology. The latter can be achieved using smarter processing techniques and/or better algorithms. Reducing the data volume that needs to be processed is a good strategy and it can be achieved by extracting only the relevant information via user segmentation by adopting an appropriate clustering technique. However, while dealing with text content, clustering algorithms do suffer due to the very high dimensions to be dealt with. Since the domain‐specific aspects are getting lost while applying traditional dimensionality reduction approaches, it is important to device an alternate strategy. This work proposes a taxonomy grooming algorithm (TGA), an autodidactic domain‐specific dimensionality reduction approach, for fast clustering of social media text data. Our experiment results are very promising and the dimensionality reduction using TGA resulted in better results in comparison with the traditional dimensionality reduction approaches.
Author Sreekumar, A.
Renjith, Shini
Jathavedan, M.
Author_xml – sequence: 1
  givenname: Shini
  orcidid: 0000-0003-1088-0825
  surname: Renjith
  fullname: Renjith, Shini
  email: shinirenjith@gmail.com
  organization: Mar Baselios College of Engineering and Technology
– sequence: 2
  givenname: A.
  surname: Sreekumar
  fullname: Sreekumar, A.
  organization: Cochin University of Science and Technology
– sequence: 3
  givenname: M.
  surname: Jathavedan
  fullname: Jathavedan, M.
  organization: Cochin University of Science and Technology
BookMark eNp1kMtOwzAQRS0EEuUh8QmW2LBJ8SOJm2VVlYeEBIuyjqZ-tK6SONiOoDv-AL6RLyGhiAWClceac2fm3iO037hGI3RGyZgSwi5lq8f5hIs9NKIZZwnJebr_U7P8EB2FsCGEUsLpCL0t4MU1rt7ilXeuts0KQ7Vy3sZ1jT9e3_G0wdBFp6wCGa3EytVgGxxaLa0Z_rbWTbCugcrGLfZadT3nelXbegdyjY3z2ECIWFZdiNoPO5zBwUkLFa61soCjfolYQYQTdGCgCvr0-z1Gj1fzxewmubu_vp1N7xLJCi4Sni-JSOVSGJpBVjCll1otU661nBSMCJATXpi0Ny8FlSBYJggb-lxmRgngx-h8N7c_8qnTIZYb1_neRChZntGMMJIWPTXeUdK7ELw2pbQRBnvRg61KSsoh9LIPvRxC7wUXvwSttzX47V9oskOfbaW3_3Ll7GH-xX8CKPeWqw
CitedBy_id crossref_primary_10_1007_s42452_024_06443_7
Cites_doi 10.1007/978-0-387-30164-8_826
10.7551/mitpress/7287.003.0018
10.1016/0377-0427(87)90125-7
10.1145/2124295.2124308
10.1016/j.matpr.2020.01.110
10.1016/j.patrec.2014.09.008
10.1007/978-981-15-3514-7_78
10.1109/TPAMI.1979.4766909
10.1016/j.eswa.2019.05.030
10.1007/978-1-4615-5725-8_7
10.1007/978-981-15-5558-9_45
10.1016/j.eswa.2014.11.038
10.1002/cpe.6359
10.1016/j.csl.2004.05.007
10.1016/j.procs.2015.02.026
10.1080/03610927408827101
10.1007/3-540-36175-8_7
10.1109/ICDM.2003.1250972
10.1016/S0957-4174(02)00185-9
10.1016/j.ins.2009.02.019
10.1016/j.eswa.2014.10.023
10.1075/cilt.189.35deb
10.1145/2766462.2767755
10.3115/1621445.1621458
10.1126/science.295.5552.7a
10.3115/981732.981751
10.1037/h0071325
10.1016/j.patcog.2017.09.045
10.1109/IJCNN.1998.685895
10.1109/RAICS.2018.8635080
10.1016/j.jocs.2013.11.007
10.1145/219717.219748
10.1016/j.neucom.2017.11.019
10.1016/S2095-3119(12)60064-1
10.1016/j.asoc.2016.01.019
10.1016/j.future.2017.12.005
10.1080/13102818.2014.949045
10.1016/j.csl.2004.05.004
10.1016/j.asoc.2014.11.015
10.1007/978-981-15-7234-0_36
10.1145/2872427.2883037
10.1016/j.ipm.2019.102078
10.1108/eb026526
10.1007/11424918_14
10.1016/j.knosys.2014.11.028
10.1037/h0054116
ContentType Journal Article
Copyright 2022 John Wiley & Sons Ltd.
2022 John Wiley & Sons, Ltd.
Copyright_xml – notice: 2022 John Wiley & Sons Ltd.
– notice: 2022 John Wiley & Sons, Ltd.
DBID AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1002/cpe.6837
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList CrossRef
Computer and Information Systems Abstracts

DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1532-0634
EndPage n/a
ExternalDocumentID 10_1002_cpe_6837
CPE6837
Genre article
GroupedDBID .3N
.DC
.GA
05W
0R~
10A
1L6
1OC
33P
3SF
3WU
4.4
50Y
50Z
51W
51X
52M
52N
52O
52P
52S
52T
52U
52W
52X
5GY
5VS
66C
702
7PT
8-0
8-1
8-3
8-4
8-5
8UM
930
A03
AAESR
AAEVG
AAHHS
AAHQN
AAMNL
AANLZ
AAONW
AAXRX
AAYCA
AAZKR
ABCQN
ABCUV
ABEML
ABIJN
ACAHQ
ACCFJ
ACCZN
ACPOU
ACSCC
ACXBN
ACXQS
ADBBV
ADEOM
ADIZJ
ADKYN
ADMGS
ADOZA
ADXAS
ADZMN
ADZOD
AEEZP
AEIGN
AEIMD
AEQDE
AEUQT
AEUYR
AFBPY
AFFPM
AFGKR
AFPWT
AFWVQ
AHBTC
AITYG
AIURR
AIWBW
AJBDE
AJXKR
ALMA_UNASSIGNED_HOLDINGS
ALUQN
ALVPJ
AMBMR
AMYDB
ATUGU
AUFTA
AZBYB
BAFTC
BDRZF
BFHJK
BHBCM
BMNLL
BROTX
BRXPI
BY8
CS3
D-E
D-F
DCZOG
DPXWK
DR2
DRFUL
DRSTM
EBS
F00
F01
F04
F5P
G-S
G.N
GNP
GODZA
HGLYW
HHY
HZ~
IX1
JPC
KQQ
LATKE
LAW
LC2
LC3
LEEKS
LH4
LITHE
LOXES
LP6
LP7
LUTES
LYRES
MEWTI
MK4
MRFUL
MRSTM
MSFUL
MSSTM
MXFUL
MXSTM
N04
N05
N9A
O66
O9-
OIG
P2W
P2X
P4D
PQQKQ
Q.N
Q11
QB0
QRW
R.K
ROL
RWI
RX1
SUPJJ
TN5
UB1
V2E
W8V
W99
WBKPD
WIH
WIK
WOHZO
WQJ
WRC
WXSBR
WYISQ
WZISG
XG1
XV2
~IA
~WT
AAYXX
ADMLS
AEYWJ
AGHNM
AGYGG
CITATION
O8X
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c2937-36b074cb7f15a592debedb43eec89207ac839f4063c71ca725702b43e3c5fd7a3
IEDL.DBID DRFUL
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000749260800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1532-0626
IngestDate Sun Nov 09 06:00:41 EST 2025
Tue Nov 18 21:38:38 EST 2025
Sat Nov 29 01:41:28 EST 2025
Wed Jan 22 16:26:06 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 11
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c2937-36b074cb7f15a592debedb43eec89207ac839f4063c71ca725702b43e3c5fd7a3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-1088-0825
PQID 2651502049
PQPubID 2045170
PageCount 19
ParticipantIDs proquest_journals_2651502049
crossref_citationtrail_10_1002_cpe_6837
crossref_primary_10_1002_cpe_6837
wiley_primary_10_1002_cpe_6837_CPE6837
PublicationCentury 2000
PublicationDate 15 May 2022
PublicationDateYYYYMMDD 2022-05-15
PublicationDate_xml – month: 05
  year: 2022
  text: 15 May 2022
  day: 15
PublicationDecade 2020
PublicationPlace Hoboken, USA
PublicationPlace_xml – name: Hoboken, USA
– name: Hoboken
PublicationTitle Concurrency and computation
PublicationYear 2022
Publisher John Wiley & Sons, Inc
Wiley Subscription Services, Inc
Publisher_xml – name: John Wiley & Sons, Inc
– name: Wiley Subscription Services, Inc
References 1933; 24
1995; 38
2015; 75
2019; 57
2008; 9
1943; 38
2018; 82
2014; 28
2012; 11
1974; 3
2020a; 1133
2015; 46
2014; 5
2000
2015; 42
2016; 43
1998; 98
2018; 76
2010; 5
2021b; 33
2012
2011
2002; 295
2015; 52
1998
1997
2007
1995
2009; 179
1994
2005
2004
2003
2020c; 1245
2020b; 672
2021a; 5
1972; 28
1999
1979; PAMI‐1
2018; 275
1987; 20
2015; 27
2004; 18
2003; 24
2020; 27
2005; 1
2016
2005; 17
2019; 134
e_1_2_9_31_1
e_1_2_9_50_1
e_1_2_9_10_1
e_1_2_9_35_1
e_1_2_9_56_1
e_1_2_9_12_1
e_1_2_9_33_1
e_1_2_9_54_1
Hung C (e_1_2_9_22_1) 2005; 1
e_1_2_9_39_1
e_1_2_9_16_1
e_1_2_9_37_1
e_1_2_9_58_1
Renjith S (e_1_2_9_14_1) 2020; 1133
d'Aspremont A (e_1_2_9_18_1) 2005; 17
e_1_2_9_41_1
e_1_2_9_20_1
Manning CD (e_1_2_9_2_1) 1999
e_1_2_9_45_1
Lin D (e_1_2_9_7_1) 1998; 98
e_1_2_9_24_1
e_1_2_9_43_1
e_1_2_9_6_1
e_1_2_9_4_1
Renjith S (e_1_2_9_53_1) 2021; 5
Renjith S (e_1_2_9_52_1) 2020; 1245
Maaten L (e_1_2_9_19_1) 2008; 9
e_1_2_9_26_1
e_1_2_9_28_1
e_1_2_9_47_1
e_1_2_9_30_1
e_1_2_9_51_1
e_1_2_9_11_1
e_1_2_9_34_1
e_1_2_9_57_1
e_1_2_9_13_1
e_1_2_9_32_1
e_1_2_9_55_1
Chali Y (e_1_2_9_49_1) 2005
e_1_2_9_15_1
e_1_2_9_38_1
e_1_2_9_17_1
e_1_2_9_36_1
Basavaraju M (e_1_2_9_48_1) 2010; 5
e_1_2_9_42_1
e_1_2_9_40_1
e_1_2_9_21_1
e_1_2_9_46_1
e_1_2_9_23_1
e_1_2_9_44_1
Leacock C (e_1_2_9_8_1) 1998
e_1_2_9_5_1
e_1_2_9_3_1
e_1_2_9_9_1
e_1_2_9_25_1
e_1_2_9_27_1
e_1_2_9_29_1
References_xml – volume: 1133
  start-page: 1047
  year: 2020a
  end-page: 1065
  article-title: A comparative analysis of clustering quality based on internal validation indices for dimensionally reduced social media data
  publication-title: Adv Intell Syst Comput
– volume: 295
  start-page: 7
  year: 2002
  end-page: 7
  article-title: The isomap algorithm and topological stability
  publication-title: Science
– start-page: 280
  year: 2005
  end-page: 291
– volume: 98
  start-page: 296
  year: 1998
  end-page: 304
  article-title: An information‐theoretic definition of similarity
  publication-title: Icml
– volume: 28
  start-page: S44
  year: 2014
  end-page: S48
  article-title: Clustering performance comparison using K‐means and expectation maximization algorithms
  publication-title: Biotechnol Biotechnol Equip
– volume: 18
  start-page: 301
  year: 2004
  end-page: 317
  article-title: Word sense disambiguation of WordNet glosses
  publication-title: Comput Speech Lang
– volume: 1245
  start-page: 407
  year: 2020c
  end-page: 414
  article-title: A sentiment‐based recommender system framework for social media big data using open‐source tech stack
  publication-title: Adv Intell Syst Comput
– start-page: 101
  year: 1998
  end-page: 116
– volume: 275
  start-page: 2444
  year: 2018
  end-page: 2458
  article-title: Corpus‐based topic diffusion for short text clustering
  publication-title: Neurocomputing
– start-page: 120
  year: 2005
  end-page: 132
– volume: 5
  start-page: 297
  year: 2021a
  end-page: 307
  article-title: SMaRT: a framework for social media based recommender for tourism
  publication-title: Trans Comput Sci Comput Intell
– year: 1994
– year: 1998
– volume: 5
  start-page: 156
  year: 2014
  end-page: 169
  article-title: A three‐stage unsupervised dimension reduction method for text clustering
  publication-title: J Comput Sci
– volume: 27
  start-page: 627
  issue: 1
  year: 2020
  end-page: 633
  article-title: Performance evaluation of clustering algorithms for varying cardinality and dimensionality of data sets
  publication-title: Mater Today Proc
– volume: 17
  start-page: 41
  year: 2005
  end-page: 48
  article-title: A direct formulation for sparse PCA using semidefinite programming
  publication-title: Adv Neural Inf Proces Syst
– volume: 75
  start-page: 152
  year: 2015
  end-page: 160
  article-title: TESC: an approach to text classification using semi‐supervised clustering
  publication-title: Knowl‐Based Syst
– volume: 38
  start-page: 39
  year: 1995
  end-page: 41
  article-title: WordNet: a lexical database for English
  publication-title: Commun ACM
– start-page: 353
  year: 2000
– volume: 24
  start-page: 417
  year: 1933
  end-page: 441
  article-title: Analysis of a complex of statistical variables into principal components
  publication-title: J Educ Psychol
– year: 2004
– year: 1997
– volume: 5
  start-page: 15
  year: 2010
  end-page: 25
  article-title: A novel method of spam mail detection using text based clustering approach
  publication-title: Int J Comput Appl
– volume: PAMI‐1
  start-page: 224
  year: 1979
  end-page: 227
  article-title: A cluster separation measure
  publication-title: IEEE Trans Pattern Anal Mach Intell
– volume: 46
  start-page: 314
  year: 2015
  end-page: 320
  article-title: A lexical approach for text categorization of medical documents
  publication-title: Procedia Comput Sci
– volume: 43
  start-page: 20
  year: 2016
  end-page: 34
  article-title: Opposition chaotic fitness mutation based adaptive inertia weight BPSO for feature selection in text clustering
  publication-title: Appl Soft Comput
– volume: 42
  start-page: 3105
  year: 2015
  end-page: 3114
  article-title: Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering
  publication-title: Expert Syst Appl
– start-page: 63
  year: 2003
  end-page: 74
– volume: 672
  start-page: 499
  year: 2020b
  end-page: 512
  article-title: Pragmatic evaluation of the impact of dimensionality reduction in the performance of clustering algorithms
  publication-title: Lect Notes Electr Eng
– volume: 9
  start-page: 2579
  year: 2008
  end-page: 2605
  article-title: Visualizing data using t‐SNE
  publication-title: J Mach Learn Res
– volume: 42
  start-page: 2264
  year: 2015
  end-page: 2275
  article-title: A semantic approach for text clustering using WordNet and lexical chains
  publication-title: Expert Syst Appl
– volume: 11
  start-page: 752
  year: 2012
  end-page: 759
  article-title: Agricultural ontology based feature optimization for agricultural text clustering
  publication-title: J Integr Agric
– year: 2007
– volume: 27
  start-page: 269
  year: 2015
  end-page: 278
  article-title: A novel incremental conceptual hierarchical text clustering method using CFu‐tree
  publication-title: Appl Soft Comput
– year: 2003
– volume: 52
  start-page: 25
  year: 2015
  end-page: 31
  article-title: Interactive textual feature selection for consensus clustering
  publication-title: Pattern Recogn Lett
– volume: 3
  start-page: 1
  year: 1974
  end-page: 27
  article-title: A dendrite method for cluster analysis
  publication-title: Commun Stat Theory Methods
– year: 2016
– year: 2012
– volume: 28
  start-page: 11
  year: 1972
  end-page: 21
  article-title: A statistical interpretation of term specificity and its application in retrieval
  publication-title: J Doc
– volume: 33
  year: 2021b
  article-title: SemRec – an efficient ensemble recommender with sentiment based clustering for social media text corpus
  publication-title: Concurrency Computat Pract Exper
– volume: 18
  start-page: 253
  year: 2004
  end-page: 273
  article-title: Unsupervised word sense disambiguation using WordNet relatives
  publication-title: Comput Speech Lang
– start-page: 963
  year: 2011
  end-page: 968
– volume: 76
  start-page: 691
  year: 2018
  end-page: 703
  article-title: Concept decompositions for short text clustering by identifying word communities
  publication-title: Pattern Recogn
– volume: 179
  start-page: 2249
  year: 2009
  end-page: 2262
  article-title: Exploiting noun phrases and semantic relationships for text document clustering
  publication-title: Inf Sci
– volume: 134
  start-page: 192
  year: 2019
  end-page: 200
  article-title: Text document clustering using spectral clustering algorithm with particle swarm optimization
  publication-title: Expert Syst Appl
– volume: 20
  start-page: 53
  year: 1987
  end-page: 65
  article-title: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
  publication-title: J Comput Appl Math
– volume: 1
  start-page: 127
  year: 2005
  end-page: 142
  article-title: Neural network based document clustering using WordNet ontologies
  publication-title: Int J Hybrid Intell Syst
– volume: 38
  start-page: 476
  year: 1943
  end-page: 506
  article-title: The description of personality: basic traits resolved into clusters
  publication-title: J Abnorm Soc Psychol
– volume: 24
  start-page: 351
  year: 2003
  end-page: 363
  article-title: Empirical comparison of fast partitioning‐based clustering algorithms for large data sets
  publication-title: Expert Syst Appl
– year: 1995
– volume: 57
  issue: 1
  year: 2019
  article-title: An extensive study on the evolution of context‐aware personalized travel recommender systems
  publication-title: Inf Process Manag
– volume: 82
  start-page: 190
  year: 2018
  end-page: 199
  article-title: Link based BPSO for feature selection in big data text clustering
  publication-title: Futur Gener Comput Syst
– year: 1999
– ident: e_1_2_9_55_1
  doi: 10.1007/978-0-387-30164-8_826
– volume-title: Combining Local Context and WordNet Similarity for Word Sense Identification
  year: 1998
  ident: e_1_2_9_8_1
  doi: 10.7551/mitpress/7287.003.0018
– ident: e_1_2_9_56_1
– ident: e_1_2_9_11_1
  doi: 10.1016/0377-0427(87)90125-7
– ident: e_1_2_9_50_1
  doi: 10.1145/2124295.2124308
– ident: e_1_2_9_44_1
  doi: 10.1016/j.matpr.2020.01.110
– ident: e_1_2_9_37_1
  doi: 10.1016/j.patrec.2014.09.008
– volume: 1133
  start-page: 1047
  year: 2020
  ident: e_1_2_9_14_1
  article-title: A comparative analysis of clustering quality based on internal validation indices for dimensionally reduced social media data
  publication-title: Adv Intell Syst Comput
  doi: 10.1007/978-981-15-3514-7_78
– ident: e_1_2_9_13_1
  doi: 10.1109/TPAMI.1979.4766909
– start-page: 280
  volume-title: Lecture Notes in Computer Science
  year: 2005
  ident: e_1_2_9_49_1
– ident: e_1_2_9_36_1
  doi: 10.1016/j.eswa.2019.05.030
– ident: e_1_2_9_5_1
– ident: e_1_2_9_16_1
  doi: 10.1007/978-1-4615-5725-8_7
– ident: e_1_2_9_15_1
  doi: 10.1007/978-981-15-5558-9_45
– ident: e_1_2_9_30_1
  doi: 10.1016/j.eswa.2014.11.038
– ident: e_1_2_9_54_1
  doi: 10.1002/cpe.6359
– ident: e_1_2_9_23_1
  doi: 10.1016/j.csl.2004.05.007
– ident: e_1_2_9_33_1
  doi: 10.1016/j.procs.2015.02.026
– ident: e_1_2_9_12_1
  doi: 10.1080/03610927408827101
– ident: e_1_2_9_47_1
  doi: 10.1007/3-540-36175-8_7
– ident: e_1_2_9_21_1
  doi: 10.1109/ICDM.2003.1250972
– ident: e_1_2_9_46_1
  doi: 10.1016/S0957-4174(02)00185-9
– ident: e_1_2_9_27_1
  doi: 10.1016/j.ins.2009.02.019
– ident: e_1_2_9_28_1
  doi: 10.1016/j.eswa.2014.10.023
– volume: 5
  start-page: 297
  year: 2021
  ident: e_1_2_9_53_1
  article-title: SMaRT: a framework for social media based recommender for tourism
  publication-title: Trans Comput Sci Comput Intell
– volume: 9
  start-page: 2579
  year: 2008
  ident: e_1_2_9_19_1
  article-title: Visualizing data using t‐SNE
  publication-title: J Mach Learn Res
– ident: e_1_2_9_26_1
  doi: 10.1075/cilt.189.35deb
– ident: e_1_2_9_58_1
  doi: 10.1145/2766462.2767755
– volume: 98
  start-page: 296
  year: 1998
  ident: e_1_2_9_7_1
  article-title: An information‐theoretic definition of similarity
  publication-title: Icml
– ident: e_1_2_9_6_1
– ident: e_1_2_9_24_1
  doi: 10.3115/1621445.1621458
– volume: 1
  start-page: 127
  year: 2005
  ident: e_1_2_9_22_1
  article-title: Neural network based document clustering using WordNet ontologies
  publication-title: Int J Hybrid Intell Syst
– ident: e_1_2_9_20_1
  doi: 10.1126/science.295.5552.7a
– ident: e_1_2_9_4_1
  doi: 10.3115/981732.981751
– ident: e_1_2_9_17_1
  doi: 10.1037/h0071325
– ident: e_1_2_9_31_1
  doi: 10.1016/j.patcog.2017.09.045
– ident: e_1_2_9_41_1
  doi: 10.1109/IJCNN.1998.685895
– volume: 5
  start-page: 15
  year: 2010
  ident: e_1_2_9_48_1
  article-title: A novel method of spam mail detection using text based clustering approach
  publication-title: Int J Comput Appl
– ident: e_1_2_9_43_1
  doi: 10.1109/RAICS.2018.8635080
– ident: e_1_2_9_29_1
  doi: 10.1016/j.jocs.2013.11.007
– ident: e_1_2_9_3_1
  doi: 10.1145/219717.219748
– ident: e_1_2_9_40_1
  doi: 10.1016/j.neucom.2017.11.019
– ident: e_1_2_9_32_1
  doi: 10.1016/S2095-3119(12)60064-1
– ident: e_1_2_9_34_1
  doi: 10.1016/j.asoc.2016.01.019
– ident: e_1_2_9_35_1
  doi: 10.1016/j.future.2017.12.005
– ident: e_1_2_9_42_1
  doi: 10.1080/13102818.2014.949045
– ident: e_1_2_9_25_1
  doi: 10.1016/j.csl.2004.05.004
– ident: e_1_2_9_38_1
  doi: 10.1016/j.asoc.2014.11.015
– volume: 1245
  start-page: 407
  year: 2020
  ident: e_1_2_9_52_1
  article-title: A sentiment‐based recommender system framework for social media big data using open‐source tech stack
  publication-title: Adv Intell Syst Comput
  doi: 10.1007/978-981-15-7234-0_36
– ident: e_1_2_9_57_1
  doi: 10.1145/2872427.2883037
– volume: 17
  start-page: 41
  year: 2005
  ident: e_1_2_9_18_1
  article-title: A direct formulation for sparse PCA using semidefinite programming
  publication-title: Adv Neural Inf Proces Syst
– ident: e_1_2_9_51_1
  doi: 10.1016/j.ipm.2019.102078
– volume-title: Foundations of Statistical Natural Language Processing
  year: 1999
  ident: e_1_2_9_2_1
– ident: e_1_2_9_9_1
  doi: 10.1108/eb026526
– ident: e_1_2_9_45_1
  doi: 10.1007/11424918_14
– ident: e_1_2_9_39_1
  doi: 10.1016/j.knosys.2014.11.028
– ident: e_1_2_9_10_1
  doi: 10.1037/h0054116
SSID ssj0011031
Score 2.3018382
Snippet Social media being the most eminent source toward the growth of big data is important for information retrieval‐based applications to improve the efficiency in...
SourceID proquest
crossref
wiley
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
SubjectTerms Algorithms
Big Data
Clustering
Digital media
dimensionality reduction
Information retrieval
Reduction
Segmentation
social media data
Social networks
Taxonomy
taxonomy grooming
Title Taxonomy grooming algorithm ‐ An autodidactic domain specific dimensionality reduction approach for fast clustering of social media text data
URI https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fcpe.6837
https://www.proquest.com/docview/2651502049
Volume 34
WOSCitedRecordID wos000749260800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVWIB
  databaseName: Wiley Online Library Full Collection 2020
  customDbUrl:
  eissn: 1532-0634
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0011031
  issn: 1532-0626
  databaseCode: DRFUL
  dateStart: 20010101
  isFulltext: true
  titleUrlDefault: https://onlinelibrary.wiley.com
  providerName: Wiley-Blackwell
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8QwEB509eDFt7i-GEH0VN2m76OoiwcRERVvJU1SXVhb2e2KR_-B_kZ_iTN9rAoKgqcemtCSmUm-JDPfB7Dj2kbZiW8sR3cMbVACaYU6ta3QRLYMHDt1SuL5m7Pg_Dy8vY0u6qxKroWp-CHGB24cGeV8zQEuk-HBJ2moejT7Pm2vJmFKkNu6LZg6vuxen43vEFjAoGJLFVaHcHtDPdsRB03f74vRJ8L8ilPLhaY7959fnIfZGl7iYeUPCzBhskWYa6QbsI7kJXi9ks9lOQPeMXSm9Qtl_y4f9Ir7B3x_ecPDDOWoyHVPl1VUqPMH2cuQCzM5uQg1qwJUjB6E43HADLBsY2xIypHQMKZyWKDqj5iNgb-Rp1gd0mNZsYKcdoKcpLoM192Tq6NTq9ZmsBQBBJqX_ITAh0qC1PakFwlNzqAT1zFGhZEgkytCXimhBUcFtpIBi-UJfu8oL9WBdFagleWZWQUMHI9J61RKyM0NtS9pE6cTu6MS8iLp-m3Ya4wUq5q4nPUz-nFFuSxiGueYx7kN2-OWjxVZxw9tNho7x3W4DmPBgvBcJhy1Ybe06K_946OLE36u_bXhOswILplgxldvA1rFYGQ2YVo9Fb3hYKt22g_d0vSH
linkProvider Wiley-Blackwell
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Na9tAEB0Sp9BemqRtiNu0mUBoT6r1_UFPIbVxqWNMsUNuYrW7SgyOFGS59Nh_0P7G_pLO6MNpIIVATjpoFwnNzO7b1b73AI5dS0sr8bXhKFPTAiUQRqhSywh1ZInAsVKnEp4_HwXjcXhxEU024FPLhan1IdYbblwZ1XjNBc4b0r1b1VB5oz_6tL7ahC2XssjrwNbnb4PZaP0TgR0MarlU2zAJuLfas6bda_venY1uIea_QLWaaQbbj3rHHXjeAEw8qTNiFzZ09gK2W_MGbGr5Jfyaih8VoQEvGTzTDIZicZkX8_LqGv_8_I0nGYpVmau5qnhUqPJrMc-QqZl8vAgV-wLUmh6E5LFgDViOMrYy5Uh4GFOxLFEuVqzHwM_IU6y36bHirCAfPEE-pvoKZoP-9HRoNO4MhiSIQCOTnxD8kEmQWp7wIltROqjEdbSWYWRT0CVhr5TwgiMDS4qA7fJsvu9IL1WBcPagk-WZ3gcMHI9l62RK2M0NlS9oGacSy5QJ5ZFw_S58aKMUy0a6nB00FnEtumzH9J1j_s5dOFq3vKnlOu5pc9AGOm4KdhnbbAnPROGoC--rkP63f3w66fP19UMbHsLT4fRsFI--jL--gWc2EyhY_9U7gE5ZrPRbeCK_l_Nl8a7J4L9y8fh3
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1bS9xAFD5YLcUXta3F9XqE0j5FN_cEn0RdlC7LIlp8C5O52IU1WXaz4qP_QH-jv8RzclkVKhT6lIfMkJBzzsw3k_m-D-C7Z2tpp4G2XNXWtEAJhRUpY1uRjm0RurZxS-H5392w14uuruL-HBw0XJhKH2K24caVUY7XXOB6pMz-i2qoHOm9gNZXH2DB8-OAqnLh-Lxz2Z39RGAHg0ou1bHaBNwb7dm2s9_0fTsbvUDM10C1nGk6y__1jiuwVANMPKwy4jPM6ewLLDfmDVjX8ld4uBB3JaEBrxk80wyGYnidjwfFnxt8un_EwwzFtMjVQJU8KlT5jRhkyNRMPl6Ein0BKk0PQvI4Zg1YjjI2MuVIeBiNmBQoh1PWY-Bn5AarbXosOSvIB0-Qj6muwmXn5OLo1KrdGSxJEIFGpiAl-CHT0Ni-8GNHUTqo1HO1llHsUNAlYS9DeMGVoS1FyHZ5Dt93pW9UKNxvMJ_lmV4DDF2fZeukIezmRSoQtIxTqd2WKeWR8IIW_GyilMhaupwdNIZJJbrsJPSdE_7OLdidtRxVch1_abPZBDqpC3aSOGwJz0ThuAU_ypC-2z856p_wdf1fG-7Ap_5xJ-me9X5twKLD_AmWf_U3Yb4YT_UWfJS3xWAy3q4T-BkLNvfy
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Taxonomy+grooming+algorithm+%E2%80%90+An+autodidactic+domain+specific+dimensionality+reduction+approach+for+fast+clustering+of+social+media+text+data&rft.jtitle=Concurrency+and+computation&rft.au=Renjith%2C+Shini&rft.au=Sreekumar%2C+A&rft.au=Jathavedan%2C+M&rft.date=2022-05-15&rft.pub=Wiley+Subscription+Services%2C+Inc&rft.issn=1532-0626&rft.eissn=1532-0634&rft.volume=34&rft.issue=11&rft_id=info:doi/10.1002%2Fcpe.6837&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1532-0626&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1532-0626&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1532-0626&client=summon