Taxonomy grooming algorithm ‐ An autodidactic domain specific dimensionality reduction approach for fast clustering of social media text data
Social media being the most eminent source toward the growth of big data is important for information retrieval‐based applications to improve the efficiency in proportional to the volume it must deal with. One way to achieve better performance is to upgrade the processing capacity and the alternativ...
Saved in:
| Published in: | Concurrency and computation Vol. 34; no. 11 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Hoboken, USA
John Wiley & Sons, Inc
15.05.2022
Wiley Subscription Services, Inc |
| Subjects: | |
| ISSN: | 1532-0626, 1532-0634 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Social media being the most eminent source toward the growth of big data is important for information retrieval‐based applications to improve the efficiency in proportional to the volume it must deal with. One way to achieve better performance is to upgrade the processing capacity and the alternative option is to improve the processing methodology. The latter can be achieved using smarter processing techniques and/or better algorithms. Reducing the data volume that needs to be processed is a good strategy and it can be achieved by extracting only the relevant information via user segmentation by adopting an appropriate clustering technique. However, while dealing with text content, clustering algorithms do suffer due to the very high dimensions to be dealt with. Since the domain‐specific aspects are getting lost while applying traditional dimensionality reduction approaches, it is important to device an alternate strategy. This work proposes a taxonomy grooming algorithm (TGA), an autodidactic domain‐specific dimensionality reduction approach, for fast clustering of social media text data. Our experiment results are very promising and the dimensionality reduction using TGA resulted in better results in comparison with the traditional dimensionality reduction approaches. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1532-0626 1532-0634 |
| DOI: | 10.1002/cpe.6837 |