The effect of clustering algorithms on question answering
Question answering (QA) is one of the essential fields in information retrieval where specific answers are provided instead of large documents. The relations among questions and answers are determined using natural language processing techniques while clustering algorithms can be helpful in improvin...
Uložené v:
| Vydané v: | Expert systems with applications Ročník 243; s. 122959 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier Ltd
01.06.2024
|
| Predmet: | |
| ISSN: | 0957-4174, 1873-6793 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Question answering (QA) is one of the essential fields in information retrieval where specific answers are provided instead of large documents. The relations among questions and answers are determined using natural language processing techniques while clustering algorithms can be helpful in improving the effectiveness of result retrieval by reducing the amount of required comparisons for a specific question or answer. In this work, we introduce a clustering-based approach for a QA system. This approach groups related questions into clusters using different clustering algorithms, specifies the appropriate answer using similarity methods between the answers and the generated clusters, and then assigns answers to their most related questions. Different clustering algorithms, such as k-means, spherical k-means, single-linkage hierarchical clustering (SLHA), unweighted pair group method with arithmetic mean (UPGMA), expectation–maximization (EM), and clustering Arabic documents based on bond energy (CADBE), are tested. The effectiveness of a clustering algorithm is investigated with respect to certain factors, including number of clusters, text representation, similarity measure between answers and clusters, and similarity measure between answers and questions in a selected cluster. In addition, a comprehensive ranking system is introduced to evaluate the performance of clustering algorithms. Evaluation is performed using the Dataset of Arabic Why Question Answering System (DAWQAS) and the Multilingual Question Answering (MLQA) dataset. Results show that CADBE achieves the highest accuracy and the first rank, followed by SLHA and UPGMA, while spherical k-means has the lowest rank. The performance of clustering algorithms for MLQA dataset is affected by its characteristics, such as short questions, long and varied answers, and diverse subject domains. Unigram and bigram intersection measures perform well in most cases. Term frequency inverse document frequency representation outperforms word embedding in DAWQAS. Overall, the experiments provide insights into the performance of clustering algorithms in QA systems.
•A clustering-based QA system groups related questions, selects answer via similarity.•Assigning Answers to Related Questions Using Various Similarity Methods.•Exploring certain factors to investigate effectiveness of clustering algorithm.•A comprehensive ranking system evaluates the performance of clustering algorithms.•CADBE achieves highest accuracy, then SLHA, UPGMA. Spherical k-means ranks lowest. |
|---|---|
| AbstractList | Question answering (QA) is one of the essential fields in information retrieval where specific answers are provided instead of large documents. The relations among questions and answers are determined using natural language processing techniques while clustering algorithms can be helpful in improving the effectiveness of result retrieval by reducing the amount of required comparisons for a specific question or answer. In this work, we introduce a clustering-based approach for a QA system. This approach groups related questions into clusters using different clustering algorithms, specifies the appropriate answer using similarity methods between the answers and the generated clusters, and then assigns answers to their most related questions. Different clustering algorithms, such as k-means, spherical k-means, single-linkage hierarchical clustering (SLHA), unweighted pair group method with arithmetic mean (UPGMA), expectation–maximization (EM), and clustering Arabic documents based on bond energy (CADBE), are tested. The effectiveness of a clustering algorithm is investigated with respect to certain factors, including number of clusters, text representation, similarity measure between answers and clusters, and similarity measure between answers and questions in a selected cluster. In addition, a comprehensive ranking system is introduced to evaluate the performance of clustering algorithms. Evaluation is performed using the Dataset of Arabic Why Question Answering System (DAWQAS) and the Multilingual Question Answering (MLQA) dataset. Results show that CADBE achieves the highest accuracy and the first rank, followed by SLHA and UPGMA, while spherical k-means has the lowest rank. The performance of clustering algorithms for MLQA dataset is affected by its characteristics, such as short questions, long and varied answers, and diverse subject domains. Unigram and bigram intersection measures perform well in most cases. Term frequency inverse document frequency representation outperforms word embedding in DAWQAS. Overall, the experiments provide insights into the performance of clustering algorithms in QA systems.
•A clustering-based QA system groups related questions, selects answer via similarity.•Assigning Answers to Related Questions Using Various Similarity Methods.•Exploring certain factors to investigate effectiveness of clustering algorithm.•A comprehensive ranking system evaluates the performance of clustering algorithms.•CADBE achieves highest accuracy, then SLHA, UPGMA. Spherical k-means ranks lowest. |
| ArticleNumber | 122959 |
| Author | AlMahmoud, Rana Husni Alian, Marwah |
| Author_xml | – sequence: 1 givenname: Rana Husni orcidid: 0000-0003-4240-9392 surname: AlMahmoud fullname: AlMahmoud, Rana Husni email: Rana.Almahmoud@gju.edu.jo organization: School of Electrical Engineering and Information Technology, German Jordanian University, Amman, Jordan – sequence: 2 givenname: Marwah orcidid: 0000-0001-6358-059X surname: Alian fullname: Alian, Marwah email: marwah2001@yahoo.com organization: Basic Sciences Department, Faculty of Science, The Hashemite University, Zarqa, Jordan |
| BookMark | eNp9kMtOwzAQRS1UJNLCD7DKDyT4kcSxxAZVvKRKbMracuxx6yhNwHap-HsSwopFV3MXc0b3zBIt-qEHhG4Jzgkm1V2bQzipnGLKckKpKMUFSkjNWVZxwRYowaLkWUF4cYWWIbQYE44xT5DY7iEFa0HHdLCp7o4hgnf9LlXdbvAu7g8hHfr08wghujGoPpx-F67RpVVdgJu_uULvT4_b9Uu2eXt-XT9sMs0wjhkpNVPCGl1XBeOKNYxVY02tqaEKbFEzgik31hRgmGpKUYIWnFZNo-uGlMBWiM53tR9C8GDlh3cH5b8lwXKSl62c5OUkL2f5Ear_QdpFNQlEr1x3Hr2fURilvhx4GbSDXoNxfvySNIM7h_8A6XF4MA |
| CitedBy_id | crossref_primary_10_1016_j_engappai_2024_109042 crossref_primary_10_1051_bioconf_202414601041 crossref_primary_10_1016_j_conengprac_2024_106129 crossref_primary_10_3389_fpubh_2025_1597381 crossref_primary_10_3390_computers13120327 crossref_primary_10_1038_s41598_025_96696_y crossref_primary_10_1016_j_cie_2025_110886 |
| Cites_doi | 10.1111/j.2517-6161.1977.tb01600.x 10.1109/ACCESS.2019.2918675 10.1109/2.781637 10.1093/comjnl/20.2.141 10.1016/j.patcog.2012.04.031 10.18637/jss.v050.i10 10.12733/jics20105420 10.1109/ACCESS.2021.3074950 10.1007/s41870-022-01012-w 10.1007/s00500-021-05754-w 10.1016/j.csl.2019.101023 10.1108/IDD-06-2018-0022 10.1016/j.eswa.2020.113598 10.1016/j.procs.2018.10.467 10.1016/B978-0-12-387730-7.00018-8 10.1016/j.procs.2017.10.108 10.1016/j.procs.2019.09.203 10.1007/s12046-018-1022-8 10.1007/s10772-020-09753-4 10.1145/584792.584890 10.18653/v1/2020.ecnlp-1.11 |
| ContentType | Journal Article |
| Copyright | 2023 Elsevier Ltd |
| Copyright_xml | – notice: 2023 Elsevier Ltd |
| DBID | AAYXX CITATION |
| DOI | 10.1016/j.eswa.2023.122959 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1873-6793 |
| ExternalDocumentID | 10_1016_j_eswa_2023_122959 S0957417423034619 |
| GroupedDBID | --K --M .DC .~1 0R~ 13V 1B1 1RT 1~. 1~5 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN 9JO AAAKF AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AARIN AAXUO AAYFN ABBOA ABFNM ABMAC ABMVD ABUCO ABYKQ ACDAQ ACGFS ACHRH ACNTT ACRLP ACZNC ADBBV ADEZE ADTZH AEBSH AECPX AEKER AENEX AFKWA AFTJW AGHFR AGJBL AGUBO AGUMN AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJOXV ALEQD ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD APLSM AXJTR BJAXD BKOJK BLXMC BNSAS CS3 DU5 EBS EFJIC EFLBG EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ HAMUX IHE J1W JJJVA KOM MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 RIG ROL RPZ SDF SDG SDP SDS SES SEW SPC SPCBC SSB SSD SSL SST SSV SSZ T5K TN5 ~G- 29G 9DU AAAKG AAQXK AATTM AAXKI AAYWO AAYXX ABJNI ABKBG ABUFD ABWVN ABXDB ACLOT ACNNM ACRPL ACVFH ADCNI ADJOM ADMUD ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP ASPBG AVWKF AZFZN CITATION EFKBS EJD FEDTE FGOYB G-2 HLZ HVGLF HZ~ LG9 LY1 LY7 M41 R2- SBC SET WUQ XPP ZMT ~HD |
| ID | FETCH-LOGICAL-c300t-15c3a9fdc86437a3b336202cc2d2aef4831027dfd4ed3ab595ec9726bbc8b15e3 |
| ISICitedReferencesCount | 8 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001138974300001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0957-4174 |
| IngestDate | Sat Nov 29 07:05:58 EST 2025 Tue Nov 18 21:00:47 EST 2025 Sat Mar 02 16:00:14 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Arabic language Question answering Clustering Similarity measures |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c300t-15c3a9fdc86437a3b336202cc2d2aef4831027dfd4ed3ab595ec9726bbc8b15e3 |
| ORCID | 0000-0003-4240-9392 0000-0001-6358-059X |
| ParticipantIDs | crossref_primary_10_1016_j_eswa_2023_122959 crossref_citationtrail_10_1016_j_eswa_2023_122959 elsevier_sciencedirect_doi_10_1016_j_eswa_2023_122959 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-06-01 2024-06-00 |
| PublicationDateYYYYMMDD | 2024-06-01 |
| PublicationDate_xml | – month: 06 year: 2024 text: 2024-06-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationTitle | Expert systems with applications |
| PublicationYear | 2024 |
| Publisher | Elsevier Ltd |
| Publisher_xml | – name: Elsevier Ltd |
| References | Dhillon, Fan, Guan (b20) 2001 (pp. 1027–1035). Kamal, Abd Azim, Mahmoud (b32) 2014 Zhang, Zhao (b56) 2010 Alian, Awajan (b10) 2023 Alian, Awajan (b8) 2020; 23 Karpagam, Saradha (b33) 2019; 44 Ratna, Noviaindriani, Santiar, Ibrahim, Purnamasari (b44) 2019 Ahmed, BabuAnto (b2) 2016; 12 Dikshit, Chandra, Gupta (b21) 2021 Ismail, Homsi (b26) 2018; 142 (pp. 69–76). Alian, Al-Naymat (b7) 2022; 14 Arthur, D., & Vassilvitskii, S. (2007). K-means++ the advantages of careful seeding. In Othman, Faiz, Smaïli (b40) 2019; 159 Legendre, Legendre (b35) 1998 Hamerly, G., & Elkan, C. (2002). Alternatives to the k-means algorithm that find better clusterings. In Tan, Steinbach, Kumar (b50) 2016 San, Huynh, Nakamori (b46) 2004; 14 Reddy, Madhavi (b45) 2017; 19 Wang, Zhou, Gan, Chen, Fang, Sun, Cheng, Liu (b52) 2021 Jing, Huang, Shi (b30) 2002 AlMahmoud, Hammo, Faris (b13) 2020; 159 Rahim (b43) 2021 Zhong (b57) 2005 Yoon, Shin, Jung (b54) 2017 Jin, Han (b28) 2010 Yang, Lai, Lin (b53) 2012; 45 Al Mahmoud, Hammo, Faris (b5) 2023 Ashok, A., Natarajan, G., Elmasri, R., & Smith-Stvan, L. (2020). SimsterQ: A Similarity based Clustering Approach to Opinion Question Answering. In Zelnik-Manor, Perona (b55) 2004; 17 Mohammad (b38) 2017 Alian, Awajan (b9) 2021; 25 Allahyari, Pouriyeh, Assefi, Safaei, Trippe, Gutierrez, Kochut (b12) 2017 Lewis, Oguz, Rinott, Riedel, Schwenk (b36) 2020 Jin, Luo, Gao, Tang, Yuan (b29) 2019; 7 (pp. 600–607). Abdi, Hasan, Arshi, Shamsuddin, Idris (b1) 2020; 60 Jain, Sharma (b27) 2018 Banerjee, Dhillon, Ghosh, Sra, Ridgeway (b16) 2005; 6 Ahmed, Bibin, Anto (b3) 2017; 6 Everitt, Landau, Leese, Stahl (b22) 2011 Jovanovska, Bozhinova, Zdravkova (b31) 2015 Al-Khawaldeh (b4) 2015; 5 Hornik, Feinerer, Kober, Buchta (b25) 2012; 50 Ullmann (b51) 1977; 20 Zhu, Zhang, Li, He, Zhang (b58) 2016 Aljalbout, Golkov, Siddiqui, Strobel, Cremers (b11) 2018 Biltawi, Tedmori, Awajan (b17) 2021; 9 Borriss, Rueckert, Blom, Bezuidt, Reva, Klenk (b18) 2011 Dempster, Laird, Rubin (b19) 1977; 39 Sun, Ma, Wang (b49) 2015; 12 Schubotz, Scharpf, Dudhat, Nagar, Hamborg, Gipp (b47) 2018 Sokal (b48) 1958; 38 Paranjpe (b41) 2007 Karypis, Han, Kumar (b34) 1999; 32 Mikolov, Chen, Corrado, Dean (b37) 2013 Mozannar, Hajal, Maamary, Hajj (b39) 2019 Gupta, Kulkarni, Chanda, Rayasam, Lipton (b23) 2019 Albarghothi, Khater, Shaalan (b6) 2017; 117 Perera (b42) 2012 Tan (10.1016/j.eswa.2023.122959_b50) 2016 Reddy (10.1016/j.eswa.2023.122959_b45) 2017; 19 Zhong (10.1016/j.eswa.2023.122959_b57) 2005 Schubotz (10.1016/j.eswa.2023.122959_b47) 2018 Al Mahmoud (10.1016/j.eswa.2023.122959_b5) 2023 10.1016/j.eswa.2023.122959_b14 Karpagam (10.1016/j.eswa.2023.122959_b33) 2019; 44 Ullmann (10.1016/j.eswa.2023.122959_b51) 1977; 20 Zelnik-Manor (10.1016/j.eswa.2023.122959_b55) 2004; 17 Alian (10.1016/j.eswa.2023.122959_b7) 2022; 14 10.1016/j.eswa.2023.122959_b15 Gupta (10.1016/j.eswa.2023.122959_b23) 2019 Yang (10.1016/j.eswa.2023.122959_b53) 2012; 45 Mozannar (10.1016/j.eswa.2023.122959_b39) 2019 Lewis (10.1016/j.eswa.2023.122959_b36) 2020 Wang (10.1016/j.eswa.2023.122959_b52) 2021 Perera (10.1016/j.eswa.2023.122959_b42) 2012 Allahyari (10.1016/j.eswa.2023.122959_b12) 2017 Kamal (10.1016/j.eswa.2023.122959_b32) 2014 Karypis (10.1016/j.eswa.2023.122959_b34) 1999; 32 Dikshit (10.1016/j.eswa.2023.122959_b21) 2021 Ismail (10.1016/j.eswa.2023.122959_b26) 2018; 142 Jing (10.1016/j.eswa.2023.122959_b30) 2002 San (10.1016/j.eswa.2023.122959_b46) 2004; 14 Everitt (10.1016/j.eswa.2023.122959_b22) 2011 Mohammad (10.1016/j.eswa.2023.122959_b38) 2017 Paranjpe (10.1016/j.eswa.2023.122959_b41) 2007 Alian (10.1016/j.eswa.2023.122959_b9) 2021; 25 10.1016/j.eswa.2023.122959_b24 Zhang (10.1016/j.eswa.2023.122959_b56) 2010 Aljalbout (10.1016/j.eswa.2023.122959_b11) 2018 Zhu (10.1016/j.eswa.2023.122959_b58) 2016 Albarghothi (10.1016/j.eswa.2023.122959_b6) 2017; 117 Jin (10.1016/j.eswa.2023.122959_b29) 2019; 7 Alian (10.1016/j.eswa.2023.122959_b10) 2023 Dempster (10.1016/j.eswa.2023.122959_b19) 1977; 39 Sokal (10.1016/j.eswa.2023.122959_b48) 1958; 38 Banerjee (10.1016/j.eswa.2023.122959_b16) 2005; 6 Yoon (10.1016/j.eswa.2023.122959_b54) 2017 Jain (10.1016/j.eswa.2023.122959_b27) 2018 Borriss (10.1016/j.eswa.2023.122959_b18) 2011 AlMahmoud (10.1016/j.eswa.2023.122959_b13) 2020; 159 Othman (10.1016/j.eswa.2023.122959_b40) 2019; 159 Legendre (10.1016/j.eswa.2023.122959_b35) 1998 Ahmed (10.1016/j.eswa.2023.122959_b3) 2017; 6 Abdi (10.1016/j.eswa.2023.122959_b1) 2020; 60 Alian (10.1016/j.eswa.2023.122959_b8) 2020; 23 Ratna (10.1016/j.eswa.2023.122959_b44) 2019 Dhillon (10.1016/j.eswa.2023.122959_b20) 2001 Jin (10.1016/j.eswa.2023.122959_b28) 2010 Ahmed (10.1016/j.eswa.2023.122959_b2) 2016; 12 Hornik (10.1016/j.eswa.2023.122959_b25) 2012; 50 Rahim (10.1016/j.eswa.2023.122959_b43) 2021 Al-Khawaldeh (10.1016/j.eswa.2023.122959_b4) 2015; 5 Mikolov (10.1016/j.eswa.2023.122959_b37) 2013 Jovanovska (10.1016/j.eswa.2023.122959_b31) 2015 Sun (10.1016/j.eswa.2023.122959_b49) 2015; 12 Biltawi (10.1016/j.eswa.2023.122959_b17) 2021; 9 |
| References_xml | – start-page: 2692 year: 2010 end-page: 2696 ident: b56 article-title: A Chinese question-answering system with question classification and answer clustering publication-title: 2010 seventh international conference on fuzzy systems and knowledge discovery, Vol. 6 – start-page: 1 year: 2023 end-page: 39 ident: b5 article-title: Cluster-based ensemble learning model for improving sentiment classification of arabic documents publication-title: Natural Language Engineering – start-page: 1415 year: 2016 end-page: 1420 ident: b58 article-title: A study of damp-heat syndrome classification using word2vec and TF-IDF publication-title: 2016 IEEE international conference on bioinformatics and biomedicine (BIBM) – volume: 6 year: 2005 ident: b16 article-title: Clustering on the unit hypersphere using von mises-Fisher distributions publication-title: Journal of Machine Learning Research – year: 2011 ident: b22 article-title: Cluster analysis: Wiley series in probability and statistics – year: 2017 ident: b38 article-title: Word affect intensities – start-page: 641 year: 2014 end-page: 645 ident: b32 article-title: Enhancing arabic question answering system publication-title: 2014 international conference on computational intelligence and communication networks – volume: 38 start-page: 1409 year: 1958 end-page: 1438 ident: b48 article-title: A statistical method for evaluating systematic relationships publication-title: The University of Kansas Science Bulletin – volume: 142 start-page: 123 year: 2018 end-page: 131 ident: b26 article-title: Dawqas: A dataset for arabic why question answering system publication-title: Procedia Computer Science – year: 1998 ident: b35 article-title: Numerical ecology – volume: 7 start-page: 75235 year: 2019 end-page: 75246 ident: b29 article-title: ComQA: Question answering over knowledge base via semantic matching publication-title: IEEE Access – start-page: 1 year: 2019 end-page: 5 ident: b44 article-title: K-means clustering for answer categorization on latent semantic analysis automatic Japanese short essay grading system publication-title: 2019 16th international conference on quality in research (QIR): international symposium on electrical and computer engineering – volume: 32 start-page: 68 year: 1999 end-page: 75 ident: b34 article-title: Chameleon: Hierarchical clustering using dynamic modeling publication-title: Computer – start-page: 3958 year: 2021 end-page: 3968 ident: b52 article-title: Cluster-former: Clustering-based sparse transformer for question answering publication-title: Findings of the association for computational linguistics: ACL-IJCNLP 2021 – year: 2007 ident: b41 article-title: Clustering semantically similar and related questions – reference: (pp. 1027–1035). – year: 2016 ident: b50 article-title: Introduction to data mining – start-page: 245 year: 2012 end-page: 246 ident: b42 article-title: Ipedagogy: Question answering system based on web information clustering publication-title: 2012 IEEE fourth international conference on technology for education – start-page: 357 year: 2001 end-page: 381 ident: b20 article-title: Efficient clustering of very large document collections publication-title: Data mining for scientific and engineering applications – volume: 12 start-page: 18 year: 2016 end-page: 22 ident: b2 article-title: Answer extraction for how and why questions in question answering systems publication-title: International Journal of Computational Engineering Research (IJCER) – volume: 23 start-page: 851 year: 2020 end-page: 859 ident: b8 article-title: Factors affecting sentence similarity and paraphrasing identification publication-title: International Journal of Speech Technology – reference: (pp. 69–76). – year: 2019 ident: b39 article-title: Neural arabic question answering – volume: 9 start-page: 63876 year: 2021 end-page: 63904 ident: b17 article-title: Arabic question answering systems: Gap analysis publication-title: IEEE Access – year: 2018 ident: b47 article-title: Introducing mathqa: a math-aware question answering system publication-title: Information Discovery and Delivery – year: 2013 ident: b37 article-title: Efficient estimation of word representations in vector space – volume: 159 year: 2020 ident: b13 article-title: A modified bond energy algorithm with fuzzy merging and its application to Arabic text document clustering publication-title: Expert Systems with Applications – year: 2019 ident: b23 article-title: Amazonqa: a review-based question answering task – reference: (pp. 600–607). – start-page: 1209 year: 2018 end-page: 1213 ident: b27 article-title: Clustering of text streams via facility location and spherical K-means publication-title: 2018 second international conference on electronics, communication and aerospace technology (ICECA) – start-page: 7315 year: 2020 end-page: 7330 ident: b36 article-title: MLQA: Evaluating cross-lingual extractive question answering publication-title: Proceedings of the 58th annual meeting of the association for computational linguistics – reference: Hamerly, G., & Elkan, C. (2002). Alternatives to the k-means algorithm that find better clusterings. In – year: 2021 ident: b43 article-title: Measuring semantic similarity for arabic sentences using machine learning – start-page: 3180 year: 2005 end-page: 3185 ident: b57 article-title: Efficient online spherical k-means clustering publication-title: Proceedings. 2005 IEEE international joint conference on neural networks, 2005, Vol. 5 – start-page: 205 year: 2015 end-page: 214 ident: b31 article-title: Using NLP methods to improve the effectiveness of a Macedonian question answering system publication-title: International conference on ICT innovations – volume: 6 start-page: 142 year: 2017 end-page: 144 ident: b3 article-title: Question answering system based on neural networks publication-title: International Journal of Engineering Research – reference: Ashok, A., Natarajan, G., Elmasri, R., & Smith-Stvan, L. (2020). SimsterQ: A Similarity based Clustering Approach to Opinion Question Answering. In – volume: 117 start-page: 183 year: 2017 end-page: 191 ident: b6 article-title: Arabic question answering using ontology publication-title: Procedia Computer Science – year: 2018 ident: b11 article-title: Clustering with deep learning: Taxonomy and new methods – volume: 17 year: 2004 ident: b55 article-title: Self-tuning spectral clustering publication-title: Advances in Neural Information Processing Systems – year: 2017 ident: b54 article-title: Learning to rank question-answer pairs using hierarchical recurrent encoder with latent topic clustering – volume: 14 start-page: 3793 year: 2022 end-page: 3802 ident: b7 article-title: Questions clustering using canopy-k-means and hierarchical-k-means clustering publication-title: International Journal of Information Technology – reference: Arthur, D., & Vassilvitskii, S. (2007). K-means++ the advantages of careful seeding. In – volume: 19 start-page: 19 year: 2017 end-page: 23 ident: b45 article-title: A survey on types of question answering system publication-title: IOSR Journal of Computer Engineering (IOSR-JCE) – volume: 50 start-page: 1 year: 2012 end-page: 22 ident: b25 article-title: Spherical k-means clustering publication-title: Journal of Statistical Software – year: 2010 ident: b28 article-title: Expectation maximization clustering – start-page: 1630 year: 2021 end-page: 1633 ident: b21 article-title: Automating questions and answers of good and services tax system using clustering and embeddings of queries publication-title: 2021 20th IEEE international conference on machine learning and applications (ICMLA) – volume: 159 start-page: 485 year: 2019 end-page: 494 ident: b40 article-title: Enhancing question retrieval in community question answering using word embeddings publication-title: Procedia Computer Science – volume: 39 start-page: 1 year: 1977 end-page: 22 ident: b19 article-title: Maximum likelihood from incomplete data via the EM algorithm publication-title: Journal of the Royal Statistical Society. Series B. Statistical Methodology – start-page: 944 year: 2002 end-page: 946 ident: b30 article-title: Improved feature selection approach TFIDF in text mining publication-title: Machine learning and cybernetics, 2002. Proceedings. 2002 international conference on, Vol. 2 – volume: 60 year: 2020 ident: b1 article-title: A question answering system in hadith using linguistic knowledge publication-title: Computer Speech and Language – volume: 20 start-page: 141 year: 1977 end-page: 147 ident: b51 article-title: A binary n-gram technique for automatic correction of substitution, deletion, insertion and reversal errors in words publication-title: The Computer Journal – volume: 12 start-page: 957 year: 2015 end-page: 964 ident: b49 article-title: A comparative evaluation of string similarity metrics for ontology alignment publication-title: Journal of Information & Computational Science – volume: 14 start-page: 241 year: 2004 end-page: 247 ident: b46 article-title: An alternative extension of the k-means algorithm for clustering categorical data publication-title: International Journal of Applied Mathematics and Computer Science – year: 2017 ident: b12 article-title: A brief survey of text mining: Classification, clustering and extraction techniques – start-page: 1 year: 2023 end-page: 12 ident: b10 article-title: Syntactic-semantic similarity based on dependency tree kernel publication-title: Arabian Journal for Science and Engineering – start-page: 409 year: 2011 end-page: 436 ident: b18 article-title: Whole genome sequence comparisons in taxonomy publication-title: Methods in microbiology, Vol. 38 – volume: 5 start-page: 82 year: 2015 end-page: 86 ident: b4 article-title: Answer extraction for why arabic questions answering systems: EWAQ publication-title: World of Computer Science and Information Technology Journal (WCSIT) – volume: 25 start-page: 10089 year: 2021 end-page: 10101 ident: b9 article-title: Arabic sentence similarity based on similarity features and machine learning publication-title: Soft Computing – volume: 44 start-page: 1 year: 2019 end-page: 10 ident: b33 article-title: A framework for intelligent question answering system using semantic context-specific document clustering and wordnet publication-title: Sādhanā – volume: 45 start-page: 3950 year: 2012 end-page: 3961 ident: b53 article-title: A robust EM clustering algorithm for Gaussian mixture models publication-title: Pattern Recognition – volume: 12 start-page: 18 issue: 6 year: 2016 ident: 10.1016/j.eswa.2023.122959_b2 article-title: Answer extraction for how and why questions in question answering systems publication-title: International Journal of Computational Engineering Research (IJCER) – volume: 39 start-page: 1 issue: 1 year: 1977 ident: 10.1016/j.eswa.2023.122959_b19 article-title: Maximum likelihood from incomplete data via the EM algorithm publication-title: Journal of the Royal Statistical Society. Series B. Statistical Methodology doi: 10.1111/j.2517-6161.1977.tb01600.x – volume: 7 start-page: 75235 year: 2019 ident: 10.1016/j.eswa.2023.122959_b29 article-title: ComQA: Question answering over knowledge base via semantic matching publication-title: IEEE Access doi: 10.1109/ACCESS.2019.2918675 – year: 1998 ident: 10.1016/j.eswa.2023.122959_b35 – volume: 32 start-page: 68 issue: 8 year: 1999 ident: 10.1016/j.eswa.2023.122959_b34 article-title: Chameleon: Hierarchical clustering using dynamic modeling publication-title: Computer doi: 10.1109/2.781637 – start-page: 1 year: 2023 ident: 10.1016/j.eswa.2023.122959_b5 article-title: Cluster-based ensemble learning model for improving sentiment classification of arabic documents publication-title: Natural Language Engineering – start-page: 944 year: 2002 ident: 10.1016/j.eswa.2023.122959_b30 article-title: Improved feature selection approach TFIDF in text mining – start-page: 205 year: 2015 ident: 10.1016/j.eswa.2023.122959_b31 article-title: Using NLP methods to improve the effectiveness of a Macedonian question answering system – ident: 10.1016/j.eswa.2023.122959_b14 – year: 2013 ident: 10.1016/j.eswa.2023.122959_b37 – volume: 20 start-page: 141 issue: 2 year: 1977 ident: 10.1016/j.eswa.2023.122959_b51 article-title: A binary n-gram technique for automatic correction of substitution, deletion, insertion and reversal errors in words publication-title: The Computer Journal doi: 10.1093/comjnl/20.2.141 – start-page: 1 year: 2019 ident: 10.1016/j.eswa.2023.122959_b44 article-title: K-means clustering for answer categorization on latent semantic analysis automatic Japanese short essay grading system – volume: 45 start-page: 3950 issue: 11 year: 2012 ident: 10.1016/j.eswa.2023.122959_b53 article-title: A robust EM clustering algorithm for Gaussian mixture models publication-title: Pattern Recognition doi: 10.1016/j.patcog.2012.04.031 – volume: 50 start-page: 1 year: 2012 ident: 10.1016/j.eswa.2023.122959_b25 article-title: Spherical k-means clustering publication-title: Journal of Statistical Software doi: 10.18637/jss.v050.i10 – volume: 12 start-page: 957 issue: 3 year: 2015 ident: 10.1016/j.eswa.2023.122959_b49 article-title: A comparative evaluation of string similarity metrics for ontology alignment publication-title: Journal of Information & Computational Science doi: 10.12733/jics20105420 – volume: 38 start-page: 1409 year: 1958 ident: 10.1016/j.eswa.2023.122959_b48 article-title: A statistical method for evaluating systematic relationships publication-title: The University of Kansas Science Bulletin – volume: 9 start-page: 63876 year: 2021 ident: 10.1016/j.eswa.2023.122959_b17 article-title: Arabic question answering systems: Gap analysis publication-title: IEEE Access doi: 10.1109/ACCESS.2021.3074950 – year: 2019 ident: 10.1016/j.eswa.2023.122959_b39 – start-page: 357 year: 2001 ident: 10.1016/j.eswa.2023.122959_b20 article-title: Efficient clustering of very large document collections – volume: 17 year: 2004 ident: 10.1016/j.eswa.2023.122959_b55 article-title: Self-tuning spectral clustering publication-title: Advances in Neural Information Processing Systems – start-page: 2692 year: 2010 ident: 10.1016/j.eswa.2023.122959_b56 article-title: A Chinese question-answering system with question classification and answer clustering – year: 2016 ident: 10.1016/j.eswa.2023.122959_b50 – start-page: 1415 year: 2016 ident: 10.1016/j.eswa.2023.122959_b58 article-title: A study of damp-heat syndrome classification using word2vec and TF-IDF – start-page: 3180 year: 2005 ident: 10.1016/j.eswa.2023.122959_b57 article-title: Efficient online spherical k-means clustering – start-page: 1 year: 2023 ident: 10.1016/j.eswa.2023.122959_b10 article-title: Syntactic-semantic similarity based on dependency tree kernel publication-title: Arabian Journal for Science and Engineering – year: 2019 ident: 10.1016/j.eswa.2023.122959_b23 – volume: 14 start-page: 3793 issue: 7 year: 2022 ident: 10.1016/j.eswa.2023.122959_b7 article-title: Questions clustering using canopy-k-means and hierarchical-k-means clustering publication-title: International Journal of Information Technology doi: 10.1007/s41870-022-01012-w – volume: 25 start-page: 10089 issue: 15 year: 2021 ident: 10.1016/j.eswa.2023.122959_b9 article-title: Arabic sentence similarity based on similarity features and machine learning publication-title: Soft Computing doi: 10.1007/s00500-021-05754-w – start-page: 3958 year: 2021 ident: 10.1016/j.eswa.2023.122959_b52 article-title: Cluster-former: Clustering-based sparse transformer for question answering – volume: 6 issue: 9 year: 2005 ident: 10.1016/j.eswa.2023.122959_b16 article-title: Clustering on the unit hypersphere using von mises-Fisher distributions publication-title: Journal of Machine Learning Research – volume: 60 year: 2020 ident: 10.1016/j.eswa.2023.122959_b1 article-title: A question answering system in hadith using linguistic knowledge publication-title: Computer Speech and Language doi: 10.1016/j.csl.2019.101023 – year: 2021 ident: 10.1016/j.eswa.2023.122959_b43 – year: 2018 ident: 10.1016/j.eswa.2023.122959_b47 article-title: Introducing mathqa: a math-aware question answering system publication-title: Information Discovery and Delivery doi: 10.1108/IDD-06-2018-0022 – year: 2017 ident: 10.1016/j.eswa.2023.122959_b38 – volume: 19 start-page: 19 issue: 6 year: 2017 ident: 10.1016/j.eswa.2023.122959_b45 article-title: A survey on types of question answering system publication-title: IOSR Journal of Computer Engineering (IOSR-JCE) – year: 2018 ident: 10.1016/j.eswa.2023.122959_b11 – volume: 159 year: 2020 ident: 10.1016/j.eswa.2023.122959_b13 article-title: A modified bond energy algorithm with fuzzy merging and its application to Arabic text document clustering publication-title: Expert Systems with Applications doi: 10.1016/j.eswa.2020.113598 – start-page: 1209 year: 2018 ident: 10.1016/j.eswa.2023.122959_b27 article-title: Clustering of text streams via facility location and spherical K-means – year: 2017 ident: 10.1016/j.eswa.2023.122959_b12 – start-page: 7315 year: 2020 ident: 10.1016/j.eswa.2023.122959_b36 article-title: MLQA: Evaluating cross-lingual extractive question answering – volume: 142 start-page: 123 year: 2018 ident: 10.1016/j.eswa.2023.122959_b26 article-title: Dawqas: A dataset for arabic why question answering system publication-title: Procedia Computer Science doi: 10.1016/j.procs.2018.10.467 – start-page: 641 year: 2014 ident: 10.1016/j.eswa.2023.122959_b32 article-title: Enhancing arabic question answering system – year: 2011 ident: 10.1016/j.eswa.2023.122959_b22 – start-page: 409 year: 2011 ident: 10.1016/j.eswa.2023.122959_b18 article-title: Whole genome sequence comparisons in taxonomy doi: 10.1016/B978-0-12-387730-7.00018-8 – volume: 117 start-page: 183 year: 2017 ident: 10.1016/j.eswa.2023.122959_b6 article-title: Arabic question answering using ontology publication-title: Procedia Computer Science doi: 10.1016/j.procs.2017.10.108 – volume: 159 start-page: 485 year: 2019 ident: 10.1016/j.eswa.2023.122959_b40 article-title: Enhancing question retrieval in community question answering using word embeddings publication-title: Procedia Computer Science doi: 10.1016/j.procs.2019.09.203 – volume: 44 start-page: 1 issue: 3 year: 2019 ident: 10.1016/j.eswa.2023.122959_b33 article-title: A framework for intelligent question answering system using semantic context-specific document clustering and wordnet publication-title: Sādhanā doi: 10.1007/s12046-018-1022-8 – volume: 23 start-page: 851 issue: 4 year: 2020 ident: 10.1016/j.eswa.2023.122959_b8 article-title: Factors affecting sentence similarity and paraphrasing identification publication-title: International Journal of Speech Technology doi: 10.1007/s10772-020-09753-4 – start-page: 245 year: 2012 ident: 10.1016/j.eswa.2023.122959_b42 article-title: Ipedagogy: Question answering system based on web information clustering – volume: 6 start-page: 142 issue: 3 year: 2017 ident: 10.1016/j.eswa.2023.122959_b3 article-title: Question answering system based on neural networks publication-title: International Journal of Engineering Research – volume: 14 start-page: 241 issue: 2 year: 2004 ident: 10.1016/j.eswa.2023.122959_b46 article-title: An alternative extension of the k-means algorithm for clustering categorical data publication-title: International Journal of Applied Mathematics and Computer Science – year: 2007 ident: 10.1016/j.eswa.2023.122959_b41 – year: 2010 ident: 10.1016/j.eswa.2023.122959_b28 – year: 2017 ident: 10.1016/j.eswa.2023.122959_b54 – ident: 10.1016/j.eswa.2023.122959_b24 doi: 10.1145/584792.584890 – volume: 5 start-page: 82 year: 2015 ident: 10.1016/j.eswa.2023.122959_b4 article-title: Answer extraction for why arabic questions answering systems: EWAQ publication-title: World of Computer Science and Information Technology Journal (WCSIT) – start-page: 1630 year: 2021 ident: 10.1016/j.eswa.2023.122959_b21 article-title: Automating questions and answers of good and services tax system using clustering and embeddings of queries – ident: 10.1016/j.eswa.2023.122959_b15 doi: 10.18653/v1/2020.ecnlp-1.11 |
| SSID | ssj0017007 |
| Score | 2.478272 |
| Snippet | Question answering (QA) is one of the essential fields in information retrieval where specific answers are provided instead of large documents. The relations... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 122959 |
| SubjectTerms | Arabic language Clustering Question answering Similarity measures |
| Title | The effect of clustering algorithms on question answering |
| URI | https://dx.doi.org/10.1016/j.eswa.2023.122959 |
| Volume | 243 |
| WOSCitedRecordID | wos001138974300001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: ScienceDirect database customDbUrl: eissn: 1873-6793 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017007 issn: 0957-4174 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1JT-MwFLagcODCMsyIZUA-cENBiR039hGhjgAB4gBSb5G3DEVtgpqU8vOxYycpiEHDgYsVObZj-Xt6fnkrAEeYKsFFmAUIMR3EvB8GNCIqwEJmmQwTcyPVSVyvkpsbOhyyW1_DvazLCSR5Tl9e2NO3Qm36DNg2dPYLcLeLmg7zbEA3rYHdtP8NvPPSqF3GxzObCqEORRz_Laaj6mFSWwjq-8D5IpfzesAbJb3NgFz5PM9NBNyCrbulk_E1f5gUM-e2w3N-fD4r81H3euQUrNd8OveaZ69iQHHnCtXqCpMgjlw5nYZtohgvML7IlgVnH_Jkpx54PNHl3CZ6QvikG_w2Afa7i6l1F2w80R5Tu0Zq10jdGstgBSWEGXa2cnoxGF62BqQkdJHyzc59vJRz7Xu_k49lkgU5424TrPsfBHjqgN0CSzr_ATaa4hvQ8-JtwAzO0OEMiwx2OMMOZ1jksMEZtjj_BPd_Bndn54GvgxFIHIZVEBGJOcuUpNbKyrHARuoIkZRIIa6z2NaKQ4nKVKwV5oIwoiVLUF8ISUVENP4FenmR6x0AucKmo6-R1YkQpBinEmkj01GhMhqSXRA1J5FKnyTe1ioZp__GYBcct3OeXIqUT0eT5oBTL-Q54S019PLJvL0vfWUfrHWE_Bv0qulMH4BV-VyNyumhJ5ZXKeRzZA |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+effect+of+clustering+algorithms+on+question+answering&rft.jtitle=Expert+systems+with+applications&rft.au=AlMahmoud%2C+Rana+Husni&rft.au=Alian%2C+Marwah&rft.date=2024-06-01&rft.issn=0957-4174&rft.volume=243&rft.spage=122959&rft_id=info:doi/10.1016%2Fj.eswa.2023.122959&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_eswa_2023_122959 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0957-4174&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0957-4174&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0957-4174&client=summon |