Improving unsupervised keyphrase extraction by modeling hierarchical multi-granularity features

Existing unsupervised keyphrase extraction methods typically emphasize the importance of the candidate keyphrase itself, ignoring other important factors such as the influence of uninformative sentences. We hypothesize that the salient sentences of a document are particularly important as they are m...

Full description

Saved in:

Bibliographic Details
Published in:	Information processing & management Vol. 60; no. 4; p. 103356
Main Authors:	Zhang, Zhihao, Liang, Xinnian, Zuo, Yuan, Lin, Chenghua
Format:	Journal Article
Language:	English
Published:	Elsevier Ltd 01.07.2023
Subjects:	Graph-based ranking algorithm Hierarchical Multi-granularity features Unsupervised keyphrase extraction Unsupervised keyphrase extraction Hierarchical Multi-granularity features Graph-based ranking algorithm
ISSN:	0306-4573
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Abstract	Existing unsupervised keyphrase extraction methods typically emphasize the importance of the candidate keyphrase itself, ignoring other important factors such as the influence of uninformative sentences. We hypothesize that the salient sentences of a document are particularly important as they are most likely to contain keyphrases, especially for long documents. To our knowledge, our work is the first attempt to exploit sentence salience for unsupervised keyphrase extraction by modeling hierarchical multi-granularity features. Specifically, we propose a novel position-aware graph-based unsupervised keyphrase extraction model, which includes two model variants. The pipeline model first extracts salient sentences from the document, followed by keyphrase extraction from the extracted salient sentences. In contrast to the pipeline model which models multi-granularity features in a two-stage paradigm, the joint model accounts for both sentence and phrase representations of the source document simultaneously via hierarchical graphs. Concretely, the sentence nodes are introduced as an inductive bias, injecting sentence-level information for determining the importance of candidate keyphrases. We compare our model against strong baselines on three benchmark datasets including Inspec, DUC 2001, and SemEval 2010. Experimental results show that the simple pipeline-based approach achieves promising results, indicating that keyphrase extraction task benefits from the salient sentence extraction task. The joint model, which mitigates the potential accumulated error of the pipeline model, gives the best performance and achieves new state-of-the-art results while generalizing better on data from different domains and with different lengths. In particular, for the SemEval 2010 dataset consisting of long documents, our joint model outperforms the strongest baseline UKERank by 3.48%, 3.69% and 4.84% in terms of F1@5, F1@10 and F1@15, respectively. We also conduct qualitative experiments to validate the effectiveness of our model components. •To our knowledge, our work is the first attempt to exploit the salience of sentences for unsupervised keyphrase extraction by modeling hierarchical multi-granularity features.•Our empirical study shows positive synergy between the salient sentence extraction task and the keyphrase extraction task, suggesting that better integration of these two tasks is a promising research direction.•Our method consistently outperforms all existing competitors across the three datasets, each with different document length, covering two different domains.
AbstractList	Existing unsupervised keyphrase extraction methods typically emphasize the importance of the candidate keyphrase itself, ignoring other important factors such as the influence of uninformative sentences. We hypothesize that the salient sentences of a document are particularly important as they are most likely to contain keyphrases, especially for long documents. To our knowledge, our work is the first attempt to exploit sentence salience for unsupervised keyphrase extraction by modeling hierarchical multi-granularity features. Specifically, we propose a novel position-aware graph-based unsupervised keyphrase extraction model, which includes two model variants. The pipeline model first extracts salient sentences from the document, followed by keyphrase extraction from the extracted salient sentences. In contrast to the pipeline model which models multi-granularity features in a two-stage paradigm, the joint model accounts for both sentence and phrase representations of the source document simultaneously via hierarchical graphs. Concretely, the sentence nodes are introduced as an inductive bias, injecting sentence-level information for determining the importance of candidate keyphrases. We compare our model against strong baselines on three benchmark datasets including Inspec, DUC 2001, and SemEval 2010. Experimental results show that the simple pipeline-based approach achieves promising results, indicating that keyphrase extraction task benefits from the salient sentence extraction task. The joint model, which mitigates the potential accumulated error of the pipeline model, gives the best performance and achieves new state-of-the-art results while generalizing better on data from different domains and with different lengths. In particular, for the SemEval 2010 dataset consisting of long documents, our joint model outperforms the strongest baseline UKERank by 3.48%, 3.69% and 4.84% in terms of F1@5, F1@10 and F1@15, respectively. We also conduct qualitative experiments to validate the effectiveness of our model components. •To our knowledge, our work is the first attempt to exploit the salience of sentences for unsupervised keyphrase extraction by modeling hierarchical multi-granularity features.•Our empirical study shows positive synergy between the salient sentence extraction task and the keyphrase extraction task, suggesting that better integration of these two tasks is a promising research direction.•Our method consistently outperforms all existing competitors across the three datasets, each with different document length, covering two different domains.
ArticleNumber	103356
Author	Lin, Chenghua Liang, Xinnian Zhang, Zhihao Zuo, Yuan
Author_xml	– sequence: 1 givenname: Zhihao orcidid: 0000-0002-8860-0881 surname: Zhang fullname: Zhang, Zhihao organization: School of Economics and Management, Beihang University, Beijing, China – sequence: 2 givenname: Xinnian surname: Liang fullname: Liang, Xinnian organization: State Key Lab of Software Development Environment, Beihang University, Beijing, China – sequence: 3 givenname: Yuan surname: Zuo fullname: Zuo, Yuan organization: School of Economics and Management, Beihang University, Beijing, China – sequence: 4 givenname: Chenghua orcidid: 0000-0003-3454-2468 surname: Lin fullname: Lin, Chenghua email: c.lin@sheffield.ac.uk organization: Department of Computer Science, The University of Sheffield, Sheffield, UK
BookMark	eNp9kMtqwzAQAHVIoUnaD-jNP-BUD0sm9FRCH4FAL-1ZbORVotSWjSSH-u9rk556yGlZ2FmYWZCZbz0S8sDoilGmHk8r1zUrTrkYdyGkmpE5FVTlhSzFLVnEeKKUFpLxOdHbpgvt2flD1vvYdxjOLmKVfePQHQNEzPAnBTDJtT7bD1nTVlhP10eHAYI5OgN11vR1cvkhgO9rCC4NmUVIfcB4R24s1BHv_-aSfL2-fG7e893H23bzvMuNKGjKmeWs2ltbWizUGqQBueYoDFOIpZKKlnssJFDKKyhMQRVDCQaFAs6AKSOWhF3-mtDGGNDqLrgGwqAZ1VMVfdJjFT1V0ZcqI1P-Y4xLMJmOxq6-Sj5dSByVzmMJHY1Db7ByAU3SVeuu0L8C5oQQ
CitedBy_id	crossref_primary_10_1177_01655515241282003 crossref_primary_10_1016_j_eswa_2025_126748 crossref_primary_10_1186_s40537_023_00833_1 crossref_primary_10_1016_j_knosys_2024_112511 crossref_primary_10_1016_j_inffus_2025_103088
Cites_doi	10.1016/j.ipm.2018.06.004 10.1109/TCBB.2021.3079339 10.1109/ACCESS.2020.2965087 10.1007/s11042-018-5749-3 10.1016/j.ipm.2019.102063
ContentType	Journal Article
Copyright	2023
Copyright_xml	– notice: 2023
DBID	6I. AAFTH AAYXX CITATION
DOI	10.1016/j.ipm.2023.103356
DatabaseName	ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Library & Information Science
ExternalDocumentID	10_1016_j_ipm_2023_103356 S0306457323000936
GroupedDBID	--K --M -~X .DC .~1 0B8 0R~ 1B1 1RT 1~. 1~5 29I 4.4 41~ 457 4G. 5GY 5VS 6I. 7-5 71M 77K 8P~ 9JN 9JO AABNK AACTN AAEDT AAEDW AAFJI AAFTH AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXKI AAXUO AAYFN AAYOK ABBOA ABFNM ABFRF ABJNI ABMAC ABMMH ABPPZ ABXDB ACDAQ ACGFS ACHQT ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMHG ADMUD AEBSH AEFWE AEKER AENEX AFJKZ AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJOXV AKRWK ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOMHK AOUOD ASPBG AVARZ AVWKF AXJTR AZFZN BKOJK BLXMC CS3 DU5 EBS EFJIC EJD EO8 EO9 EP2 EP3 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HLZ HMY HVGLF HZ~ H~9 IHE J1W KOM LG9 LPU LY1 M3Y M41 MO0 MS~ MVM N9A O-L O9- OAUVE OHT OZT P-8 P-9 P2P PC. PQQKQ PRBVW Q38 R2- RIG ROL RPZ SBC SDF SDG SDP SDS SES SEW SPC SPCBC SSB SSO SSS SSV SSZ T5K TN5 U5U UHB UHS UNMZH WUQ ZMT ~G- 77I 9DU AATTM AAYWO AAYXX ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFPUW AGQPQ AIGII AIIUN AKBMS AKYEP ANKPU APXCP CITATION EFKBS EFLBG ~HD
ID	FETCH-LOGICAL-c340t-1f21dbff7fe469a5ca592e3c16ee765607be45a002da4c4061e5ace36a21a16c3
ISICitedReferencesCount	6
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000970563800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN	0306-4573
IngestDate	Sat Nov 29 07:21:35 EST 2025 Tue Nov 18 22:25:53 EST 2025 Tue Dec 03 03:45:01 EST 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	4
Keywords	Unsupervised keyphrase extraction Hierarchical Multi-granularity features Graph-based ranking algorithm
Language	English
License	This is an open access article under the CC BY license.
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c340t-1f21dbff7fe469a5ca592e3c16ee765607be45a002da4c4061e5ace36a21a16c3
ORCID	0000-0003-3454-2468 0000-0002-8860-0881
OpenAccessLink	https://dx.doi.org/10.1016/j.ipm.2023.103356
ParticipantIDs	crossref_primary_10_1016_j_ipm_2023_103356 crossref_citationtrail_10_1016_j_ipm_2023_103356 elsevier_sciencedirect_doi_10_1016_j_ipm_2023_103356
PublicationCentury	2000
PublicationDate	July 2023 2023-07-00
PublicationDateYYYYMMDD	2023-07-01
PublicationDate_xml	– month: 07 year: 2023 text: July 2023
PublicationDecade	2020
PublicationTitle	Information processing & management
PublicationYear	2023
Publisher	Elsevier Ltd
Publisher_xml	– name: Elsevier Ltd
References	Kim, Medelyan, Kan, Baldwin (b17) 2010 Gu, Wang, Bi, Meng, Liu, Han (b12) 2021 Liang, Wu, Li, Li (b23) 2021 Papagiannopoulou, Tsoumakas (b32) 2018; 54 Hasan, Ng (b13) 2014 Chowdhury, Rossiello, Glass, Mihindukulasooriya, Gliozzo (b6) 2022 Mihalcea (b27) 2004 Devlin, Chang, Lee, Toutanova (b7) 2019 Meng, Yuan, Wang, Zhao, Trischler, He (b26) 2021 Pagliardini, Gupta, Jaggi (b31) 2018 Liu, Ott, Goyal, Du, Joshi, Chen (b25) 2019 Joulin, Grave, Bojanowski, Mikolov (b16) 2017 Bennani-Smires, Musat, Hossmann, Baeriswyl, Jaggi (b1) 2018 Zhao, Yan, Cao, Li (b51) 2021 Jiang, Hu, Li (b15) 2009 Lewis, Liu, Goyal, Ghazvininejad, Mohamed, Levy (b19) 2020 Peng, Yin, Rong, Lin, Zhou, Xiong (b33) 2021; 19 Peters, Neumann, Iyyer, Gardner, Clark, Lee (b35) 2018 Sun, Qiu, Zheng, Wang, Zhang (b39) 2020; 8 Florescu, Caragea (b10) 2017 Liang, Li, Wu, Li, Li (b20) 2021 Mihalcea, Tarau (b28) 2004 Wang, Jin, Zhu, Goutte (b46) 2016 Liang, Wu, Li, Li (b21) 2021; vol. ACL/IJCNLP 2021 Zheng, Lapata (b52) 2019 Florescu, Caragea (b11) 2017 Dong, Romascanu, Cheung (b9) 2021 Xiong, Hu, Xiong, Campos, Overwijk (b47) 2019 Cheng, Li, Liu, Zhao, Li, Lin (b5) 2021 Ushio, Liberatore, Camacho-Collados (b42) 2021 Page, Brin, Motwani, Winograd (b30) 1999 Sun, Wang, Li, Feng, Tian, Wu (b40) 2020 Ye, Cai, Gui, Zhang (b49) 2021 Bougouin, Boudin, Daille (b3) 2013 Le, Mikolov (b18) 2014; vol. 32 Song, Huang, Ruan (b37) 2019; 78 Sun, Xiong, Liu, Liu, Bao (b41) 2020 Vega-Oliveros, Gomes, Milios, Berton (b43) 2019; 56 Campos, Mangaravite, Pasquali, Jorge, Nunes, Jatowt (b4) 2018; vol. 10772 Wang, Fan, Rosé (b45) 2020 Mikolov, Chen, Corrado, Dean (b29) 2013 Zhang, Chen, Wang, Deng, Zhang, Li (b50) 2022 Liang, Wu, Li, Li (b22) 2021 Yang, Dai, Yang, Carbonell, Salakhutdinov, Le (b48) 2019 Liu, Li, Zheng, Sun (b24) 2009 Wan, Xiao (b44) 2008 Song, Jing, Xiao (b38) 2021 Boudin (b2) 2018 Ding, Luo (b8) 2021 Hulth (b14) 2003 Saxena, Mangal, Jain (b36) 2020 Pennington, Socher, Manning (b34) 2014 Jiang (10.1016/j.ipm.2023.103356_b15) 2009 Bougouin (10.1016/j.ipm.2023.103356_b3) 2013 Saxena (10.1016/j.ipm.2023.103356_b36) 2020 Gu (10.1016/j.ipm.2023.103356_b12) 2021 Wan (10.1016/j.ipm.2023.103356_b44) 2008 Campos (10.1016/j.ipm.2023.103356_b4) 2018; vol. 10772 Lewis (10.1016/j.ipm.2023.103356_b19) 2020 Mikolov (10.1016/j.ipm.2023.103356_b29) 2013 Xiong (10.1016/j.ipm.2023.103356_b47) 2019 Yang (10.1016/j.ipm.2023.103356_b48) 2019 Liang (10.1016/j.ipm.2023.103356_b22) 2021 Joulin (10.1016/j.ipm.2023.103356_b16) 2017 Mihalcea (10.1016/j.ipm.2023.103356_b28) 2004 Sun (10.1016/j.ipm.2023.103356_b39) 2020; 8 Florescu (10.1016/j.ipm.2023.103356_b11) 2017 Chowdhury (10.1016/j.ipm.2023.103356_b6) 2022 Vega-Oliveros (10.1016/j.ipm.2023.103356_b43) 2019; 56 Wang (10.1016/j.ipm.2023.103356_b46) 2016 Liu (10.1016/j.ipm.2023.103356_b24) 2009 Pagliardini (10.1016/j.ipm.2023.103356_b31) 2018 Song (10.1016/j.ipm.2023.103356_b38) 2021 Boudin (10.1016/j.ipm.2023.103356_b2) 2018 Wang (10.1016/j.ipm.2023.103356_b45) 2020 Mihalcea (10.1016/j.ipm.2023.103356_b27) 2004 Sun (10.1016/j.ipm.2023.103356_b40) 2020 Le (10.1016/j.ipm.2023.103356_b18) 2014; vol. 32 Liu (10.1016/j.ipm.2023.103356_b25) 2019 Ding (10.1016/j.ipm.2023.103356_b8) 2021 Devlin (10.1016/j.ipm.2023.103356_b7) 2019 Zheng (10.1016/j.ipm.2023.103356_b52) 2019 Meng (10.1016/j.ipm.2023.103356_b26) 2021 Cheng (10.1016/j.ipm.2023.103356_b5) 2021 Papagiannopoulou (10.1016/j.ipm.2023.103356_b32) 2018; 54 Hasan (10.1016/j.ipm.2023.103356_b13) 2014 Page (10.1016/j.ipm.2023.103356_b30) 1999 Song (10.1016/j.ipm.2023.103356_b37) 2019; 78 Zhao (10.1016/j.ipm.2023.103356_b51) 2021 Peters (10.1016/j.ipm.2023.103356_b35) 2018 Sun (10.1016/j.ipm.2023.103356_b41) 2020 Dong (10.1016/j.ipm.2023.103356_b9) 2021 Liang (10.1016/j.ipm.2023.103356_b20) 2021 Zhang (10.1016/j.ipm.2023.103356_b50) 2022 Bennani-Smires (10.1016/j.ipm.2023.103356_b1) 2018 Pennington (10.1016/j.ipm.2023.103356_b34) 2014 Peng (10.1016/j.ipm.2023.103356_b33) 2021; 19 Florescu (10.1016/j.ipm.2023.103356_b10) 2017 Ye (10.1016/j.ipm.2023.103356_b49) 2021 Liang (10.1016/j.ipm.2023.103356_b21) 2021; vol. ACL/IJCNLP 2021 Liang (10.1016/j.ipm.2023.103356_b23) 2021 Ushio (10.1016/j.ipm.2023.103356_b42) 2021 Hulth (10.1016/j.ipm.2023.103356_b14) 2003 Kim (10.1016/j.ipm.2023.103356_b17) 2010
References_xml	– start-page: 257 year: 2009 end-page: 266 ident: b24 article-title: Clustering to find exemplar terms for keyphrase extraction publication-title: Proceedings of the 2009 conference on empirical methods in natural language processing, EMNLP 2009, 6–7 August 2009, Singapore, a meeting of SIGDAT, a special interest group of the ACL – start-page: 221 year: 2018 end-page: 229 ident: b1 article-title: Simple unsupervised keyphrase extraction using sentence embeddings publication-title: Proceedings of the 22nd conference on computational natural language learning, CoNLL 2018, Brussels, Belgium, October 31–November 1, 2018 – volume: 56 year: 2019 ident: b43 article-title: A multi-centrality index for graph-based keyword extraction publication-title: Information Processing and Management – start-page: 667 year: 2018 end-page: 672 ident: b2 article-title: Unsupervised keyphrase extraction with multipartite graphs publication-title: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1–6, 2018, vol. 2 – volume: vol. 10772 start-page: 806 year: 2018 end-page: 810 ident: b4 article-title: YAKE! collection-independent automatic keyword extractor publication-title: Advances in information retrieval - 40th European conference on IR research, ECIR 2018, Grenoble, France, March 26–29, 2018, proceedings – start-page: 427 year: 2017 end-page: 431 ident: b16 article-title: Bag of tricks for efficient text classification publication-title: Proceedings of the 15th conference of the european chapter of the association for computational linguistics, EACL 2017, Valencia, Spain, April 3–7, 2017, vol. 2 – start-page: 855 year: 2008 end-page: 860 ident: b44 article-title: Single document keyphrase extraction using neighborhood knowledge publication-title: Proceedings of the twenty-third AAAI conference on artificial intelligence, AAAI 2008, Chicago, Illinois, USA, July 13–17, 2008 – start-page: 5174 year: 2019 end-page: 5183 ident: b47 article-title: Open domain web keyphrase extraction beyond language modeling publication-title: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3–7, 2019 – year: 2022 ident: b6 article-title: Applying a generic sequence-to-sequence model for simple and effective keyphrase generation – year: 2003 ident: b14 article-title: Improved automatic keyword extraction given more linguistic knowledge publication-title: Proceedings of the conference on empirical methods in natural language processing, EMNLP 2003, Sapporo, Japan, July 11–12, 2003 – start-page: 1089 year: 2021 end-page: 1102 ident: b9 article-title: Discourse-aware unsupervised summarization for long scientific documents publication-title: Proceedings of the 16th conference of the european chapter of the association for computational linguistics: main volume, EACL 2021, Online, April 19–23, 2021 – start-page: 8968 year: 2020 end-page: 8975 ident: b40 article-title: ERNIE 2.0: A continual pre-training framework for language understanding publication-title: The thirty-fourth AAAI conference on artificial intelligence, AAAI 2020, the thirty-second innovative applications of artificial intelligence conference, IAAI 2020, the tenth AAAI symposium on educational advances in artificial intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020 – year: 2021 ident: b20 article-title: Improving unsupervised extractive summarization by jointly modeling facet and redundancy publication-title: IEEE/ACM Transactions on Audio, Speech, and Language Processing – start-page: 1105 year: 2017 end-page: 1115 ident: b11 article-title: PositionRank: An unsupervised approach to keyphrase extraction from scholarly documents publication-title: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, vol. 1 – start-page: 1262 year: 2014 end-page: 1273 ident: b13 article-title: Automatic keyphrase extraction: A survey of the state of the art publication-title: Proceedings of the 52nd annual meeting of the association for computational linguistics, ACL 2014, June 22–27, 2014, Baltimore, MD, USA, vol. 1 – start-page: 932 year: 2016 end-page: 942 ident: b46 article-title: Extracting discriminative keyphrases with learned semantic hierarchies publication-title: COLING 2016, 26th international conference on computational linguistics, proceedings of the conference: technical papers, December 11–16, 2016, Osaka, Japan – start-page: 2705 year: 2021 end-page: 2715 ident: b49 article-title: Heterogeneous graph neural networks for keyphrase generation publication-title: Proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, virtual event/punta cana, Dominican Republic, 7–11 November, 2021 – year: 2020 ident: b41 article-title: Joint keyphrase chunking and salience ranking with BERT – start-page: 396 year: 2022 end-page: 409 ident: b50 article-title: MDERank: A masked document embedding rank approach for unsupervised keyphrase extraction publication-title: Findings of the association for computational linguistics: ACL 2022, Dublin, Ireland, May 22–27, 2022 – start-page: 1790 year: 2020 end-page: 1800 ident: b45 article-title: Incorporating multimodal information in open-domain web keyphrase extraction publication-title: Proceedings of the 2020 conference on empirical methods in natural language processing, EMNLP 2020, Online, November 16–20, 2020 – start-page: 478 year: 2021 end-page: 486 ident: b12 article-title: UCPhrase: Unsupervised context-aware quality phrase tagging publication-title: KDD ’21: The 27th ACM SIGKDD conference on knowledge discovery and data mining, virtual event, Singapore, August 14–18, 2021 – year: 2013 ident: b29 article-title: Efficient estimation of word representations in vector space publication-title: 1st international conference on learning representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, workshop track proceedings – volume: 54 start-page: 888 year: 2018 end-page: 902 ident: b32 article-title: Local word vectors guiding keyphrase extraction publication-title: Information Processing and Management – start-page: 6236 year: 2019 end-page: 6247 ident: b52 article-title: Sentence centrality revisited for unsupervised summarization publication-title: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics – start-page: 4985 year: 2021 end-page: 5007 ident: b26 article-title: An empirical study on neural keyphrase generation publication-title: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2021, Online, June 6–11, 2021 – volume: vol. 32 start-page: 1188 year: 2014 end-page: 1196 ident: b18 article-title: Distributed representations of sentences and documents publication-title: Proceedings of the 31th international conference on machine learning, ICML 2014, Beijing, China, 21–26 June 2014 – start-page: 4923 year: 2017 end-page: 4924 ident: b10 article-title: A position-biased PageRank algorithm for keyphrase extraction publication-title: Proceedings of the thirty-first AAAI conference on artificial intelligence, February 4–9, 2017, San Francisco, California, USA – volume: vol. ACL/IJCNLP 2021 start-page: 1685 year: 2021 end-page: 1697 ident: b21 article-title: Improving unsupervised extractive summarization with facet-aware modeling publication-title: Findings of the association for computational linguistics: ACL/IJCNLP 2021, online event, August 1–6, 2021 – start-page: 1532 year: 2014 end-page: 1543 ident: b34 article-title: Glove: Global vectors for word representation publication-title: Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, a meeting of SIGDAT, a special interest group of the ACL – start-page: 2726 year: 2021 end-page: 2736 ident: b38 article-title: Importance estimation from multiple perspectives for keyphrase extraction publication-title: Proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, virtual event/punta cana, Dominican Republic, 7–11 November, 2021 – volume: 19 start-page: 2365 year: 2021 end-page: 2376 ident: b33 article-title: Named entity aware transfer learning for biomedical factoid question answering publication-title: IEEE/ACM Transactions on Computational Biology and Bioinformatics – start-page: 404 year: 2004 end-page: 411 ident: b28 article-title: TextRank: Bringing order into text publication-title: Proceedings of the 2004 conference on empirical methods in natural language processing , EMNLP 2004, a meeting of SIGDAT, a special interest group of the ACL, held in conjunction with ACL 2004, 25–26 July 2004, Barcelona, Spain – start-page: 8089 year: 2021 end-page: 8103 ident: b42 article-title: Back to the basics: A quantitative analysis of statistical and graph-based term weighting schemes for keyword extraction publication-title: Proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, virtual event/punta cana, Dominican Republic, 7–11 November, 2021 – year: 2021 ident: b5 article-title: Guiding the growth: Difficulty-controllable question generation through step-by-step rewriting publication-title: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, vol. 1 – start-page: 155 year: 2021 end-page: 164 ident: b22 article-title: Unsupervised keyphrase extraction by jointly modeling local and global context publication-title: Proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, virtual event/punta cana, Dominican Republic, 7–11 November, 2021 – start-page: 155 year: 2021 end-page: 164 ident: b23 article-title: Unsupervised keyphrase extraction by jointly modeling local and global context publication-title: Proceedings of the 2021 conference on empirical methods in natural language processing – volume: 78 start-page: 857 year: 2019 end-page: 875 ident: b37 article-title: Abstractive text summarization using LSTM-CNN based deep learning publication-title: Multimedia Tools and Applications – start-page: 7871 year: 2020 end-page: 7880 ident: b19 article-title: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension publication-title: Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020 – start-page: 4171 year: 2019 end-page: 4186 ident: b7 article-title: BERT: pre-training of deep bidirectional transformers for language understanding publication-title: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, vol. 1 – start-page: 2227 year: 2018 end-page: 2237 ident: b35 article-title: Deep contextualized word representations publication-title: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1–6, 2018, vol. 1 – start-page: 543 year: 2013 end-page: 551 ident: b3 article-title: TopicRank: Graph-based topic ranking for keyphrase extraction publication-title: Sixth international joint conference on natural language processing, IJCNLP 2013, Nagoya, Japan, October 14–18, 2013 – year: 2004 ident: b27 article-title: Graph-based ranking algorithms for sentence extraction, applied to text summarization publication-title: Proceedings of the 42nd annual meeting of the association for computational linguistics, Barcelona, Spain, July 21–26 – start-page: 21 year: 2010 end-page: 26 ident: b17 article-title: SemEval-2010 task 5 : Automatic keyphrase extraction from scientific articles publication-title: Proceedings of the 5th international workshop on semantic evaluation, SemEval@ACL 2010, Uppsala University, Uppsala, Sweden, July 15–16, 2010 – volume: 8 start-page: 10896 year: 2020 end-page: 10906 ident: b39 article-title: SIFRank: A new baseline for unsupervised keyphrase extraction based on pre-trained language model publication-title: IEEE Access – start-page: 528 year: 2018 end-page: 540 ident: b31 article-title: Unsupervised learning of sentence embeddings using compositional n-gram features publication-title: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1–6, 2018, vol. 1 – start-page: 5754 year: 2019 end-page: 5764 ident: b48 article-title: XLNet: Generalized autoregressive pretraining for language understanding publication-title: Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, December 8–14, 2019, Vancouver, BC, Canada – start-page: 1919 year: 2021 end-page: 1928 ident: b8 article-title: AttentionRank: Unsupervised keyphrase extraction using self and cross attentions publication-title: Proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, virtual event/punta cana, Dominican Republic, 7–11 November, 2021 – start-page: 2037 year: 2020 end-page: 2048 ident: b36 article-title: KeyGames: A game theoretic approach to automatic keyphrase extraction publication-title: Proceedings of the 28th international conference on computational linguistics, COLING 2020, Barcelona, Spain (Online), December 8–13, 2020 – year: 2019 ident: b25 article-title: RoBERTa: A robustly optimized BERT pretraining approach – start-page: 14524 year: 2021 end-page: 14531 ident: b51 article-title: A unified multi-task learning framework for joint extraction of entities and relations publication-title: Thirty-Fifth AAAI conference on artificial intelligence, AAAI 2021, thirty-third conference on innovative applications of artificial intelligence, IAAI 2021, the eleventh symposium on educational advances in artificial intelligence, EAAI 2021, Virtual Event, February 2–9, 2021 – year: 1999 ident: b30 article-title: The PageRank citation ranking: Bringing order to the web. – start-page: 756 year: 2009 end-page: 757 ident: b15 article-title: A ranking approach to keyphrase extraction publication-title: Proceedings of the 32nd annual international ACM SIGIR conference on research and development in information retrieval, SIGIR 2009, Boston, MA, USA, July 19–23, 2009 – start-page: 404 year: 2004 ident: 10.1016/j.ipm.2023.103356_b28 article-title: TextRank: Bringing order into text – start-page: 221 year: 2018 ident: 10.1016/j.ipm.2023.103356_b1 article-title: Simple unsupervised keyphrase extraction using sentence embeddings – year: 2013 ident: 10.1016/j.ipm.2023.103356_b29 article-title: Efficient estimation of word representations in vector space – volume: 54 start-page: 888 issue: 6 year: 2018 ident: 10.1016/j.ipm.2023.103356_b32 article-title: Local word vectors guiding keyphrase extraction publication-title: Information Processing and Management doi: 10.1016/j.ipm.2018.06.004 – start-page: 155 year: 2021 ident: 10.1016/j.ipm.2023.103356_b23 article-title: Unsupervised keyphrase extraction by jointly modeling local and global context – volume: 19 start-page: 2365 issue: 4 year: 2021 ident: 10.1016/j.ipm.2023.103356_b33 article-title: Named entity aware transfer learning for biomedical factoid question answering publication-title: IEEE/ACM Transactions on Computational Biology and Bioinformatics doi: 10.1109/TCBB.2021.3079339 – start-page: 4985 year: 2021 ident: 10.1016/j.ipm.2023.103356_b26 article-title: An empirical study on neural keyphrase generation – year: 2020 ident: 10.1016/j.ipm.2023.103356_b41 – start-page: 543 year: 2013 ident: 10.1016/j.ipm.2023.103356_b3 article-title: TopicRank: Graph-based topic ranking for keyphrase extraction – start-page: 7871 year: 2020 ident: 10.1016/j.ipm.2023.103356_b19 article-title: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension – year: 1999 ident: 10.1016/j.ipm.2023.103356_b30 – volume: 8 start-page: 10896 year: 2020 ident: 10.1016/j.ipm.2023.103356_b39 article-title: SIFRank: A new baseline for unsupervised keyphrase extraction based on pre-trained language model publication-title: IEEE Access doi: 10.1109/ACCESS.2020.2965087 – volume: vol. 10772 start-page: 806 year: 2018 ident: 10.1016/j.ipm.2023.103356_b4 article-title: YAKE! collection-independent automatic keyword extractor – start-page: 932 year: 2016 ident: 10.1016/j.ipm.2023.103356_b46 article-title: Extracting discriminative keyphrases with learned semantic hierarchies – start-page: 8089 year: 2021 ident: 10.1016/j.ipm.2023.103356_b42 article-title: Back to the basics: A quantitative analysis of statistical and graph-based term weighting schemes for keyword extraction – start-page: 155 year: 2021 ident: 10.1016/j.ipm.2023.103356_b22 article-title: Unsupervised keyphrase extraction by jointly modeling local and global context – volume: 78 start-page: 857 issue: 1 year: 2019 ident: 10.1016/j.ipm.2023.103356_b37 article-title: Abstractive text summarization using LSTM-CNN based deep learning publication-title: Multimedia Tools and Applications doi: 10.1007/s11042-018-5749-3 – start-page: 8968 year: 2020 ident: 10.1016/j.ipm.2023.103356_b40 article-title: ERNIE 2.0: A continual pre-training framework for language understanding – start-page: 667 year: 2018 ident: 10.1016/j.ipm.2023.103356_b2 article-title: Unsupervised keyphrase extraction with multipartite graphs – start-page: 14524 year: 2021 ident: 10.1016/j.ipm.2023.103356_b51 article-title: A unified multi-task learning framework for joint extraction of entities and relations – start-page: 1790 year: 2020 ident: 10.1016/j.ipm.2023.103356_b45 article-title: Incorporating multimodal information in open-domain web keyphrase extraction – start-page: 257 year: 2009 ident: 10.1016/j.ipm.2023.103356_b24 article-title: Clustering to find exemplar terms for keyphrase extraction – start-page: 478 year: 2021 ident: 10.1016/j.ipm.2023.103356_b12 article-title: UCPhrase: Unsupervised context-aware quality phrase tagging – volume: vol. 32 start-page: 1188 year: 2014 ident: 10.1016/j.ipm.2023.103356_b18 article-title: Distributed representations of sentences and documents – start-page: 528 year: 2018 ident: 10.1016/j.ipm.2023.103356_b31 article-title: Unsupervised learning of sentence embeddings using compositional n-gram features – start-page: 756 year: 2009 ident: 10.1016/j.ipm.2023.103356_b15 article-title: A ranking approach to keyphrase extraction – volume: vol. ACL/IJCNLP 2021 start-page: 1685 year: 2021 ident: 10.1016/j.ipm.2023.103356_b21 article-title: Improving unsupervised extractive summarization with facet-aware modeling – year: 2021 ident: 10.1016/j.ipm.2023.103356_b5 article-title: Guiding the growth: Difficulty-controllable question generation through step-by-step rewriting – start-page: 4923 year: 2017 ident: 10.1016/j.ipm.2023.103356_b10 article-title: A position-biased PageRank algorithm for keyphrase extraction – start-page: 5174 year: 2019 ident: 10.1016/j.ipm.2023.103356_b47 article-title: Open domain web keyphrase extraction beyond language modeling – start-page: 427 year: 2017 ident: 10.1016/j.ipm.2023.103356_b16 article-title: Bag of tricks for efficient text classification – start-page: 2726 year: 2021 ident: 10.1016/j.ipm.2023.103356_b38 article-title: Importance estimation from multiple perspectives for keyphrase extraction – start-page: 2037 year: 2020 ident: 10.1016/j.ipm.2023.103356_b36 article-title: KeyGames: A game theoretic approach to automatic keyphrase extraction – start-page: 855 year: 2008 ident: 10.1016/j.ipm.2023.103356_b44 article-title: Single document keyphrase extraction using neighborhood knowledge – year: 2021 ident: 10.1016/j.ipm.2023.103356_b20 article-title: Improving unsupervised extractive summarization by jointly modeling facet and redundancy publication-title: IEEE/ACM Transactions on Audio, Speech, and Language Processing – start-page: 5754 year: 2019 ident: 10.1016/j.ipm.2023.103356_b48 article-title: XLNet: Generalized autoregressive pretraining for language understanding – start-page: 6236 year: 2019 ident: 10.1016/j.ipm.2023.103356_b52 article-title: Sentence centrality revisited for unsupervised summarization – start-page: 1262 year: 2014 ident: 10.1016/j.ipm.2023.103356_b13 article-title: Automatic keyphrase extraction: A survey of the state of the art – volume: 56 issue: 6 year: 2019 ident: 10.1016/j.ipm.2023.103356_b43 article-title: A multi-centrality index for graph-based keyword extraction publication-title: Information Processing and Management doi: 10.1016/j.ipm.2019.102063 – start-page: 2227 year: 2018 ident: 10.1016/j.ipm.2023.103356_b35 article-title: Deep contextualized word representations – year: 2019 ident: 10.1016/j.ipm.2023.103356_b25 – start-page: 396 year: 2022 ident: 10.1016/j.ipm.2023.103356_b50 article-title: MDERank: A masked document embedding rank approach for unsupervised keyphrase extraction – start-page: 1089 year: 2021 ident: 10.1016/j.ipm.2023.103356_b9 article-title: Discourse-aware unsupervised summarization for long scientific documents – start-page: 2705 year: 2021 ident: 10.1016/j.ipm.2023.103356_b49 article-title: Heterogeneous graph neural networks for keyphrase generation – start-page: 1532 year: 2014 ident: 10.1016/j.ipm.2023.103356_b34 article-title: Glove: Global vectors for word representation – start-page: 1105 year: 2017 ident: 10.1016/j.ipm.2023.103356_b11 article-title: PositionRank: An unsupervised approach to keyphrase extraction from scholarly documents – year: 2022 ident: 10.1016/j.ipm.2023.103356_b6 – start-page: 4171 year: 2019 ident: 10.1016/j.ipm.2023.103356_b7 article-title: BERT: pre-training of deep bidirectional transformers for language understanding – start-page: 1919 year: 2021 ident: 10.1016/j.ipm.2023.103356_b8 article-title: AttentionRank: Unsupervised keyphrase extraction using self and cross attentions – start-page: 21 year: 2010 ident: 10.1016/j.ipm.2023.103356_b17 article-title: SemEval-2010 task 5 : Automatic keyphrase extraction from scientific articles – year: 2004 ident: 10.1016/j.ipm.2023.103356_b27 article-title: Graph-based ranking algorithms for sentence extraction, applied to text summarization – year: 2003 ident: 10.1016/j.ipm.2023.103356_b14 article-title: Improved automatic keyword extraction given more linguistic knowledge
SSID	ssj0004512
Score	2.4070623
Snippet	Existing unsupervised keyphrase extraction methods typically emphasize the importance of the candidate keyphrase itself, ignoring other important factors such...
SourceID	crossref elsevier
SourceType	Enrichment Source Index Database Publisher
StartPage	103356
SubjectTerms	Graph-based ranking algorithm Hierarchical Multi-granularity features Unsupervised keyphrase extraction
Title	Improving unsupervised keyphrase extraction by modeling hierarchical multi-granularity features
URI	https://dx.doi.org/10.1016/j.ipm.2023.103356
Volume	60
WOSCitedRecordID	wos000970563800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 issn: 0306-4573 databaseCode: AIEXJ dateStart: 19950101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://www.sciencedirect.com omitProxy: false ssIdentifier: ssj0004512 providerName: Elsevier
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Nb9QwELWWLQcuiPIhChT5gDiAUtmJ4yTHqmpFEao4FGnpJXIcp5uqcqPdTVX-RH9zx7HjNS1F9MAliiLHm933djwZv5lB6APJC1YrWUdFLmTEMgJnRKgorxUs1oJUNB8Shb9lR0f5bFZ8n0yux1yYy_NM6_zqquj-K9RwDcA2qbMPgNtPChfgHECHI8AOx38Cfh0m6PWy74wtWIJXCX9WwA3WrM9gjheuQzj4nkMrHDPaNMUethUMaoPOMDqFhczIVI2n3qihBOgy9GZdLtMwVWczDmzkgTtVbCir8aHpk3k7FxdeCdS6y7NW64CrJ_0QxP3ZB6Kh1ikElD6d9yIMWMSJF7f6RC3CI5baDiajEbZNBRzZWGBRKUkSW3r8jrG3cYeznbYzJQXiZGc99vfC2rcWPC9DHBVuZyVMUZopSjvFI7QRZ2mRT9HG7uH-7GtQf566fSn7FcZ98kExeOs5_uzpBN7L8TP01L124F1Ll000Ufo52nZJK_gjDpDEzty_QKWnEg6phD2V8JpKuPqFRyrhkEr4DpXwSKWX6MfB_vHel8g15Ihkwsgqok1M66ppskYxXohUirSIVSIpVyozVZyySrFUwCJbCyaNq6hSIVXCRUwF5TJ5hab6QqvXCJsU6IbIBl7HGWuoqLJcMs5rLtI0ARd2C5Hxxyulq1Zvmqacl_eCtoU--Vs6W6rlb4PZiEjpfE3rQ5bArvtve_OQz3iLnqz5_w5NV4tebaPH8nLVLhfvHbVuAKJQpN8
linkProvider	Elsevier
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improving+unsupervised+keyphrase+extraction+by+modeling+hierarchical+multi-granularity+features&rft.jtitle=Information+processing+%26+management&rft.au=Zhang%2C+Zhihao&rft.au=Liang%2C+Xinnian&rft.au=Zuo%2C+Yuan&rft.au=Lin%2C+Chenghua&rft.date=2023-07-01&rft.issn=0306-4573&rft.volume=60&rft.issue=4&rft.spage=103356&rft_id=info:doi/10.1016%2Fj.ipm.2023.103356&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_ipm_2023_103356
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0306-4573&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0306-4573&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0306-4573&client=summon