Improving unsupervised keyphrase extraction by modeling hierarchical multi-granularity features

Existing unsupervised keyphrase extraction methods typically emphasize the importance of the candidate keyphrase itself, ignoring other important factors such as the influence of uninformative sentences. We hypothesize that the salient sentences of a document are particularly important as they are m...

Full description

Saved in:
Bibliographic Details
Published in:Information processing & management Vol. 60; no. 4; p. 103356
Main Authors: Zhang, Zhihao, Liang, Xinnian, Zuo, Yuan, Lin, Chenghua
Format: Journal Article
Language:English
Published: Elsevier Ltd 01.07.2023
Subjects:
ISSN:0306-4573
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Existing unsupervised keyphrase extraction methods typically emphasize the importance of the candidate keyphrase itself, ignoring other important factors such as the influence of uninformative sentences. We hypothesize that the salient sentences of a document are particularly important as they are most likely to contain keyphrases, especially for long documents. To our knowledge, our work is the first attempt to exploit sentence salience for unsupervised keyphrase extraction by modeling hierarchical multi-granularity features. Specifically, we propose a novel position-aware graph-based unsupervised keyphrase extraction model, which includes two model variants. The pipeline model first extracts salient sentences from the document, followed by keyphrase extraction from the extracted salient sentences. In contrast to the pipeline model which models multi-granularity features in a two-stage paradigm, the joint model accounts for both sentence and phrase representations of the source document simultaneously via hierarchical graphs. Concretely, the sentence nodes are introduced as an inductive bias, injecting sentence-level information for determining the importance of candidate keyphrases. We compare our model against strong baselines on three benchmark datasets including Inspec, DUC 2001, and SemEval 2010. Experimental results show that the simple pipeline-based approach achieves promising results, indicating that keyphrase extraction task benefits from the salient sentence extraction task. The joint model, which mitigates the potential accumulated error of the pipeline model, gives the best performance and achieves new state-of-the-art results while generalizing better on data from different domains and with different lengths. In particular, for the SemEval 2010 dataset consisting of long documents, our joint model outperforms the strongest baseline UKERank by 3.48%, 3.69% and 4.84% in terms of F1@5, F1@10 and F1@15, respectively. We also conduct qualitative experiments to validate the effectiveness of our model components. •To our knowledge, our work is the first attempt to exploit the salience of sentences for unsupervised keyphrase extraction by modeling hierarchical multi-granularity features.•Our empirical study shows positive synergy between the salient sentence extraction task and the keyphrase extraction task, suggesting that better integration of these two tasks is a promising research direction.•Our method consistently outperforms all existing competitors across the three datasets, each with different document length, covering two different domains.
AbstractList Existing unsupervised keyphrase extraction methods typically emphasize the importance of the candidate keyphrase itself, ignoring other important factors such as the influence of uninformative sentences. We hypothesize that the salient sentences of a document are particularly important as they are most likely to contain keyphrases, especially for long documents. To our knowledge, our work is the first attempt to exploit sentence salience for unsupervised keyphrase extraction by modeling hierarchical multi-granularity features. Specifically, we propose a novel position-aware graph-based unsupervised keyphrase extraction model, which includes two model variants. The pipeline model first extracts salient sentences from the document, followed by keyphrase extraction from the extracted salient sentences. In contrast to the pipeline model which models multi-granularity features in a two-stage paradigm, the joint model accounts for both sentence and phrase representations of the source document simultaneously via hierarchical graphs. Concretely, the sentence nodes are introduced as an inductive bias, injecting sentence-level information for determining the importance of candidate keyphrases. We compare our model against strong baselines on three benchmark datasets including Inspec, DUC 2001, and SemEval 2010. Experimental results show that the simple pipeline-based approach achieves promising results, indicating that keyphrase extraction task benefits from the salient sentence extraction task. The joint model, which mitigates the potential accumulated error of the pipeline model, gives the best performance and achieves new state-of-the-art results while generalizing better on data from different domains and with different lengths. In particular, for the SemEval 2010 dataset consisting of long documents, our joint model outperforms the strongest baseline UKERank by 3.48%, 3.69% and 4.84% in terms of F1@5, F1@10 and F1@15, respectively. We also conduct qualitative experiments to validate the effectiveness of our model components. •To our knowledge, our work is the first attempt to exploit the salience of sentences for unsupervised keyphrase extraction by modeling hierarchical multi-granularity features.•Our empirical study shows positive synergy between the salient sentence extraction task and the keyphrase extraction task, suggesting that better integration of these two tasks is a promising research direction.•Our method consistently outperforms all existing competitors across the three datasets, each with different document length, covering two different domains.
ArticleNumber 103356
Author Lin, Chenghua
Liang, Xinnian
Zhang, Zhihao
Zuo, Yuan
Author_xml – sequence: 1
  givenname: Zhihao
  orcidid: 0000-0002-8860-0881
  surname: Zhang
  fullname: Zhang, Zhihao
  organization: School of Economics and Management, Beihang University, Beijing, China
– sequence: 2
  givenname: Xinnian
  surname: Liang
  fullname: Liang, Xinnian
  organization: State Key Lab of Software Development Environment, Beihang University, Beijing, China
– sequence: 3
  givenname: Yuan
  surname: Zuo
  fullname: Zuo, Yuan
  organization: School of Economics and Management, Beihang University, Beijing, China
– sequence: 4
  givenname: Chenghua
  orcidid: 0000-0003-3454-2468
  surname: Lin
  fullname: Lin, Chenghua
  email: c.lin@sheffield.ac.uk
  organization: Department of Computer Science, The University of Sheffield, Sheffield, UK
BookMark eNp9kMtqwzAQAHVIoUnaD-jNP-BUD0sm9FRCH4FAL-1ZbORVotSWjSSH-u9rk556yGlZ2FmYWZCZbz0S8sDoilGmHk8r1zUrTrkYdyGkmpE5FVTlhSzFLVnEeKKUFpLxOdHbpgvt2flD1vvYdxjOLmKVfePQHQNEzPAnBTDJtT7bD1nTVlhP10eHAYI5OgN11vR1cvkhgO9rCC4NmUVIfcB4R24s1BHv_-aSfL2-fG7e893H23bzvMuNKGjKmeWs2ltbWizUGqQBueYoDFOIpZKKlnssJFDKKyhMQRVDCQaFAs6AKSOWhF3-mtDGGNDqLrgGwqAZ1VMVfdJjFT1V0ZcqI1P-Y4xLMJmOxq6-Sj5dSByVzmMJHY1Db7ByAU3SVeuu0L8C5oQQ
CitedBy_id crossref_primary_10_1177_01655515241282003
crossref_primary_10_1016_j_eswa_2025_126748
crossref_primary_10_1186_s40537_023_00833_1
crossref_primary_10_1016_j_knosys_2024_112511
crossref_primary_10_1016_j_inffus_2025_103088
Cites_doi 10.1016/j.ipm.2018.06.004
10.1109/TCBB.2021.3079339
10.1109/ACCESS.2020.2965087
10.1007/s11042-018-5749-3
10.1016/j.ipm.2019.102063
ContentType Journal Article
Copyright 2023
Copyright_xml – notice: 2023
DBID 6I.
AAFTH
AAYXX
CITATION
DOI 10.1016/j.ipm.2023.103356
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Library & Information Science
ExternalDocumentID 10_1016_j_ipm_2023_103356
S0306457323000936
GroupedDBID --K
--M
-~X
.DC
.~1
0B8
0R~
1B1
1RT
1~.
1~5
29I
4.4
41~
457
4G.
5GY
5VS
6I.
7-5
71M
77K
8P~
9JN
9JO
AABNK
AACTN
AAEDT
AAEDW
AAFJI
AAFTH
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXKI
AAXUO
AAYFN
AAYOK
ABBOA
ABFNM
ABFRF
ABJNI
ABMAC
ABMMH
ABPPZ
ABXDB
ACDAQ
ACGFS
ACHQT
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMHG
ADMUD
AEBSH
AEFWE
AEKER
AENEX
AFJKZ
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOMHK
AOUOD
ASPBG
AVARZ
AVWKF
AXJTR
AZFZN
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EJD
EO8
EO9
EP2
EP3
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
GBLVA
GBOLZ
HLZ
HMY
HVGLF
HZ~
H~9
IHE
J1W
KOM
LG9
LPU
LY1
M3Y
M41
MO0
MS~
MVM
N9A
O-L
O9-
OAUVE
OHT
OZT
P-8
P-9
P2P
PC.
PQQKQ
PRBVW
Q38
R2-
RIG
ROL
RPZ
SBC
SDF
SDG
SDP
SDS
SES
SEW
SPC
SPCBC
SSB
SSO
SSS
SSV
SSZ
T5K
TN5
U5U
UHB
UHS
UNMZH
WUQ
ZMT
~G-
77I
9DU
AATTM
AAYWO
AAYXX
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKYEP
ANKPU
APXCP
CITATION
EFKBS
EFLBG
~HD
ID FETCH-LOGICAL-c340t-1f21dbff7fe469a5ca592e3c16ee765607be45a002da4c4061e5ace36a21a16c3
ISICitedReferencesCount 6
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000970563800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0306-4573
IngestDate Sat Nov 29 07:21:35 EST 2025
Tue Nov 18 22:25:53 EST 2025
Tue Dec 03 03:45:01 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 4
Keywords Unsupervised keyphrase extraction
Hierarchical Multi-granularity features
Graph-based ranking algorithm
Language English
License This is an open access article under the CC BY license.
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c340t-1f21dbff7fe469a5ca592e3c16ee765607be45a002da4c4061e5ace36a21a16c3
ORCID 0000-0003-3454-2468
0000-0002-8860-0881
OpenAccessLink https://dx.doi.org/10.1016/j.ipm.2023.103356
ParticipantIDs crossref_primary_10_1016_j_ipm_2023_103356
crossref_citationtrail_10_1016_j_ipm_2023_103356
elsevier_sciencedirect_doi_10_1016_j_ipm_2023_103356
PublicationCentury 2000
PublicationDate July 2023
2023-07-00
PublicationDateYYYYMMDD 2023-07-01
PublicationDate_xml – month: 07
  year: 2023
  text: July 2023
PublicationDecade 2020
PublicationTitle Information processing & management
PublicationYear 2023
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Kim, Medelyan, Kan, Baldwin (b17) 2010
Gu, Wang, Bi, Meng, Liu, Han (b12) 2021
Liang, Wu, Li, Li (b23) 2021
Papagiannopoulou, Tsoumakas (b32) 2018; 54
Hasan, Ng (b13) 2014
Chowdhury, Rossiello, Glass, Mihindukulasooriya, Gliozzo (b6) 2022
Mihalcea (b27) 2004
Devlin, Chang, Lee, Toutanova (b7) 2019
Meng, Yuan, Wang, Zhao, Trischler, He (b26) 2021
Pagliardini, Gupta, Jaggi (b31) 2018
Liu, Ott, Goyal, Du, Joshi, Chen (b25) 2019
Joulin, Grave, Bojanowski, Mikolov (b16) 2017
Bennani-Smires, Musat, Hossmann, Baeriswyl, Jaggi (b1) 2018
Zhao, Yan, Cao, Li (b51) 2021
Jiang, Hu, Li (b15) 2009
Lewis, Liu, Goyal, Ghazvininejad, Mohamed, Levy (b19) 2020
Peng, Yin, Rong, Lin, Zhou, Xiong (b33) 2021; 19
Peters, Neumann, Iyyer, Gardner, Clark, Lee (b35) 2018
Sun, Qiu, Zheng, Wang, Zhang (b39) 2020; 8
Florescu, Caragea (b10) 2017
Liang, Li, Wu, Li, Li (b20) 2021
Mihalcea, Tarau (b28) 2004
Wang, Jin, Zhu, Goutte (b46) 2016
Liang, Wu, Li, Li (b21) 2021; vol. ACL/IJCNLP 2021
Zheng, Lapata (b52) 2019
Florescu, Caragea (b11) 2017
Dong, Romascanu, Cheung (b9) 2021
Xiong, Hu, Xiong, Campos, Overwijk (b47) 2019
Cheng, Li, Liu, Zhao, Li, Lin (b5) 2021
Ushio, Liberatore, Camacho-Collados (b42) 2021
Page, Brin, Motwani, Winograd (b30) 1999
Sun, Wang, Li, Feng, Tian, Wu (b40) 2020
Ye, Cai, Gui, Zhang (b49) 2021
Bougouin, Boudin, Daille (b3) 2013
Le, Mikolov (b18) 2014; vol. 32
Song, Huang, Ruan (b37) 2019; 78
Sun, Xiong, Liu, Liu, Bao (b41) 2020
Vega-Oliveros, Gomes, Milios, Berton (b43) 2019; 56
Campos, Mangaravite, Pasquali, Jorge, Nunes, Jatowt (b4) 2018; vol. 10772
Wang, Fan, Rosé (b45) 2020
Mikolov, Chen, Corrado, Dean (b29) 2013
Zhang, Chen, Wang, Deng, Zhang, Li (b50) 2022
Liang, Wu, Li, Li (b22) 2021
Yang, Dai, Yang, Carbonell, Salakhutdinov, Le (b48) 2019
Liu, Li, Zheng, Sun (b24) 2009
Wan, Xiao (b44) 2008
Song, Jing, Xiao (b38) 2021
Boudin (b2) 2018
Ding, Luo (b8) 2021
Hulth (b14) 2003
Saxena, Mangal, Jain (b36) 2020
Pennington, Socher, Manning (b34) 2014
Jiang (10.1016/j.ipm.2023.103356_b15) 2009
Bougouin (10.1016/j.ipm.2023.103356_b3) 2013
Saxena (10.1016/j.ipm.2023.103356_b36) 2020
Gu (10.1016/j.ipm.2023.103356_b12) 2021
Wan (10.1016/j.ipm.2023.103356_b44) 2008
Campos (10.1016/j.ipm.2023.103356_b4) 2018; vol. 10772
Lewis (10.1016/j.ipm.2023.103356_b19) 2020
Mikolov (10.1016/j.ipm.2023.103356_b29) 2013
Xiong (10.1016/j.ipm.2023.103356_b47) 2019
Yang (10.1016/j.ipm.2023.103356_b48) 2019
Liang (10.1016/j.ipm.2023.103356_b22) 2021
Joulin (10.1016/j.ipm.2023.103356_b16) 2017
Mihalcea (10.1016/j.ipm.2023.103356_b28) 2004
Sun (10.1016/j.ipm.2023.103356_b39) 2020; 8
Florescu (10.1016/j.ipm.2023.103356_b11) 2017
Chowdhury (10.1016/j.ipm.2023.103356_b6) 2022
Vega-Oliveros (10.1016/j.ipm.2023.103356_b43) 2019; 56
Wang (10.1016/j.ipm.2023.103356_b46) 2016
Liu (10.1016/j.ipm.2023.103356_b24) 2009
Pagliardini (10.1016/j.ipm.2023.103356_b31) 2018
Song (10.1016/j.ipm.2023.103356_b38) 2021
Boudin (10.1016/j.ipm.2023.103356_b2) 2018
Wang (10.1016/j.ipm.2023.103356_b45) 2020
Mihalcea (10.1016/j.ipm.2023.103356_b27) 2004
Sun (10.1016/j.ipm.2023.103356_b40) 2020
Le (10.1016/j.ipm.2023.103356_b18) 2014; vol. 32
Liu (10.1016/j.ipm.2023.103356_b25) 2019
Ding (10.1016/j.ipm.2023.103356_b8) 2021
Devlin (10.1016/j.ipm.2023.103356_b7) 2019
Zheng (10.1016/j.ipm.2023.103356_b52) 2019
Meng (10.1016/j.ipm.2023.103356_b26) 2021
Cheng (10.1016/j.ipm.2023.103356_b5) 2021
Papagiannopoulou (10.1016/j.ipm.2023.103356_b32) 2018; 54
Hasan (10.1016/j.ipm.2023.103356_b13) 2014
Page (10.1016/j.ipm.2023.103356_b30) 1999
Song (10.1016/j.ipm.2023.103356_b37) 2019; 78
Zhao (10.1016/j.ipm.2023.103356_b51) 2021
Peters (10.1016/j.ipm.2023.103356_b35) 2018
Sun (10.1016/j.ipm.2023.103356_b41) 2020
Dong (10.1016/j.ipm.2023.103356_b9) 2021
Liang (10.1016/j.ipm.2023.103356_b20) 2021
Zhang (10.1016/j.ipm.2023.103356_b50) 2022
Bennani-Smires (10.1016/j.ipm.2023.103356_b1) 2018
Pennington (10.1016/j.ipm.2023.103356_b34) 2014
Peng (10.1016/j.ipm.2023.103356_b33) 2021; 19
Florescu (10.1016/j.ipm.2023.103356_b10) 2017
Ye (10.1016/j.ipm.2023.103356_b49) 2021
Liang (10.1016/j.ipm.2023.103356_b21) 2021; vol. ACL/IJCNLP 2021
Liang (10.1016/j.ipm.2023.103356_b23) 2021
Ushio (10.1016/j.ipm.2023.103356_b42) 2021
Hulth (10.1016/j.ipm.2023.103356_b14) 2003
Kim (10.1016/j.ipm.2023.103356_b17) 2010
References_xml – start-page: 257
  year: 2009
  end-page: 266
  ident: b24
  article-title: Clustering to find exemplar terms for keyphrase extraction
  publication-title: Proceedings of the 2009 conference on empirical methods in natural language processing, EMNLP 2009, 6–7 August 2009, Singapore, a meeting of SIGDAT, a special interest group of the ACL
– start-page: 221
  year: 2018
  end-page: 229
  ident: b1
  article-title: Simple unsupervised keyphrase extraction using sentence embeddings
  publication-title: Proceedings of the 22nd conference on computational natural language learning, CoNLL 2018, Brussels, Belgium, October 31–November 1, 2018
– volume: 56
  year: 2019
  ident: b43
  article-title: A multi-centrality index for graph-based keyword extraction
  publication-title: Information Processing and Management
– start-page: 667
  year: 2018
  end-page: 672
  ident: b2
  article-title: Unsupervised keyphrase extraction with multipartite graphs
  publication-title: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1–6, 2018, vol. 2
– volume: vol. 10772
  start-page: 806
  year: 2018
  end-page: 810
  ident: b4
  article-title: YAKE! collection-independent automatic keyword extractor
  publication-title: Advances in information retrieval - 40th European conference on IR research, ECIR 2018, Grenoble, France, March 26–29, 2018, proceedings
– start-page: 427
  year: 2017
  end-page: 431
  ident: b16
  article-title: Bag of tricks for efficient text classification
  publication-title: Proceedings of the 15th conference of the european chapter of the association for computational linguistics, EACL 2017, Valencia, Spain, April 3–7, 2017, vol. 2
– start-page: 855
  year: 2008
  end-page: 860
  ident: b44
  article-title: Single document keyphrase extraction using neighborhood knowledge
  publication-title: Proceedings of the twenty-third AAAI conference on artificial intelligence, AAAI 2008, Chicago, Illinois, USA, July 13–17, 2008
– start-page: 5174
  year: 2019
  end-page: 5183
  ident: b47
  article-title: Open domain web keyphrase extraction beyond language modeling
  publication-title: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3–7, 2019
– year: 2022
  ident: b6
  article-title: Applying a generic sequence-to-sequence model for simple and effective keyphrase generation
– year: 2003
  ident: b14
  article-title: Improved automatic keyword extraction given more linguistic knowledge
  publication-title: Proceedings of the conference on empirical methods in natural language processing, EMNLP 2003, Sapporo, Japan, July 11–12, 2003
– start-page: 1089
  year: 2021
  end-page: 1102
  ident: b9
  article-title: Discourse-aware unsupervised summarization for long scientific documents
  publication-title: Proceedings of the 16th conference of the european chapter of the association for computational linguistics: main volume, EACL 2021, Online, April 19–23, 2021
– start-page: 8968
  year: 2020
  end-page: 8975
  ident: b40
  article-title: ERNIE 2.0: A continual pre-training framework for language understanding
  publication-title: The thirty-fourth AAAI conference on artificial intelligence, AAAI 2020, the thirty-second innovative applications of artificial intelligence conference, IAAI 2020, the tenth AAAI symposium on educational advances in artificial intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020
– year: 2021
  ident: b20
  article-title: Improving unsupervised extractive summarization by jointly modeling facet and redundancy
  publication-title: IEEE/ACM Transactions on Audio, Speech, and Language Processing
– start-page: 1105
  year: 2017
  end-page: 1115
  ident: b11
  article-title: PositionRank: An unsupervised approach to keyphrase extraction from scholarly documents
  publication-title: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, vol. 1
– start-page: 1262
  year: 2014
  end-page: 1273
  ident: b13
  article-title: Automatic keyphrase extraction: A survey of the state of the art
  publication-title: Proceedings of the 52nd annual meeting of the association for computational linguistics, ACL 2014, June 22–27, 2014, Baltimore, MD, USA, vol. 1
– start-page: 932
  year: 2016
  end-page: 942
  ident: b46
  article-title: Extracting discriminative keyphrases with learned semantic hierarchies
  publication-title: COLING 2016, 26th international conference on computational linguistics, proceedings of the conference: technical papers, December 11–16, 2016, Osaka, Japan
– start-page: 2705
  year: 2021
  end-page: 2715
  ident: b49
  article-title: Heterogeneous graph neural networks for keyphrase generation
  publication-title: Proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, virtual event/punta cana, Dominican Republic, 7–11 November, 2021
– year: 2020
  ident: b41
  article-title: Joint keyphrase chunking and salience ranking with BERT
– start-page: 396
  year: 2022
  end-page: 409
  ident: b50
  article-title: MDERank: A masked document embedding rank approach for unsupervised keyphrase extraction
  publication-title: Findings of the association for computational linguistics: ACL 2022, Dublin, Ireland, May 22–27, 2022
– start-page: 1790
  year: 2020
  end-page: 1800
  ident: b45
  article-title: Incorporating multimodal information in open-domain web keyphrase extraction
  publication-title: Proceedings of the 2020 conference on empirical methods in natural language processing, EMNLP 2020, Online, November 16–20, 2020
– start-page: 478
  year: 2021
  end-page: 486
  ident: b12
  article-title: UCPhrase: Unsupervised context-aware quality phrase tagging
  publication-title: KDD ’21: The 27th ACM SIGKDD conference on knowledge discovery and data mining, virtual event, Singapore, August 14–18, 2021
– year: 2013
  ident: b29
  article-title: Efficient estimation of word representations in vector space
  publication-title: 1st international conference on learning representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, workshop track proceedings
– volume: 54
  start-page: 888
  year: 2018
  end-page: 902
  ident: b32
  article-title: Local word vectors guiding keyphrase extraction
  publication-title: Information Processing and Management
– start-page: 6236
  year: 2019
  end-page: 6247
  ident: b52
  article-title: Sentence centrality revisited for unsupervised summarization
  publication-title: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
– start-page: 4985
  year: 2021
  end-page: 5007
  ident: b26
  article-title: An empirical study on neural keyphrase generation
  publication-title: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2021, Online, June 6–11, 2021
– volume: vol. 32
  start-page: 1188
  year: 2014
  end-page: 1196
  ident: b18
  article-title: Distributed representations of sentences and documents
  publication-title: Proceedings of the 31th international conference on machine learning, ICML 2014, Beijing, China, 21–26 June 2014
– start-page: 4923
  year: 2017
  end-page: 4924
  ident: b10
  article-title: A position-biased PageRank algorithm for keyphrase extraction
  publication-title: Proceedings of the thirty-first AAAI conference on artificial intelligence, February 4–9, 2017, San Francisco, California, USA
– volume: vol. ACL/IJCNLP 2021
  start-page: 1685
  year: 2021
  end-page: 1697
  ident: b21
  article-title: Improving unsupervised extractive summarization with facet-aware modeling
  publication-title: Findings of the association for computational linguistics: ACL/IJCNLP 2021, online event, August 1–6, 2021
– start-page: 1532
  year: 2014
  end-page: 1543
  ident: b34
  article-title: Glove: Global vectors for word representation
  publication-title: Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, a meeting of SIGDAT, a special interest group of the ACL
– start-page: 2726
  year: 2021
  end-page: 2736
  ident: b38
  article-title: Importance estimation from multiple perspectives for keyphrase extraction
  publication-title: Proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, virtual event/punta cana, Dominican Republic, 7–11 November, 2021
– volume: 19
  start-page: 2365
  year: 2021
  end-page: 2376
  ident: b33
  article-title: Named entity aware transfer learning for biomedical factoid question answering
  publication-title: IEEE/ACM Transactions on Computational Biology and Bioinformatics
– start-page: 404
  year: 2004
  end-page: 411
  ident: b28
  article-title: TextRank: Bringing order into text
  publication-title: Proceedings of the 2004 conference on empirical methods in natural language processing , EMNLP 2004, a meeting of SIGDAT, a special interest group of the ACL, held in conjunction with ACL 2004, 25–26 July 2004, Barcelona, Spain
– start-page: 8089
  year: 2021
  end-page: 8103
  ident: b42
  article-title: Back to the basics: A quantitative analysis of statistical and graph-based term weighting schemes for keyword extraction
  publication-title: Proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, virtual event/punta cana, Dominican Republic, 7–11 November, 2021
– year: 2021
  ident: b5
  article-title: Guiding the growth: Difficulty-controllable question generation through step-by-step rewriting
  publication-title: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, vol. 1
– start-page: 155
  year: 2021
  end-page: 164
  ident: b22
  article-title: Unsupervised keyphrase extraction by jointly modeling local and global context
  publication-title: Proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, virtual event/punta cana, Dominican Republic, 7–11 November, 2021
– start-page: 155
  year: 2021
  end-page: 164
  ident: b23
  article-title: Unsupervised keyphrase extraction by jointly modeling local and global context
  publication-title: Proceedings of the 2021 conference on empirical methods in natural language processing
– volume: 78
  start-page: 857
  year: 2019
  end-page: 875
  ident: b37
  article-title: Abstractive text summarization using LSTM-CNN based deep learning
  publication-title: Multimedia Tools and Applications
– start-page: 7871
  year: 2020
  end-page: 7880
  ident: b19
  article-title: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension
  publication-title: Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020
– start-page: 4171
  year: 2019
  end-page: 4186
  ident: b7
  article-title: BERT: pre-training of deep bidirectional transformers for language understanding
  publication-title: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, vol. 1
– start-page: 2227
  year: 2018
  end-page: 2237
  ident: b35
  article-title: Deep contextualized word representations
  publication-title: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1–6, 2018, vol. 1
– start-page: 543
  year: 2013
  end-page: 551
  ident: b3
  article-title: TopicRank: Graph-based topic ranking for keyphrase extraction
  publication-title: Sixth international joint conference on natural language processing, IJCNLP 2013, Nagoya, Japan, October 14–18, 2013
– year: 2004
  ident: b27
  article-title: Graph-based ranking algorithms for sentence extraction, applied to text summarization
  publication-title: Proceedings of the 42nd annual meeting of the association for computational linguistics, Barcelona, Spain, July 21–26
– start-page: 21
  year: 2010
  end-page: 26
  ident: b17
  article-title: SemEval-2010 task 5 : Automatic keyphrase extraction from scientific articles
  publication-title: Proceedings of the 5th international workshop on semantic evaluation, SemEval@ACL 2010, Uppsala University, Uppsala, Sweden, July 15–16, 2010
– volume: 8
  start-page: 10896
  year: 2020
  end-page: 10906
  ident: b39
  article-title: SIFRank: A new baseline for unsupervised keyphrase extraction based on pre-trained language model
  publication-title: IEEE Access
– start-page: 528
  year: 2018
  end-page: 540
  ident: b31
  article-title: Unsupervised learning of sentence embeddings using compositional n-gram features
  publication-title: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1–6, 2018, vol. 1
– start-page: 5754
  year: 2019
  end-page: 5764
  ident: b48
  article-title: XLNet: Generalized autoregressive pretraining for language understanding
  publication-title: Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, December 8–14, 2019, Vancouver, BC, Canada
– start-page: 1919
  year: 2021
  end-page: 1928
  ident: b8
  article-title: AttentionRank: Unsupervised keyphrase extraction using self and cross attentions
  publication-title: Proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, virtual event/punta cana, Dominican Republic, 7–11 November, 2021
– start-page: 2037
  year: 2020
  end-page: 2048
  ident: b36
  article-title: KeyGames: A game theoretic approach to automatic keyphrase extraction
  publication-title: Proceedings of the 28th international conference on computational linguistics, COLING 2020, Barcelona, Spain (Online), December 8–13, 2020
– year: 2019
  ident: b25
  article-title: RoBERTa: A robustly optimized BERT pretraining approach
– start-page: 14524
  year: 2021
  end-page: 14531
  ident: b51
  article-title: A unified multi-task learning framework for joint extraction of entities and relations
  publication-title: Thirty-Fifth AAAI conference on artificial intelligence, AAAI 2021, thirty-third conference on innovative applications of artificial intelligence, IAAI 2021, the eleventh symposium on educational advances in artificial intelligence, EAAI 2021, Virtual Event, February 2–9, 2021
– year: 1999
  ident: b30
  article-title: The PageRank citation ranking: Bringing order to the web.
– start-page: 756
  year: 2009
  end-page: 757
  ident: b15
  article-title: A ranking approach to keyphrase extraction
  publication-title: Proceedings of the 32nd annual international ACM SIGIR conference on research and development in information retrieval, SIGIR 2009, Boston, MA, USA, July 19–23, 2009
– start-page: 404
  year: 2004
  ident: 10.1016/j.ipm.2023.103356_b28
  article-title: TextRank: Bringing order into text
– start-page: 221
  year: 2018
  ident: 10.1016/j.ipm.2023.103356_b1
  article-title: Simple unsupervised keyphrase extraction using sentence embeddings
– year: 2013
  ident: 10.1016/j.ipm.2023.103356_b29
  article-title: Efficient estimation of word representations in vector space
– volume: 54
  start-page: 888
  issue: 6
  year: 2018
  ident: 10.1016/j.ipm.2023.103356_b32
  article-title: Local word vectors guiding keyphrase extraction
  publication-title: Information Processing and Management
  doi: 10.1016/j.ipm.2018.06.004
– start-page: 155
  year: 2021
  ident: 10.1016/j.ipm.2023.103356_b23
  article-title: Unsupervised keyphrase extraction by jointly modeling local and global context
– volume: 19
  start-page: 2365
  issue: 4
  year: 2021
  ident: 10.1016/j.ipm.2023.103356_b33
  article-title: Named entity aware transfer learning for biomedical factoid question answering
  publication-title: IEEE/ACM Transactions on Computational Biology and Bioinformatics
  doi: 10.1109/TCBB.2021.3079339
– start-page: 4985
  year: 2021
  ident: 10.1016/j.ipm.2023.103356_b26
  article-title: An empirical study on neural keyphrase generation
– year: 2020
  ident: 10.1016/j.ipm.2023.103356_b41
– start-page: 543
  year: 2013
  ident: 10.1016/j.ipm.2023.103356_b3
  article-title: TopicRank: Graph-based topic ranking for keyphrase extraction
– start-page: 7871
  year: 2020
  ident: 10.1016/j.ipm.2023.103356_b19
  article-title: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension
– year: 1999
  ident: 10.1016/j.ipm.2023.103356_b30
– volume: 8
  start-page: 10896
  year: 2020
  ident: 10.1016/j.ipm.2023.103356_b39
  article-title: SIFRank: A new baseline for unsupervised keyphrase extraction based on pre-trained language model
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2020.2965087
– volume: vol. 10772
  start-page: 806
  year: 2018
  ident: 10.1016/j.ipm.2023.103356_b4
  article-title: YAKE! collection-independent automatic keyword extractor
– start-page: 932
  year: 2016
  ident: 10.1016/j.ipm.2023.103356_b46
  article-title: Extracting discriminative keyphrases with learned semantic hierarchies
– start-page: 8089
  year: 2021
  ident: 10.1016/j.ipm.2023.103356_b42
  article-title: Back to the basics: A quantitative analysis of statistical and graph-based term weighting schemes for keyword extraction
– start-page: 155
  year: 2021
  ident: 10.1016/j.ipm.2023.103356_b22
  article-title: Unsupervised keyphrase extraction by jointly modeling local and global context
– volume: 78
  start-page: 857
  issue: 1
  year: 2019
  ident: 10.1016/j.ipm.2023.103356_b37
  article-title: Abstractive text summarization using LSTM-CNN based deep learning
  publication-title: Multimedia Tools and Applications
  doi: 10.1007/s11042-018-5749-3
– start-page: 8968
  year: 2020
  ident: 10.1016/j.ipm.2023.103356_b40
  article-title: ERNIE 2.0: A continual pre-training framework for language understanding
– start-page: 667
  year: 2018
  ident: 10.1016/j.ipm.2023.103356_b2
  article-title: Unsupervised keyphrase extraction with multipartite graphs
– start-page: 14524
  year: 2021
  ident: 10.1016/j.ipm.2023.103356_b51
  article-title: A unified multi-task learning framework for joint extraction of entities and relations
– start-page: 1790
  year: 2020
  ident: 10.1016/j.ipm.2023.103356_b45
  article-title: Incorporating multimodal information in open-domain web keyphrase extraction
– start-page: 257
  year: 2009
  ident: 10.1016/j.ipm.2023.103356_b24
  article-title: Clustering to find exemplar terms for keyphrase extraction
– start-page: 478
  year: 2021
  ident: 10.1016/j.ipm.2023.103356_b12
  article-title: UCPhrase: Unsupervised context-aware quality phrase tagging
– volume: vol. 32
  start-page: 1188
  year: 2014
  ident: 10.1016/j.ipm.2023.103356_b18
  article-title: Distributed representations of sentences and documents
– start-page: 528
  year: 2018
  ident: 10.1016/j.ipm.2023.103356_b31
  article-title: Unsupervised learning of sentence embeddings using compositional n-gram features
– start-page: 756
  year: 2009
  ident: 10.1016/j.ipm.2023.103356_b15
  article-title: A ranking approach to keyphrase extraction
– volume: vol. ACL/IJCNLP 2021
  start-page: 1685
  year: 2021
  ident: 10.1016/j.ipm.2023.103356_b21
  article-title: Improving unsupervised extractive summarization with facet-aware modeling
– year: 2021
  ident: 10.1016/j.ipm.2023.103356_b5
  article-title: Guiding the growth: Difficulty-controllable question generation through step-by-step rewriting
– start-page: 4923
  year: 2017
  ident: 10.1016/j.ipm.2023.103356_b10
  article-title: A position-biased PageRank algorithm for keyphrase extraction
– start-page: 5174
  year: 2019
  ident: 10.1016/j.ipm.2023.103356_b47
  article-title: Open domain web keyphrase extraction beyond language modeling
– start-page: 427
  year: 2017
  ident: 10.1016/j.ipm.2023.103356_b16
  article-title: Bag of tricks for efficient text classification
– start-page: 2726
  year: 2021
  ident: 10.1016/j.ipm.2023.103356_b38
  article-title: Importance estimation from multiple perspectives for keyphrase extraction
– start-page: 2037
  year: 2020
  ident: 10.1016/j.ipm.2023.103356_b36
  article-title: KeyGames: A game theoretic approach to automatic keyphrase extraction
– start-page: 855
  year: 2008
  ident: 10.1016/j.ipm.2023.103356_b44
  article-title: Single document keyphrase extraction using neighborhood knowledge
– year: 2021
  ident: 10.1016/j.ipm.2023.103356_b20
  article-title: Improving unsupervised extractive summarization by jointly modeling facet and redundancy
  publication-title: IEEE/ACM Transactions on Audio, Speech, and Language Processing
– start-page: 5754
  year: 2019
  ident: 10.1016/j.ipm.2023.103356_b48
  article-title: XLNet: Generalized autoregressive pretraining for language understanding
– start-page: 6236
  year: 2019
  ident: 10.1016/j.ipm.2023.103356_b52
  article-title: Sentence centrality revisited for unsupervised summarization
– start-page: 1262
  year: 2014
  ident: 10.1016/j.ipm.2023.103356_b13
  article-title: Automatic keyphrase extraction: A survey of the state of the art
– volume: 56
  issue: 6
  year: 2019
  ident: 10.1016/j.ipm.2023.103356_b43
  article-title: A multi-centrality index for graph-based keyword extraction
  publication-title: Information Processing and Management
  doi: 10.1016/j.ipm.2019.102063
– start-page: 2227
  year: 2018
  ident: 10.1016/j.ipm.2023.103356_b35
  article-title: Deep contextualized word representations
– year: 2019
  ident: 10.1016/j.ipm.2023.103356_b25
– start-page: 396
  year: 2022
  ident: 10.1016/j.ipm.2023.103356_b50
  article-title: MDERank: A masked document embedding rank approach for unsupervised keyphrase extraction
– start-page: 1089
  year: 2021
  ident: 10.1016/j.ipm.2023.103356_b9
  article-title: Discourse-aware unsupervised summarization for long scientific documents
– start-page: 2705
  year: 2021
  ident: 10.1016/j.ipm.2023.103356_b49
  article-title: Heterogeneous graph neural networks for keyphrase generation
– start-page: 1532
  year: 2014
  ident: 10.1016/j.ipm.2023.103356_b34
  article-title: Glove: Global vectors for word representation
– start-page: 1105
  year: 2017
  ident: 10.1016/j.ipm.2023.103356_b11
  article-title: PositionRank: An unsupervised approach to keyphrase extraction from scholarly documents
– year: 2022
  ident: 10.1016/j.ipm.2023.103356_b6
– start-page: 4171
  year: 2019
  ident: 10.1016/j.ipm.2023.103356_b7
  article-title: BERT: pre-training of deep bidirectional transformers for language understanding
– start-page: 1919
  year: 2021
  ident: 10.1016/j.ipm.2023.103356_b8
  article-title: AttentionRank: Unsupervised keyphrase extraction using self and cross attentions
– start-page: 21
  year: 2010
  ident: 10.1016/j.ipm.2023.103356_b17
  article-title: SemEval-2010 task 5 : Automatic keyphrase extraction from scientific articles
– year: 2004
  ident: 10.1016/j.ipm.2023.103356_b27
  article-title: Graph-based ranking algorithms for sentence extraction, applied to text summarization
– year: 2003
  ident: 10.1016/j.ipm.2023.103356_b14
  article-title: Improved automatic keyword extraction given more linguistic knowledge
SSID ssj0004512
Score 2.4070623
Snippet Existing unsupervised keyphrase extraction methods typically emphasize the importance of the candidate keyphrase itself, ignoring other important factors such...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 103356
SubjectTerms Graph-based ranking algorithm
Hierarchical Multi-granularity features
Unsupervised keyphrase extraction
Title Improving unsupervised keyphrase extraction by modeling hierarchical multi-granularity features
URI https://dx.doi.org/10.1016/j.ipm.2023.103356
Volume 60
WOSCitedRecordID wos000970563800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  issn: 0306-4573
  databaseCode: AIEXJ
  dateStart: 19950101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: https://www.sciencedirect.com
  omitProxy: false
  ssIdentifier: ssj0004512
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Nb9QwELWWLQcuiPIhChT5gDiAUtmJ4yTHqmpFEao4FGnpJXIcp5uqcqPdTVX-RH9zx7HjNS1F9MAliiLHm933djwZv5lB6APJC1YrWUdFLmTEMgJnRKgorxUs1oJUNB8Shb9lR0f5bFZ8n0yux1yYy_NM6_zqquj-K9RwDcA2qbMPgNtPChfgHECHI8AOx38Cfh0m6PWy74wtWIJXCX9WwA3WrM9gjheuQzj4nkMrHDPaNMUethUMaoPOMDqFhczIVI2n3qihBOgy9GZdLtMwVWczDmzkgTtVbCir8aHpk3k7FxdeCdS6y7NW64CrJ_0QxP3ZB6Kh1ikElD6d9yIMWMSJF7f6RC3CI5baDiajEbZNBRzZWGBRKUkSW3r8jrG3cYeznbYzJQXiZGc99vfC2rcWPC9DHBVuZyVMUZopSjvFI7QRZ2mRT9HG7uH-7GtQf566fSn7FcZ98kExeOs5_uzpBN7L8TP01L124F1Ll000Ufo52nZJK_gjDpDEzty_QKWnEg6phD2V8JpKuPqFRyrhkEr4DpXwSKWX6MfB_vHel8g15Ihkwsgqok1M66ppskYxXohUirSIVSIpVyozVZyySrFUwCJbCyaNq6hSIVXCRUwF5TJ5hab6QqvXCJsU6IbIBl7HGWuoqLJcMs5rLtI0ARd2C5Hxxyulq1Zvmqacl_eCtoU--Vs6W6rlb4PZiEjpfE3rQ5bArvtve_OQz3iLnqz5_w5NV4tebaPH8nLVLhfvHbVuAKJQpN8
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improving+unsupervised+keyphrase+extraction+by+modeling+hierarchical+multi-granularity+features&rft.jtitle=Information+processing+%26+management&rft.au=Zhang%2C+Zhihao&rft.au=Liang%2C+Xinnian&rft.au=Zuo%2C+Yuan&rft.au=Lin%2C+Chenghua&rft.date=2023-07-01&rft.issn=0306-4573&rft.volume=60&rft.issue=4&rft.spage=103356&rft_id=info:doi/10.1016%2Fj.ipm.2023.103356&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_ipm_2023_103356
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0306-4573&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0306-4573&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0306-4573&client=summon