Multi-granularity hierarchical topic-based segmentation of structured, digital library resources

Purpose Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits cohesive text such as news feed but not coherent texts such as documents of a digital library which have hierarchical structures. To overcome the...

Full description

Saved in:
Bibliographic Details
Published in:Electronic library Vol. 35; no. 1; pp. 99 - 120
Main Authors: Wang, Zhongyi, Zhang, Jin, Huang, Jing
Format: Journal Article
Language:English
Published: Oxford Emerald Publishing Limited 01.01.2017
Emerald Group Publishing Limited
Subjects:
ISSN:0264-0473, 1758-616X, 1758-616X
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Purpose Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits cohesive text such as news feed but not coherent texts such as documents of a digital library which have hierarchical structures. To overcome the focus on linear segmentation in document segmentation and to realize the purpose of hierarchical segmentation for a digital library’s structured resources, this paper aimed to propose a new multi-granularity hierarchical topic-based segmentation system (MHTSS) to decide section breaks. Design/methodology/approach MHTSS adopts up-down segmentation strategy to divide a structured, digital library document into a document segmentation tree. Specifically, it works in a three-stage process, such as document parsing, coarse segmentation based on document access structures and fine-grained segmentation based on lexical cohesion. Findings This paper analyzed limitations of document segmentation methods for the structured, digital library resources. Authors found that the combination of document access structures and lexical cohesion techniques should complement each other and allow for a better segmentation of structured, digital library resources. Based on this finding, this paper proposed the MHTSS for the structured, digital library resources. To evaluate it, MHTSS was compared to the TT and C99 algorithms on real-world digital library corpora. Through comparison, it was found that the MHTSS achieves top overall performance. Practical implications With MHTSS, digital library users can get their relevant information directly in segments instead of receiving the whole document. This will improve retrieval performance as well as dramatically reduce information overload. Originality/value This paper proposed MHTSS for the structured, digital library resources, which combines the document access structures and lexical cohesion techniques to decide section breaks. With this system, end-users can access a document by sections through a document structure tree.
AbstractList Purpose Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits cohesive text such as news feed but not coherent texts such as documents of a digital library which have hierarchical structures. To overcome the focus on linear segmentation in document segmentation and to realize the purpose of hierarchical segmentation for a digital library's structured resources, this paper aimed to propose a new multi-granularity hierarchical topic-based segmentation system (MHTSS) to decide section breaks. Design/methodology/approach MHTSS adopts up-down segmentation strategy to divide a structured, digital library document into a document segmentation tree. Specifically, it works in a three-stage process, such as document parsing, coarse segmentation based on document access structures and fine-grained segmentation based on lexical cohesion. Findings This paper analyzed limitations of document segmentation methods for the structured, digital library resources. Authors found that the combination of document access structures and lexical cohesion techniques should complement each other and allow for a better segmentation of structured, digital library resources. Based on this finding, this paper proposed the MHTSS for the structured, digital library resources. To evaluate it, MHTSS was compared to the TT and C99 algorithms on real-world digital library corpora. Through comparison, it was found that the MHTSS achieves top overall performance. Practical implications With MHTSS, digital library users can get their relevant information directly in segments instead of receiving the whole document. This will improve retrieval performance as well as dramatically reduce information overload. Originality/value This paper proposed MHTSS for the structured, digital library resources, which combines the document access structures and lexical cohesion techniques to decide section breaks. With this system, end-users can access a document by sections through a document structure tree.
Purpose Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits cohesive text such as news feed but not coherent texts such as documents of a digital library which have hierarchical structures. To overcome the focus on linear segmentation in document segmentation and to realize the purpose of hierarchical segmentation for a digital library’s structured resources, this paper aimed to propose a new multi-granularity hierarchical topic-based segmentation system (MHTSS) to decide section breaks. Design/methodology/approach MHTSS adopts up-down segmentation strategy to divide a structured, digital library document into a document segmentation tree. Specifically, it works in a three-stage process, such as document parsing, coarse segmentation based on document access structures and fine-grained segmentation based on lexical cohesion. Findings This paper analyzed limitations of document segmentation methods for the structured, digital library resources. Authors found that the combination of document access structures and lexical cohesion techniques should complement each other and allow for a better segmentation of structured, digital library resources. Based on this finding, this paper proposed the MHTSS for the structured, digital library resources. To evaluate it, MHTSS was compared to the TT and C99 algorithms on real-world digital library corpora. Through comparison, it was found that the MHTSS achieves top overall performance. Practical implications With MHTSS, digital library users can get their relevant information directly in segments instead of receiving the whole document. This will improve retrieval performance as well as dramatically reduce information overload. Originality/value This paper proposed MHTSS for the structured, digital library resources, which combines the document access structures and lexical cohesion techniques to decide section breaks. With this system, end-users can access a document by sections through a document structure tree.
Proposes a new multi-granularity hierarchical topic-based segmentation system (MHTSS) to decide section breaks in coherent texts such as documents of a digital library which have hierarchical structures. Source: National Library of New Zealand Te Puna Matauranga o Aotearoa, licensed by the Department of Internal Affairs for re-use under the Creative Commons Attribution 3.0 New Zealand Licence.
Author Zhang, Jin
Huang, Jing
Wang, Zhongyi
Author_xml – sequence: 1
  givenname: Zhongyi
  surname: Wang
  fullname: Wang, Zhongyi
  email: wzywzy13579@163.com
– sequence: 2
  givenname: Jin
  surname: Zhang
  fullname: Zhang, Jin
  email: jinzhanguwm@gmail.com
– sequence: 3
  givenname: Jing
  surname: Huang
  fullname: Huang, Jing
  email: huangjing8117@163.com
BackLink https://natlib-primo.hosted.exlibrisgroup.com/primo-explore/search?query=any,contains,998933097202837&tab=innz&search_scope=INNZ&vid=NLNZ&offset=0$$DView this record in NLNZ
BookMark eNptUU1rGzEQFcWF2m7PvS70GjkjaS1pjyU4acChlxZ6U_W1tsJa60haiPvrK-NACOQ0DPPemzdvFmgWx-gR-kpgRQjI680WA8cUyBpD7T-gORFriTnhf2ZoDpS3GFrBPqFFzo8AQLiAOfr7MA0l4F3ScRp0CuXU7INPOtl9sHpoyngMFhudvWuy3x18LLqEMTZj3-SSJlum5N1V48IulIofgqnkU5N8Hqdkff6MPvZ6yP7LS12i37ebXzc_8Pbn3f3N9y22tIOCafUne2scFZ52jDGje9ExboVxzrLeyLY33IjWd60G2zMrpLSCsI6CaZ1kS_TtontM49Pkc1GP1UCsKxWRvEoxwugrKg7xnwrR-edau07WOXSCApVMVNT1BWXTmHPyvTqmcKhnKQLqHLbabBVwdQ5bncOujNWF4Q81vMG9Q3jzHfYf3zOBTQ
Cites_doi 10.3233/ICA-130446
10.1002/asi.20237
10.1016/j.patcog.2003.10.012
10.1023/A:1007506220214
10.1109/TASL.2011.2143405
10.1162/089120102317341756
10.1016/j.ipm.2010.11.008
10.1016/j.ins.2007.02.038
ContentType Journal Article
Copyright Emerald Publishing Limited
Emerald Publishing Limited 2017
Copyright_xml – notice: Emerald Publishing Limited
– notice: Emerald Publishing Limited 2017
DBID AAYXX
CITATION
DUNLO
GOM
0-V
7RV
7SC
7XB
8FD
8FE
8FG
8FI
ABUWG
AFKRA
ALSLI
ARAPS
AZQEC
BEC
BENPR
BGLVJ
CCPQU
CJNVE
CNYFK
DWQXO
E3H
F2A
FYUFA
GNUQQ
GUQSH
HCIFZ
JQ2
K7-
L7M
L~C
L~D
M0N
M0P
M1O
M2O
MBDVC
NAPCQ
P5Z
P62
PHGZM
PHGZT
PKEHL
PPXIY
PQEDU
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PRQQA
Q9U
DOI 10.1108/EL-06-2015-0108
DatabaseName CrossRef
Index New Zealand (A&I)
Index New Zealand
ProQuest Social Sciences Premium Collection【Remote access available】
Nursing & Allied Health Database
Computer and Information Systems Abstracts
ProQuest Central (purchase pre-March 2016)
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
Hospital Premium Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Social Science Premium Collection
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials
eLibrary
ProQuest Central
Technology collection
ProQuest One Community College
Education Collection
Library & Information Science Collection
ProQuest Central
Library & Information Sciences Abstracts (LISA)
Library & Information Science Abstracts (LISA)
Health Research Premium Collection
ProQuest Central Student
Research Library Prep
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Computing Database
Education Database
Library Science Database
Research Library
Research Library (Corporate)
Nursing & Allied Health Premium
AAdvanced Technologies & Aerospace Database (subscription)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic
ProQuest One Academic Middle East (New)
One Health & Nursing
One Education
ProQuest One Academic Eastern Edition (DO NOT USE)
One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest One Social Sciences
ProQuest Central Basic
DatabaseTitle CrossRef
ProQuest One Education
Research Library Prep
Computer Science Database
ProQuest Central Student
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
Library and Information Science Abstracts (LISA)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
elibrary
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Library Science
Health Research Premium Collection
ProQuest Central Korea
Library & Information Science Collection
ProQuest Research Library
ProQuest Central (New)
Advanced Technologies Database with Aerospace
Advanced Technologies & Aerospace Collection
Social Science Premium Collection
ProQuest Computing
Education Collection
ProQuest One Social Sciences
ProQuest Central Basic
ProQuest Education Journals
ProQuest One Academic Eastern Edition
ProQuest Nursing & Allied Health Source
ProQuest Hospital Collection
ProQuest Technology Collection
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
Nursing & Allied Health Premium
ProQuest Social Sciences Premium Collection
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList ProQuest One Education


Database_xml – sequence: 1
  dbid: BENPR
  name: ProQuest Central (NC Live)
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Library & Information Science
Computer Science
EISSN 1758-616X
EndPage 120
ExternalDocumentID 4313389391
998933097202837
10_1108_EL_06_2015_0108
10.1108/EL-06-2015-0108
GeographicLocations New York
United States--US
China
GeographicLocations_xml – name: New York
– name: China
– name: United States--US
GroupedDBID .X0
0-V
0R~
1WG
1XV
29G
3FY
3V.
4.4
5GY
5VS
70U
77K
7RV
8FE
8FG
8FI
8FW
8R4
8R5
9F-
AAGBP
AAMCF
AAOWE
AAPSD
AAUDR
AAWTL
ABEAN
ABHCV
ABIJV
ABJNI
ABSDC
ABUWG
ACGFS
ACHQT
ACKOT
ADBBV
ADFRT
ADMHG
ADOMW
AEBZA
AEDOK
AEMMR
AETHF
AFKRA
AFNZV
AGZLY
AIAFM
AJEBP
AJFKA
ALIPV
ALMA_UNASSIGNED_HOLDINGS
ALSLI
AODMV
APPLU
ARALO
ARAPS
ASMFL
ASUFR
ATGMP
AUCOK
AZQEC
BCU
BEC
BENPR
BGLVJ
BKEYQ
BPHCQ
BVLZF
BVXVI
CCPQU
CJNVE
CNYFK
CS3
DU5
DWQXO
EBS
EJD
EX3
FNNZZ
FYUFA
GEA
GEC
GEI
GMM
GMN
GNUQQ
GQ.
GUQSH
H13
HCIFZ
HZ~
IJT
IPNFZ
J1Y
JI-
JL0
K6V
K7-
KLENG
M0N
M0P
M1O
M2O
M42
NAPCQ
O9-
P62
PCD
PQEDU
PQQKQ
PRG
PROAC
Q2X
RIG
SCAQC
SDURG
SJFOW
TDX
TEM
TET
TGG
TMD
TMF
TMT
TN5
UKHRP
WH7
WOW
Z11
Z21
77I
AABYC
AAYXX
ABXQL
ABYQI
ACXJU
AFFHD
AHAFT
AHMHQ
CITATION
PHGZM
PHGZT
PPXIY
PQGLB
PRQQA
34G
39C
9E0
ABKIT
ACZUD
ADIOT
ADQUB
ADYJY
AEACZ
AFFNX
AFQLH
AFVFF
AGQPQ
AGSTH
AGUEF
AJNYF
AJZCB
AKXVL
ALJBP
ASJQZ
ASPBG
AVWKF
AZFZN
BCR
BFQZO
BLC
BLEHN
BTXLY
BUONS
CAG
COF
DUNLO
EOXHF
GOM
HF~
H~9
LPU
ROL
UGKUH
YQR
7SC
7XB
8FD
E3H
F2A
JQ2
L7M
L~C
L~D
MBDVC
PKEHL
PQEST
PQUKI
PRINS
Q9U
ID FETCH-LOGICAL-c290t-22648fcbd27e29333baf7936c7bddc3fb84fb6b74e94a0cf3c788c713920b4d83
IEDL.DBID TMT
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000396719100006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0264-0473
1758-616X
IngestDate Sat Nov 15 03:21:33 EST 2025
Fri Nov 14 15:36:48 EST 2025
Wed Nov 05 12:12:59 EST 2025
Tue Feb 11 07:04:41 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords Access structures
Digital library resources
Hierarchical segmentation
Lexical cohesion
Structured segmentation
AIC
Optimum partitioning clustering
Language English
License Licensed re-use rights only
https://www.emerald.com/insight/site-policies
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c290t-22648fcbd27e29333baf7936c7bddc3fb84fb6b74e94a0cf3c788c713920b4d83
Notes Includes illustrations, references, tables
Includes links to related electronic resources
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 1867933132
PQPubID 32127
PageCount 22
ParticipantIDs nlnz_indexnz_998933097202837
crossref_primary_10_1108_EL_06_2015_0108
proquest_journals_1867933132
emerald_primary_10_1108_EL-06-2015-0108
PublicationCentury 2000
PublicationDate 2017-01-01
PublicationDateYYYYMMDD 2017-01-01
PublicationDate_xml – month: 01
  year: 2017
  text: 2017-01-01
  day: 01
PublicationDecade 2010
PublicationPlace Oxford
PublicationPlace_xml – name: Oxford
PublicationTitle Electronic library
PublicationYear 2017
Publisher Emerald Publishing Limited
Emerald Group Publishing Limited
Publisher_xml – name: Emerald Publishing Limited
– name: Emerald Group Publishing Limited
References key2020120905112644300_ref038
(key2020120905112644300_ref001) 2000
(key2020120905112644300_ref008) 1994
(key2020120905112644300_ref031) 2011; 47
(key2020120905112644300_ref021) 2006
(key2020120905112644300_ref046) 2005; 56
(key2020120905112644300_ref041) 2003
(key2020120905112644300_ref006) 2001
(key2020120905112644300_ref011) 2001
(key2020120905112644300_ref029) 2015
(key2020120905112644300_ref043) 1998
(key2020120905112644300_ref009) 2012; 20
(key2020120905112644300_ref039) 2012
(key2020120905112644300_ref026) 2009
(key2020120905112644300_ref013) 2009
(key2020120905112644300_ref032) 1991; 17
(key2020120905112644300_ref028) 2004
(key2020120905112644300_ref012) 1999
(key2020120905112644300_ref036) 2014
(key2020120905112644300_ref010) 2000
(key2020120905112644300_ref040) 1996
(key2020120905112644300_ref007) 2004
(key2020120905112644300_ref016) 2012
(key2020120905112644300_ref017) 2002
(key2020120905112644300_ref027) 1993
(key2020120905112644300_ref030) 1994
(key2020120905112644300_ref033) 2007; 177
(key2020120905112644300_ref020) 1994
key2020120905112644300_ref024
key2020120905112644300_ref045
(key2020120905112644300_ref023) 2004; 37
(key2020120905112644300_ref005) 2000
(key2020120905112644300_ref035) 1997
(key2020120905112644300_ref018) 1986; 12
(key2020120905112644300_ref015) 2004
(key2020120905112644300_ref019) 2009
(key2020120905112644300_ref004) 2001
(key2020120905112644300_ref025) 1999
(key2020120905112644300_ref048) 2014
(key2020120905112644300_ref014) 2008
(key2020120905112644300_ref022) 2003
(key2020120905112644300_ref042) 2008
(key2020120905112644300_ref047) 2003
(key2020120905112644300_ref044) 2014; 21
(key2020120905112644300_ref003) 1999; 34
(key2020120905112644300_ref002) 1999
(key2020120905112644300_ref034) 2002; 28
(key2020120905112644300_ref037) 1994
References_xml – start-page: 113
  volume-title: Research and Advanced Technology for Digital Libraries
  year: 1997
  ident: key2020120905112644300_ref035
  article-title: Text segmentation by topic
– start-page: 53
  year: 1996
  ident: key2020120905112644300_ref040
  article-title: Automatic text decomposition using text segments and text themes
– start-page: 49
  year: 2003
  ident: key2020120905112644300_ref041
  article-title: Spoken and written news story segmentation using lexical chains
– start-page: 273
  year: 2006
  ident: key2020120905112644300_ref021
  article-title: Automatic segmentation of multiparty dialogue
– ident: key2020120905112644300_ref024
– start-page: 11
  year: 2003
  ident: key2020120905112644300_ref047
  article-title: Improving pseudo-relevance feedback in web information retrieval using web page segmentation
– ident: key2020120905112644300_ref045
– start-page: 481
  year: 2015
  ident: key2020120905112644300_ref029
  article-title: Domain-independent unsupervised text segmentation for data management
– start-page: 109
  year: 2001
  ident: key2020120905112644300_ref011
  article-title: Latent semantic analysis for text segmentation
– start-page: 648
  volume-title: Coupling Approaches, Coupling Media and Coupling Languages for Information Retrieval
  year: 2004
  ident: key2020120905112644300_ref007
  article-title: Unsupervised learning with term clustering for thematic segmentation of texts
– start-page: 755
  year: 1994
  ident: key2020120905112644300_ref030
  article-title: Word sense disambiguation and text segmentation based on lexical cohesion
– volume: 21
  start-page: 35
  issue: 1
  year: 2014
  ident: key2020120905112644300_ref044
  article-title: A hybrid linear text segmentation algorithm using hierarchical agglomerative clustering and discrete particle swarm optimization
  publication-title: Integrated Computer-Aided Engineering
  doi: 10.3233/ICA-130446
– start-page: 343
  year: 2001
  ident: key2020120905112644300_ref004
  article-title: Topic segmentation with an aspect hidden Markov model
– start-page: 476
  volume-title: Content-Based Multimedia Information Access, Volume 1
  year: 2000
  ident: key2020120905112644300_ref001
  article-title: Learning for sequence extraction tasks
– volume: 56
  start-page: 1438
  issue: 13
  year: 2005
  ident: key2020120905112644300_ref046
  article-title: A heuristic method based on a statistical approach for Chinese text segmentation
  publication-title: Journal of the American Society for Information Science and Technology
  doi: 10.1002/asi.20237
– start-page: 331
  year: 1994
  ident: key2020120905112644300_ref037
  article-title: An automatic method of finding topic boundaries
– start-page: 1
  year: 2000
  ident: key2020120905112644300_ref005
  article-title: Discourse segmentation in aid of document summarization
– start-page: 111
  volume-title: Advances in Automatic Text Summarization
  year: 1999
  ident: key2020120905112644300_ref002
  article-title: Using lexical chains for text summarization
– start-page: 1131
  year: 2012
  ident: key2020120905112644300_ref016
  article-title: Research on topic segmentation of Chinese text based on lexical chain
– start-page: 1
  year: 2002
  ident: key2020120905112644300_ref017
  article-title: Using collocations for topic segmentation and link detection
– volume: 37
  start-page: 977
  issue: 5
  year: 2004
  ident: key2020120905112644300_ref023
  article-title: Text information extraction in images and video: a survey
  publication-title: Pattern Recognition
  doi: 10.1016/j.patcog.2003.10.012
– start-page: 553
  year: 2012
  ident: key2020120905112644300_ref039
  article-title: How text segmentation algorithms gain from topic models
– year: 1998
  ident: key2020120905112644300_ref043
  article-title: Text segmentation and topic tracking on broadcast news via a hidden Markov model approach
– start-page: 4018
  year: 2014
  ident: key2020120905112644300_ref048
  article-title: Text segmentation based on PLSA-TextTiling model
– volume: 34
  start-page: 177
  issue: 1/3
  year: 1999
  ident: key2020120905112644300_ref003
  article-title: Statistical models for text segmentation
  publication-title: Machine Learning
  doi: 10.1023/A:1007506220214
– start-page: 291
  year: 1999
  ident: key2020120905112644300_ref012
  article-title: Fast automatic passage ranking
– start-page: 322
  year: 2003
  ident: key2020120905112644300_ref022
  article-title: Domain-independent text segmentation using anisotropic diffusion and dynamic programming
– start-page: 353
  year: 2009
  ident: key2020120905112644300_ref013
  article-title: Hierarchical text segmentation from multi-scale lexical cohesion
– start-page: 27
  year: 2004
  ident: key2020120905112644300_ref015
  article-title: Legal text summarization by exploration of the thematic structures and argumentative roles
– start-page: 158
  volume-title: Text, Speech and Dialogue
  year: 2001
  ident: key2020120905112644300_ref006
  article-title: Text segmentation into paragraphs based on local text cohesion
– volume: 12
  start-page: 175
  issue: 3
  year: 1986
  ident: key2020120905112644300_ref018
  article-title: Attention, intentions, and the structure of discourse
  publication-title: Computational Linguistics
– start-page: 9
  year: 1994
  ident: key2020120905112644300_ref020
  article-title: Multi-paragraph segmentation of expository text
– start-page: 286
  year: 1993
  ident: key2020120905112644300_ref027
  article-title: Text segmentation based on similarity between words
– start-page: 334
  year: 2008
  ident: key2020120905112644300_ref014
  article-title: Bayesian unsupervised topic segmentation
– start-page: 817
  year: 2008
  ident: key2020120905112644300_ref042
  article-title: Topic identification for fine-grained opinion analysis
– volume: 20
  start-page: 55
  issue: 1
  year: 2012
  ident: key2020120905112644300_ref009
  article-title: Topic-based hierarchical segmentation
  publication-title: Audio, Speech, and Language Processing
  doi: 10.1109/TASL.2011.2143405
– start-page: 362
  year: 2009
  ident: key2020120905112644300_ref019
  article-title: Exploring content models for multi-document summarization
– ident: key2020120905112644300_ref038
– start-page: 302
  year: 1994
  ident: key2020120905112644300_ref008
  article-title: Passage-level evidence in document retrieval
– start-page: 167
  year: 2009
  ident: key2020120905112644300_ref026
  article-title: Efficient linear text segmentation based on information retrieval techniques
– volume: 28
  start-page: 19
  issue: 1
  year: 2002
  ident: key2020120905112644300_ref034
  article-title: A critique and improvement of an evaluation metric for text segmentation
  publication-title: Computational Linguistics
  doi: 10.1162/089120102317341756
– volume: 47
  start-page: 528
  issue: 4
  year: 2011
  ident: key2020120905112644300_ref031
  article-title: Text segmentation: a topic modeling perspective
  publication-title: Information Processing & Management
  doi: 10.1016/j.ipm.2010.11.008
– start-page: 1
  year: 2014
  ident: key2020120905112644300_ref036
  article-title: Rule based approach for text segmentation on Indonesian news article using named entity distribution
– start-page: 591
  year: 1999
  ident: key2020120905112644300_ref025
  article-title: Cohesion and collocation: using context vectors in text segmentation
– volume: 177
  start-page: 3696
  issue: 18
  year: 2007
  ident: key2020120905112644300_ref033
  article-title: Semantic passage segmentation based on sentence topics for question answering
  publication-title: Information Sciences
  doi: 10.1016/j.ins.2007.02.038
– start-page: 9
  year: 2004
  ident: key2020120905112644300_ref028
  article-title: Segmentation of lecture videos based on text: a method combining multiple linguistic features
– volume: 17
  start-page: 21
  issue: 1
  year: 1991
  ident: key2020120905112644300_ref032
  article-title: Lexical cohesion computed by thesaural relations as an indicator of the structure of text
  publication-title: Computational Linguistics
– start-page: 26
  year: 2000
  ident: key2020120905112644300_ref010
  article-title: Advances in domain independent linear text segmentation
SSID ssj0001670
Score 2.0538409
Snippet Purpose Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits...
Proposes a new multi-granularity hierarchical topic-based segmentation system (MHTSS) to decide section breaks in coherent texts such as documents of a digital...
Purpose Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits...
SourceID proquest
nlnz
crossref
emerald
SourceType Aggregation Database
Index Database
Publisher
StartPage 99
SubjectTerms Access
Algorithms
Cohesion
Computerized corpora
Cues
Digital libraries
Digital systems
Electronic Libraries
Hierarchies
Image segmentation
Information overload
Information retrieval
Libraries
Library resources
Library users
Linguistics
Literature Reviews
Methods
Morphemes
Parsing
Relevance
Repetition
Resources
Retrieval
Retrieval performance measures
Segmentation
Segments
Semantics
Semiotics
Sentences
Sequences
Spelling
Structural hierarchy
Studies
Syntax
Text editing
Text processing (Computer science)
Topic and comment
SummonAdditionalLinks – databaseName: Computer Science Database
  dbid: K7-
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07TxwxELZ4FTQhkKAcgcgFAgosfPbeerdCKDqEBEIUINEZP08nwd7l9kLBr2fG5w1EidKk2sIuvP7seXlmPkL2wcL20QnHQi0UK4yqWW1FZDxUpuJVdLKyiWxCXV9X9_f1TQ64tTmtspOJSVD7icMY-Qk2XgPnG5yn0-kPhqxR-LqaKTSWyWpfiD6e80vFfknifqlyjAWTLZTMrX2Q-WZ4hfk-oP0wdQ25Jd9ppbfS3JXmsXn5Q0on1XO-8b-L_kg-ZKOTni1OySZZCs0W2egIHWi-31tkL1cx0AOay5QQtm78E3lI5bpsBPoNs1fBgKdIpZ0eIwBrOp9Mx46hYvS0DaOnXNfU0Emki0a1P2fBH1M_HiFXCc0hJDrLbwjtZ3J3Prz9fsEyRQNzouZzhmW4AKj1QgUwHKS0JsIfl05Z752MtiqiLa0qQl0Y7qJ04HI7cIxrwW3hK7kNezxpwhdC3cCXzsAkY3zhubE8GultEZySAytdjxx1EOnpohOHTh4Mr_TwSvNSI5oa0eyRwwzhX2b-hnuP7CLEOrWmhC84nxjhqZVAu0vBcIerzne71W-g7vx7-CtZF2gEpIDNLlmBfQ57ZM09z8ft7Fs6qq-7mu_o
  priority: 102
  providerName: ProQuest
Title Multi-granularity hierarchical topic-based segmentation of structured, digital library resources
URI https://www.emerald.com/insight/content/doi/10.1108/EL-06-2015-0108/full/html
https://natlib-primo.hosted.exlibrisgroup.com/primo-explore/search?query=any,contains,998933097202837&tab=innz&search_scope=INNZ&vid=NLNZ&offset=0
https://www.proquest.com/docview/1867933132
Volume 35
WOSCitedRecordID wos000396719100006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVMCB
  databaseName: Emerald
  customDbUrl:
  eissn: 1758-616X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001670
  issn: 0264-0473
  databaseCode: TMT
  dateStart: 19990101
  isFulltext: true
  titleUrlDefault: https://www.emerald.com/insight
  providerName: Emerald
– providerCode: PRVPQU
  databaseName: Computer Science Database
  customDbUrl:
  eissn: 1758-616X
  dateEnd: 20241209
  omitProxy: false
  ssIdentifier: ssj0001670
  issn: 0264-0473
  databaseCode: K7-
  dateStart: 19980601
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/compscijour
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Library Science Database
  customDbUrl:
  eissn: 1758-616X
  dateEnd: 20241209
  omitProxy: false
  ssIdentifier: ssj0001670
  issn: 0264-0473
  databaseCode: M1O
  dateStart: 19980601
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/libraryscience
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Nursing & Allied Health Database
  customDbUrl:
  eissn: 1758-616X
  dateEnd: 20241209
  omitProxy: false
  ssIdentifier: ssj0001670
  issn: 0264-0473
  databaseCode: 7RV
  dateStart: 19980601
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/nahs
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest advanced technologies & aerospace journals
  customDbUrl:
  eissn: 1758-616X
  dateEnd: 20241209
  omitProxy: false
  ssIdentifier: ssj0001670
  issn: 0264-0473
  databaseCode: P5Z
  dateStart: 19980601
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central (NC Live)
  customDbUrl:
  eissn: 1758-616X
  dateEnd: 20241209
  omitProxy: false
  ssIdentifier: ssj0001670
  issn: 0264-0473
  databaseCode: BENPR
  dateStart: 19980601
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Education Database
  customDbUrl:
  eissn: 1758-616X
  dateEnd: 20241209
  omitProxy: false
  ssIdentifier: ssj0001670
  issn: 0264-0473
  databaseCode: M0P
  dateStart: 19980601
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/education
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest research library
  customDbUrl:
  eissn: 1758-616X
  dateEnd: 20241209
  omitProxy: false
  ssIdentifier: ssj0001670
  issn: 0264-0473
  databaseCode: M2O
  dateStart: 19980601
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/pqrl
  providerName: ProQuest
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrR3LbhMx0CqFAxdaCqiBNvIBAQesuvZm7T1ClaoSTYiqgCIuZv0KkdpNlQ0c-HpmHG9RRcWpl1lZ9q60M6N5eR6EvAYL20cnHAuVUKyoVcUqKyLjQdea6-iktmnYhBqP9WxWTbbIqKuFSWmVm3BMktOLpkUn9QgTt0EK3zQcwOk1w3PM2QENhulnsMaA9dGP9dVlksgc5xlMR9MbuXxcqhxxwdSLdPcMulOD81TOctOfO756S1_9Ldrdbi6b3__I76SUTnfu-Xd2yZNsndIPG3Z6SrZCs0d2uskPNAuCPXKYyx3oG5rrmZC-3f4z8j3V9bI5KEJMcwVLn-LM7XRrAUxB18vrhWOoQT1tw_wqF0A1dBnppqPtz1Xw76lfzHGoCc2xJrrKlw3tc_LldDg9OWN5lgNzouJrhvW6QHnrhQpgYUhp6wiioXTKeu9ktLqItrSqCFVRcxelA9_cgQddCW4Lr-ULQPmyCfuEuoEvXQ2H6toXnteWx1p6WwSn5MBK1yPvOoqZ603LDpNcHa7N8Nzw0iCODeK4R95m6txx8hY1euQAKW5SD0t4gpeKoaBKCTTQFGx3nGCyEGgN9gqEQ-Dvv_z_26_IY4HWQorsHJBtwHM4JI_cr_WiXfXJA3XxtU8efhyOJxew-qQYwBGfIDz-jFAgnAy-9RPH_wElUf3l
linkProvider Emerald
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LbxMxEB6VggQXCgVEoAUfeB2wcO3teveAEIJUrRoihArqzfVro0iwCdkAgh_Fb2TG8VIQiFsPnPZgy1rZn-flmfkA7qGFHRovPY-11Lywuua1kw0XsbKVqBqvKpfIJvR4XB0f16_X4HtfC0Nplb1MTII6zDzFyJ9Q4zV0vtF5ejb_yIk1il5XewqNFSwO49cv6LJ1Tw9e4vnel3JvePRin2dWAe5lLZacKkfxH1yQOqKuU8rZBlcuvXYheNW4qmhc6XQR68IK3yiPXqJHX66WwhWhUrjuOTiPcnyHUsj0m3c_Jf9OqXNMh5I7tMqthIhpZzii_CLUtpQqR1yWv2jB01Lg9fZ9--0PrZBU3d7G_7ZJV-ByNqrZ89UtuAprsd2EjZ6wgmX5tQnbuUqDPWC5DItg2Y9fg5NUjswnqL8pOxcdFEZU4emxBbHMlrP51HNS_IF1cfIh1221bNawVSPeT4sYHrMwnRAXC8shMrbIbyTddXh7JvtwA8901sabwPxuKL3FSdaGIgjrRGNVcEX0Wu065QfwqIeEma86jZjkoYnKDEdGlIbQYwg9A3iYIfOXmb_hbABbBCmTWm_iF51rimDVWpJdqXG4x5HJsqszpyC69e_hu3Bx_-jVyIwOxoe34ZIkgycFp7ZgHfc8bsMF_3k57RZ30jVhcHLWkPsBD4xOTw
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LbxMxEB6VghCXFgoVgRZ84HXA6mJv1rsHhBBNRJUo6gGk3tz1K4oEm5ANRfDT-us643gpCMStB057sLW7sj_PyzPzATxBC9sFKyz3lVA8r1XFKyMCz3xZl1kZrCxNJJtQk0l5clIdb8B5VwtDaZWdTIyC2s0txcgPqPEaOt_oPB2ElBZxfDh8s_jCiUGKblo7Oo01REb--zd039rXR4e410-FGA4-vHvPE8MAt6LKVpyqSPF_jBPKo96T0tQBv1JYZZyzMpgyD6YwKvdVXmc2SIseo0W_rhKZyV0p8b3X4Dpq4T6dsZHiP7XAq0Kl-A4leiiZ2goR685gTLlGqHkpbY54LX_RiJdlwZvNp-bHHxoiqr3h9v-8YLdhKxnb7O36dNyBDd_swHZHZMGSXNuB_VS9wZ6xVJ5FcO3G78JpLFPmU9TrlLWLjgsjCvF4CYMYZ6v5YmY5GQSOtX76OdVzNWwe2LpB79eldy-Zm02Jo4Wl0BlbpruT9h58vJJ12MX9nTf-PjDbd4WtcVJdu9xltclCLZ3JvVWyb6TtwYsOHnqx7kCio-eWlXow1lmhCUmakNSD5wk-f5n5G-Z6sEfw0rElJz7R6abIVqUE2ZsKhztM6STTWn0JqAf_Hn4MNxFpenw0GT2EW4LsoBiz2oNNXHK_Dzfs2WrWLh_FE8Pg9KoRdwHaGVar
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multi-granularity+hierarchical+topic-based+segmentation+of+structured%2C+digital+library+resources&rft.jtitle=Electronic+library&rft.au=Wang%2C+Zhongyi&rft.au=Zhang%2C+Jin&rft.au=Huang%2C+Jing&rft.date=2017-01-01&rft.issn=0264-0473&rft.volume=35&rft.issue=1&rft.spage=99&rft.epage=120&rft_id=info:doi/10.1108%2FEL-06-2015-0108&rft.externalDBID=GOM&rft.externalDocID=998933097202837
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0264-0473&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0264-0473&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0264-0473&client=summon