Multi-granularity hierarchical topic-based segmentation of structured, digital library resources
Purpose Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits cohesive text such as news feed but not coherent texts such as documents of a digital library which have hierarchical structures. To overcome the...
Saved in:
| Published in: | Electronic library Vol. 35; no. 1; pp. 99 - 120 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Oxford
Emerald Publishing Limited
01.01.2017
Emerald Group Publishing Limited |
| Subjects: | |
| ISSN: | 0264-0473, 1758-616X, 1758-616X |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Purpose
Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits cohesive text such as news feed but not coherent texts such as documents of a digital library which have hierarchical structures. To overcome the focus on linear segmentation in document segmentation and to realize the purpose of hierarchical segmentation for a digital library’s structured resources, this paper aimed to propose a new multi-granularity hierarchical topic-based segmentation system (MHTSS) to decide section breaks.
Design/methodology/approach
MHTSS adopts up-down segmentation strategy to divide a structured, digital library document into a document segmentation tree. Specifically, it works in a three-stage process, such as document parsing, coarse segmentation based on document access structures and fine-grained segmentation based on lexical cohesion.
Findings
This paper analyzed limitations of document segmentation methods for the structured, digital library resources. Authors found that the combination of document access structures and lexical cohesion techniques should complement each other and allow for a better segmentation of structured, digital library resources. Based on this finding, this paper proposed the MHTSS for the structured, digital library resources. To evaluate it, MHTSS was compared to the TT and C99 algorithms on real-world digital library corpora. Through comparison, it was found that the MHTSS achieves top overall performance.
Practical implications
With MHTSS, digital library users can get their relevant information directly in segments instead of receiving the whole document. This will improve retrieval performance as well as dramatically reduce information overload.
Originality/value
This paper proposed MHTSS for the structured, digital library resources, which combines the document access structures and lexical cohesion techniques to decide section breaks. With this system, end-users can access a document by sections through a document structure tree. |
|---|---|
| AbstractList | Purpose Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits cohesive text such as news feed but not coherent texts such as documents of a digital library which have hierarchical structures. To overcome the focus on linear segmentation in document segmentation and to realize the purpose of hierarchical segmentation for a digital library's structured resources, this paper aimed to propose a new multi-granularity hierarchical topic-based segmentation system (MHTSS) to decide section breaks. Design/methodology/approach MHTSS adopts up-down segmentation strategy to divide a structured, digital library document into a document segmentation tree. Specifically, it works in a three-stage process, such as document parsing, coarse segmentation based on document access structures and fine-grained segmentation based on lexical cohesion. Findings This paper analyzed limitations of document segmentation methods for the structured, digital library resources. Authors found that the combination of document access structures and lexical cohesion techniques should complement each other and allow for a better segmentation of structured, digital library resources. Based on this finding, this paper proposed the MHTSS for the structured, digital library resources. To evaluate it, MHTSS was compared to the TT and C99 algorithms on real-world digital library corpora. Through comparison, it was found that the MHTSS achieves top overall performance. Practical implications With MHTSS, digital library users can get their relevant information directly in segments instead of receiving the whole document. This will improve retrieval performance as well as dramatically reduce information overload. Originality/value This paper proposed MHTSS for the structured, digital library resources, which combines the document access structures and lexical cohesion techniques to decide section breaks. With this system, end-users can access a document by sections through a document structure tree. Purpose Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits cohesive text such as news feed but not coherent texts such as documents of a digital library which have hierarchical structures. To overcome the focus on linear segmentation in document segmentation and to realize the purpose of hierarchical segmentation for a digital library’s structured resources, this paper aimed to propose a new multi-granularity hierarchical topic-based segmentation system (MHTSS) to decide section breaks. Design/methodology/approach MHTSS adopts up-down segmentation strategy to divide a structured, digital library document into a document segmentation tree. Specifically, it works in a three-stage process, such as document parsing, coarse segmentation based on document access structures and fine-grained segmentation based on lexical cohesion. Findings This paper analyzed limitations of document segmentation methods for the structured, digital library resources. Authors found that the combination of document access structures and lexical cohesion techniques should complement each other and allow for a better segmentation of structured, digital library resources. Based on this finding, this paper proposed the MHTSS for the structured, digital library resources. To evaluate it, MHTSS was compared to the TT and C99 algorithms on real-world digital library corpora. Through comparison, it was found that the MHTSS achieves top overall performance. Practical implications With MHTSS, digital library users can get their relevant information directly in segments instead of receiving the whole document. This will improve retrieval performance as well as dramatically reduce information overload. Originality/value This paper proposed MHTSS for the structured, digital library resources, which combines the document access structures and lexical cohesion techniques to decide section breaks. With this system, end-users can access a document by sections through a document structure tree. Proposes a new multi-granularity hierarchical topic-based segmentation system (MHTSS) to decide section breaks in coherent texts such as documents of a digital library which have hierarchical structures. Source: National Library of New Zealand Te Puna Matauranga o Aotearoa, licensed by the Department of Internal Affairs for re-use under the Creative Commons Attribution 3.0 New Zealand Licence. |
| Author | Zhang, Jin Huang, Jing Wang, Zhongyi |
| Author_xml | – sequence: 1 givenname: Zhongyi surname: Wang fullname: Wang, Zhongyi email: wzywzy13579@163.com – sequence: 2 givenname: Jin surname: Zhang fullname: Zhang, Jin email: jinzhanguwm@gmail.com – sequence: 3 givenname: Jing surname: Huang fullname: Huang, Jing email: huangjing8117@163.com |
| BackLink | https://natlib-primo.hosted.exlibrisgroup.com/primo-explore/search?query=any,contains,998933097202837&tab=innz&search_scope=INNZ&vid=NLNZ&offset=0$$DView this record in NLNZ |
| BookMark | eNptUU1rGzEQFcWF2m7PvS70GjkjaS1pjyU4acChlxZ6U_W1tsJa60haiPvrK-NACOQ0DPPemzdvFmgWx-gR-kpgRQjI680WA8cUyBpD7T-gORFriTnhf2ZoDpS3GFrBPqFFzo8AQLiAOfr7MA0l4F3ScRp0CuXU7INPOtl9sHpoyngMFhudvWuy3x18LLqEMTZj3-SSJlum5N1V48IulIofgqnkU5N8Hqdkff6MPvZ6yP7LS12i37ebXzc_8Pbn3f3N9y22tIOCafUne2scFZ52jDGje9ExboVxzrLeyLY33IjWd60G2zMrpLSCsI6CaZ1kS_TtontM49Pkc1GP1UCsKxWRvEoxwugrKg7xnwrR-edau07WOXSCApVMVNT1BWXTmHPyvTqmcKhnKQLqHLbabBVwdQ5bncOujNWF4Q81vMG9Q3jzHfYf3zOBTQ |
| Cites_doi | 10.3233/ICA-130446 10.1002/asi.20237 10.1016/j.patcog.2003.10.012 10.1023/A:1007506220214 10.1109/TASL.2011.2143405 10.1162/089120102317341756 10.1016/j.ipm.2010.11.008 10.1016/j.ins.2007.02.038 |
| ContentType | Journal Article |
| Copyright | Emerald Publishing Limited Emerald Publishing Limited 2017 |
| Copyright_xml | – notice: Emerald Publishing Limited – notice: Emerald Publishing Limited 2017 |
| DBID | AAYXX CITATION DUNLO GOM 0-V 7RV 7SC 7XB 8FD 8FE 8FG 8FI ABUWG AFKRA ALSLI ARAPS AZQEC BEC BENPR BGLVJ CCPQU CJNVE CNYFK DWQXO E3H F2A FYUFA GNUQQ GUQSH HCIFZ JQ2 K7- L7M L~C L~D M0N M0P M1O M2O MBDVC NAPCQ P5Z P62 PHGZM PHGZT PKEHL PPXIY PQEDU PQEST PQGLB PQQKQ PQUKI PRINS PRQQA Q9U |
| DOI | 10.1108/EL-06-2015-0108 |
| DatabaseName | CrossRef Index New Zealand (A&I) Index New Zealand ProQuest Social Sciences Premium Collection【Remote access available】 Nursing & Allied Health Database Computer and Information Systems Abstracts ProQuest Central (purchase pre-March 2016) Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection Hospital Premium Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland Social Science Premium Collection Advanced Technologies & Computer Science Collection ProQuest Central Essentials eLibrary ProQuest Central Technology collection ProQuest One Community College Education Collection Library & Information Science Collection ProQuest Central Library & Information Sciences Abstracts (LISA) Library & Information Science Abstracts (LISA) Health Research Premium Collection ProQuest Central Student Research Library Prep SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Computing Database Education Database Library Science Database Research Library Research Library (Corporate) Nursing & Allied Health Premium AAdvanced Technologies & Aerospace Database (subscription) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic ProQuest One Academic Middle East (New) One Health & Nursing One Education ProQuest One Academic Eastern Edition (DO NOT USE) One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China ProQuest One Social Sciences ProQuest Central Basic |
| DatabaseTitle | CrossRef ProQuest One Education Research Library Prep Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) Library and Information Science Abstracts (LISA) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection elibrary Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest One Health & Nursing ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Library Science Health Research Premium Collection ProQuest Central Korea Library & Information Science Collection ProQuest Research Library ProQuest Central (New) Advanced Technologies Database with Aerospace Advanced Technologies & Aerospace Collection Social Science Premium Collection ProQuest Computing Education Collection ProQuest One Social Sciences ProQuest Central Basic ProQuest Education Journals ProQuest One Academic Eastern Edition ProQuest Nursing & Allied Health Source ProQuest Hospital Collection ProQuest Technology Collection ProQuest SciTech Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database Nursing & Allied Health Premium ProQuest Social Sciences Premium Collection ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) |
| DatabaseTitleList | ProQuest One Education |
| Database_xml | – sequence: 1 dbid: BENPR name: ProQuest Central (NC Live) url: https://www.proquest.com/central sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Library & Information Science Computer Science |
| EISSN | 1758-616X |
| EndPage | 120 |
| ExternalDocumentID | 4313389391 998933097202837 10_1108_EL_06_2015_0108 10.1108/EL-06-2015-0108 |
| GeographicLocations | New York United States--US China |
| GeographicLocations_xml | – name: New York – name: China – name: United States--US |
| GroupedDBID | .X0 0-V 0R~ 1WG 1XV 29G 3FY 3V. 4.4 5GY 5VS 70U 77K 7RV 8FE 8FG 8FI 8FW 8R4 8R5 9F- AAGBP AAMCF AAOWE AAPSD AAUDR AAWTL ABEAN ABHCV ABIJV ABJNI ABSDC ABUWG ACGFS ACHQT ACKOT ADBBV ADFRT ADMHG ADOMW AEBZA AEDOK AEMMR AETHF AFKRA AFNZV AGZLY AIAFM AJEBP AJFKA ALIPV ALMA_UNASSIGNED_HOLDINGS ALSLI AODMV APPLU ARALO ARAPS ASMFL ASUFR ATGMP AUCOK AZQEC BCU BEC BENPR BGLVJ BKEYQ BPHCQ BVLZF BVXVI CCPQU CJNVE CNYFK CS3 DU5 DWQXO EBS EJD EX3 FNNZZ FYUFA GEA GEC GEI GMM GMN GNUQQ GQ. GUQSH H13 HCIFZ HZ~ IJT IPNFZ J1Y JI- JL0 K6V K7- KLENG M0N M0P M1O M2O M42 NAPCQ O9- P62 PCD PQEDU PQQKQ PRG PROAC Q2X RIG SCAQC SDURG SJFOW TDX TEM TET TGG TMD TMF TMT TN5 UKHRP WH7 WOW Z11 Z21 77I AABYC AAYXX ABXQL ABYQI ACXJU AFFHD AHAFT AHMHQ CITATION PHGZM PHGZT PPXIY PQGLB PRQQA 34G 39C 9E0 ABKIT ACZUD ADIOT ADQUB ADYJY AEACZ AFFNX AFQLH AFVFF AGQPQ AGSTH AGUEF AJNYF AJZCB AKXVL ALJBP ASJQZ ASPBG AVWKF AZFZN BCR BFQZO BLC BLEHN BTXLY BUONS CAG COF DUNLO EOXHF GOM HF~ H~9 LPU ROL UGKUH YQR 7SC 7XB 8FD E3H F2A JQ2 L7M L~C L~D MBDVC PKEHL PQEST PQUKI PRINS Q9U |
| ID | FETCH-LOGICAL-c290t-22648fcbd27e29333baf7936c7bddc3fb84fb6b74e94a0cf3c788c713920b4d83 |
| IEDL.DBID | TMT |
| ISICitedReferencesCount | 1 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000396719100006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0264-0473 1758-616X |
| IngestDate | Sat Nov 15 03:21:33 EST 2025 Fri Nov 14 15:36:48 EST 2025 Wed Nov 05 12:12:59 EST 2025 Tue Feb 11 07:04:41 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Keywords | Access structures Digital library resources Hierarchical segmentation Lexical cohesion Structured segmentation AIC Optimum partitioning clustering |
| Language | English |
| License | Licensed re-use rights only https://www.emerald.com/insight/site-policies |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c290t-22648fcbd27e29333baf7936c7bddc3fb84fb6b74e94a0cf3c788c713920b4d83 |
| Notes | Includes illustrations, references, tables Includes links to related electronic resources ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| PQID | 1867933132 |
| PQPubID | 32127 |
| PageCount | 22 |
| ParticipantIDs | nlnz_indexnz_998933097202837 crossref_primary_10_1108_EL_06_2015_0108 proquest_journals_1867933132 emerald_primary_10_1108_EL-06-2015-0108 |
| PublicationCentury | 2000 |
| PublicationDate | 2017-01-01 |
| PublicationDateYYYYMMDD | 2017-01-01 |
| PublicationDate_xml | – month: 01 year: 2017 text: 2017-01-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | Oxford |
| PublicationPlace_xml | – name: Oxford |
| PublicationTitle | Electronic library |
| PublicationYear | 2017 |
| Publisher | Emerald Publishing Limited Emerald Group Publishing Limited |
| Publisher_xml | – name: Emerald Publishing Limited – name: Emerald Group Publishing Limited |
| References | key2020120905112644300_ref038 (key2020120905112644300_ref001) 2000 (key2020120905112644300_ref008) 1994 (key2020120905112644300_ref031) 2011; 47 (key2020120905112644300_ref021) 2006 (key2020120905112644300_ref046) 2005; 56 (key2020120905112644300_ref041) 2003 (key2020120905112644300_ref006) 2001 (key2020120905112644300_ref011) 2001 (key2020120905112644300_ref029) 2015 (key2020120905112644300_ref043) 1998 (key2020120905112644300_ref009) 2012; 20 (key2020120905112644300_ref039) 2012 (key2020120905112644300_ref026) 2009 (key2020120905112644300_ref013) 2009 (key2020120905112644300_ref032) 1991; 17 (key2020120905112644300_ref028) 2004 (key2020120905112644300_ref012) 1999 (key2020120905112644300_ref036) 2014 (key2020120905112644300_ref010) 2000 (key2020120905112644300_ref040) 1996 (key2020120905112644300_ref007) 2004 (key2020120905112644300_ref016) 2012 (key2020120905112644300_ref017) 2002 (key2020120905112644300_ref027) 1993 (key2020120905112644300_ref030) 1994 (key2020120905112644300_ref033) 2007; 177 (key2020120905112644300_ref020) 1994 key2020120905112644300_ref024 key2020120905112644300_ref045 (key2020120905112644300_ref023) 2004; 37 (key2020120905112644300_ref005) 2000 (key2020120905112644300_ref035) 1997 (key2020120905112644300_ref018) 1986; 12 (key2020120905112644300_ref015) 2004 (key2020120905112644300_ref019) 2009 (key2020120905112644300_ref004) 2001 (key2020120905112644300_ref025) 1999 (key2020120905112644300_ref048) 2014 (key2020120905112644300_ref014) 2008 (key2020120905112644300_ref022) 2003 (key2020120905112644300_ref042) 2008 (key2020120905112644300_ref047) 2003 (key2020120905112644300_ref044) 2014; 21 (key2020120905112644300_ref003) 1999; 34 (key2020120905112644300_ref002) 1999 (key2020120905112644300_ref034) 2002; 28 (key2020120905112644300_ref037) 1994 |
| References_xml | – start-page: 113 volume-title: Research and Advanced Technology for Digital Libraries year: 1997 ident: key2020120905112644300_ref035 article-title: Text segmentation by topic – start-page: 53 year: 1996 ident: key2020120905112644300_ref040 article-title: Automatic text decomposition using text segments and text themes – start-page: 49 year: 2003 ident: key2020120905112644300_ref041 article-title: Spoken and written news story segmentation using lexical chains – start-page: 273 year: 2006 ident: key2020120905112644300_ref021 article-title: Automatic segmentation of multiparty dialogue – ident: key2020120905112644300_ref024 – start-page: 11 year: 2003 ident: key2020120905112644300_ref047 article-title: Improving pseudo-relevance feedback in web information retrieval using web page segmentation – ident: key2020120905112644300_ref045 – start-page: 481 year: 2015 ident: key2020120905112644300_ref029 article-title: Domain-independent unsupervised text segmentation for data management – start-page: 109 year: 2001 ident: key2020120905112644300_ref011 article-title: Latent semantic analysis for text segmentation – start-page: 648 volume-title: Coupling Approaches, Coupling Media and Coupling Languages for Information Retrieval year: 2004 ident: key2020120905112644300_ref007 article-title: Unsupervised learning with term clustering for thematic segmentation of texts – start-page: 755 year: 1994 ident: key2020120905112644300_ref030 article-title: Word sense disambiguation and text segmentation based on lexical cohesion – volume: 21 start-page: 35 issue: 1 year: 2014 ident: key2020120905112644300_ref044 article-title: A hybrid linear text segmentation algorithm using hierarchical agglomerative clustering and discrete particle swarm optimization publication-title: Integrated Computer-Aided Engineering doi: 10.3233/ICA-130446 – start-page: 343 year: 2001 ident: key2020120905112644300_ref004 article-title: Topic segmentation with an aspect hidden Markov model – start-page: 476 volume-title: Content-Based Multimedia Information Access, Volume 1 year: 2000 ident: key2020120905112644300_ref001 article-title: Learning for sequence extraction tasks – volume: 56 start-page: 1438 issue: 13 year: 2005 ident: key2020120905112644300_ref046 article-title: A heuristic method based on a statistical approach for Chinese text segmentation publication-title: Journal of the American Society for Information Science and Technology doi: 10.1002/asi.20237 – start-page: 331 year: 1994 ident: key2020120905112644300_ref037 article-title: An automatic method of finding topic boundaries – start-page: 1 year: 2000 ident: key2020120905112644300_ref005 article-title: Discourse segmentation in aid of document summarization – start-page: 111 volume-title: Advances in Automatic Text Summarization year: 1999 ident: key2020120905112644300_ref002 article-title: Using lexical chains for text summarization – start-page: 1131 year: 2012 ident: key2020120905112644300_ref016 article-title: Research on topic segmentation of Chinese text based on lexical chain – start-page: 1 year: 2002 ident: key2020120905112644300_ref017 article-title: Using collocations for topic segmentation and link detection – volume: 37 start-page: 977 issue: 5 year: 2004 ident: key2020120905112644300_ref023 article-title: Text information extraction in images and video: a survey publication-title: Pattern Recognition doi: 10.1016/j.patcog.2003.10.012 – start-page: 553 year: 2012 ident: key2020120905112644300_ref039 article-title: How text segmentation algorithms gain from topic models – year: 1998 ident: key2020120905112644300_ref043 article-title: Text segmentation and topic tracking on broadcast news via a hidden Markov model approach – start-page: 4018 year: 2014 ident: key2020120905112644300_ref048 article-title: Text segmentation based on PLSA-TextTiling model – volume: 34 start-page: 177 issue: 1/3 year: 1999 ident: key2020120905112644300_ref003 article-title: Statistical models for text segmentation publication-title: Machine Learning doi: 10.1023/A:1007506220214 – start-page: 291 year: 1999 ident: key2020120905112644300_ref012 article-title: Fast automatic passage ranking – start-page: 322 year: 2003 ident: key2020120905112644300_ref022 article-title: Domain-independent text segmentation using anisotropic diffusion and dynamic programming – start-page: 353 year: 2009 ident: key2020120905112644300_ref013 article-title: Hierarchical text segmentation from multi-scale lexical cohesion – start-page: 27 year: 2004 ident: key2020120905112644300_ref015 article-title: Legal text summarization by exploration of the thematic structures and argumentative roles – start-page: 158 volume-title: Text, Speech and Dialogue year: 2001 ident: key2020120905112644300_ref006 article-title: Text segmentation into paragraphs based on local text cohesion – volume: 12 start-page: 175 issue: 3 year: 1986 ident: key2020120905112644300_ref018 article-title: Attention, intentions, and the structure of discourse publication-title: Computational Linguistics – start-page: 9 year: 1994 ident: key2020120905112644300_ref020 article-title: Multi-paragraph segmentation of expository text – start-page: 286 year: 1993 ident: key2020120905112644300_ref027 article-title: Text segmentation based on similarity between words – start-page: 334 year: 2008 ident: key2020120905112644300_ref014 article-title: Bayesian unsupervised topic segmentation – start-page: 817 year: 2008 ident: key2020120905112644300_ref042 article-title: Topic identification for fine-grained opinion analysis – volume: 20 start-page: 55 issue: 1 year: 2012 ident: key2020120905112644300_ref009 article-title: Topic-based hierarchical segmentation publication-title: Audio, Speech, and Language Processing doi: 10.1109/TASL.2011.2143405 – start-page: 362 year: 2009 ident: key2020120905112644300_ref019 article-title: Exploring content models for multi-document summarization – ident: key2020120905112644300_ref038 – start-page: 302 year: 1994 ident: key2020120905112644300_ref008 article-title: Passage-level evidence in document retrieval – start-page: 167 year: 2009 ident: key2020120905112644300_ref026 article-title: Efficient linear text segmentation based on information retrieval techniques – volume: 28 start-page: 19 issue: 1 year: 2002 ident: key2020120905112644300_ref034 article-title: A critique and improvement of an evaluation metric for text segmentation publication-title: Computational Linguistics doi: 10.1162/089120102317341756 – volume: 47 start-page: 528 issue: 4 year: 2011 ident: key2020120905112644300_ref031 article-title: Text segmentation: a topic modeling perspective publication-title: Information Processing & Management doi: 10.1016/j.ipm.2010.11.008 – start-page: 1 year: 2014 ident: key2020120905112644300_ref036 article-title: Rule based approach for text segmentation on Indonesian news article using named entity distribution – start-page: 591 year: 1999 ident: key2020120905112644300_ref025 article-title: Cohesion and collocation: using context vectors in text segmentation – volume: 177 start-page: 3696 issue: 18 year: 2007 ident: key2020120905112644300_ref033 article-title: Semantic passage segmentation based on sentence topics for question answering publication-title: Information Sciences doi: 10.1016/j.ins.2007.02.038 – start-page: 9 year: 2004 ident: key2020120905112644300_ref028 article-title: Segmentation of lecture videos based on text: a method combining multiple linguistic features – volume: 17 start-page: 21 issue: 1 year: 1991 ident: key2020120905112644300_ref032 article-title: Lexical cohesion computed by thesaural relations as an indicator of the structure of text publication-title: Computational Linguistics – start-page: 26 year: 2000 ident: key2020120905112644300_ref010 article-title: Advances in domain independent linear text segmentation |
| SSID | ssj0001670 |
| Score | 2.0538409 |
| Snippet | Purpose
Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits... Proposes a new multi-granularity hierarchical topic-based segmentation system (MHTSS) to decide section breaks in coherent texts such as documents of a digital... Purpose Current segmentation systems almost invariably focus on linear segmentation and can only divide text into linear sequences of segments. This suits... |
| SourceID | proquest nlnz crossref emerald |
| SourceType | Aggregation Database Index Database Publisher |
| StartPage | 99 |
| SubjectTerms | Access Algorithms Cohesion Computerized corpora Cues Digital libraries Digital systems Electronic Libraries Hierarchies Image segmentation Information overload Information retrieval Libraries Library resources Library users Linguistics Literature Reviews Methods Morphemes Parsing Relevance Repetition Resources Retrieval Retrieval performance measures Segmentation Segments Semantics Semiotics Sentences Sequences Spelling Structural hierarchy Studies Syntax Text editing Text processing (Computer science) Topic and comment |
| SummonAdditionalLinks | – databaseName: Computer Science Database dbid: K7- link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07TxwxELZ4FTQhkKAcgcgFAgosfPbeerdCKDqEBEIUINEZP08nwd7l9kLBr2fG5w1EidKk2sIuvP7seXlmPkL2wcL20QnHQi0UK4yqWW1FZDxUpuJVdLKyiWxCXV9X9_f1TQ64tTmtspOJSVD7icMY-Qk2XgPnG5yn0-kPhqxR-LqaKTSWyWpfiD6e80vFfknifqlyjAWTLZTMrX2Q-WZ4hfk-oP0wdQ25Jd9ppbfS3JXmsXn5Q0on1XO-8b-L_kg-ZKOTni1OySZZCs0W2egIHWi-31tkL1cx0AOay5QQtm78E3lI5bpsBPoNs1fBgKdIpZ0eIwBrOp9Mx46hYvS0DaOnXNfU0Emki0a1P2fBH1M_HiFXCc0hJDrLbwjtZ3J3Prz9fsEyRQNzouZzhmW4AKj1QgUwHKS0JsIfl05Z752MtiqiLa0qQl0Y7qJ04HI7cIxrwW3hK7kNezxpwhdC3cCXzsAkY3zhubE8GultEZySAytdjxx1EOnpohOHTh4Mr_TwSvNSI5oa0eyRwwzhX2b-hnuP7CLEOrWmhC84nxjhqZVAu0vBcIerzne71W-g7vx7-CtZF2gEpIDNLlmBfQ57ZM09z8ft7Fs6qq-7mu_o priority: 102 providerName: ProQuest |
| Title | Multi-granularity hierarchical topic-based segmentation of structured, digital library resources |
| URI | https://www.emerald.com/insight/content/doi/10.1108/EL-06-2015-0108/full/html https://natlib-primo.hosted.exlibrisgroup.com/primo-explore/search?query=any,contains,998933097202837&tab=innz&search_scope=INNZ&vid=NLNZ&offset=0 https://www.proquest.com/docview/1867933132 |
| Volume | 35 |
| WOSCitedRecordID | wos000396719100006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVMCB databaseName: Emerald customDbUrl: eissn: 1758-616X dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001670 issn: 0264-0473 databaseCode: TMT dateStart: 19990101 isFulltext: true titleUrlDefault: https://www.emerald.com/insight providerName: Emerald – providerCode: PRVPQU databaseName: Computer Science Database customDbUrl: eissn: 1758-616X dateEnd: 20241209 omitProxy: false ssIdentifier: ssj0001670 issn: 0264-0473 databaseCode: K7- dateStart: 19980601 isFulltext: true titleUrlDefault: http://search.proquest.com/compscijour providerName: ProQuest – providerCode: PRVPQU databaseName: Library Science Database customDbUrl: eissn: 1758-616X dateEnd: 20241209 omitProxy: false ssIdentifier: ssj0001670 issn: 0264-0473 databaseCode: M1O dateStart: 19980601 isFulltext: true titleUrlDefault: https://search.proquest.com/libraryscience providerName: ProQuest – providerCode: PRVPQU databaseName: Nursing & Allied Health Database customDbUrl: eissn: 1758-616X dateEnd: 20241209 omitProxy: false ssIdentifier: ssj0001670 issn: 0264-0473 databaseCode: 7RV dateStart: 19980601 isFulltext: true titleUrlDefault: https://search.proquest.com/nahs providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest advanced technologies & aerospace journals customDbUrl: eissn: 1758-616X dateEnd: 20241209 omitProxy: false ssIdentifier: ssj0001670 issn: 0264-0473 databaseCode: P5Z dateStart: 19980601 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central (NC Live) customDbUrl: eissn: 1758-616X dateEnd: 20241209 omitProxy: false ssIdentifier: ssj0001670 issn: 0264-0473 databaseCode: BENPR dateStart: 19980601 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Education Database customDbUrl: eissn: 1758-616X dateEnd: 20241209 omitProxy: false ssIdentifier: ssj0001670 issn: 0264-0473 databaseCode: M0P dateStart: 19980601 isFulltext: true titleUrlDefault: https://search.proquest.com/education providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest research library customDbUrl: eissn: 1758-616X dateEnd: 20241209 omitProxy: false ssIdentifier: ssj0001670 issn: 0264-0473 databaseCode: M2O dateStart: 19980601 isFulltext: true titleUrlDefault: https://search.proquest.com/pqrl providerName: ProQuest |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrR3LbhMx0CqFAxdaCqiBNvIBAQesuvZm7T1ClaoSTYiqgCIuZv0KkdpNlQ0c-HpmHG9RRcWpl1lZ9q60M6N5eR6EvAYL20cnHAuVUKyoVcUqKyLjQdea6-iktmnYhBqP9WxWTbbIqKuFSWmVm3BMktOLpkUn9QgTt0EK3zQcwOk1w3PM2QENhulnsMaA9dGP9dVlksgc5xlMR9MbuXxcqhxxwdSLdPcMulOD81TOctOfO756S1_9Ldrdbi6b3__I76SUTnfu-Xd2yZNsndIPG3Z6SrZCs0d2uskPNAuCPXKYyx3oG5rrmZC-3f4z8j3V9bI5KEJMcwVLn-LM7XRrAUxB18vrhWOoQT1tw_wqF0A1dBnppqPtz1Xw76lfzHGoCc2xJrrKlw3tc_LldDg9OWN5lgNzouJrhvW6QHnrhQpgYUhp6wiioXTKeu9ktLqItrSqCFVRcxelA9_cgQddCW4Lr-ULQPmyCfuEuoEvXQ2H6toXnteWx1p6WwSn5MBK1yPvOoqZ603LDpNcHa7N8Nzw0iCODeK4R95m6txx8hY1euQAKW5SD0t4gpeKoaBKCTTQFGx3nGCyEGgN9gqEQ-Dvv_z_26_IY4HWQorsHJBtwHM4JI_cr_WiXfXJA3XxtU8efhyOJxew-qQYwBGfIDz-jFAgnAy-9RPH_wElUf3l |
| linkProvider | Emerald |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LbxMxEB6VggQXCgVEoAUfeB2wcO3teveAEIJUrRoihArqzfVro0iwCdkAgh_Fb2TG8VIQiFsPnPZgy1rZn-flmfkA7qGFHRovPY-11Lywuua1kw0XsbKVqBqvKpfIJvR4XB0f16_X4HtfC0Nplb1MTII6zDzFyJ9Q4zV0vtF5ejb_yIk1il5XewqNFSwO49cv6LJ1Tw9e4vnel3JvePRin2dWAe5lLZacKkfxH1yQOqKuU8rZBlcuvXYheNW4qmhc6XQR68IK3yiPXqJHX66WwhWhUrjuOTiPcnyHUsj0m3c_Jf9OqXNMh5I7tMqthIhpZzii_CLUtpQqR1yWv2jB01Lg9fZ9--0PrZBU3d7G_7ZJV-ByNqrZ89UtuAprsd2EjZ6wgmX5tQnbuUqDPWC5DItg2Y9fg5NUjswnqL8pOxcdFEZU4emxBbHMlrP51HNS_IF1cfIh1221bNawVSPeT4sYHrMwnRAXC8shMrbIbyTddXh7JvtwA8901sabwPxuKL3FSdaGIgjrRGNVcEX0Wu065QfwqIeEma86jZjkoYnKDEdGlIbQYwg9A3iYIfOXmb_hbABbBCmTWm_iF51rimDVWpJdqXG4x5HJsqszpyC69e_hu3Bx_-jVyIwOxoe34ZIkgycFp7ZgHfc8bsMF_3k57RZ30jVhcHLWkPsBD4xOTw |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LbxMxEB6VghCXFgoVgRZ84HXA6mJv1rsHhBBNRJUo6gGk3tz1K4oEm5ANRfDT-us643gpCMStB057sLW7sj_PyzPzATxBC9sFKyz3lVA8r1XFKyMCz3xZl1kZrCxNJJtQk0l5clIdb8B5VwtDaZWdTIyC2s0txcgPqPEaOt_oPB2ElBZxfDh8s_jCiUGKblo7Oo01REb--zd039rXR4e410-FGA4-vHvPE8MAt6LKVpyqSPF_jBPKo96T0tQBv1JYZZyzMpgyD6YwKvdVXmc2SIseo0W_rhKZyV0p8b3X4Dpq4T6dsZHiP7XAq0Kl-A4leiiZ2goR685gTLlGqHkpbY54LX_RiJdlwZvNp-bHHxoiqr3h9v-8YLdhKxnb7O36dNyBDd_swHZHZMGSXNuB_VS9wZ6xVJ5FcO3G78JpLFPmU9TrlLWLjgsjCvF4CYMYZ6v5YmY5GQSOtX76OdVzNWwe2LpB79eldy-Zm02Jo4Wl0BlbpruT9h58vJJ12MX9nTf-PjDbd4WtcVJdu9xltclCLZ3JvVWyb6TtwYsOHnqx7kCio-eWlXow1lmhCUmakNSD5wk-f5n5G-Z6sEfw0rElJz7R6abIVqUE2ZsKhztM6STTWn0JqAf_Hn4MNxFpenw0GT2EW4LsoBiz2oNNXHK_Dzfs2WrWLh_FE8Pg9KoRdwHaGVar |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multi-granularity+hierarchical+topic-based+segmentation+of+structured%2C+digital+library+resources&rft.jtitle=Electronic+library&rft.au=Wang%2C+Zhongyi&rft.au=Zhang%2C+Jin&rft.au=Huang%2C+Jing&rft.date=2017-01-01&rft.issn=0264-0473&rft.volume=35&rft.issue=1&rft.spage=99&rft.epage=120&rft_id=info:doi/10.1108%2FEL-06-2015-0108&rft.externalDBID=GOM&rft.externalDocID=998933097202837 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0264-0473&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0264-0473&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0264-0473&client=summon |