Text Mining and Subject Analysis for Fiction; or, Using Machine Learning and Information Extraction to Assign Subject Headings to Dime Novels

This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of cataloging. It focuses narrowly on subject analysis of dime novels, a format of inexpensive fiction that was popular in the United States between...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Cataloging & classification quarterly Ročník 57; číslo 5; s. 315 - 336
Hlavní autor: Short, Matthew
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York Routledge 04.07.2019
Taylor & Francis Ltd
Témata:
ISSN:0163-9374, 1544-4554
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of cataloging. It focuses narrowly on subject analysis of dime novels, a format of inexpensive fiction that was popular in the United States between 1860 and 1915. NIU holds more than 55,000 dime novels in its collections, which it is in the process of comprehensively digitizing. Classification, keyword extraction, named-entity recognition, clustering, and topic modeling are discussed as means of assigning subject headings to improve their discoverability by researchers and to increase the productivity of digitization workflows.
AbstractList This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of cataloging. It focuses narrowly on subject analysis of dime novels, a format of inexpensive fiction that was popular in the United States between 1860 and 1915. NIU holds more than 55,000 dime novels in its collections, which it is in the process of comprehensively digitizing. Classification, keyword extraction, named-entity recognition, clustering, and topic modeling are discussed as means of assigning subject headings to improve their discoverability by researchers and to increase the productivity of digitization workflows.
Author Short, Matthew
Author_xml – sequence: 1
  givenname: Matthew
  surname: Short
  fullname: Short, Matthew
  email: mshort@niu.edu
  organization: University Libraries, Northern Illinois University
BookMark eNqFkMFOwyAYgInRxDl9BBMSr3YCBTrixWVOt2TqQXcmtKXK0sEEpu4hfGfbbXrwoCdC-L4__N8R2LfOagBOMeph1EcXCPNUpBntEYRFD3OWUpzugQ5mlCaUMboPOi2TtNAhOAphjpo7w6QDPp_0R4R3xhr7DJUt4eMqn-siwoFV9TqYACvn4Y0ponH2Ejp_DmehZe9U8WKshlOt_I88sQ29UC0LRx_Rq40Go4ODEMyz_Zk-1qpspNA-XZuFhvfuTdfhGBxUqg76ZHd2wexm9DQcJ9OH28lwME2KNO3HRGMl8lxpTkVZFYwgwTNU9VXGkcpIjjLOhCA8K7nINenrUueUU5YhVgoiSJl2wdl27tK715UOUc7dyjcbB5lizDknFLGGYluq8C4Eryu59Gah_FpiJNvy8ru8bMvLXfnGu_zlFSZuojRFTP2vfbW1zTbmu_N1KaNa185XXtnCtJ_8c8QX4z-eoA
CitedBy_id crossref_primary_10_1080_19386389_2022_2030635
crossref_primary_10_1016_j_acalib_2023_102736
crossref_primary_10_1155_2022_5856069
ContentType Journal Article
Copyright 2019 The Author(s). Published with license by Taylor & Francis Group, LLC 2019
2019 The Author(s). Published with license by Taylor & Francis Group, LLC
Copyright_xml – notice: 2019 The Author(s). Published with license by Taylor & Francis Group, LLC 2019
– notice: 2019 The Author(s). Published with license by Taylor & Francis Group, LLC
DBID AAYXX
CITATION
E3H
F2A
DOI 10.1080/01639374.2019.1653413
DatabaseName CrossRef
Library & Information Sciences Abstracts (LISA)
Library & Information Science Abstracts (LISA)
DatabaseTitle CrossRef
Library and Information Science Abstracts (LISA)
DatabaseTitleList Library and Information Science Abstracts (LISA)

DeliveryMethod fulltext_linktorsrc
Discipline Library & Information Science
EISSN 1544-4554
EndPage 336
ExternalDocumentID 10_1080_01639374_2019_1653413
1653413
Genre Article
GroupedDBID -~X
.7I
.QK
0BK
0R~
29B
2DF
4.4
5GY
5VS
6J9
77I
77K
AAGDL
AAGZJ
AAHIA
AAMFJ
AAMIU
AAPUL
AATTQ
AAZMC
ABCCY
ABFIM
ABJNI
ABLIJ
ABPEM
ABTAI
ABUFD
ABXUL
ABXYU
ACGFO
ACGFS
ACTIO
ACTOA
ADAHI
ADCVX
ADKVQ
ADMHG
AECIN
AEFOU
AEISY
AEKEX
AEOZL
AEPSL
AETEA
AEYOC
AEZRU
AFFNX
AFRVT
AGDLA
AGMYJ
AGRBW
AHDZW
AIJEM
AIYEW
AKBVH
ALMA_UNASSIGNED_HOLDINGS
ALQZU
AQTUD
AVBZW
AWYRJ
BEJHT
BLEHA
BMOTO
BOHLJ
CCCUG
CQ1
CS3
DGFLZ
DKJDH
DKSSO
DU5
E.L
EBS
EJD
E~B
E~C
G-F
GTTXZ
H13
HF~
HZ~
H~9
IPNFZ
J.O
KYCEM
LJTGL
M4Z
NA5
NY.
O9-
P2P
RIG
RNANH
ROSJB
RSYQP
S-F
STATR
TASJS
TBQAZ
TDBHL
TFH
TFL
TFW
TGZ
TN5
TNTFI
TRJHH
TUROJ
UPT
UT5
UT9
VAE
XSW
~01
~S~
AAYXX
CITATION
E3H
F2A
ID FETCH-LOGICAL-c338t-e1a9bbae649dfc5209670f8a760a72b076599267d69be28edeb4645705d9292d3
IEDL.DBID TFW
ISICitedReferencesCount 8
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000489726400004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0163-9374
IngestDate Fri Nov 14 18:44:56 EST 2025
Tue Nov 18 22:07:54 EST 2025
Sat Nov 29 02:28:30 EST 2025
Mon Oct 20 23:48:48 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 5
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c338t-e1a9bbae649dfc5209670f8a760a72b076599267d69be28edeb4645705d9292d3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 3116662405
PQPubID 2035452
PageCount 22
ParticipantIDs crossref_primary_10_1080_01639374_2019_1653413
proquest_journals_3116662405
informaworld_taylorfrancis_310_1080_01639374_2019_1653413
crossref_citationtrail_10_1080_01639374_2019_1653413
PublicationCentury 2000
PublicationDate 2019-07-04
PublicationDateYYYYMMDD 2019-07-04
PublicationDate_xml – month: 07
  year: 2019
  text: 2019-07-04
  day: 04
PublicationDecade 2010
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle Cataloging & classification quarterly
PublicationYear 2019
Publisher Routledge
Taylor & Francis Ltd
Publisher_xml – name: Routledge
– name: Taylor & Francis Ltd
SSID ssj0016512
Score 2.1641643
Snippet This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of...
SourceID proquest
crossref
informaworld
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 315
SubjectTerms cataloging digital resources
cataloguing popular fiction
Classification
Clustering
Data mining
Digitization
dime novels
Experiments
Extraction
Fiction
Information retrieval
Literary criticism
Machine learning
Novels
Productivity
Researcher subject relations
Subject analysis
Subject heading schemes
Subject headings
text mining
Title Text Mining and Subject Analysis for Fiction; or, Using Machine Learning and Information Extraction to Assign Subject Headings to Dime Novels
URI https://www.tandfonline.com/doi/abs/10.1080/01639374.2019.1653413
https://www.proquest.com/docview/3116662405
Volume 57
WOSCitedRecordID wos000489726400004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAWR
  databaseName: Taylor & Francis Journals Complete
  customDbUrl:
  eissn: 1544-4554
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0016512
  issn: 0163-9374
  databaseCode: TFW
  dateStart: 19810331
  isFulltext: true
  titleUrlDefault: https://www.tandfonline.com
  providerName: Taylor & Francis
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELYQYmDhjShQdANiIiUPx47FhKAVA60YiugWObGDKkGD2oD4E_xnfI5TUSHUAcYkOjuy72Xf3XeEnEZ-Lo3hzD2tte_RnBcoUsxTkdZhLCOqlUXXv-ODQTIaiXuXTThzaZV4hi5qoAirq1G4ZTZrMuIujJeCMG54IxKITsBi1MRGCxvTj6I57D3O4wgstvFOpPCQpKnh-W2UBeu0gF36Q1dbA9Tb_Idf3yIbzvuEq5pdtsmKnuyQtqtdgDNwxUm4WeCkfpd8Do0Ch77tJAFmZjDKBm9voAE0AUMEvbGtkLiEcnoONhEB-jZRU4PDcK2Jv0_R_aimdWEFVCUYThk_Teaj39bp_TP8dDN-0TAo340h3yMPve7w-tZzXRy83Bx_K08HUmSZ1IwKVeSYdcO4XySSM1_yMPM5i4UIGVdMZDpMtNIZRlu5HyvjuoUq2ierk3KiDwgoKphhrURKXlBaBBnlSiW5eWGR5vwWoc3upbmDOMdOG89p0CChuvVPcf1Tt_4t0pmTvdYYH8sIxHfWSCt7uVLUnVDSaAntccNHqVMXSILRW-NcxYd_GPqIrOOjTSamx2S1mr7pNlnL36vxbHpiBeMLAeMI1w
linkProvider Taylor & Francis
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELagIMHCG_Fo4QbERCAPx47FhICqiLZTEd2iJHZQJWhRCRV_gv-Mz3GqVggxwBrr7Mi-l3133xFyErhZog1n5iilXIdmPEeRYo4MlPLDJKBKGnT9Nu92o35fzNbCYFol3qHzEijC6GoUbnyMrlLiLrSbgjhu-CTiiXOPhaiKF8lSiB81T_eaj9NIAgtNxBNJHByuqnh-mmbOPs2hl37T1sYENdf_4-c3yJp1QOGq5JhNsqCGW6RhyxfgFGx9Ep4XWMHfJp89rcOhY5pJgF4atL7BBxyoME1AE0FzYIokLmE0PgOTiwAdk6upwMK4lsSzS9x-FOOytgKKEWhmGTwNp7O3ygz_Nxy6Gbwo6I4m2pbvkIfmbe-65dhGDk6mb8CFo7xEpGmiGBUyzzDxhnE3jxLO3IT7qctZKITPuGQiVX6kpEox4MrdUGrvzZfBLqkNR0O1R0BSwTR3RUnCc0pzL6VcyijTHwzYnLtPaHV8cWZRzrHZxnPsVWCodv9j3P_Y7v8-OZ-SvZYwH78RiFneiAvzvpKXzVDi4BfaesVIsdUYSIIBXO1fhQd_mPqYrLR6nXbcvuveH5JVHDK5xbROasX4XTXIcjYpBm_jIyMlX8hfDQE
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA66injxLT5WnYN4stpHmjR4Eteyoi57WNFbaZtUFnRXduvin_A_m0nTRRHxoNeGSUoyr2RmviHkMHDzVBvO3FFKuQ7NeYEixRwZKOWHaUCVNOj6N7zTiR4eRNdmE45tWiXeoYsKKMLoahTuF1nUGXGn2ktBGDd8EfHEicdC1MSzZE67zgyZvBffTwMJLDQBTyRxkKYu4vlpmi_m6Qt46TdlbSxQvPwP_75Clqz7CecVv6ySGTVYI3u2eAGOwFYn4WmBFft18t7TGhxuTSsJ0CuD1jb4fAM1ogloIoj7pkTiDIajYzCZCHBrMjUVWBDXivjzEpdv5aiqrIByCJpV-o-D6eztKr9_jEOt_rOCznCiLfkGuYsvexdtx7ZxcHJ9_y0d5aUiy1LFqJBFjmk3jLtFlHLmptzPXM5CIXzGJROZ8iMlVYbhVu6GUvtuvgw2SWMwHKgtApIKpnkrSlNeUFp4GeVSRrn-YKDm3G1C69NLcotxjq02nhKvhkK1-5_g_id2_7fJyZTspQL5-I1AfGaNpDSvK0XVCiUJfqFt1nyUWH2BJBi-1d5VuPOHqQ_IQrcVJzdXnetdsogjJrGYNkmjHL2qPTKfT8r-eLRvZOQDik0Lsw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Text+Mining+and+Subject+Analysis+for+Fiction%3B+or%2C+Using+Machine+Learning+and+Information+Extraction+to+Assign+Subject+Headings+to+Dime+Novels&rft.jtitle=Cataloging+%26+classification+quarterly&rft.au=Short%2C+Matthew&rft.date=2019-07-04&rft.issn=0163-9374&rft.eissn=1544-4554&rft.volume=57&rft.issue=5&rft.spage=315&rft.epage=336&rft_id=info:doi/10.1080%2F01639374.2019.1653413&rft.externalDBID=n%2Fa&rft.externalDocID=10_1080_01639374_2019_1653413
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0163-9374&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0163-9374&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0163-9374&client=summon