Text Mining and Subject Analysis for Fiction; or, Using Machine Learning and Information Extraction to Assign Subject Headings to Dime Novels
This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of cataloging. It focuses narrowly on subject analysis of dime novels, a format of inexpensive fiction that was popular in the United States between...
Uloženo v:
| Vydáno v: | Cataloging & classification quarterly Ročník 57; číslo 5; s. 315 - 336 |
|---|---|
| Hlavní autor: | |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
Routledge
04.07.2019
Taylor & Francis Ltd |
| Témata: | |
| ISSN: | 0163-9374, 1544-4554 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of cataloging. It focuses narrowly on subject analysis of dime novels, a format of inexpensive fiction that was popular in the United States between 1860 and 1915. NIU holds more than 55,000 dime novels in its collections, which it is in the process of comprehensively digitizing. Classification, keyword extraction, named-entity recognition, clustering, and topic modeling are discussed as means of assigning subject headings to improve their discoverability by researchers and to increase the productivity of digitization workflows. |
|---|---|
| AbstractList | This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of cataloging. It focuses narrowly on subject analysis of dime novels, a format of inexpensive fiction that was popular in the United States between 1860 and 1915. NIU holds more than 55,000 dime novels in its collections, which it is in the process of comprehensively digitizing. Classification, keyword extraction, named-entity recognition, clustering, and topic modeling are discussed as means of assigning subject headings to improve their discoverability by researchers and to increase the productivity of digitization workflows. |
| Author | Short, Matthew |
| Author_xml | – sequence: 1 givenname: Matthew surname: Short fullname: Short, Matthew email: mshort@niu.edu organization: University Libraries, Northern Illinois University |
| BookMark | eNqFkMFOwyAYgInRxDl9BBMSr3YCBTrixWVOt2TqQXcmtKXK0sEEpu4hfGfbbXrwoCdC-L4__N8R2LfOagBOMeph1EcXCPNUpBntEYRFD3OWUpzugQ5mlCaUMboPOi2TtNAhOAphjpo7w6QDPp_0R4R3xhr7DJUt4eMqn-siwoFV9TqYACvn4Y0ponH2Ejp_DmehZe9U8WKshlOt_I88sQ29UC0LRx_Rq40Go4ODEMyz_Zk-1qpspNA-XZuFhvfuTdfhGBxUqg76ZHd2wexm9DQcJ9OH28lwME2KNO3HRGMl8lxpTkVZFYwgwTNU9VXGkcpIjjLOhCA8K7nINenrUueUU5YhVgoiSJl2wdl27tK715UOUc7dyjcbB5lizDknFLGGYluq8C4Eryu59Gah_FpiJNvy8ru8bMvLXfnGu_zlFSZuojRFTP2vfbW1zTbmu_N1KaNa185XXtnCtJ_8c8QX4z-eoA |
| CitedBy_id | crossref_primary_10_1080_19386389_2022_2030635 crossref_primary_10_1016_j_acalib_2023_102736 crossref_primary_10_1155_2022_5856069 |
| ContentType | Journal Article |
| Copyright | 2019 The Author(s). Published with license by Taylor & Francis Group, LLC 2019 2019 The Author(s). Published with license by Taylor & Francis Group, LLC |
| Copyright_xml | – notice: 2019 The Author(s). Published with license by Taylor & Francis Group, LLC 2019 – notice: 2019 The Author(s). Published with license by Taylor & Francis Group, LLC |
| DBID | AAYXX CITATION E3H F2A |
| DOI | 10.1080/01639374.2019.1653413 |
| DatabaseName | CrossRef Library & Information Sciences Abstracts (LISA) Library & Information Science Abstracts (LISA) |
| DatabaseTitle | CrossRef Library and Information Science Abstracts (LISA) |
| DatabaseTitleList | Library and Information Science Abstracts (LISA) |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Library & Information Science |
| EISSN | 1544-4554 |
| EndPage | 336 |
| ExternalDocumentID | 10_1080_01639374_2019_1653413 1653413 |
| Genre | Article |
| GroupedDBID | -~X .7I .QK 0BK 0R~ 29B 2DF 4.4 5GY 5VS 6J9 77I 77K AAGDL AAGZJ AAHIA AAMFJ AAMIU AAPUL AATTQ AAZMC ABCCY ABFIM ABJNI ABLIJ ABPEM ABTAI ABUFD ABXUL ABXYU ACGFO ACGFS ACTIO ACTOA ADAHI ADCVX ADKVQ ADMHG AECIN AEFOU AEISY AEKEX AEOZL AEPSL AETEA AEYOC AEZRU AFFNX AFRVT AGDLA AGMYJ AGRBW AHDZW AIJEM AIYEW AKBVH ALMA_UNASSIGNED_HOLDINGS ALQZU AQTUD AVBZW AWYRJ BEJHT BLEHA BMOTO BOHLJ CCCUG CQ1 CS3 DGFLZ DKJDH DKSSO DU5 E.L EBS EJD E~B E~C G-F GTTXZ H13 HF~ HZ~ H~9 IPNFZ J.O KYCEM LJTGL M4Z NA5 NY. O9- P2P RIG RNANH ROSJB RSYQP S-F STATR TASJS TBQAZ TDBHL TFH TFL TFW TGZ TN5 TNTFI TRJHH TUROJ UPT UT5 UT9 VAE XSW ~01 ~S~ AAYXX CITATION E3H F2A |
| ID | FETCH-LOGICAL-c338t-e1a9bbae649dfc5209670f8a760a72b076599267d69be28edeb4645705d9292d3 |
| IEDL.DBID | TFW |
| ISICitedReferencesCount | 8 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000489726400004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0163-9374 |
| IngestDate | Fri Nov 14 18:44:56 EST 2025 Tue Nov 18 22:07:54 EST 2025 Sat Nov 29 02:28:30 EST 2025 Mon Oct 20 23:48:48 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 5 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c338t-e1a9bbae649dfc5209670f8a760a72b076599267d69be28edeb4645705d9292d3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| PQID | 3116662405 |
| PQPubID | 2035452 |
| PageCount | 22 |
| ParticipantIDs | crossref_primary_10_1080_01639374_2019_1653413 proquest_journals_3116662405 informaworld_taylorfrancis_310_1080_01639374_2019_1653413 crossref_citationtrail_10_1080_01639374_2019_1653413 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-07-04 |
| PublicationDateYYYYMMDD | 2019-07-04 |
| PublicationDate_xml | – month: 07 year: 2019 text: 2019-07-04 day: 04 |
| PublicationDecade | 2010 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | Cataloging & classification quarterly |
| PublicationYear | 2019 |
| Publisher | Routledge Taylor & Francis Ltd |
| Publisher_xml | – name: Routledge – name: Taylor & Francis Ltd |
| SSID | ssj0016512 |
| Score | 2.1641643 |
| Snippet | This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of... |
| SourceID | proquest crossref informaworld |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 315 |
| SubjectTerms | cataloging digital resources cataloguing popular fiction Classification Clustering Data mining Digitization dime novels Experiments Extraction Fiction Information retrieval Literary criticism Machine learning Novels Productivity Researcher subject relations Subject analysis Subject heading schemes Subject headings text mining |
| Title | Text Mining and Subject Analysis for Fiction; or, Using Machine Learning and Information Extraction to Assign Subject Headings to Dime Novels |
| URI | https://www.tandfonline.com/doi/abs/10.1080/01639374.2019.1653413 https://www.proquest.com/docview/3116662405 |
| Volume | 57 |
| WOSCitedRecordID | wos000489726400004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAWR databaseName: Taylor & Francis Journals Complete customDbUrl: eissn: 1544-4554 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0016512 issn: 0163-9374 databaseCode: TFW dateStart: 19810331 isFulltext: true titleUrlDefault: https://www.tandfonline.com providerName: Taylor & Francis |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELYQYmDhjShQdANiIiUPx47FhKAVA60YiugWObGDKkGD2oD4E_xnfI5TUSHUAcYkOjuy72Xf3XeEnEZ-Lo3hzD2tte_RnBcoUsxTkdZhLCOqlUXXv-ODQTIaiXuXTThzaZV4hi5qoAirq1G4ZTZrMuIujJeCMG54IxKITsBi1MRGCxvTj6I57D3O4wgstvFOpPCQpKnh-W2UBeu0gF36Q1dbA9Tb_Idf3yIbzvuEq5pdtsmKnuyQtqtdgDNwxUm4WeCkfpd8Do0Ch77tJAFmZjDKBm9voAE0AUMEvbGtkLiEcnoONhEB-jZRU4PDcK2Jv0_R_aimdWEFVCUYThk_Teaj39bp_TP8dDN-0TAo340h3yMPve7w-tZzXRy83Bx_K08HUmSZ1IwKVeSYdcO4XySSM1_yMPM5i4UIGVdMZDpMtNIZRlu5HyvjuoUq2ierk3KiDwgoKphhrURKXlBaBBnlSiW5eWGR5vwWoc3upbmDOMdOG89p0CChuvVPcf1Tt_4t0pmTvdYYH8sIxHfWSCt7uVLUnVDSaAntccNHqVMXSILRW-NcxYd_GPqIrOOjTSamx2S1mr7pNlnL36vxbHpiBeMLAeMI1w |
| linkProvider | Taylor & Francis |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELagIMHCG_Fo4QbERCAPx47FhICqiLZTEd2iJHZQJWhRCRV_gv-Mz3GqVggxwBrr7Mi-l3133xFyErhZog1n5iilXIdmPEeRYo4MlPLDJKBKGnT9Nu92o35fzNbCYFol3qHzEijC6GoUbnyMrlLiLrSbgjhu-CTiiXOPhaiKF8lSiB81T_eaj9NIAgtNxBNJHByuqnh-mmbOPs2hl37T1sYENdf_4-c3yJp1QOGq5JhNsqCGW6RhyxfgFGx9Ep4XWMHfJp89rcOhY5pJgF4atL7BBxyoME1AE0FzYIokLmE0PgOTiwAdk6upwMK4lsSzS9x-FOOytgKKEWhmGTwNp7O3ygz_Nxy6Gbwo6I4m2pbvkIfmbe-65dhGDk6mb8CFo7xEpGmiGBUyzzDxhnE3jxLO3IT7qctZKITPuGQiVX6kpEox4MrdUGrvzZfBLqkNR0O1R0BSwTR3RUnCc0pzL6VcyijTHwzYnLtPaHV8cWZRzrHZxnPsVWCodv9j3P_Y7v8-OZ-SvZYwH78RiFneiAvzvpKXzVDi4BfaesVIsdUYSIIBXO1fhQd_mPqYrLR6nXbcvuveH5JVHDK5xbROasX4XTXIcjYpBm_jIyMlX8hfDQE |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA66injxLT5WnYN4stpHmjR4Eteyoi57WNFbaZtUFnRXduvin_A_m0nTRRHxoNeGSUoyr2RmviHkMHDzVBvO3FFKuQ7NeYEixRwZKOWHaUCVNOj6N7zTiR4eRNdmE45tWiXeoYsKKMLoahTuF1nUGXGn2ktBGDd8EfHEicdC1MSzZE67zgyZvBffTwMJLDQBTyRxkKYu4vlpmi_m6Qt46TdlbSxQvPwP_75Clqz7CecVv6ySGTVYI3u2eAGOwFYn4WmBFft18t7TGhxuTSsJ0CuD1jb4fAM1ogloIoj7pkTiDIajYzCZCHBrMjUVWBDXivjzEpdv5aiqrIByCJpV-o-D6eztKr9_jEOt_rOCznCiLfkGuYsvexdtx7ZxcHJ9_y0d5aUiy1LFqJBFjmk3jLtFlHLmptzPXM5CIXzGJROZ8iMlVYbhVu6GUvtuvgw2SWMwHKgtApIKpnkrSlNeUFp4GeVSRrn-YKDm3G1C69NLcotxjq02nhKvhkK1-5_g_id2_7fJyZTspQL5-I1AfGaNpDSvK0XVCiUJfqFt1nyUWH2BJBi-1d5VuPOHqQ_IQrcVJzdXnetdsogjJrGYNkmjHL2qPTKfT8r-eLRvZOQDik0Lsw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Text+Mining+and+Subject+Analysis+for+Fiction%3B+or%2C+Using+Machine+Learning+and+Information+Extraction+to+Assign+Subject+Headings+to+Dime+Novels&rft.jtitle=Cataloging+%26+classification+quarterly&rft.au=Short%2C+Matthew&rft.date=2019-07-04&rft.issn=0163-9374&rft.eissn=1544-4554&rft.volume=57&rft.issue=5&rft.spage=315&rft.epage=336&rft_id=info:doi/10.1080%2F01639374.2019.1653413&rft.externalDBID=n%2Fa&rft.externalDocID=10_1080_01639374_2019_1653413 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0163-9374&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0163-9374&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0163-9374&client=summon |