Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining

Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media (such as blog articles, forum posts, product reviews, and tweets). This has led to an increasing demand for powerful softw...

Celý popis

Uloženo v:
Podrobná bibliografie
Hlavní autoři: Zhai, ChengXiang, Massung, Sean
Médium: E-kniha Kniha
Jazyk:angličtina
Vydáno: New York, NY, USA Association for Computing Machinery and Morgan & Claypool 2016
ACM Books
Morgan & Claypool
Association for Computing Machinery
Association for Computing Machinery and Morgan & C
Vydání:1
Edice:ACM Books
Témata:
ISBN:9781970001174, 1970001178, 197000116X, 9781970001167
ISSN:2374-6777
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media (such as blog articles, forum posts, product reviews, and tweets). This has led to an increasing demand for powerful software tools to help people manage and analyze vast amounts of text data effectively and efficiently. Unlike data generated by a computer system or sensors, text data are usually generated directly by humans, and capture semantically rich content. As such, text data are especially valuable for discovering knowledge about human opinions and preferences, in addition to many other kinds of knowledge that we encode in text. In contrast to structured data, which conform to well-defined schemas (thus are relatively easy for computers to handle), text has less explicit structure, requiring computer processing toward understanding of the content encoded in text. The current technology of natural language processing has not yet reached a point to enable a computer to precisely understand natural language text, but a wide range of statistical and heuristic approaches to management and analysis of text data have been developed over the past few decades. They are usually very robust and can be applied to analyze and manage text data in any natural language, and about any topic. This book provides a systematic introduction to many of these approaches, with an emphasis on covering the most useful knowledge and skills required to build a variety of practically useful text information systems. Because humans can understand natural languages far better than computers can, effective involvement of humans in a text information system is generally needed and text information systems often serve as intelligent assistants for humans. Depending on how a text information system collaborates with humans, we distinguish two kinds of text information systems. The first is information retrieval systems which include search engines and recommender systems; they assist users in finding from a large collection of text data the most relevant text data that are actually needed for solving a specific application problem, thus effecively turning big raw text data into much smaller relevant text data that can be more easily processed by humans. The second is text mining application systems; they can assist users in analyzing patterns in text data to extract and discover useful actionable knowledge directly useful for task completion or decision making, thus providing more direct task support for users. This book covers the major concepts, techniques, and ideas in information retrieval and text data mining from a practical viewpoint, and includes many hands-on exercises designed with a companion software toolkit (i.e., MeTA) to help readers learn how to apply techniques of information retrieval and text mining to real-world text data and how to experiment with and improve some of the algorithms for interesting application tasks. This book can be used as a textbook for computer science undergraduates and graduates, library and information scientists, or as a reference book for practitioners working on relevant problems in managing and analyzing text data.
AbstractList Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media (such as blog articles, forum posts, product reviews, and tweets). This has led to an increasing demand for powerful software tools to help people manage and analyze vast amounts of text data effectively and efficiently. Unlike data generated by a computer system or sensors, text data are usually generated directly by humans, and capture semantically rich content. As such, text data are especially valuable for discovering knowledge about human opinions and preferences, in addition to many other kinds of knowledge that we encode in text. In contrast to structured data, which conform to well-defined schemas (thus are relatively easy for computers to handle), text has less explicit structure, requiring computer processing toward understanding of the content encoded in text. The current technology of natural language processing has not yet reached a point to enable a computer to precisely understand natural language text, but a wide range of statistical and heuristic approaches to management and analysis of text data have been developed over the past few decades. They are usually very robust and can be applied to analyze and manage text data in any natural language, and about any topic. This book provides a systematic introduction to many of these approaches, with an emphasis on covering the most useful knowledge and skills required to build a variety of practically useful text information systems. Because humans can understand natural languages far better than computers can, effective involvement of humans in a text information system is generally needed and text information systems often serve as intelligent assistants for humans. Depending on how a text information system collaborates with humans, we distinguish two kinds of text information systems. The first is information retrieval systems which include search engines and recommender systems; they assist users in finding from a large collection of text data the most relevant text data that are actually needed for solving a specific application problem, thus effecively turning big raw text data into much smaller relevant text data that can be more easily processed by humans. The second is text mining application systems; they can assist users in analyzing patterns in text data to extract and discover useful actionable knowledge directly useful for task completion or decision making, thus providing more direct task support for users. This book covers the major concepts, techniques, and ideas in information retrieval and text data mining from a practical viewpoint, and includes many hands-on exercises designed with a companion software toolkit (i.e., MeTA) to help readers learn how to apply techniques of information retrieval and text mining to real-world text data and how to experiment with and improve some of the algorithms for interesting application tasks. This book can be used as a textbook for computer science undergraduates and graduates, library and information scientists, or as a reference book for practitioners working on relevant problems in managing and analyzing text data.
This book provides a systematic introduction to a wide range of statistical and heuristical approaches to the management and analysis of text data. It emphasizes the most useful knowledge and skills required to build a variety of practically useful text information systems. Because humans can understand natural languages far better than computers can, effective involvement of humans in a text information system is generally needed and text information systems often serve as intelligent assistants for humans. Depending on how a text information system collaborates with humans, we distinguish two kinds of text information systems. The first is information retrieval systems which include search engines and recommender systems; they assist users in finding from a large collection of text data the most relevant text data that are actually needed for solving a specific application problem, thus effectively turning big raw text data into much smaller relevant text data that can be more easily processed by humans. The second is text mining application systems; they can assist users in analyzing patterns in text data to extract and discover useful actionable knowledge directly useful for task completion or decision making, thus providing more direct task support for users. The book covers the major concepts, techniques, and ideas in information retrieval and text data mining from a practical viewpoint, and includes many hands-on exercises designed with a companion software toolkit (i.e., META) to help readers learn how to apply techniques of information retrieval and text mining to real-world text data and how to experiment with and improve some of the algorithms for interesting application tasks. The book can be used as a textbook for computer science undergraduates and graduates, library and information scientists, or as a reference book for practitioners working on relevant problems in managing and analyzing text data.
Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. This has led to an increasing demand for powerful software tools to help people analyze and manage vast amounts of text data effectively and efficiently. Unlike data generated by a computer system or sensors, text data are usually generated directly by humans, and are accompanied by semantically rich content. As such, text data are especially valuable for discovering knowledge about human opinions and preferences, in addition to many other kinds of knowledge that we encode in text. In contrast to structured data, which conform to well-defined schemas (thus are relatively easy for computers to handle), text has less explicit structure, requiring computer processing toward understanding of the content encoded in text. The current technology of natural language processing has not yet reached a point to enable a computer to precisely understand natural language text, but a wide range of statistical and heuristic approaches to analysis and management of text data have been developed over the past few decades. They are usually very robust and can be applied to analyze and manage text data in any natural language, and about any topic. This book provides a systematic introduction to all these approaches, with an emphasis on covering the most useful knowledge and skills required to build a variety of practically useful text information systems. The focus is on text mining applications that can help users analyze patterns in text data to extract and reveal useful knowledge. Information retrieval systems, including search engines and recommender systems, are also covered as supporting technology for text mining applications. The book covers the major concepts, techniques, and ideas in text data mining and information retrieval from a practical viewpoint, and includes many hands-on exercises designed with a companion software toolkit (i.e., MeTA) to help readers learn how to apply techniques of text mining and information retrieval to real-world text data and how to experiment with and improve some of the algorithms for interesting application tasks. The book can be used as a textbook for a computer science undergraduate course or a reference book for practitioners working on relevant problems in analyzing and managing text data.
Author Massung, Sean
Zhai, ChengXiang
Author_xml – sequence: 1
  givenname: ChengXiang
  surname: Zhai
  fullname: Zhai, ChengXiang
  organization: University of Illinois at Urbana-Champaign
– sequence: 2
  givenname: Sean
  surname: Massung
  fullname: Massung, Sean
  organization: University of Illinois at Urbana-Champaign
BackLink https://cir.nii.ac.jp/crid/1130282269856746624$$DView record in CiNii
BookMark eNpdkUlPGzEUx93SViyN-hV8qIQ4BLzNPLu3ELZIoFYV6nXk8QJDJnY7dgJ8e5zlUi5--uv9_H_bIfoUYnAIfaPklFJRnTFFK8LpBzRSIKkCQgilID6iA8ZBjGsA2HuX-4IOKWcUpFKV2kejlJ5KhlFeUUEOkL13Lxlf6KzxnQ76wS1cyFgHiydB96-pSz_wBP8atMmd0T2ehTxEuywqBpxj0T4OC72Rv10eOrcq1Pr_xviuC114-Io-e90nN9rFI_Tn6vJ-ejO-_Xk9m05ux5ozBnzcutpVpDRmWVvGdVaAZsR46XklFDDOgXvuCEhjvTfSGt56I2rZagnWW36ETrbGOs3dc3qMfU7NqndtjPPU_LeXwh5v2b9D_Ld0KTcbzJT5B903l-fTWlVCAhTy-5YMXdeYbv1SygmTjNVKVjWIumZrQ7wrbhbNtiIlzfpuze5u_A2v8IEk
ContentType eBook
Book
DBID RYH
DEWEY 006.35
DOI 10.1145/2915031
DatabaseName CiNii Complete
DatabaseTitleList


DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Languages & Literatures
EISBN 9781970001174
1970001178
EISSN 2374-6777
Edition 1
ExternalDocumentID 9781970001174
EBC6954877
BB22576564
2915031
GroupedDBID 38.
AABBV
ACM
ALECU
ALMA_UNASSIGNED_HOLDINGS
AMUST
ATVEE
BBABE
COGBH
CZZ
NKVPI
AAZEP
ABARN
ABMRC
ABQPQ
ADVEM
AEIUR
AERYV
AFOJC
AHWGJ
AINKX
AJFER
EBSCA
GEOUK
QD8
RYH
ID FETCH-LOGICAL-a32273-be6e50514d2b145ed47a20cf8f3549723373f3e078cdffc8dc3bfc468ba87dfd3
ISBN 9781970001174
1970001178
197000116X
9781970001167
IngestDate Fri Nov 08 03:27:56 EST 2024
Wed Dec 10 12:42:26 EST 2025
Fri Jun 27 00:53:17 EDT 2025
Wed Jan 31 06:45:39 EST 2024
IsPeerReviewed false
IsScholarly false
LCCallNum_Ident P98.5.S83 Z435 2016
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-a32273-be6e50514d2b145ed47a20cf8f3549723373f3e078cdffc8dc3bfc468ba87dfd3
Notes Includes bibliographical references (p. [477]-488) and index
OCLC 1321789959
PQID EBC6954877
PageCount 530
ParticipantIDs askewsholts_vlebooks_9781970001174
proquest_ebookcentral_EBC6954877
nii_cinii_1130282269856746624
acm_books_10_1145_2915031
PublicationCentury 2000
PublicationDate [2016]
PublicationDateYYYYMMDD 2016-01-01
PublicationDate_xml – year: 2016
  text: [2016]
PublicationDecade 2010
PublicationPlace New York, NY, USA
PublicationPlace_xml – name: New York, NY, USA
– name: New York
– name: San Rafael, Calif
– name: San Rafael
PublicationSeriesTitle ACM Books
PublicationYear 2016
Publisher Association for Computing Machinery and Morgan & Claypool
ACM Books
Morgan & Claypool
Association for Computing Machinery
Association for Computing Machinery and Morgan & C
Publisher_xml – name: Association for Computing Machinery and Morgan & Claypool
– name: ACM Books
– name: Morgan & Claypool
– name: Association for Computing Machinery
– name: Association for Computing Machinery and Morgan & C
SSID ssj0002135140
Score 2.0693314
Snippet Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise...
This book provides a systematic introduction to a wide range of statistical and heuristical approaches to the management and analysis of text data. It...
SourceID askewsholts
proquest
nii
acm
SourceType Aggregation Database
Publisher
SubjectTerms Computational linguistics-Statistical methods
Data mining
Title Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining
URI https://cir.nii.ac.jp/crid/1130282269856746624
https://ebookcentral.proquest.com/lib/[SITE_ID]/detail.action?docID=6954877
https://www.vlebooks.com/vleweb/product/openreader?id=none&isbn=9781970001174
Volume 12
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELZglwO98BZbKLIQ14gkdmyHG1QLSCgFQUF7ixxngiIgRZt01Z_P-JE0217gwMXaOJuJMt_IHo898xHyQkgtYglxVKWZxgVKpaKKZyrSOJlo4NwkWjuyCXlyojab_FPgbO8dnYDsOnVxkf_-r1BjH4JtU2f_Ae5JKHbgbwQdW4Qd2yse8XQZEMdxFlEc9OxMi9sbGAuP-Cx0X6HIuBobg6_3ak0AfdCQmuQuPzumrV2oJOBEF45LYhZpbv1-PXTfN2hk050C_fEwhHyBYHz2y6FH4AuXXbEXbEiuBhtmJuNOQXrmCRvTKNzRT9j6olGFo6RyxjtWpvQL1iSXzgv1zDzXh29uK12kOXqpYW7Yr4W9J-EmWaY8Y3xBlu_WH79-mEJrqWUd5LFPkbYyXwaJ1gMxvw7Ige5_4BSC08vQY1_XttdmYudenN4lS7BauUduQHef3BmZNmgYeB-Q2gJALbb0EluKOqAjtq_oazohS-fI0uGMzpClE7LueSfYI_uQfHu7Pj1-HwV-jEjjMCxZVIGAzBawr9MKPxNqLnUam0Y1DJf9MmVMsoYBeoGmbhqjasOqxnChKq1k3dTsEVl0Zx08JrTmhrMaEgaWlI4lVWYg5kkFTOKiU7MVWaHqSmvYfemT2bMyqHVFns8UWu5-gv_bHlorcoR6Lk1r28Rul6NrKnKVCcmFSPE-HREo3fPhcHK5fnMsbFVCKQ__5j1PyO1Lw31KFsP2HI7ILbMb2n77LJjKH-Lebdw
linkProvider ProQuest Ebooks
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.title=Text+Data+Management+and+Analysis%3A+A+Practical+Introduction+to+Information+Retrieval+and+Text+Mining&rft.au=Zhai%2C+ChengXiang&rft.au=Massung%2C+Sean&rft.series=ACM+Books&rft.date=2016-01-01&rft.pub=Association+for+Computing+Machinery+and+Morgan+%26+C&rft.isbn=9781970001174&rft_id=info:doi/10.1145%2F2915031&rft.externalDocID=9781970001174
thumbnail_m http://cvtisr.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fvle.dmmserver.com%2Fmedia%2F640%2F97819700%2F9781970001174.jpg