Data mining for the social sciences an introduction
We live in a world of big data: the amount of information collected on human behavior each day is staggering, and exponentially greater than at any time in the past. Additionally, powerful algorithms are capable of churning through seas of data to uncover patterns. Providing a simple and accessible...
Gespeichert in:
| Hauptverfasser: | , |
|---|---|
| Format: | E-Book Buch |
| Sprache: | Englisch |
| Veröffentlicht: |
Oakland, Calif
University of California Press
2015
|
| Ausgabe: | 1 |
| Schlagworte: | |
| ISBN: | 9780520280984, 9780520960596, 0520960599, 9780520280977, 0520280989, 0520280970 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | We live in a world of big data: the amount of information collected on human behavior each day is staggering, and exponentially greater than at any time in the past. Additionally, powerful algorithms are capable of churning through seas of data to uncover patterns. Providing a simple and accessible introduction to data mining, Paul Attewell and David B. Monaghan discuss how data mining substantially differs from conventional statistical modeling familiar to most social scientists. The authors also empower social scientists to tap into these new resources and incorporate data mining methodologies in their analytical toolkits. Data Mining for the Social Sciences demystifies the process by describing the diverse set of techniques available, discussing the strengths and weaknesses of various approaches, and giving practical demonstrations of how to carry out analyses using tools in various statistical software packages. |
|---|---|
| AbstractList | We live in a world of big data: the amount of information collected on human behavior each day is staggering, and exponentially greater than at any time in the past. Additionally, powerful algorithms are capable of churning through seas of data to uncover patterns. Providing a simple and accessible introduction to data mining, Paul Attewell and David B. Monaghan discuss how data mining substantially differs from conventional statistical modeling familiar to most social scientists. The authors also empower social scientists to tap into these new resources and incorporate data mining methodologies in their analytical toolkits. Data Mining for the Social Sciences demystifies the process by describing the diverse set of techniques available, discussing the strengths and weaknesses of various approaches, and giving practical demonstrations of how to carry out analyses using tools in various statistical software packages. We live in a world of big data: the amount of information collected on human behavior each day is staggering, and exponentially greater than at any time in the past. Additionally, powerful algorithms are capable of churning through seas of data to uncover patterns. Providing a simple and accessible introduction to data mining, Paul Attewell and David B. Monaghan discuss how data mining substantially differs from conventional statistical modeling familiar to most social scientists. The authors also empower social scientists to tap into these new resources and incorporate data mining methodologies in their analytical toolkits.Data Mining for the Social Sciencesdemystifies the process by describing the diverse set of techniques available, discussing the strengths and weaknesses of various approaches, and giving practical demonstrations of how to carry out analyses using tools in various statistical software packages. |
| Author | Monaghan, David Attewell, Paul |
| Author_xml | – sequence: 1 fullname: Attewell, Paul – sequence: 2 fullname: Monaghan, David |
| BackLink | https://cir.nii.ac.jp/crid/1130000795785968256$$DView record in CiNii |
| BookMark | eNplkc1PVDEUxUsUIoOzdMWCl2BiWIz2-2MpI4oJiS6AbdN2-t50eLyObQfkv7fjm0iIm7Y355dz7r2dgNdDHDwA7xD8iBhmn5SQkGGoOGSK74HJv0K9AtOdiCVUku6DCYaIQsKw4AfgUFUJI07RGzDNeQVhFYkQCh2C0y-mmOY-DGHomjampix9k6MLpm-yC35wPr8F-63ps5_u7iNw-_Xien45u_rx7fv889XMMCkxnjFpxcIj6hZeOCWls5ZZ2jpCWmg4tpxaYZCgWCqlMDFctFRwSVupLKNckSNwNhqbfOcf8zL2JeuH3tsY77J-Mf8z-2j64tPCd2nzVB_63iT3H_thZNcp_tr4XPRfS-eHkkyvL87nqA4AJazk8Y70qfdd1GM0Y5grWdX3ozqEoF3YngiRuk4oFBOyRknMtnEnI7bKJaadxUq7UhD5jTrXPTcUuvXG9iEv6_r1OoXa_JO-mf883_4R4ZT8AR9QjrQ |
| ContentType | eBook Book |
| Copyright | 2015 David B. Monaghan 2015 Paul Attewell |
| Copyright_xml | – notice: 2015 Paul Attewell – notice: 2015 David B. Monaghan |
| DBID | E05 RYH YSPEL |
| DEWEY | 006.3/12 |
| DOI | 10.1525/9780520960596 |
| DatabaseName | University of California Press eBooks CiNii Complete Perlego |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Social Sciences (General) Political Science Computer Science |
| EISBN | 0520960599 9780520960596 |
| Edition | 1 |
| ExternalDocumentID | 9780520960596 EBC1882080 552698 BB18721160 10.1525/j.ctt13x1gcg UCPB0001364 |
| Genre | Electronic books |
| GroupedDBID | -VQ -VX AABBV AAZEP ABARN ABCYY ABGYK ABMRC ABQPQ ABYBY ACBVX ACBYE ACLGV ACNAM ACTWL ACYTI ADHSM ADVEM ADWAQ AERYV AFAWW AFTSL AHWGJ AILDO AIXPE AJFER ALMA_UNASSIGNED_HOLDINGS ALUEM AMYDA AQQNK AVGCG AXFPL AZZ BBABE BPBUR CZZ DHNOV DUGUG E05 EBSCA EBZNK ECOWB GHDSN HELXT J-X JJU JLPMJ KBOFU MYL NK1 NK2 PQQKQ PYZUL QD8 SUPCW XI1 YSPEL ~I6 RYH |
| ID | FETCH-LOGICAL-a58822-58b7de14cde7c988cbb5b4fc33f0a62b64b7a1742899923a67f47684f89b54693 |
| ISBN | 9780520280984 9780520960596 0520960599 9780520280977 0520280989 0520280970 |
| IngestDate | Thu Sep 11 08:26:12 EDT 2025 Fri Nov 21 19:37:54 EST 2025 Wed Dec 10 11:06:17 EST 2025 Tue Dec 02 17:24:38 EST 2025 Thu Jun 26 22:25:24 EDT 2025 Mon Sep 22 07:06:41 EDT 2025 Thu Sep 11 08:22:35 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Keywords | confusion matrix data scholarship software for data mining data mining social scientists weka analyzing data chaid data science naive bayes studying data hardware for data mining partition trees statistical modeling text mining big data bayesian networks vif regression data analysis data processing permutation tests scholarly data business analytics social science bootstrapping heteroscedasticity classification and regression trees statistical methods classification trees |
| LCCN | 2014035276 |
| LCCallNum | H61.3A88 2015 |
| LCCallNum_Ident | H61.3 -- .A88 2015eb |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-a58822-58b7de14cde7c988cbb5b4fc33f0a62b64b7a1742899923a67f47684f89b54693 |
| Notes | Bibliography: p. 239-244 Includes index |
| OCLC | 905221641 |
| PQID | EBC1882080 |
| PageCount | 265 |
| ParticipantIDs | askewsholts_vlebooks_9780520960596 walterdegruyter_marc_9780520960596 proquest_ebookcentral_EBC1882080 perlego_books_552698 nii_cinii_1130000795785968256 jstor_books_j_ctt13x1gcg igpublishing_primary_UCPB0001364 |
| ProviderPackageCode | J-X |
| PublicationCentury | 2000 |
| PublicationDate | 2015. 20150504 c2015 2015 [2015] 2015-04-30 |
| PublicationDateYYYYMMDD | 2015-01-01 2015-05-04 2015-04-30 |
| PublicationDate_xml | – year: 2015 text: 2015 |
| PublicationDecade | 2010 |
| PublicationPlace | Oakland, Calif |
| PublicationPlace_xml | – name: Oakland, Calif – name: Berkeley – name: Berkeley, CA |
| PublicationYear | 2015 |
| Publisher | University of California Press |
| Publisher_xml | – name: University of California Press |
| RestrictionsOnAccess | restricted access |
| SSID | ssj0001437791 |
| Score | 2.3206434 |
| Snippet | We live in a world of big data: the amount of information collected on human behavior each day is staggering, and exponentially greater than at any time in the... |
| SourceID | askewsholts walterdegruyter proquest perlego nii jstor igpublishing |
| SourceType | Aggregation Database Publisher |
| SubjectTerms | analyzing data bayesian networks big data bootstrapping business analytics chaid classification and regression trees classification trees confusion matrix data analysis Data mining Data processing data scholarship data science Demography hardware for data mining heteroscedasticity naive bayes partition trees permutation tests Political Science POLITICAL SCIENCE / General POLITICAL SCIENCE / Political Economy Population Studies scholarly data SOCIAL SCIENCE SOCIAL SCIENCE / Demography Social sciences Social sciences -- Data processing Social sciences -- Statistical methods social scientists software for data mining Statistical methods statistical modeling studying data text mining vif regression weka |
| SubjectTermsDisplay | Demography Social Science |
| Subtitle | an introduction |
| TableOfContents | Data mining for the social sciences : an introduction -- Contents -- Acknowledgments -- Part 1: Concepts -- 1. What Is Data Mining? -- 2. Contrasts with the Conventional Statistical Approach -- 3. Some General Strategies Used in Data Mining -- 4. Important Stages in a Data Mining Project -- Part 2: Worked Examples -- 5. Preparing Training and Test Datasets -- 6. Variable Selection Tools -- 7. Creating New Variables Using Binning and Trees -- 8. Extracting Variables -- 9. Classifiers -- 10. Classification Trees -- 11. Neural Networks -- 12. Clustering -- 13. Latent Class Analysis and Mixture Models -- 14. Association Rules -- Conclusion -- Bibliography -- Notes -- Index. Front Matter Table of Contents ACKNOWLEDGMENTS 1: WHAT IS DATA MINING? 2: CONTRASTS WITH THE CONVENTIONAL STATISTICAL APPROACH 3: SOME GENERAL STRATEGIES USED IN DATA MINING 4: IMPORTANT STAGES IN A DATA MINING PROJECT 5: PREPARING TRAINING AND TEST DATASETS 6: VARIABLE SELECTION TOOLS 7: Creating New Variables Using Binning and Trees 8: EXTRACTING VARIABLES 9: CLASSIFIERS 10: CLASSIFICATION TREES 11: NEURAL NETWORKS 12: CLUSTERING 13: LATENT CLASS ANALYSIS AND MIXTURE MODELS 14: ASSOCIATION RULES CONCLUSION BIBLIOGRAPHY NOTES INDEX Boosted Trees and Random Forests -- 11. Neural Networks -- 12. Clustering -- Hierarchical Clustering -- K-Means Clustering -- Normal Mixtures -- Self-Organized Maps -- 13. Latent Class Analysis and Mixture Models -- Latent Class Analysis -- Latent Class Regression -- Mixture Models -- 14. Association Rules -- Conclusion -- Bibliography -- Notes -- Index -- A -- B -- C -- D -- E -- F -- G -- H -- I -- J -- K -- L -- M -- N -- O -- P -- R -- S -- T -- U -- V -- W -- X -- Y -- Z Cover -- Title -- Copyright -- Contents -- Acknowledgments -- PART 1. CONCEPTS -- 1. What Is Data Mining? -- The Goals of This Book -- Software and Hardware for Data Mining -- Basic Terminology -- 2. Contrasts with the Conventional Statistical Approach -- Predictive Power in Conventional Statistical Modeling -- Hypothesis Testing in the Conventional Approach -- Heteroscedasticity as a Threat to Validity in Conventional Modeling -- The Challenge of Complex and Nonrandom Samples -- Bootstrapping and Permutation Tests -- Nonlinearity in Conventional Predictive Models -- Statistical Interactions in Conventional Models -- Conclusion -- 3. Some General Strategies Used in Data Mining -- Cross-Validation -- Overfitting -- Boosting -- Calibrating -- Measuring Fit: The Confusion Matrix and ROC Curves -- Identifying Statistical Interactions and Effect Heterogeneity in Data Mining -- Bagging and Random Forests -- The Limits of Prediction -- Big Data Is Never Big Enough -- 4. Important Stages in a Data Mining Project -- When to Sample Big Data -- Building a Rich Array of Features -- Feature Selection -- Feature Extraction -- Constructing a Model -- PART 2. WORKED EXAMPLES -- 5. Preparing Training and Test Datasets -- The Logic of Cross-Validation -- Cross-Validation Methods: An Overview -- 6. Variable Selection Tools -- Stepwise Regression -- The LASSO -- VIF Regression -- 7. Creating New Variables Using Binning and Trees -- Discretizing a Continuous Predictor -- Continuous Outcomes and Continuous Predictors -- Binning Categorical Predictors -- Using Partition Trees to Study Interactions -- 8. Extracting Variables -- Principal Component Analysis -- Independent Component Analysis -- 9. Classifiers -- K-Nearest Neighbors -- Naive Bayes -- Support Vector Machines -- Optimizing Prediction across Multiple Classifiers -- 10. Classification Trees -- Partition Trees ACKNOWLEDGMENTS -- 14. ASSOCIATION RULES -- 8. EXTRACTING VARIABLES -- 4. IMPORTANT STAGES IN A DATA MINING PROJECT -- 9. CLASSIFIERS -- 3. SOME GENERAL STRATEGIES USED IN DATA MINING -- PART 2 WORKED EXAMPLES -- CONTENTS -- 5. PREPARING TRAINING AND TEST DATASETS -- CONCLUSION. Where Next? -- 10. CLASSIFICATION TREES -- NOTES -- PART 1 CONCEPTS -- 1. WHAT IS DATA MINING? -- 7. CREATING NEW VARIABLES -- 12. CLUSTERING -- BIBLIOGRAPHY -- Frontmatter -- 2. CONTRASTS WITH THE CONVENTIONAL STATISTICAL APPROACH -- 6. VARIABLE SELECTION TOOLS -- INDEX 11. NEURAL NETWORKS -- 13. LATENT CLASS ANALYSIS AND MIXTURE MODELS -- |
| Title | Data mining for the social sciences |
| URI | http://portal.igpublish.com/iglibrary/search/UCPB0001364.html https://www.jstor.org/stable/10.1525/j.ctt13x1gcg https://cir.nii.ac.jp/crid/1130000795785968256 https://www.perlego.com/book/552698/data-mining-for-the-social-sciences-an-introduction-pdf https://ebookcentral.proquest.com/lib/[SITE_ID]/detail.action?docID=1882080 https://www.degruyterbrill.com/isbn/9780520960596 https://www.vlebooks.com/vleweb/product/openreader?id=none&isbn=9780520960596&uid=none |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Pb9MwFH6ClQM7IGCgZWMoAg4gFJGQOHGOrBvsANUkBtrNsh07FKasarJR_nueHSdtAkhw4GK1kd1Y76uT9-N77wE8EzrUqBejkUOIDhISy4AXhAdZLjgNU0l1ZBOF32ezGT0_z09dV9bathPIqoquVvniv0KN1xBskzr7D3D3P4oX8DOCjiPCjuNII-6_togf8Ya__GAbPvTkQZd9605wrz-_aRrV0aEH3EDUy8svrU90zXZ3ToGIjJwCf8zy2uR1tCak5cGkpgnPbx-obXekX-YNC1ePXig9zW-w7CZMTJQXDeTJu5Ojj7O1JywxRQ8jVwAVb_hqsG4btnn9DZ_4-DZoalNRtlz0HrqORYqaQTWfow2zUMsLVV4O7IU73y3zoFDl8upH00W6rQJxdhcmymSV3IMbqroPXguL38HiP3fVv1_swFMDo9_C6KM8fYTRH81_AJ_fHp9NTwLXySLgBE0YPAlUZIWKElmoTOaUSiGISLSMYx3y9LVIE5FxNA6N-YsqN08znZgQqaa5IEmaxw9hq7qs1C74RR5LzgmeMGpquFIhtNJhluJ8JbOQevBkQ17s-sJG3Ws2EKoH_qYY2aItbcI-TU8PrVmQJh7sWtGydvlXJpsmildRKUsPDlDaTM7NGJmAKGqauSmblKcUdWgPdhwObjExfexxZ34HCrObcvxjdnw4jVBMaLzg5kdgMVPSZbj5vb-ZtA-31wfjEWw1yyt1ALfkdTOvl4_df_AnnYtsFA |
| linkProvider | ProQuest Ebooks |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.title=Data+Mining+for+the+Social+Sciences&rft.au=Attewell%2C+Paul&rft.au=Monaghan%2C+David&rft.date=2015-01-01&rft.pub=University+of+California+Press&rft.isbn=9780520960596&rft_id=info:doi/10.1525%2F9780520960596&rft.externalDBID=n%2Fa&rft.externalDocID=9780520960596 |
| thumbnail_l | http://cvtisr.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fwww.perlego.com%2Fbooks%2FRM_Books%2Fingram_csplus_gexhsuob%2F9780520960596.jpg |
| thumbnail_m | http://cvtisr.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fwww.degruyterbrill.com%2Fdocument%2Fcover%2Fisbn%2F9780520960596%2Foriginal http://cvtisr.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fvle.dmmserver.com%2Fmedia%2F640%2F97805209%2F9780520960596.jpg |
| thumbnail_s | http://cvtisr.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fportal.igpublish.com%2Figlibrary%2Famazonbuffer%2FUCPB0001364_null_0_320.png |

