Data mining for the social sciences an introduction

We live in a world of big data: the amount of information collected on human behavior each day is staggering, and exponentially greater than at any time in the past. Additionally, powerful algorithms are capable of churning through seas of data to uncover patterns. Providing a simple and accessible...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Attewell, Paul, Monaghan, David
Format:	E-Book Buch
Sprache:	Englisch
Veröffentlicht:	Oakland, Calif University of California Press 2015
Ausgabe:	1
Schlagworte:	analyzing data bayesian networks big data bootstrapping business analytics chaid classification and regression trees classification trees confusion matrix data analysis Data mining Data processing data scholarship data science Demography hardware for data mining heteroscedasticity naive bayes partition trees permutation tests Political Science POLITICAL SCIENCE / General POLITICAL SCIENCE / Political Economy Population Studies scholarly data SOCIAL SCIENCE SOCIAL SCIENCE / Demography Social sciences Social sciences > Data processing Social sciences > Statistical methods social scientists software for data mining Statistical methods statistical modeling studying data text mining vif regression weka confusion matrix data scholarship software for data mining data mining social scientists weka analyzing data chaid data science naive bayes studying data hardware for data mining partition trees statistical modeling text mining big data bayesian networks vif regression data analysis data processing permutation tests scholarly data business analytics social science bootstrapping heteroscedasticity classification and regression trees statistical methods classification trees
ISBN:	9780520280984, 9780520960596, 0520960599, 9780520280977, 0520280989, 0520280970
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Inhaltsangabe:

Data mining for the social sciences : an introduction -- Contents -- Acknowledgments -- Part 1: Concepts -- 1. What Is Data Mining? -- 2. Contrasts with the Conventional Statistical Approach -- 3. Some General Strategies Used in Data Mining -- 4. Important Stages in a Data Mining Project -- Part 2: Worked Examples -- 5. Preparing Training and Test Datasets -- 6. Variable Selection Tools -- 7. Creating New Variables Using Binning and Trees -- 8. Extracting Variables -- 9. Classifiers -- 10. Classification Trees -- 11. Neural Networks -- 12. Clustering -- 13. Latent Class Analysis and Mixture Models -- 14. Association Rules -- Conclusion -- Bibliography -- Notes -- Index.
Front Matter Table of Contents ACKNOWLEDGMENTS 1: WHAT IS DATA MINING? 2: CONTRASTS WITH THE CONVENTIONAL STATISTICAL APPROACH 3: SOME GENERAL STRATEGIES USED IN DATA MINING 4: IMPORTANT STAGES IN A DATA MINING PROJECT 5: PREPARING TRAINING AND TEST DATASETS 6: VARIABLE SELECTION TOOLS 7: Creating New Variables Using Binning and Trees 8: EXTRACTING VARIABLES 9: CLASSIFIERS 10: CLASSIFICATION TREES 11: NEURAL NETWORKS 12: CLUSTERING 13: LATENT CLASS ANALYSIS AND MIXTURE MODELS 14: ASSOCIATION RULES CONCLUSION BIBLIOGRAPHY NOTES INDEX
Boosted Trees and Random Forests -- 11. Neural Networks -- 12. Clustering -- Hierarchical Clustering -- K-Means Clustering -- Normal Mixtures -- Self-Organized Maps -- 13. Latent Class Analysis and Mixture Models -- Latent Class Analysis -- Latent Class Regression -- Mixture Models -- 14. Association Rules -- Conclusion -- Bibliography -- Notes -- Index -- A -- B -- C -- D -- E -- F -- G -- H -- I -- J -- K -- L -- M -- N -- O -- P -- R -- S -- T -- U -- V -- W -- X -- Y -- Z
Cover -- Title -- Copyright -- Contents -- Acknowledgments -- PART 1. CONCEPTS -- 1. What Is Data Mining? -- The Goals of This Book -- Software and Hardware for Data Mining -- Basic Terminology -- 2. Contrasts with the Conventional Statistical Approach -- Predictive Power in Conventional Statistical Modeling -- Hypothesis Testing in the Conventional Approach -- Heteroscedasticity as a Threat to Validity in Conventional Modeling -- The Challenge of Complex and Nonrandom Samples -- Bootstrapping and Permutation Tests -- Nonlinearity in Conventional Predictive Models -- Statistical Interactions in Conventional Models -- Conclusion -- 3. Some General Strategies Used in Data Mining -- Cross-Validation -- Overfitting -- Boosting -- Calibrating -- Measuring Fit: The Confusion Matrix and ROC Curves -- Identifying Statistical Interactions and Effect Heterogeneity in Data Mining -- Bagging and Random Forests -- The Limits of Prediction -- Big Data Is Never Big Enough -- 4. Important Stages in a Data Mining Project -- When to Sample Big Data -- Building a Rich Array of Features -- Feature Selection -- Feature Extraction -- Constructing a Model -- PART 2. WORKED EXAMPLES -- 5. Preparing Training and Test Datasets -- The Logic of Cross-Validation -- Cross-Validation Methods: An Overview -- 6. Variable Selection Tools -- Stepwise Regression -- The LASSO -- VIF Regression -- 7. Creating New Variables Using Binning and Trees -- Discretizing a Continuous Predictor -- Continuous Outcomes and Continuous Predictors -- Binning Categorical Predictors -- Using Partition Trees to Study Interactions -- 8. Extracting Variables -- Principal Component Analysis -- Independent Component Analysis -- 9. Classifiers -- K-Nearest Neighbors -- Naive Bayes -- Support Vector Machines -- Optimizing Prediction across Multiple Classifiers -- 10. Classification Trees -- Partition Trees
ACKNOWLEDGMENTS --
14. ASSOCIATION RULES --
8. EXTRACTING VARIABLES --
4. IMPORTANT STAGES IN A DATA MINING PROJECT --
9. CLASSIFIERS --
3. SOME GENERAL STRATEGIES USED IN DATA MINING --
PART 2 WORKED EXAMPLES --
CONTENTS --
5. PREPARING TRAINING AND TEST DATASETS --
CONCLUSION. Where Next? --
10. CLASSIFICATION TREES --
NOTES --
PART 1 CONCEPTS --
1. WHAT IS DATA MINING? --
7. CREATING NEW VARIABLES --
12. CLUSTERING --
BIBLIOGRAPHY --
Frontmatter --
2. CONTRASTS WITH THE CONVENTIONAL STATISTICAL APPROACH --
6. VARIABLE SELECTION TOOLS --
INDEX
11. NEURAL NETWORKS --
13. LATENT CLASS ANALYSIS AND MIXTURE MODELS --