Categorical data analysis

Praise for the Second Edition "A must-have book for anyone expecting to do research and/or applications in categorical data analysis." —Statistics in Medicine "It is a total delight reading this book." —Pharmaceutical Research "If you do any analysis of categorical data, thi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Agresti, Alan
Format: E-Book Buch
Sprache:Englisch
Veröffentlicht: Hoboken, N.J Wiley 2012
John Wiley & Sons, Incorporated
Wiley-Blackwell
Ausgabe:3rd ed
Schriftenreihe:Wiley series in probability and statistics
Schlagworte:
ISBN:0470463635, 9780470463635, 9781118710852, 1118710851
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Inhaltsangabe:
  • 4.3.6 Example: Modeling Death Rates for Heart Valve Operations -- 4.3.7 Poisson GLM of Independence in Two-Way Contingency Tables -- 4.4 Moments and Likelihood for Generalized Linear Models -- 4.4.1 The Exponential Dispersion Family -- 4.4.2 Mean and Variance Functions for the Random Component -- 4.4.3 Mean and Variance Functions for Poisson and Binomial GLMs -- 4.4.4 Systematic Component and Link Function of a GLM -- 4.4.5 Likelihood Equations for a GLM -- 4.4.6 The Key Role of the Mean-Variance Relationship -- 4.4.7 Likelihood Equations for Binomial GLMs -- 4.4.8 Asymptotic Covariance Matrix of Model Parameter Estimators -- 4.4.9 Likelihood Equations and cov(β) for Poisson Loglinear Model -- 4.5 Inference and Model Checking for Generalized Linear Models -- 4.5.1 Deviance and Goodness of Fit -- 4.5.2 Deviance for Poisson GLMs -- 4.5.3 Deviance for Binomial GLMs: Grouped Versus Ungrouped Data -- 4.5.4 Likelihood-Ratio Model Comparison Using the Deviances -- 4.5.5 Score Tests for Goodness of Fit and for Model Comparison -- 4.5.6 Residuals for GLMs -- 4.5.7 Covariance Matrices for Fitted Values and Residuals -- 4.5.8 The Bayesian Approach for GLMs -- 4.6 Fitting Generalized Linear Models -- 4.6.1 Newton-Raphson Method -- 4.6.2 Fisher Scoring Method -- 4.6.3 Newton-Raphson and Fisher Scoring for Binary Data -- 4.6.4 ML as Iterative Reweighted Least Squares -- 4.6.5 Simplifications for Canonical Link Functions -- 4.7 Quasi-Likelihood and Generalized Linear Models -- 4.7.1 Mean-Variance Relationship Determines Quasi-likelihood Estimates -- 4.7.2 Overdispersion for Poisson GLMs and Quasi-likelihood -- 4.7.3 Overdispersion for Binomial GLMs and Quasi-likelihood -- 4.7.4 Example: Teratology Overdispersion -- Notes -- Exercises -- 5 Logistic Regression -- 5.1 Interpreting Parameters in Logistic Regression
  • 3.5.2 Example: Fisher's Tea Drinker -- 3.5.3 Two-Sided P-Values for Fisher's Exact Test -- 3.5.4 Confidence Intervals Based on Conditional Likelihood -- 3.5.5 Discreteness and Conservatism Issues -- 3.5.6 Small-Sample Unconditional Tests of Independence -- 3.5.7 Conditional Versus Unconditional Tests -- 3.6 Bayesian Inference for Two-way Contingency Tables -- 3.6.1 Prior Distributions for Comparing Proportions in 2 x 2 Tables -- 3.6.2 Posterior Probabilities Comparing Proportions -- 3.6.3 Posterior Intervals for Association Parameters -- 3.6.4 Example: Urn Sampling Gives Highly Unbalanced Treatment Allocation -- 3.6.5 Highest Posterior Density Intervals -- 3.6.6 Testing Independence -- 3.6.7 Empirical Bayes and Hierarchical Bayesian Approaches -- 3.7 Extensions for Multiway Tables and Nontabulated Responses -- 3.7.1 Categorical Data Need Not Be Contingency Tables -- Notes -- Exercises -- 4 Introduction to Generalized Linear Models -- 4.1 The Generalized Linear Model -- 4.1.1 Components of Generalized Linear Models -- 4.1.2 Binomial Logit Models for Binary Data -- 4.1.3 Poisson Loglinear Models for Count Data -- 4.1.4 Generalized Linear Models for Continuous Responses -- 4.1.5 Deviance of a GLM -- 4.1.6 Advantages of GLMs Versus Transforming the Data -- 4.2 Generalized Linear Models for Binary Data -- 4.2.1 Linear Probability Model -- 4.2.2 Example: Snoring and Heart Disease -- 4.2.3 Logistic Regression Model -- 4.2.4 Binomial GLM for 2 x 2 Contingency Tables -- 4.2.5 Probit and Inverse cdf Link Functions -- 4.2.6 Latent Tolerance Motivation for Binary Response Models -- 4.3 Generalized Linear Models for Counts and Rates -- 4.3.1 Poisson Loglinear Models -- 4.3.2 Example: Horseshoe Crab Mating -- 4.3.3 Overdispersion for Poisson GLMs -- 4.3.4 Negative Binomial GLMs -- 4.3.5 Poisson Regression for Rates Using Offsets
  • Cover -- Title Page -- Copyright Page -- Contents -- Preface -- 1 Introduction: Distributions and Inference for Categorical Data -- 1.1 Categorical Response Data -- 1.1.1 Response-Explanatory Variable Distinction -- 1.1.2 Binary-Nominal-Ordinal Scale Distinction -- 1.1.3 Discrete-Continuous Variable Distinction -- 1.1.4 Quantitative-Qualitative Variable Distinction -- 1.1.5 Organization of Book and Online Computing Appendix -- 1.2 Distributions for Categorical Data -- 1.2.1 Binomial Distribution -- 1.2.2 Multinomial Distribution -- 1.2.3 Poisson Distribution -- 1.2.4 Overdispersion -- 1.2.5 Connection Between Poisson and Multinomial Distributions -- 1.2.6 The Chi-Squared Distribution -- 1.3 Statistical Inference for Categorical Data -- 1.3.1 Likelihood Functions and Maximum Likelihood Estimation -- 1.3.2 Likelihood Function and ML Estimate for Binomial Parameter -- 1.3.3 Wald-Likelihood Ratio-Score Test Triad -- 1.3.4 Constructing Confidence Intervals by Inverting Tests -- 1.4 Statistical Inference for Binomial Parameters -- 1.4.1 Tests About a Binomial Parameter -- 1.4.2 Confidence Intervals for a Binomial Parameter -- 1.4.3 Example: Estimating the Proportion of Vegetarians -- 1.4.4 Exact Small-Sample Inference and the Mid P- Value -- 1.5 Statistical Inference for Multinomial Parameters -- 1.5.1 Estimation of Multinomial Parameters -- 1.5.2 Pearson Chi-Squared Test of a Specified Multinomial -- 1.5.3 Likelihood-Ratio Chi-Squared Test of a Specified Multinomial -- 1.5.4 Example: Testing Mendel's Theories -- 1.5.5 Testing with Estimated Expected Frequencies -- 1.5.6 Example: Pneumonia Infections in Calves -- 1.5.7 Chi-Squared Theoretical Justification -- 1.6 Bayesian Inference for Binomial and Multinomial Parameters -- 1.6.1 The Bayesian Approach to Statistical Inference -- 1.6.2 Binomial Estimation: Beta and Logit-Normal Prior Distributions
  • 1.6.3 Multinomial Estimation: Dirichlet Prior Distributions -- 1.6.4 Example: Estimating Vegetarianism Revisited -- 1.6.5 Binomial and Multinomial Estimation: Improper Priors -- Notes -- Exercises -- 2 Describing Contingency Tables -- 2.1 Probability Structure for Contingency Tables -- 2.1.1 Contingency Tables -- 2.1.2 Joint/Marginal/Conditional Distributions for Contingency Tables -- 2.1.3 Example: Sensitivity and Specificity for Medical Diagnoses -- 2.1.4 Independence of Categorical Variables -- 2.1.5 Poisson, Binomial, and Multinomial Sampling -- 2.1.6 Example: Seat Belts and Auto Accident Injuries -- 2.1.7 Example: Case-Control Study of Cancer and Smoking -- 2.1.8 Types of Studies: Observational Versus Experimental -- 2.2 Comparing Two Proportions -- 2.2.1 Difference of Proportions -- 2.2.2 Relative Risk -- 2.2.3 Odds Ratio -- 2.2.4 Properties of the Odds Ratio -- 2.2.5 Example: Association Between Heart Attacks and Aspirin Use -- 2.2.6 Case-Control Studies and the Odds Ratio -- 2.2.7 Relationship Between Odds Ratio and Relative Risk -- 2.3 Conditional Association in Stratified 2 × 2 Tables -- 2.3.1 Partial Tables -- 2.3.2 Example: Racial Characteristics and the Death Penalty -- 2.3.3 Conditional and Marginal Odds Ratios -- 2.3.4 Marginal Independence Versus Conditional Independence -- 2.3.5 Homogeneous Association -- 2.3.6 Collapsibility: Identical Conditional and Marginal Associations -- 2.4 Measuring Association in I × J Tables -- 2.4.1 Odds Ratios in I x J Tables -- 2.4.2 Association Factors -- 2.4.3 Summary Measures of Association -- 2.4.4 Ordinal Trends: Concordant and Discordant Pairs -- 2.4.5 Ordinal Measure of Association: Gamma -- 2.4.6 Probabilistic Comparisons of Two Ordinal Distributions -- 2.4.7 Example: Comparing Pain Ratings After Surgery -- 2.4.8 Correlation for Underlying Normality -- Exercises -- Notes
  • 5.1.1 Interpreting β: Odds, Probabilities, and Linear Approximations
  • 3 Inference for Two-Way Contingency Tables -- 3.1 Confidence Intervals for Association Parameters -- 3.1.1 Interval Estimation of the Odds Ratio -- 3.1.2 Example: Seat-Belt Use and Traffic Deaths -- 3.1.3 Interval Estimation of Difference of Proportions and Relative Risk -- 3.1.4 Example: Aspirin and Heart Attacks Revisited -- 3.1.5 Deriving Standard Errors with the Delta Method -- 3.1.6 Delta Method Applied to the Sample Logit -- 3.1.7 Delta Method for the Log Odds Ratio -- 3.1.8 Simultaneous Confidence Intervals for Multiple Comparisons -- 3.2 Testing Independence in Two-way Contingency Tables -- 3.2.1 Pearson and Likelihood-Ratio Chi-Squared Tests -- 3.2.2 Example: Education and Belief in God -- 3.2.3 Adequacy of Chi-Squared Approximations -- 3.2.4 Chi-Squared and Comparing Proportions in 2 x 2 Tables -- 3.2.5 Score Confidence Intervals Comparing Proportions -- 3.2.6 Profile Likelihood Confidence Intervals -- 3.3 Following-up Chi-Squared Tests -- 3.3.1 Pearson Residuals and Standardized Residuals -- 3.3.2 Example: Education and Belief in God Revisited -- 3.3.3 Partitioning Chi-Squared -- 3.3.4 Example: Origin of Schizophrenia -- 3.3.5 Rules for Partitioning -- 3.3.6 Summarizing the Association -- 3.3.7 Limitations of Chi-Squared Tests -- 3.3.8 Why Consider Independence If It's Unlikely to Be True? -- 3.4 Two-Way Tables with Ordered Classifications -- 3.4.1 Linear Trend Alternative to Independence -- 3.4.2 Example: Is Happiness Associated with Political Ideology? -- 3.4.3 Monotone Trend Alternatives to Independence -- 3.4.4 Extra Power with Ordinal Tests -- 3.4.5 Sensitivity to Choice of Scores -- 3.4.6 Example: Infant Birth Defects by Maternal Alcohol Consumption -- 3.4.7 Trend Tests for I x 2 and 2 x J Tables -- 3.4.8 Nominal-Ordinal Tables -- 3.5 Small-Sample Inference for Contingency Tables -- 3.5.1 Fisher's Exact Test for 2 x 2 Tables
  • Intro -- Half Title page -- Title page -- Copyright page -- Dedication -- Preface -- Chapter 1: Introduction: Distributions and Inference for Categorical Data -- 1.1 Categorical Response Data -- 1.2 Distributions for Categorical Data -- 1.3 Statistical Inference for Categorical Data -- 1.4 Statistical Inference for Binomial Parameters -- 1.5 Statistical Inference for Multinomial Parameters -- 1.6 Bayesian Inference for Binomial and Multinomial Parameters -- Notes -- Exercises -- Chapter 2: Describing Contingency Tables -- 2.1 Probability Structure for Contingency Tables -- 2.2 Comparing Two Proportions -- 2.3 Conditional Association in Stratified 2 × 2 Tables -- 2.4 Measuring Association in I × J Tables -- Notes -- Exercises -- Chapter 3: Inference for Two-Way Contingency Tables -- 3.1 Confidence Intervals for Association Parameters -- 3.2 Testing Independence in Two-way Contingency Tables -- 3.3 Following-up Chi-Squared Tests -- 3.4 Two-Way Tables with Ordered Classifications -- 3.5 Small-Sample Inference for Contingency Tables -- 3.6 Bayesian Inference for Two-way Contingency Tables -- 3.7 Extensions for Multiway Tables and Nontabulated Responses -- Notes -- Exercises -- Chapter 4: Introduction to Generalized Linear Models -- 4.1 The Generalized Linear Model -- 4.2 Generalized Linear Models for Binary Data -- 4.3 Generalized Linear Models for Counts and Rates -- 4.4 Moments and Likelihood for Generalized Linear Models -- 4.5 Inference and Model Checking for Generalized Linear Models -- 4.6 Fitting Generalized Linear Models -- 4.7 Quasi-Likelihood and Generalized Linear Models -- Notes -- Exercises -- Chapter 5: Logistic Regression -- 5.1 Interpreting Parameters in Logistic Regression -- 5.2 Inference for Logistic Regression -- 5.3 Logistic Models with Categorical Predictors -- 5.4 Multiple Logistic Regression
  • 5.5 Fitting Logistic Regression Models -- Notes -- Exercises -- Chapter 6: Building, Checking, and Applying Logistic Regression Models -- 6.1 Strategies in Model Selection -- 6.2 Logistic Regression Diagnostics -- 6.3 Summarizing the Predictive Power of a Model -- 6.4 Mantel-Haenszel and Related Methods for Multiple 2 × 2 Tables -- 6.5 Detecting and Dealing with Infinite Estimates -- 6.6 Sample Size and Power Considerations -- Notes -- Exercises -- Chapter 7: Alternative Modeling of Binary Response Data -- 7.1 Probit and Complementary Log-log Models -- 7.2 Bayesian Inference for Binary Regression -- 7.3 Conditional Logistic Regression -- 7.4 Smoothing: Kernels, Penalized Likelihood, Generalized Additive Models -- 7.5 Issues in Analyzing High-Dimensional Categorical Data -- Notes -- Exercises -- Chapter 8: Models for Multinomial Responses -- 8.1 Nominal Responses: Baseline-Category Logit Models -- 8.2 Ordinal Responses: Cumulative Logit Models -- 8.3 Ordinal Responses: Alternative Models -- 8.4 Testing Conditional Independence in I × J × K Tables -- 8.5 Discrete-Choice Models -- 8.6 Bayesian Modeling of Multinomial Responses -- Notes -- Exercises -- Chapter 9: Loglinear Models for Contingency Tables -- 9.1 Loglinear Models for Two-way Tables -- 9.2 Loglinear Models for Independence and Interaction in Three-way Tables -- 9.3 Inference for Loglinear Models -- 9.4 Loglinear Models for Higher Dimensions -- 9.5 Loglinear-Logistic Model Connection -- 9.6 Loglinear Model Fitting: Likelihood Equations and Asymptotic Distributions -- 9.7 Loglinear Model Fitting: Iterative Methods and Their Application -- Notes -- Exercises -- Chapter 10: Building and Extending Loglinear Models -- 10.1 Conditional Independence Graphs and Collapsibility -- 10.2 Model Selection and Comparison -- 10.3 Residuals for Detecting Cell-Specific Lack of Fit
  • 15.2 Classification: Tree-Structured Prediction -- 15.3 Cluster Analysis for Categorical Data -- Notes -- Exercises -- Chapter 16: Large- and Small-Sample Theory for Multinomial Models -- 16.1 Delta Method -- 16.2 Asymptotic Distributions of Estimators of Model Parameters and Cell Probabilities -- 16.3 Asymptotic Distributions of Residuals and Goodness-of-fit Statistics -- 16.4 Asymptotic Distributions for Logit/Loglinear Models -- 16.5 Small-Sample Significance Tests for Contingency Tables -- 16.6 Small-Sample Confidence Intervals for Categorical Data -- 16.7 Alternative Estimation Theory for Parametric Models -- Notes -- Exercises -- Chapter 17: Historical Tour of Categorical Data Analysis -- 17.1 Pearson-Yule Association Controversy -- 17.2 R. A. Fisher's Contributions -- 17.3 Logistic Regression -- 17.4 Multiway Contingency Tables and Loglinear Models -- 17.5 Bayesian Methods for Categorical Data -- 17.6 A Look Forward, and Backward -- Appendix A: Statistical Software for Categorical Data Analysis -- A.1 SAS -- A.2 R And S-Plus -- A.3 Stata -- A.4 SPSS -- A.5 Statxact and Logxact -- A.6 Other Software -- Appendix B: Chi-Squared Distribution Values -- References -- Author Index -- Example Index -- Subject Index
  • 10.4 Modeling Ordinal Associations -- 10.5 Generalized Loglinear and Association Models, Correlation Models, and Correspondence Analysis -- 10.6 Empty Cells and Sparseness in Modeling Contingency Tables -- 10.7 Bayesian Loglinear Modeling -- Notes -- Exercises -- Chapter 11: Models for Matched Pairs -- 11.1 Comparing Dependent Proportions -- 11.2 Conditional Logistic Regression for Binary Matched Pairs -- 11.3 Marginal Models for Square Contingency Tables -- 11.4 Symmetry, Quasi-Symmetry, and Quasi-Independence -- 11.5 Measuring Agreement Between Observers -- 11.6 Bradley-Terry Model for Paired Preferences -- 11.7 Marginal Models and Quasi-Symmetry Models for Matched Sets -- Notes -- Exercises -- Chapter 12: Clustered Categorical Data: Marginal and Transitional Models -- 12.1 Marginal Modeling: Maximum Likelihood Approach -- 12.2 Marginal Modeling: Generalized Estimating Equations (GEEs) Approach -- 12.3 Quasi-Likelihood and Its GEE Multivariate Extension: Details -- 12.4 Transitional Models: Markov Chain and Time Series Models -- Notes -- Exercises -- Chapter 13: Clustered Categorical Data: Random Effects Models -- 13.1 Random Effects Modeling of Clustered Categorical Data -- 13.2 Binary Responses: Logistic-Normal Model -- 13.3 Examples of Random Effects Models for Binary Data -- 13.4 Random Effects Models for Multinomial Data -- 13.5 Multilevel Modeling -- 13.6 GLMM Fitting, Inference, and Prediction -- 13.7 Bayesian Multivariate Categorical Modeling -- Notes -- Exercises -- Chapter 14: Other Mixture Models for Discrete Data -- 14.1 Latent Class Models -- 14.2 Nonparametric Random Effects Models -- 14.3 Beta-Binomial Models -- 14.4 Negative Binomial Regression -- 14.5 Poisson Regression with Random Effects -- Notes -- Exercises -- Chapter 15: Non-Model-Based Classification and Clustering -- 15.1 Classification: Linear Discriminant Analysis