Python Feature Engineering Cookbook Over 70 Recipes for Creating, Engineering, and Transforming Features to Build Machine Learning Models

<![CDATA[Leverage the power of Python to build real-world feature engineering and machine learning pipelines ready to be deployed to productionKey FeaturesCraft powerful features from tabular, transactional, and time-series dataDevelop efficient and reproducible real-world feature engineering pip...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Galli, Soledad, Molnar, Christoph
Format: E-Book
Sprache:Englisch
Veröffentlicht: Birmingham Packt Publishing, Limited 2024
Packt Publishing
Ausgabe:3
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Inhaltsangabe:
  • Creating spline features -- Getting ready -- How to do it… -- How it works… -- See also -- Chapter 9: Extracting Features from Relational Data with Featuretools -- Technical requirements -- Setting up an entity set and creating features automatically -- Getting ready -- How to do it... -- How it works... -- See also -- Creating features with general and cumulative operations -- Getting ready -- How to do it... -- How it works... -- Combining numerical features -- How to do it... -- How it works... -- Extracting features from date and time -- How to do it... -- How it works... -- Extracting features from text -- Getting ready -- How to do it... -- How it works... -- Creating features with aggregation primitives -- Getting ready -- How to do it... -- How it works... -- Chapter 10: Creating Features from a Time Series with tsfresh -- Technical requirements -- Extracting hundreds of features automatically from a time series -- Getting ready -- How to do it... -- How it works... -- See also -- Automatically creating and selecting predictive features from time-series data -- How to do it... -- How it works... -- See also -- Extracting different features from different time series -- How to do it... -- How it works... -- Creating a subset of features identified through feature selection -- How to do it... -- How it works... -- Embedding feature creation into a scikit-learn pipeline -- How to do it... -- How it works... -- See also -- Chapter 11: Extracting Features from Text Variables -- Technical requirements -- Counting characters, words, and vocabulary -- Getting ready -- How to do it... -- How it works... -- There's more... -- See also -- Estimating text complexity by counting sentences -- Getting ready -- How to do it... -- How it works... -- There's more... -- Creating features with bag-of-words and n-grams -- Getting ready -- How to do it...
  • How it works... -- Performing binary encoding -- How to do it... -- How it works... -- Chapter 3: Transforming Numerical Variables -- Transforming variables with the logarithm function -- Getting ready -- How to do it... -- How it works... -- There's more… -- Transforming variables with the reciprocal function -- How to do it... -- How it works... -- Using the square root to transform variables -- How to do it... -- How it works… -- Using power transformations -- How to do it... -- How it works... -- Performing Box-Cox transformations -- How to do it... -- How it works... -- There's more… -- Performing Yeo-Johnson transformations -- How to do it... -- How it works... -- There's more… -- Chapter 4: Performing Variable Discretization -- Technical requirements -- Performing equal-width discretization -- How to do it... -- How it works… -- See also -- Implementing equal-frequency discretization -- How to do it... -- How it works… -- Discretizing the variable into arbitrary intervals -- How to do it... -- How it works... -- Performing discretization with k-means clustering -- How to do it... -- How it works... -- See also -- Implementing feature binarization -- Getting ready -- How to do it... -- How it works… -- Using decision trees for discretization -- How to do it... -- How it works... -- There's more... -- Chapter 5: Working with Outliers -- Technical requirements -- Visualizing outliers with boxplots and the inter-quartile proximity rule -- How to do it... -- How it works… -- Finding outliers using the mean and standard deviation -- How to do it... -- How it works… -- Using the median absolute deviation to find outliers -- How to do it... -- How it works… -- Removing outliers -- How to do it... -- How it works... -- See also -- Bringing outliers back within acceptable limits -- How to do it... -- How it works... -- See also -- Applying winsorization
  • How to do it... -- How it works... -- See also -- Chapter 6: Extracting Features from Date and Time Variables -- Technical requirements -- Extracting features from dates with pandas -- Getting ready -- How to do it... -- How it works... -- There's more… -- See also -- Extracting features from time with pandas -- Getting ready -- How to do it... -- How it works... -- There's more… -- Capturing the elapsed time between datetime variables -- How to do it... -- How it works... -- There's more... -- See also -- Working with time in different time zones -- How to do it... -- How it works... -- See also -- Automating the datetime feature extraction with Feature-engine -- How to do it... -- How it works... -- Chapter 7: Performing Feature Scaling -- Technical requirements -- Standardizing the features -- Getting ready -- How to do it... -- How it works... -- Scaling to the maximum and minimum values -- Getting ready -- How to do it... -- How it works... -- Scaling with the median and quantiles -- How to do it... -- How it works... -- Performing mean normalization -- How to do it... -- How it works… -- There's more... -- Implementing maximum absolute scaling -- Getting ready -- How to do it... -- There's more... -- Scaling to vector unit length -- How to do it... -- How it works... -- Chapter 8: Creating New Features -- Technical requirements -- Combining features with mathematical functions -- Getting ready -- How to do it... -- How it works... -- See also -- Comparing features to reference variables -- How to do it… -- How it works... -- See also -- Performing polynomial expansion -- Getting ready -- How to do it... -- How it works... -- There's more... -- Combining features with decision trees -- How to do it... -- How it works... -- See also -- Creating periodic features from cyclical variables -- Getting ready -- How to do it… -- How it works
  • How it works... -- See also -- Implementing term frequency-inverse document frequency -- Getting ready -- How to do it... -- How it works... -- See also -- Cleaning and stemming text variables -- Getting ready -- How to do it... -- How it works... -- Index -- Other Books You May Enjoy
  • Cover -- Title page -- Copyright and credits -- Foreword -- Contributors -- Table of Contents -- Preface -- Chapter 1: Imputing Missing Data -- Technical requirements -- Removing observations with missing data -- How to do it... -- How it works... -- See also -- Performing mean or median imputation -- How to do it... -- How it works... -- Imputing categorical variables -- How to do it... -- How it works... -- Replacing missing values with an arbitrary number -- How to do it... -- How it works... -- Finding extreme values for imputation -- How to do it... -- How it works... -- Marking imputed values -- How to do it... -- How it works... -- There's more… -- Implementing forward and backward fill -- How to do it... -- How it works... -- Carrying out interpolation -- How to do it... -- How it works... -- See also -- Performing multivariate imputation by chained equations -- How to do it... -- How it works... -- See also -- Estimating missing data with nearest neighbors -- How to do it... -- How it works... -- Chapter 2: Encoding Categorical Variables -- Technical requirements -- Creating binary variables through one-hot encoding -- How to do it... -- How it works... -- There's more... -- Performing one-hot encoding of frequent categories -- How to do it... -- How it works... -- There's more... -- Replacing categories with counts or the frequency of observations -- How to do it... -- How it works... -- See also -- Replacing categories with ordinal numbers -- How to do it... -- How it works... -- There's more... -- Performing ordinal encoding based on the target value -- How to do it... -- How it works... -- See also -- Implementing target mean encoding -- How to do it... -- How it works… -- There's more… -- Encoding with Weight of Evidence -- How to do it... -- How it works... -- See also -- Grouping rare or infrequent categories -- How to do it...