Pandas Cookbook - Practical Recipes for Scientific Computing, Time Series, and Exploratory Data Analysis Using Python (3rd Edition)

Unlock the full power of pandas 2.x with this hands-on cookbook, designed for Python developers, data analysts, and data scientists who need fast, efficient solutions for real-world data challenges. This book provides practical, ready-to-use recipes to streamline your workflow. With step-by-step gui...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Ayd, William, Harrison, Matthew, McKinney, Wes
Format: E-Book
Sprache:Englisch
Veröffentlicht: Birmingham Packt Publishing 2024
Packt Publishing, Limited
Ausgabe:3
Schlagworte:
ISBN:9781836205876, 1836205872
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Inhaltsangabe:
  • Title Page Preface Table of Contents 1. Pandas Foundations 2. Selection and Assignment 3. Data Types 4. The Pandas I/O System 5. Algorithms and How to Apply Them 6. Visualization 7. Reshaping DataFrames 8. Group by 9. Temporal Data Types and Algorithms 10. General Usage and Performance Tips 11. The Pandas Ecosystem Index
  • Cover -- Title Page -- Copy right Page -- Forweord -- Contributors -- Table of Contents -- Preface -- Making the Most Out of This Book - Get to Know Your Free Benefits -- Chapter 1: pandas Foundations -- Importing pandas -- Series -- DataFrame -- Index -- Series attributes -- DataFrame attributes -- Chapter 2: Selection and Assignment -- Basic selection from a Series -- Basic selection from a DataFrame -- Position-based selection of a Series -- Position-based selection of a DataFrame -- Label-based selection from a Series -- Label-based selection from a DataFrame -- Mixing position-based and label-based selection -- DataFrame.filter -- Selection by data type -- Selection/filtering via Boolean arrays -- Selection with a MultiIndex - A single level -- Selection with a MultiIndex - Multiple levels -- Selection with a MultiIndex - a DataFrame -- Item assignment with .loc and .iloc -- DataFrame column assignment -- Chapter 3: Data Types -- Integral types -- Floating point types -- Boolean types -- String types -- Missing value handling -- Categorical types -- Temporal types - datetime -- Temporal types - timedelta -- Temporal PyArrow types -- PyArrow List types -- PyArrow decimal types -- NumPy type system, the object type, and pitfalls -- Chapter 4: The pandas I/O System -- CSV - basic reading/writing -- CSV - strategies for reading large files -- Microsoft Excel - basic reading/writing -- Microsoft Excel - finding tables in non-default locations -- Microsoft Excel - hierarchical data -- SQL using SQLAlchemy -- SQL using ADBC -- Apache Parquet -- JSON -- HTML -- Pickle -- Third-party I/O libraries -- Chapter 5: Algorithms and How to Apply Them -- Basic pd.Series arithmetic -- Basic pd.DataFrame arithmetic -- Aggregations -- Transformations -- Map -- Apply -- Summary statistics -- Binning algorithms -- One-hot encoding with pd.get_dummies
  • Great Expectations -- Visualization -- Plotly -- PyGWalker -- Data science -- scikit-learn -- XGBoost -- Databases -- DuckDB -- Other DataFrame libraries -- Ibis -- Dask -- Polars -- cuDF -- Packt Page -- Other BooksYou May Enjoy -- Index
  • Chaining with .pipe -- Selecting the lowest-budget movies from the top 100 -- Calculating a trailing stop order price -- Finding the baseball players best at… -- Understanding which position scores the most per tea -- Chapter 6: Visualization -- Creating charts from aggregated data -- Plotting distributions of non-aggregated data -- Further plot customization with Matplotlib -- Exploring scatter plots -- Exploring categorical data -- Exploring continuous data -- Using seaborn for advanced plots -- Chapter 7: Reshaping DataFrames -- Concatenating pd.DataFrame objects -- Merging DataFrames with pd.merge -- Joining DataFrames with pd.DataFrame.join -- Reshaping with pd.DataFrame.stack and pd.DataFrame.unstack -- Reshaping with pd.DataFrame.melt -- Reshaping with pd.wide_to_long -- Reshaping with pd.DataFrame.pivot and pd.pivot_table -- Reshaping with pd.DataFrame.explode -- Transposing with pd.DataFrame.T -- Join our community on Discord -- Chapter 8: Group By -- Group by basics -- Grouping and calculating multiple columns -- Group by apply -- Window operations -- Selecting the highest rated movies by year -- Comparing the best hitter in baseball across years -- Chapter 9: Temporal Data Types and Algorithms -- Timezone handling -- DateOffsets -- Datetime selection -- Resampling -- Aggregating weekly crime and traffic accidents -- Calculating year-over-year changes in crime by category -- Accurately measuring sensor-collected events with missing values -- Chapter 10: General Usage and Performance Tips -- Avoid dtype=object -- Be cognizant of data sizes -- Use vectorized functions instead of loops -- Avoid mutating data -- Dictionary-encode low cardinality data -- Test-driven development features -- Chapter 11: The pandas Ecosystem -- Foundational libraries -- NumPy -- PyArrow -- Exploratory data analysis -- YData Profiling -- Data validation