Pandas Basics

This book is intended for those who plan to become data scientists as well as anyone who needs to perform data cleaning tasks using Pandas and NumPy. It contains a variety of code samples and features of NumPy and Pandas, and how to write regular expressions. Chapter 3 includes fundamental statistic...

Celý popis

Uloženo v:
Podrobná bibliografie
Hlavní autor: Campesato, Oswald
Médium: E-kniha
Jazyk:angličtina
Vydáno: Berlin Mercury Learning and Information 2023
Mercury Learning & Information
Mercury Learning
Vydání:1
Témata:
ISBN:9781683928263, 1683928261
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract This book is intended for those who plan to become data scientists as well as anyone who needs to perform data cleaning tasks using Pandas and NumPy. It contains a variety of code samples and features of NumPy and Pandas, and how to write regular expressions. Chapter 3 includes fundamental statistical concepts and Chapter 7 covers data visualization with Matplotlib and Seaborn. Companion files with code are available for downloading from the publisher.FEATURES: Provides the reader with numerous code samples for Pandas and NumPy programming concepts, and an introduction to statistical concepts and data visualization Includes an introductory chapter on Python Companion files with code
AbstractList This book is intended for those who plan to become data scientists as well as anyone who needs to perform data cleaning tasks using Pandas and NumPy. --
This book is intended for those who plan to become data scientists as well as anyone who needs to perform data cleaning tasks using Pandas and NumPy. It contains a variety of code samples and features of NumPy and Pandas, and how to write regular expressions. Chapter 3 includes fundamental statistical concepts and Chapter 7 covers data visualization with Matplotlib and Seaborn. Companion files with code are available for downloading from the publisher.FEATURES: Provides the reader with numerous code samples for Pandas and NumPy programming concepts, and an introduction to statistical concepts and data visualization Includes an introductory chapter on Python Companion files with code
Author Campesato, Oswald
Author_xml – sequence: 1
  fullname: Campesato, Oswald
BookMark eNpVjzlPw0AQRhdxCBJc0tMgoAjs7ngPlyQKhxSJFIh2NdkDWTZ28DpA_j0Lpkk180lvnr4ZkYOmbTwhZ4zeMMHEbaE0kxoKrrmQeyTbyfs7WcIRGTEQgnOmJRyTLMZyRQUFxguhTki2xMZhPJ9iLG08JYcB6-iz_zkmr_fzl9njZPH88DS7W0yQccXVJN1T8IH5QKXVuVcFeqoLQIs2mUVQOlipqSiEc2DRCQcBHeToZa5yB2NyOYhjVdZ1bENvVm1bRZ5_K7OqIqWUyZxqlcjrgfzCuved82_dZpsW846dNTuvJ_ZqYNdd-7HxsTf-12p903dYm_l0BpRrKVWe0IsBrZr209dm3ZVJuP1rYar1cpoaUA7wA7dCZ0o
ContentType eBook
Copyright 2023
Copyright_xml – notice: 2023
DEWEY 005.133
DOI 10.1515/9781683928256
DatabaseTitleList

DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781683928256
1683928253
9781683928249
1683928245
Edition 1
ExternalDocumentID bks000164087
9781683928256
EBC30286674
book_kpPB000023
Genre Electronic books
GroupedDBID 38.
AAKGN
AANYM
AAZGR
ABWNX
ADBND
AEHEP
AFQEX
ALMA_UNASSIGNED_HOLDINGS
APVFW
E2F
I4C
L7C
UE6
AABBV
AAFRR
AAHDW
AALIM
AAUSU
AAZEP
ABONK
ABRSK
ACXXF
ADDXO
AEIUR
AFRFP
BBABE
CMZ
CZZ
ECNEQ
K-E
QD8
TD3
WZT
ID FETCH-LOGICAL-a12727-50303ef1ef06c84e79ae0893acac3125f78fc680595dd3cad5d3fad34ae6474d3
IEDL.DBID CMZ
ISBN 9781683928263
1683928261
IngestDate Tue Oct 28 12:00:34 EDT 2025
Fri Nov 21 20:03:13 EST 2025
Fri May 30 22:00:55 EDT 2025
Sat Nov 23 13:59:30 EST 2024
IsPeerReviewed false
IsScholarly false
Keywords Data Science
Matplotlib
Developers
data mining
Computer Science
Programming
Data
Seaborn
NumPy
Python
LCCallNum QA76.9.D343 .C36 2022
LCCallNum_Ident QA76.9.D343 .C36 2022
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a12727-50303ef1ef06c84e79ae0893acac3125f78fc680595dd3cad5d3fad34ae6474d3
OCLC 1355221863
PQID EBC30286674
PageCount 215
ParticipantIDs skillsoft_books24x7_bks000164087
walterdegruyter_marc_9781683928256
proquest_ebookcentral_EBC30286674
knovel_primary_book_kpPB000023
PublicationCentury 2000
PublicationDate 2023
2022
[2022]
2022.
PublicationDateYYYYMMDD 2023-01-01
2022-01-01
PublicationDate_xml – year: 2023
  text: 2023
PublicationDecade 2020
PublicationPlace Berlin
PublicationPlace_xml – name: Berlin
– name: Dulles, VA
– name: Place of publication not identified
PublicationYear 2023
2022
Publisher Mercury Learning and Information
Mercury Learning & Information
Mercury Learning
Publisher_xml – name: Mercury Learning and Information
– name: Mercury Learning & Information
– name: Mercury Learning
RestrictionsOnAccess restricted access
SSID ssib050312957
ssib051558295
ssib051558294
Score 2.383677
Snippet This book is intended for those who plan to become data scientists as well as anyone who needs to perform data cleaning tasks using Pandas and NumPy. It...
This book is intended for those who plan to become data scientists as well as anyone who needs to perform data cleaning tasks using Pandas and NumPy. --
SourceID skillsoft
walterdegruyter
proquest
knovel
SourceType Aggregation Database
Publisher
SubjectTerms COM018000 COMPUTERS / Data Processing
Computer Science
COMPUTERS / Database Management / Data Mining
COMPUTERS / Programming / General
COMPUTERS / Programming Languages / Python
Data
Data mining
Data Science
Developers
Matplotlib
NumPy
Programming
Programming Languages
Python
Python (Computer program language)
Seaborn
Software Engineering
SubjectTermsDisplay Data mining.
Electronic books.
Python (Computer program language)
TableOfContents Title Page Disclaimer Preface Table of Contents 1. Introduction to Python 2. Working with Data 3. Introduction to Probability and Statistics 4. Introduction to Pandas (1) 5. Introduction to Pandas (2) 6. Introduction to Pandas (3) 7. Data Visualization Index
Types of Distance Metrics -- What is Bayesian Inference? -- Bayes' Theorem -- Some Bayesian Terminology -- What is MAP? -- Why Use Bayes' Theorem? -- Summary -- Chapter 4: Introduction to Pandas (1) -- What is Pandas? -- Pandas Options and Settings -- Pandas Data Frames -- Data Frames and Data Cleaning Tasks -- Alternatives to Pandas -- A Pandas Data Frame with a NumPy Example -- Describing a Pandas Data Frame -- Pandas Boolean Data Frames -- Transposing a Pandas Data Frame -- Pandas Data Frames and Random Numbers -- Reading CSV Files in Pandas -- Specifying a Separator and Column Sets in Text Files -- Specifying an Index in Text Files -- The loc() and iloc() Methods in Pandas -- Converting Categorical Data to Numeric Data -- Matching and Splitting Strings in Pandas -- Converting Strings to Dates in Pandas -- Working with Date Ranges in Pandas -- Detecting Missing Dates in Pandas -- Interpolating Missing Dates in Pandas -- Other Operations with Dates in Pandas -- Merging and Splitting Columns in Pandas -- Reading HTML Web Pages in Pandas -- Saving a Pandas Data Frame as an HTML Web Page -- Summary -- Chapter 5: Introduction to Pandas (2) -- Combining Pandas Data Frames -- Data Manipulation with Pandas Data Frames (1) -- Data Manipulation with Pandas Data Frames (2) -- Data Manipulation with Pandas Data Frames (3) -- Pandas Data Frames and CSV Files -- Managing Columns in Data Frames -- Switching Columns -- Appending Columns -- Deleting Columns -- Inserting Columns -- Scaling Numeric Columns -- Managing Rows in Pandas -- Selecting a Range of Rows in Pandas -- Finding Duplicate Rows in Pandas -- Inserting New Rows in Pandas -- Handling Missing Data in Pandas -- Multiple Types of Missing Values -- Test for Numeric Values in a Column -- Replacing NaN Values in Pandas -- Summary -- Chapter 6: Introduction to Pandas (3) -- Threshold Values and Outliers
Finding Outliers with Pandas -- Calculating Z-scores to Find Outliers -- Finding Outliers with SkLearn (Optional) -- Working with Missing Data -- Imputing Values: When is Zero a Valid Value? -- Dealing with Imbalanced Datasets -- What is SMOTE? -- SMOTE extensions -- The Bias-Variance Tradeoff -- Types of Bias in Data -- Analyzing Classifiers (Optional) -- What is LIME? -- What is ANOVA? -- Summary -- Chapter 3: Introduction to Probability and Statistics -- What is a Probability? -- Calculating the Expected Value -- Random Variables -- Discrete versus Continuous Random Variables -- Well-known Probability Distributions -- Fundamental Concepts in Statistics -- The Mean -- The Median -- The Mode -- The Variance and Standard Deviation -- Population, Sample, and Population Variance -- Chebyshev's Inequality -- What is a p-value? -- The Moments of a Function (Optional) -- What is Skewness? -- What is Kurtosis? -- Data and Statistics -- The Central Limit Theorem -- Correlation versus Causation -- Statistical Inferences -- Statistical Terms: RSS, TSS, R^2, and F1 Score -- What is an F1 score? -- Gini Impurity, Entropy, and Perplexity -- What is the Gini Impurity? -- What is Entropy? -- Calculating the Gini Impurity and Entropy Values -- Multi-dimensional Gini Index -- What is Perplexity? -- Cross-Entropy and KL Divergence -- What is Cross-Entropy? -- What is KL Divergence? -- What's Their Purpose? -- Covariance and Correlation Matrices -- The Covariance Matrix -- Covariance Matrix: An Example -- The Correlation Matrix -- Eigenvalues and Eigenvectors -- Calculating Eigenvectors: A Simple Example -- Gauss Jordan Elimination (Optional) -- PCA (Principal Component Analysis) -- The New Matrix of Eigenvectors -- Well-known Distance Metrics -- Pearson Correlation Coefficient -- Jaccard Index (or Similarity) -- Local Sensitivity Hashing (Optional)
The Pandas Pipe Method -- Pandas query() Method for Filtering Data -- Sorting Data Frames in Pandas -- Working with groupby() in Pandas -- Working with apply() and mapapply() in Pandas -- Handling Outliers in Pandas -- Pandas Data Frames and Scatterplots -- Pandas Data Frames and Simple Statistics -- Aggregate Operations in Pandas Data Frames -- Aggregate Operations with the titanic.csv Dataset -- Save Data Frames as CSV Files and Zip Files -- Pandas Data Frames and Excel Spreadsheets -- Working with JSON-based Data -- Python Dictionary and JSON -- Python, Pandas, and JSON -- Window Functions in Pandas -- Useful One-line Commands in Pandas -- What is pandasql? -- What is Method Chaining? -- Pandas and Method Chaining -- Pandas Profiling -- Alternatives to Pandas -- Summary -- Chapter 7: Data Visualization -- What is Data Visualization? -- Types of Data Visualization -- What is Matplotlib? -- Lines in a Grid in Matplotlib -- A Colored Grid in Matplotlib -- Randomized Data Points in Matplotlib -- A Histogram in Matplotlib -- A Set of Line Segments in Matplotlib -- Plotting Multiple Lines in Matplotlib -- Trigonometric Functions in Matplotlib -- Display IQ Scores in Matplotlib -- Plot a Best-Fitting Line in Matplotlib -- The Iris Dataset in Sklearn -- Sklearn, Pandas, and the Iris Dataset -- Working with Seaborn -- Features of Seaborn -- Seaborn Built-in Datasets -- The Iris Dataset in Seaborn -- The Titanic Dataset in Seaborn -- Extracting Data from the Titanic Dataset in Seaborn (1) -- Extracting Data from the Titanic Dataset in Seaborn (2) -- Visualizing a Pandas Dataset in Seaborn -- Data Visualization in Pandas -- What is Bokeh? -- Summary -- Index
Cover -- Title Page -- Copyright -- Dedication -- Contents -- Preface -- Chapter 1: Introduction to Python -- Tools for Python -- easy_install and pip -- virtualenv -- IPython -- Python Installation -- Setting the PATH Environment Variable (Windows Only) -- Launching Python on Your Machine -- The Python Interactive Interpreter -- Python Identifiers -- Lines, Indentation, and Multi-lines -- Quotations and Comments -- Saving Your Code in a Module -- Some Standard Modules -- The help() and dir() Functions -- Compile Time and Runtime Code Checking -- Simple Data Types -- Working with Numbers -- Working with Other Bases -- The chr() Function -- The round() Function -- Formatting Numbers -- Working with Fractions -- Unicode and UTF-8 -- Working with Unicode -- Working with Strings -- Comparing Strings -- Formatting Strings -- Uninitialized Variables and the Value None -- Slicing and Splicing Strings -- Testing for Digits and Alphabetic Characters -- Search and Replace a String in Other Strings -- Remove Leading and Trailing Characters -- Printing Text without NewLine Characters -- Text Alignment -- Working with Dates -- Converting Strings to Dates -- Exception Handling -- Handling User Input -- Command-line Arguments -- Summary -- Chapter 2: Working with Data -- Dealing with Data: What Can Go Wrong? -- What is Data Drift? -- What are Datasets? -- Data Preprocessing -- Data Types -- Preparing Datasets -- Discrete Data Versus Continuous Data -- Binning Continuous Data -- Scaling Numeric Data via Normalization -- Scaling Numeric Data via Standardization -- Scaling Numeric Data via Robust Standardization -- What to Look for in Categorical Data -- Mapping Categorical Data to Numeric Values -- Working with Dates -- Working with Currency -- Working with Outliers and Anomalies -- Outlier Detection/Removal -- Finding Outliers with NumPy
Chapter 4: Introduction to Pandas (1) --
Contents --
Preface --
Frontmatter --
Chapter 2: Working with Data --
Chapter 3: Introduction to Probability and Statistics --
Index
Chapter 6: Introduction to Pandas (3) --
Chapter 7: Data Visualization --
Chapter 1: Introduction to Python --
Chapter 5: Introduction to Pandas (2) --
Title Pandas Basics
URI https://app.knovel.com/hotlink/toc/id:kpPB000023/pandas-basics/pandas-basics?kpromoter=Summon
https://ebookcentral.proquest.com/lib/[SITE_ID]/detail.action?docID=30286674
https://www.degruyterbrill.com/isbn/9781683928256
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwxV1ZS8NAEB5EBRXxQMV6lCq-hubYK74ILRVBLX0QEV_CurvRktKWJl7_3tlNo1YQH30J7JJMlplh9tudC-AkUEzjNks97qexR8JAeVIIHBqqpVFaRK6Q9u0V73bF3V3cm4OiyoWxza2y4ejFDJyZfhoV1pHZLEaq2den2bjX8l2NlubYHrRzD219X-Wzo7Ns7ELZUBXKq6RfiFvD7TMbB9a-vq-UkKKWh_GXK9C2QBE4YZPCmEUTiMeDaa2oahxNi3fiu81v87Yp9mL5vxn8upRn_cEgRzO7AmuvziuuzePk-b2ovLBucztf_x-2bMCCsYkWmzBnhluw2nNEGi1HZBtuzzs37Qtv2qrBk0GIEMhDtvmRSQOT-kwJYngsjY9YSCqpkKE05SJVTCCYo1pHSmqqo1TqiEjDCCc62oH54WhodqFBAy1MyI0UcUp4SqViMQuVRJoPLJC0BvVytcm4LMiR2PNI8sWEGhxVzE6cy3ka55p0Wu0I8RRjnNSg8SkF930ekjeePGS5Q8HEF7wGxz-Ek9jyIsmMhPf-Wsw-LNsO9eWtzQHMF5NncwiL6qXo55O6Uz18XnqdD-gw8m8
linkProvider Knovel
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.title=Pandas+Basics&rft.au=Campesato%2C+O&rft.date=2022-01-01&rft.pub=Mercury+Learning&rft.isbn=9781683928263&rft_id=info:doi/10.1515%2F9781683928256&rft.externalDocID=bks000164087
thumbnail_s http://cvtisr.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fcontent.knovel.com%2Fcontent%2FThumbs%2Fthumb15341.gif