Big Data Analysis with Python
Ivan Marin Ankit Shukla Sarang VK更新时间:2021-06-11 13:46:55
最新章节:Chapter 08: Creating a Full Analysis Report封面
版权页
Preface
Chapter 1 The Python Data Science Stack
Introduction
Python Libraries and Packages
Using Pandas
Data Type Conversion
Aggregation and Grouping
Exporting Data from Pandas
Visualization with Pandas
Summary
Chapter 2 Statistical Visualizations
Introduction
Types of Graphs and When to Use Them
Components of a Graph
Seaborn
Which Tool Should Be Used?
Types of Graphs
Pandas DataFrames and Grouped Data
Changing Plot Design: Modifying Graph Components
Exporting Graphs
Summary
Chapter 3 Working with Big Data Frameworks
Introduction
Hadoop
Spark
Writing Parquet Files
Handling Unstructured Data
Summary
Chapter 4 Diving Deeper with Spark
Introduction
Getting Started with Spark DataFrames
Writing Output from Spark DataFrames
Exploring Spark DataFrames
Data Manipulation with Spark DataFrames
Graphs in Spark
Summary
Chapter 5 Handling Missing Values and Correlation Analysis
Introduction
Setting up the Jupyter Notebook
Missing Values
Handling Missing Values in Spark DataFrames
Correlation
Summary
Chapter 6 Exploratory Data Analysis
Introduction
Defining a Business Problem
Translating a Business Problem into Measurable Metrics and Exploratory Data Analysis (EDA)
Structured Approach to the Data Science Project Life Cycle
Summary
Chapter 7 Reproducibility in Big Data Analysis
Introduction
Reproducibility with Jupyter Notebooks
Gathering Data in a Reproducible Way
Code Practices and Standards
Avoiding Repetition
Summary
Chapter 8 Creating a Full Analysis Report
Introduction
Reading Data in Spark from Different Data Sources
SQL Operations on a Spark DataFrame
Generating Statistical Measurements
Summary
Appendix
Chapter 01: The Python Data Science Stack
Chapter 02: Statistical Visualizations Using Matplotlib and Seaborn
Chapter 03: Working with Big Data Frameworks
Chapter 04: Diving Deeper with Spark
Chapter 05: Missing Value Handling and Correlation Analysis in Spark
Chapter 6: Business Process Definition and Exploratory Data Analysis
Chapter 07: Reproducibility in Big Data Analysis
Chapter 08: Creating a Full Analysis Report
更新时间:2021-06-11 13:46:55