Have a personal or library account? Click to login
Hands-On Big Data Analytics with PySpark Cover

Hands-On Big Data Analytics with PySpark

Analyze large datasets and discover techniques for testing, immunizing, and parallelizing Spark jobs

Paid access
|Apr 2019
Product purchase options

Table of Contents

  1. Installing Pyspark and Setting up Your Development Environment
  2. Getting Your Big Data into the Spark Environment Using RDDs
  3. Big Data Cleaning and Wrangling with Spark Notebooks
  4. Aggregating and Summarizing Data into Useful Reports
  5. Powerful Exploratory Data Analysis with MLlib
  6. Putting Structure on Your Big Data with SparkSQL
  7. Transformations and Actions
  8. Immutable Design
  9. Avoiding Shuffle and Reducing Operational Expenses
  10. Saving Data in the Correct Format
  11. Working with the Spark Key/Value API
  12. Testing Apache Spark Jobs
  13. Leveraging the Spark GraphX API
PDF ISBN: 978-1-83864-883-1
Publisher: Packt Publishing Limited
Copyright owner: © 2019 Packt Publishing Limited
Publication date: 2019
Language: English
Pages: 182