Have a personal or library account? Click to login
Essential PySpark for Scalable Data Analytics Cover

Essential PySpark for Scalable Data Analytics

A beginner's guide to harnessing the power and ease of PySpark 3

Paid access
|Nov 2021

Table of Contents

  1. Distributed Computing Primer
  2. Data Ingestion
  3. Data Cleansing and Integration
  4. Real-time Data Analytics
  5. Scalable Machine Learning with PySpark
  6. Feature Engineering – Extraction, Transformation, and Selection
  7. Supervised Machine Learning
  8. Unsupervised Machine Learning
  9. Machine Learning Life Cycle Management
  10. Scaling Out Single-Node Machine Learning Using PySpark
  11. Data Visualization with PySpark
  12. Spark SQL Primer
  13. Integrating External Tools with Spark SQL
  14. The Data Lakehouse
PDF ISBN: 978-1-80056-309-4
Publisher: Packt Publishing Limited
Copyright owner: © 2021 Packt Publishing Limited
Publication date: 2021
Language: English
Pages: 322

People also read