- Author: Pramod Singh
- ISBN: 1484249607
- Year: 2019
- Pages: 210
- Language: English
- File size: 10.3 MB
- File format: PDF, ePub
- Category: Python
Leverage machine and profound learning models to build applications on real-time information using PySpark. This book is ideal for people who wish to learn to use this language to perform exploratory data analysis and solve a range of business challenges.
You’ll start with reviewing PySpark fundamentals, for example, Spark’s core architecture, and see how to use PySpark for large data processing like information ingestion, cleaning, and transformations techniques. This can be followed by constructing workflows for analyzing streaming information using PySpark and contrast of various streaming platforms.
You will then find out how to schedule distinct spark jobs utilizing Airflow using PySpark and book analyze tuning machine and deep learning versions for real-time forecasts. This publication concludes with a discussion on chart frames and performing community analysis using graph algorithms in PySpark. All of the code presented in the publication will be accessible in Python scripts on Github.
What You Will Learn
- Create pipelines for streaming information processing utilizing PySpark
- Build Machine Learning & Deep Learning models using PySpark newest offerings
- Use graph analytics using PySpark
- Create Sequence Embeddings from Text data
Who This Book is For
Data Scientists, machine learning, and deep learning engineers that want to learn and utilize PySpark for real-time analysis on streaming data.