PySpark Tutorial Cheat Sheet
Basic Topics 📘
Get started with the foundational topics of PySpark for data engineering.
Tutorial 1
✨ Introduction to PySpark
Welcome to the introduction to PySpark. In this tutorial, we'll cover the basics of PySpark and how to get started.
Tutorial 2
🔧 Setting Up Spark Session
In this tutorial, we'll go over how to configure and initialize a Spark session in PySpark.
Tutorial 3
📂 Working with CSV Files
This tutorial covers how to read and write CSV files in PySpark, along with configuration options.
Tutorial 4
📄 Working with JSON Files
Learn how to read and write JSON files in PySpark and configure options for handling JSON data.
Tutorial 5
🔗 Referring to Columns in PySpark
This tutorial covers various methods for referring to columns in PySpark, giving you flexible options for data manipulation.
Tutorial 6
📋 Selecting Columns in PySpark
This tutorial explores various methods for selecting columns in PySpark, providing flexibility for data manipulation.
Tutorial 7
🔍 Filtering Data
This tutorial explores various filtering options in PySpark to help you refine your datasets.
Tutorial 8
📊 Grouping Data
This tutorial explains how to group data in PySpark, covering various aggregation options.
Tutorial 9
🔗 Joining Data
This tutorial explains how to join DataFrames in PySpark, covering various join types and options.
Advanced PySpark Topics 📘
Dive into advanced topics and master PySpark with simple tutorials.
Tutorial 1
📅 Date & Time Functions
This tutorial explores various date & time functions in PySpark.
Tutorial 2
➕ Math Functions
This tutorial explores various math and arithmetic functions in PySpark.
Tutorial 3
🔤 String Functions
This tutorial explores various string manipulation functions in PySpark.