+60162650525, +919043703606

Training Information

Azure Databricks with Pyspark

We are pleased to offer a comprehensive suite of training solutions tailored to meet your needs. Our services encompass both online and offline corporate training options, ensuring flexibility and accessibility for your team's professional development.

Click Here for Enquiry Form

Course Content



Module 1: Cloud Computing Concepts

What is the "Cloud" ?

Why cloud services

Types of cloud models

Deployment Models

private Cloud deployment model

public Cloud deployment model

hybrid cloud deployment model

Microsoft Azure,

Amazon Web Services,

Google Cloud Platform

characteristics of cloud computing

On-demand self-service

Broad network access

Multi-tenancy and resource pooling

Rapid elasticity and scalability

Measured service

Cloud Data Warehouse Architecture

Shared Memory architecture

Shared Disk architecture

Shared Nothing architecture


Module 2: Core Azure Services

Core Azure Architectural components

Core Azure Services and Products

Azure solutions

Azure management tools


Module 3: Security, Privacy, Compliance

Securing network connectivity

Core Azure identity services

Security tools and features

Azure Governance methodologies

Monitoring and reportings

Privacy, compliance, and data protection standards


Module 4: Azure Pricing and Support

Azure subscriptions

Planning and managing costs

Azure support options

Azure Service Level Agreements (SLAs)

Service Lifecycle in Azure


Module 5: Introduction to Azure Databricks

Introduction to Databricks

Azure Databricks Architecture

Azure Databricks Main Concepts


Module 6: Azure Databricks Account Creation

Azure Free Account

Free Subscription for Azure Databricks

Create Databricks Community Edition Account


Module 7: Databricks Cluster Types and Notebook Options

Creating and configuring clusters

create Notebook

quick tour on notebook options


Module 8: Databricks Utilities and Notebook Parameters

Dbutils commands on files, directories

Notebooks and libraries

Databricks Variables

Widget Types

Databricks notebook parameters


Module 9: Databricks CLI

Azure Databricks CLI Installation

Databricks CLI - DBFS, Libraries and Jobs


Module 10: Databricks Integration with Azure Blob Storage

Read data from Blob Storage and Creating Blob mount point


Module 11: Databricks Integration with Azure Data Lake Storage Gen2

Reading files from Azure Data Lake Storage Gen2


Module 12: Databricks Integration with Azure Data Lake Storage Gen1

Reading Files from data lake storage Gen1


Module 13: Reading and Writing CSV files in Databricks

Read CSV Files

Read TSV Files and PIPE Seperated CSV Files

Read CSV Files with multiple delimiter in spark 2 and spark 3

Reading different position Multidelimiter CSV files


Module 14: Reading and Writing Parquet files in Databricks

Read Parquet files from Data Lake Storage Gen2

Reading and Creating Partition files in Spark


Module 16: Parsing Complex Json FilesL

Reading and Writing JSON Files

Reading, Transforming and Writing Complex JSON files


Module 17: Reading and Writing ORC and Avro Files

Reading and Writing ORC and Avro Files


Module 19: Databricks Integration with Azure Synapse

Reading and Writing Azure Synapse data from Azure Databricks


Module 20: Databricks Integration with Amazon Redshift(Redshift)

Read and Write data from Redshift using databricks


Module 21: Databricks Integration with Snowflake

Reading and Writing data from Snowflake


Module 22: Databricks Integration with CosmosDB SQL API

Reading and Writing data from Azure CosmosDB Account


Module 23: Python Introduction

Python Introduction

Installation and setup

Python Data Types for Azure Databricks


Module 24: Python Data Types

Deep dive into String Data Types in Python for Azure Databricks

Deep dive into python collection list and tuple

Deep dive on set and dict data types in python


Module 25: Python Functions and Arguments

Python Functions and Arguments

Lambda Functions


Module 26: Python Modules and Packages

Python Modules and Packages


Module 27: Python Flow Control

Python Flow Control




Module 28: Python File Handling

Python File Handling


Module 29: Python Logging Module

Python Logging Module


Module 30: Python Exception Handling

Python Exception Handlings


Module 31: Pyspark Introduction

Pyspark Introduction

Pyspark Components and Features


Module 32: Spark Architecture and Internals

Apache Spark Internal architecture

jobs stages and tasks

Spark Cluster Architecture Explained


Module 33: Spark RDD

Different Ways to create RDD in Databricks

Spark Lazy Evaluation Internals & Word Count Program

RDD Transformations in Databricks & coalesce vs repartition

RDD Transformation and Use Cases


Module 34: Spark SQL

Spark SQL Introduction

Different ways to create DataFrames


Module 35: Spark SQL Intenals

Catalyst Optimizer and Spark SQL Execution Plan

Deep dive on Sparksession vs sparkcontext

spark SQL Basics part-1

RDD Transformation and Use Cases


Module 36: Spark SQL Basics

Spark SQL Basics Part-2

Joins in Spark SQL


Module 37: Spark SQL Functions and UDFs

Spark SQL Functions part-1

Spark SQL Functions part-2

Spark SQL Functions Part-3

Spark SQL UDFs

Spark SQL Temp tables and Joins


Module 38: Databricks Delta and Implementing Dimensions SCD1 and SCD2

Implementing SCD Type1 and Apache Spark Databricks Delta

Delta Lake in Azure Databricks

Implementing SCD Type with and without Databricks Delta


Module 39: Databricks Integration with Azure Data Factory

Azure Data Factory Integration with Azure Databricks


Module 40: Databricks Streaming

Delta Streaming in Azure Databricks

Data Ingestion with Auto Loader in Azure Databricks


Module 41: Azure Databricks Projects

Azure Databricks Project-1

Azure Databricks Project-2


Module 42: Databricks Integration with Azure Devops

Azure Databricks CICD Pipelines