digital@thrayait.com +60162650525, +919043703606

Training Information

Azure Data Engineering FullStack(ADB, ADF, SYNAPSE)

We are pleased to offer a comprehensive suite of training solutions tailored to meet your needs. Our services encompass both online and offline corporate training options, ensuring flexibility and accessibility for your team's professional development.

Click Here for Enquiry Form

Course Content

Syllabus:

Azure Data Engineering

Module 1: Cloud Computing Concepts

What is the "Cloud" ?

Why Cloud Services

Types of Cloud Models

Deployment Models

Private Cloud

Public Cloud

Hybrid cloud

Types of Cloud Services

Infrastructure as a Service(IaaS)

Platform as a Service(PaaS)

Software as a Service(SaaS)

Comparing Cloud Platforms

Microsoft Azure,

Amazon Web Services,

Google Cloud Platform

Characteristics of Cloud Computing

On-demand self-service

Broad network access

Multi-tenancy and resource pooling

Rapid elasticity and scalability

Measured service

Cloud Data Warehouse Architecture

Shared Memory architecture

Shared Disk architecture

Shared Nothing architecture

Module 2: BigData Introduction

What is BigData?

BigData Sources

Data vs Information

Characteristics of BigData

Variety

Velocity

Volume

Veracity

Value

Types Of BigData

Structured Data

Unstructured Data

Semi Structured Data

Module 3: Dimensional Modelling

OLTP System

Relational Modelling

Characteristics Features of OLTP

Enterprise Data Warehouse

Dimensional Modelling

Dimensional Modelling-Schemas

Star Schema

Snowflake Schema

Multi Star Schema

Dimensional Tables

Fact Tables

Types of slowly Changing Dimensions

Type1 Dimension

Type2 Dimension

Type3 Dimension

Types Facts

Additive Facts

Semi Additive Facts

Non-Additive Facts

Module 4: Azure SQL Database

Introduction Azure SQL Database.

Comparing Single Database

Managed Instance

Creating and Using SQL Server

Creating SQL Database Services.

Azure SQL Database Tools.

Migrating on premise database to SQL Azure.

Purchasing Models

DTU service tiers

vCore based Model

Serverless compute tier

Service Tiers

General purpose / Standard

Business Critical / Premium

Hyper scale

Deployment of an Azure SQL Database

Elastic Pools.

What is SQL elastic pools

Choosing the correct pool size

Creating a New Pool

Manage Pools

Module 5: Azure Storage Service

Azure Storage Account

Features of Azure storage Service

Introduction to Blob Storage Service

Blob Storage Architecture

Blob Storage Features

Types of Blobs

Block Blobs

Append Blobs

Page Blobs

Creating a Storage Account

Azure Storage Performance Tiers

Standard

Premium Performance

Understanding Data Replication

LRS ( Locally Redundant Storage)

ZRS (Zone Redundant Storage)

GRS (Geo Redundant Storage)

Azure Storage-Access Tiers

Hot

Cold

Archive

Working with Containers and Blobs

Soft Delete

Azure Storage Explorer

Access blobs securely

Access Key

Account Shared Access Token

Service Shared Access Token

Azure Maximum Scalability Or Limits

Module 6: Azure Data Lake Storage Services

Introduction to Azure Data Lake

What is Data Lake?

What is Azure Data Lake?

Data Lake Architecture?

Working with Azure Data Lake Storage Gen1

Features of Data Lake Storage Gen1

Understanding Azure Data Lake Gen2

Features of Data Lake Storage Gen2

Differences Between Gen1 & Gen2 Storage

Explore Data Lake Storages

Prevising Data Lake Storage Gen1 Service

Provising Data Lake Storage Gen2 Service

Uploading Sample File

Using Azure Portal

Using Storage Explorer

Azure Data Factory:

Module 7: Azure Data Factory Introduction

What is Azure Data Factory (ADF)?

Azure Data Factory Key Components

Pipeline

Activity

Linked Service

Data Set

Integration Runtime

Triggers

Data Flows

Create Resource Group

Create Storage Account

Creation of Azure Data Factory Service

Module 8: Working with Copy Activity

Understanding Azure Data Factory UI

Copy Data from Blob Storage Service to Azure SQL Database

Copy data from file storage account to file storage account

Create Linked service for various data stores and compute

Creation of Datasets that points to file and table

Design Pipelines with various activities

Create SQL Server on Virtual Machines( On-Premise)

Define Copy activity and it features

Copy Activity-Copy Behavior

Copy Activity Data Integration Units

Copy Activity- User Properties

Copy Activity- Number of parallel copies

Working with Lookup Activity

Understanding of Each Activity

Filter Activity

Get Metadata Activity

Lift and Shift

Hosting Azure - SSIS Integration Runtime

Execute SSIS Packages from ADF

Monitoring Pipeline

Debug Pipeline

Trigger pipeline manually

Monitor pipeline

Trigger pipeline on schedule

Module 9: Practical Scenarios and Use Cases

ADF_PracticeSession1_Blob_To_Blob

ADF_PracticeSession2_CopyActivity_Prefix_Wildcard_FilePath_Blob_To_Blob

ADF_PracticeSession3_Blob_To_Azure_SQLDB

ADF_PracticeSession4_Blob_To_Azure_SQLDB

ADF_PracticeSession5_Dataset_Parameters_Blob_To_Azure_SQLDB

ADF_PracticeSession6_Blob_To_ADLS_Gen2

ADF_PracticeSession7_ADLS_Gen1_To_ADLS_Gen2

ADF_PracticeSession8_Pipeline_Dataset_LinkedService_Parameters

ADF_PracticeSession9_FilteringFileFormats_Getmetadata_Filter_ForEach_Copy_Activity

ADF_PracticeSession10_FilteringFileFormats_Getmetadata_Filter_ForEach_Copy_Activity

ADF_PracticeSession11_BulkCopy_Tables_Files

ADF_PracticeSession12_Container_Parameterization_Blob_To_Blob_Storage

ADF_PracticeSession13_ExecuteCopyActivity_BasedOnFileCount

ADF_PracticeSession14_StoredProcedures_Parameters

ADF_PracticeSession15_CopyActivity_CustomSQL_Queries_StoredProcedures

ADF_PracticeSession16_Pipeline_Audit_Log

ADF_PracticeSession17_Copybehaviour

ADF_PracticeSession18_CSV_To_JSON_Format

ADF_PracticeSession19_Copy_JSON_File_To_AzureSQL

ADF_PracticeSession20_Add_AdditionalColumns_WhileCopyingData

ADF_PracticeSession21_CopyDataTool

ADF_PracticeSession22_Custom_Email_Notification

ADF_PracticeSession23_AzureKeyVault_Integration

ADF_PracticeSession24_Incremental_Load

ADF_PracticeSession25_Integration_Runtime

ADF_PracticeSession26_On-Premise_SQLServer_ADLS_Gen2

ADF_PracticeSession27_On-Premise_FileSystem_ADLS_Gen2

ADF_PracticeSession28_REST_API_Integration

ADF_PracticeSession29_CosmosDB_Introduction

ADF_PracticeSession30_Eventbased_Trigger

ADF_PracticeSession31_Scheduled_Trigger

ADF_PracticeSession32_TumblingWindow_Trigger

ADF_PracticeSession33_Blob_SQLDB_Executepipeline_Activity

ADF_PracticeSession34_SQLDB_BLOB_Overwrite_Append_Mode

ADF_PracticeSession36_Dataflows_Introduction

ADF_PracticeSession37_Dataflows_Select_Filter_DerivedColumn_Transformation

ADF_PracticeSession38_Dataflows_Select__DerivedColumn_Aggregator_Sort_Transformation

ADF_PracticeSession39_Dataflows_ConditionalSplit_Transformation

ADF_PracticeSession40_Dataflows_Join_Transformation

ADF_PracticeSession41_Dataflows_Union_Transformation

ADF_PracticeSession42_Dataflows_Lookup_Transformation

ADF_PracticeSession43_Dataflows_Exists_Transformation

ADF_PracticeSession44_Dataflows_Rank_Transformation

ADF_PracticeSession45_Dataflows_Pivot_Transformation

ADF_PracticeSession46_Dataflows_UnPivot_Transformation

ADF_PracticeSession47_Dataflows_SurrogateKey_Transformation

ADF_PracticeSession48_Dataflows_AlterRow_Transformation

ADF_PracticeSession49_Remove Duplicate rows using data flows

ADF_PracticeSession50_Slowly Changing Dimension Type1 (SCD1) with Hash Key Function

Module 10: Assignments & Case Studies

ADF_Azure_HDInsight Integration

ADF_Azure_HDInsight with Spark Cluster

ADF_Azure_Databricks Integration

Azure Databricks:

Module 11: Introduction to Azure Databricks

Introduction to Databricks

Azure Databricks Architecture

Azure Databricks Main Concepts

Module 12: Databricks Integration with Azure Blob Storage

Read data from Blob Storage and Creating Blob mount point

Module 13: Databricks Integration with Azure Data Lake Storage Gen2

Reading files from Azure Data Lake Storage Gen2

Azure Synapse Analytics:

Module 14: Introduction to Azure Synapse

Technical requirements

Interdiction the components of Azure synapse

Creating synapse Workspace

Understanding Azure Data Lake Exploring Synapse Studio

Module 15: Consideration for Your Compute Environment

Technical requirements Introducing SQL Pool

Creating SQL Pool

Understanding Synapse SQL Pool

Architecture and component

Examining DWUs

Understanding distribution in Synapse SQL Pool

Understanding portions in Synapse SQL Pool

Using temporary table in Synapse SQL Pool

Discovering the benefits of Synapse SQL Pool

Understanding Synapse SQL on demand

SQL on-demand architecture and components

Learning about the benefits of Synapse SQL on-demand

Module 16: Bringing Your Data to Azure Synapse

Technical requirements

Using Synapse pipelines to import data

Bringing data to your Synapse SQL Pool using Copy Data tool

Using Azure Data Factory to import data

Using SQL Server integration Services to import data

Module 17: Using Synapse Pipelines to Orchestrate Your Data

Technical requirements

Introducing synapse pipe lines

Integration runtime

Activities

Pipelines

Triggers

Creating linked services

Defining source and target

Using various activities in synapse pipelines

Scheduling synapse pipelines

Creating pipelines using samples