SlideShare a Scribd company logo
Click to edit Master title style
Exam DP-203: Data Engineering on
Microsoft Azure Crash Course
Tim Warner
Click to edit Master title style
Tim Warner
• Based in Nashville, TN, US
• Central time zone
• MCT, MVP
• Twitter: @TechTrainerTim
• Badge:
TechTrainerTim.com
Click to edit Master title style
Day 1 of 2 Agenda
• Introduction
• Design and implement data storage (40-45%)
• Design and implement data security (10-15%)
Click to edit Master title style
Day 2 of 2 Agenda
• Content catch-up
• Design and develop data processing (25-30%)
• Monitor and optimize data storage and data processing
(10-15%)
• Exam DP-203 strategy
Click to edit Master title style
Course Materials
Click to edit Master title style
Course Expectations
• We'll learn by doing – at least 80 percent demo
• Case study approach
• Please review the recordings…several times!
• 10-minute break at midpoint
• I’m here to answer your questions – take advantage of
this
• Use the Q&A panel
Click to edit Master title style
Session Recordings
Click to edit Master title style
Session Recordings
Click to edit Master title style
Session Recordings
Click to edit Master title style
Mobile Browser: learning.oreilly.com
Click to edit Master title style
O'Reilly Mobile App
Click to edit Master title style
What is an Azure Data Engineer?
• Design and implement the management, monitoring,
security, and privacy of data using the full stack of data
services
• “Builds and tunes data pipelines”
• “Implements, monitors, and optimizes data platforms”
• “Has solid knowledge of SQL, Python, or Scala”
• The Azure Data Scientist consumes the data the
Engineer provides
Click to edit Master title style
Azure Data Engineer Associate
1-year validity period
DP-203
DP-203
Data Engineering on Microsoft Azure
Data Engineering on Microsoft
Azure
DP-203
Click to edit Master title style
Azure Data Fundamentals
DP-900
Click to edit Master title style
Azure Data Scientist Associate
DP-100
Click to edit Master title style
Azure Data Analyst Associate
DA-100
Click to edit Master title style
Azure Cosmos DB Developer
DP-420
Click to edit Master title style
Tim's Certification Study Model
Click to edit Master title style
Thank you!
• Course materials: timw.info/dp203
• Twitter: @TechTrainerTim
• Pluralsight: timw.info/ps
• Web: timw.info
Click to edit Master title style
Data Fundamentals
Click to edit Master title style
Data Types
Structured
Table
Semi-structured Unstructured
Structured
Table
Semi-structured Unstructured
Structured
Table
Semi-structured Unstructured
Structured
Table
Semi-structured Unstructured
Structured
Table
Semi-structured Unstructured
Structured
Table
Semi-structured Unstructured
Structured
Table
Semi-structured Unstructured
Structured
Table
Semi-structured Unstructured
Structured
Table
Semi-structured Unstructured
Structured
Table
Semi-structured Unstructured
Structured
Table
Semi-structured Unstructured
Structured
Table
Semi-structured Unstructured
Structured
Table
Semi-structured Unstructured
Click to edit Master title style
Data Workload Types
Online Transactional Processing (OLTP)
Customer
CustomerID CustomerName CustomerPhone
Orders
OrderID CustomerID OrderDate
Online Analytical Processing (OLAP)
Online Transactional Processing (OLTP)
Customer
CustomerID CustomerName CustomerPhone
Orders
OrderID CustomerID OrderDate
Online Analytical Processing (OLAP)
Online Transactional Processing (OLTP)
Customer
CustomerID CustomerName CustomerPhone
Orders
OrderID CustomerID OrderDate
Online Analytical Processing (OLAP)
Online Transactional Processing (OLTP)
Customer
CustomerID CustomerName CustomerPhone
Orders
OrderID CustomerID OrderDate
Online Analytical Processing (OLAP)
Click to edit Master title style
Data Processing Types
Click to edit Master title style
Data Processing
Raw
Data
Data
processing
Functions Cognitive Services
Databricks Other tools
Cleaned and
transformed data
Click to edit Master title style
ETL
Extract
Discard sensitive data
Transform
Basic filtering and
transformations
Load
Azure Data Factory
Azure Stream Analytics
Click to edit Master title style
ELT
Extract
Load Transform
Complex
processing
Azure Data Factory
Azure Synapse
Click to edit Master title style
Data Analytics
On-premises data
SQL Server, Oracle,
fileshares, SAP
Cloud data
Azure, AWS, GCP
SaaS data
Salesforce, Dynamics
Data ingestion Data storage Data processing Data visualization
Click to edit Master title style
Non-Binary Data Formats
• CSV
• Good for bandwidth-sensitive data loads
• JSON
• Clear, structured format with optional validation
Click to edit Master title style
Binary Data Formats
• Optimized for splitting across compute nodes
• Parquet, ORC: Columnar store
• Fast read performance (compression) for analytical
workloads
• Avro: Row-based store that includes JSON
• Schematized
• Optimized for write performance
Click to edit Master title style
Blob Storage and
Data Lake
Click to edit Master title style
Azure Blob Storage
Block blobs
Has a maximum size of 4.7TB
Best for storing large, discrete,
binary objects that changes
infrequently
Each individual block can store
up to 100MB of data
A block blob can contain up to
50000 blocks
Page blobs
Can hold up to 8TB of data
Is organized as a collection of
fixed sized-512 byte pages
Used to implement virtual disk
storage for virtual machines
Append blobs
The maximum size is just over
195GB
Is a block blob that is used to
optimize append operations
Each individual block can store
up to 4MB of data
Click to edit Master title style
Blob Storage
Click to edit Master title style
ADLS Gen 2
A repository of data
for your Modern Data
Warehouse
Organises data into
directories for
improved file access
Supports POSIX and
RBAC permissions
It is compatible with
Hadoop Distributed
File System
Store
Azure Data Lake Storage
High performance data lake available in
all 54 Azure regions
Click to edit Master title style
Data Lake Storage Gen 2
Click to edit Master title style
Azure Data Lake Storage Gen 2
Click to edit Master title style
Access Tiers & Lifecycle Management
Click to edit Master title style
Azure SQL Products
Click to edit Master title style
Relational Database Tables
Customers
CustomerID CustomerName CustomerPhone
100 Muisto Linna XXX-XXX-XXXX
101 Noam Maoz XXX-XXX-XXXX
102 Vanja Matkovic XXX-XXX-XXXX
103 Qamar Mounir XXX-XXX-XXXX
104 Zhenis Omar XXX-XXX-XXXX
105 Claude Paulet XXX-XXX-XXXX
106 Alex Pettersen XXX-XXX-XXXX
107 Francis Ribeiro XXX-XXX-XXXX
Data is stored in a table
Table consists of rows and columns
All rows have same # of columns
Each column is defined by a datatype
Click to edit Master title style
ACID Principle
Click to edit Master title style
Normalization
Customers
CustomerID CustomerName CustomerPhone
100 Muisto Linna XXX-XXX-XXXX
101 Noam Maoz XXX-XXX-XXXX
102 Vanja Matkovic XXX-XXX-XXXX
103 Qamar Mounir XXX-XXX-XXXX
104 Zhenis Omar XXX-XXX-XXXX
105 Claude Paulet XXX-XXX-XXXX
106 Alex Pettersen XXX-XXX-XXXX
Orders
OrderID CustomerName CustomerPhone
AD100 Noam Maoz XXX-XXX-XXXX
AD101 Noam Maoz XXX-XXX-XXXX
AD102 Noam Maoz XXX-XXX-XXXX
AX103 Qamar Mounir XXX-XXX-XXXX
AS104 Qamar Mounir XXX-XXX-XXXX
AR105 Claude Paulet XXX-XXX-XXXX
MK106 Muisto Linna XXX-XXX-XXXX
Data is normalized to:
Reduce storage Avoid data duplication Improve data quality
Click to edit Master title style
Table Relationships
Customers
CustomerID CustomerName CustomerPhone
100 Muisto Linna XXX-XXX-XXXX
101 Noam Maoz XXX-XXX-XXXX
102 Vanja Matkovic XXX-XXX-XXXX
103 Qamar Mounir XXX-XXX-XXXX
104 Zhenis Omar XXX-XXX-XXXX
105 Claude Paulet XXX-XXX-XXXX
106 Alex Pettersen XXX-XXX-XXXX
Orders
OrderID CustomerID SalesPersonID
AD100 101 200
AD101 101 200
AD102 101 200
AX103 103 201
AS104 103 201
AR105 105 200
MK106 105 201
In a normalized database schema:
Primary Keys and Foreign keys are used to define
relationships
No data duplication exists (other than key values in
3rd Normal Form (3NF)
Data is retrieved by joining tables together
in a query
Click to edit Master title style
SQL Statement Categories
DML
Data Manipulation Language
Used to query and manipulate
data
SELECT, INSERT, UPDATE,
DELETE
DDL
Data Definition Language
Used to define database
objects
CREATE, ALTER, DROP,
RENAME
DCL
Data Control Language
Used to manage security
permissions
GRANT, REVOKE, DENY
Click to edit Master title style
Azure Synapse
PolyBase
Click to edit Master title style
Data Warehouse Star Schema
Click to edit Master title style
Data Warehouse Snowflake Schema
Click to edit Master title style
Azure Synapse
Click to edit Master title style
Azure Synapse SQL Pool (DW) Architecture
Click to edit Master title style
Synapse SQL Pool Types
Click to edit Master title style
Azure Synapse Table Distribution Modes
https://timw.info/0jl
Click to edit Master title style
Azure Synapse Table Distribution Modes
https://timw.info/0jl
Click to edit Master title style
Slowly Changing Dimensions (SCD)
Click to edit Master title style
Slowly Changing Dimensions (SCD)
Click to edit Master title style
Slowly Changing Dimensions (SCD)
Click to edit Master title style
Azure Databricks
Click to edit Master title style
Azure Databricks
Click to edit Master title style
Lambda Architecture
Click to edit Master title style
Lambda Architecture with Databricks
Click to edit Master title style
Kappa Architecture
Click to edit Master title style
Kappa Architecture with Databricks
Click to edit Master title style
Data Security
Click to edit Master title style
Network security
Securing your network from attacks and unauthorized access is an important
part of any architecture
Internet protection
Assess the resources that
are internet-facing, and to
only allow inbound and
outbound communication
where necessary. Make
sure you identify all
resources that are allowing
inbound network traffic of
any type
Firewalls
To provide inbound
protection at the
perimeter, there are
several choices:
• Azure Firewall
• Azure Application
Gateway
• Azure Storage Firewall
DDoS protection
The Azure DDoS Protection
service protects your Azure
applications by scrubbing
traffic at the Azure
network edge before it can
impact your service’s
availability
Network security
groups
Network Security Groups
allow you to filter network
traffic to and from Azure
resources in an Azure
virtual network. An NSG
can contain multiple
inbound and outbound
security rules
Click to edit Master title style
Identity and access
Authentication
This is the process of establishing the
identity of a person or service looking to
access a resource. Azure Active Directory
is a cloud-based identity service that
provide this capability
Authorization
This is the process of establishing what
level of access an authenticated person
or service has. It specifies what data
they're allowed to access and what they
can do with it. Azure Active Directory
also provides this capability
Azure Active Directory features
Single sign-on
Enables users to
remember only
one ID and one
password to access
multiple
applications
Apps & device
management
You can manage your
cloud and
on-premises apps and
devices and
the access to your
organizations resources
Identity services
Manage Business
to business (B2B)
identity services
and Business-to-
Customer (B2C)
identity services
Click to edit Master title style
Encryption
Encryption at rest
Data at rest is the data that has been
stored on a physical medium. This could
be data stored on the disk of a server,
data stored in a database, or data stored
in a storage account
Encryption in transit
Data in transit is the data actively moving
from one location to another, such as
across the internet or through a private
network. Secure transfer can be handled
by several different layers
Encryption on Azure
Raw encryption
Enables the
encryption of:
• Azure Storage
• V.M. Disks
• Disk Encryption
Database encryption
Enables the encryption
of databases using:
• Transparent Data
Encryption
Encrypting secrets
Azure Key Vault is a
centralized cloud
service for storing
your application
secrets
Click to edit Master title style
Encryption
Encryption at rest
Data at rest is the data that has been
stored on a physical medium. This could
be data stored on the disk of a server,
data stored in a database, or data stored
in a storage account
Encryption in transit
Data in transit is the data actively moving
from one location to another, such as
across the internet or through a private
network. Secure transfer can be handled
by several different layers
Encryption on Azure
Raw encryption
Enables the
encryption of:
• Azure Storage
• V.M. Disks
• Disk Encryption
Database encryption
Enables the encryption
of databases using:
• Transparent Data
Encryption
Encrypting secrets
Azure Key Vault is a
centralized cloud
service for storing
your application
secrets
Click to edit Master title style
Azure SQL Database Firewall Rules
Click to edit Master title style
Azure SQL Database DDM
Click to edit Master title style
Azure SQL Database Always Encrypted
Click to edit Master title style
Azure Data Factory
Click to edit Master title style
Power BI
Click to edit Master title style
What are data streams
Data streams:
In the context of analytics, data streams
are event data generated by sensors or
other sources that can be analyzed by
another technology
Data stream processing approach:
There are two approaches. Reference
data is streaming data that can be
collected over time and persisted in
storage as static data. In contrast,
streaming data have relatively low
storage requirements. And run
computations in sliding windows
Data streams are used to:
Analyze data:
Continuously
analyze data to
detect issues and
understand or
respond to them
Understand systems:
Understand component
or
system behavior under
various conditions to
fuel further
enhancements
of said system
Trigger actions:
Trigger specific
actions when
certain thresholds
are identified
Click to edit Master title style
Event processing
The process of consuming data streams, analyzing them, and deriving actionable insights
out of them is called Event Processing and has three distinct components:
Event producer
Examples include sensors or processes that generate data continuously such as a
heart rate monitor or a highway toll lane sensor
Event processor
An engine to consume event data streams and deriving insights from them.
Depending on the problem space, event processors either process one incoming
event at a time (such as a heart rate monitor) or process multiple events at a time
(such as a highway toll lane sensor)
Event consumer
An application which consumes the data and takes specific action based on the
insights. Examples of event consumers include alert generation, dashboards, or even
sending data to another event processing engine
Click to edit Master title style
Processing events with Azure Stream
Analytics
Microsoft Azure Stream Analytics is an event processing engine. It enables the consumption
and analysis of high volumes of streaming data in real time
Source
Sensors
Systems
Applications
Ingestion
Event Hubs
IoT Hubs
Azure Blob Store
Analytical engine
Stream Analytics Query
Language
.NET SDK
Destination
Azure Data Lake
Cosmos DB
SQL Database
Blob Store
Power BI
Click to edit Master title style
Create an Event Hub
Create an event hub namespace
1. In the Azure portal, select NEW, type
Event Hubs, and then select Event Hubs
from the resulting search. Then select
Create
2. Provide a name for the event hub, and
then create a resource group. Specify xx-
name-eh and xx-name-rg respectively,
XX- represent your initials to ensure
uniqueness of the Event Hub name and
Resource
Group name
3. Click the checkbox to Pin to the
dashboard, then select the Create
button
Create an event hub
1. After the deployment is complete, click the xx-name-eh event hub on the dashboard
2. Then, under Entities, select Event Hubs
3. To create the event hub, select the + Event Hub button. Provide the name socialstudy-eh,
and then select Create
4. To grant access to the event hub, we need to create a shared access policy. Select the socialstudy-
eh event hub when it appears, and then, under Settings, select Shared access policies
5. Under Shared access policies, create a policy with MANAGE permissions by selecting + Add. Give the
policy the name of xx-name-eh-sap, check MANAGE, and then select Create
6. Select your new policy after it has been created, and then select the copy button for the
CONNECTION STRING – PRIMARY KEY entity
7. Paste the CONNECTION STRING – PRIMARY KEY entity into Notepad, this is needed later in the
exercise
8. Leave all windows open
Click to edit Master title style
Azure Stream Analytics workflow
Complex event processing of Stream Data in Azure
Input
Adapter
Complex Event
Processor
Output
Adapter
Click to edit Master title style
Start a Stream Analytics Job
Click to edit Master title style
Azure Data Factory components
Data set
(e.g. table, file)
Consumes Activity
(e.g. hive, stored proc.,
copy )
Produces
Pipeline
(Schedule, monitor,
manage)
Is a logical
grouping of
Runs on
Linked service
(e.g. SQL Server, Hadoop
Cluster)
Represents a data
item(s) stored in
Control flow Parameters Integration runtime
Click to edit Master title style
Azure Data Factory components
Linked Service
Data
Lake Store
Azure
Databricks
Dataset
Activities
Pipeline
Triggers
@ Parameters
IR Integration
Runtime
CF Control
Flow
Click to edit Master title style
Azure Monitor
Click to edit Master title style
Data Pipelines
Click to edit Master title style
Azure Diagnostics
Click to edit Master title style
Log Analytics
Click to edit Master title style
Lambda architectures from a real time
mode perspective
Speed Layer:
The Speed layer processes data streams in
real or near real time. This works well when
the aim is to minimize the latency of the
data ingestion to analysis:
1. New data ingested from sources
4. Real time views of the data created
Serving Layer:
The serving layer is optional in the
real-time architecture and acts as the
storage output of either the Batch or Speed
layer that is used by client applications to
access the results
of the data-sets
Click to edit Master title style
Architect a stream processing pipeline
with Azure Stream Analytics
Click to edit Master title style
Design a stream processing pipeline
with Azure Databricks
Click to edit Master title style
Automate an enterprise business
intelligence architecture
Click to edit Master title style
Exam DP-203
Item Types
Click to edit Master title style
Multiple Choice
Click to edit Master title style
Multiple Choice
Click to edit Master title style
Repeated Scenario
You need to move an Azure VM to another hardware
host.
Solution: You redeploy the VM.
Does this solution meet the goal?
a.Yes
b.No
Click to edit Master title style
Repeated Scenario
You need to move an Azure VM to another hardware
host.
Solution: You create a proximity placement group.
Does this solution meet the goal?
a.Yes
b.No
Click to edit Master title style
Select and Place
Click to edit Master title style
Build List and Reorder
Click to edit Master title style
Active Screen
Click to edit Master title style
Case Study
Click to edit Master title style
Performance-Based Lab
Click to edit Master title style
Microsoft Online Testing
Click to edit Master title style
Microsoft Online Testing Process
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style
Click to edit Master title style

More Related Content

What's hot

SQL to Azure Migrations
SQL to Azure MigrationsSQL to Azure Migrations
SQL to Azure MigrationsDatavail
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Heterogenous Migration with DMS & SCT
Heterogenous Migration with DMS & SCTHeterogenous Migration with DMS & SCT
Heterogenous Migration with DMS & SCTAmazon Web Services
 
AWS Data Analytics on AWS
AWS Data Analytics on AWSAWS Data Analytics on AWS
AWS Data Analytics on AWSsampath439572
 
Introducing DocumentDB
Introducing DocumentDB Introducing DocumentDB
Introducing DocumentDB James Serra
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWSGary Stafford
 
Introduction to azure cosmos db
Introduction to azure cosmos dbIntroduction to azure cosmos db
Introduction to azure cosmos dbRatan Parai
 
Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech Talks
Tackle Your Dark Data  Challenge with AWS Glue - AWS Online Tech TalksTackle Your Dark Data  Challenge with AWS Glue - AWS Online Tech Talks
Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech TalksAmazon Web Services
 
AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!Chris Taylor
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureDATAVERSITY
 

What's hot (20)

SQL to Azure Migrations
SQL to Azure MigrationsSQL to Azure Migrations
SQL to Azure Migrations
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Introduction to AWS Glue
Introduction to AWS Glue Introduction to AWS Glue
Introduction to AWS Glue
 
Microsoft Azure Overview
Microsoft Azure OverviewMicrosoft Azure Overview
Microsoft Azure Overview
 
Heterogenous Migration with DMS & SCT
Heterogenous Migration with DMS & SCTHeterogenous Migration with DMS & SCT
Heterogenous Migration with DMS & SCT
 
Azure purview
Azure purviewAzure purview
Azure purview
 
AWS Data Analytics on AWS
AWS Data Analytics on AWSAWS Data Analytics on AWS
AWS Data Analytics on AWS
 
Azure Data Engineering.pptx
Azure Data Engineering.pptxAzure Data Engineering.pptx
Azure Data Engineering.pptx
 
Introducing DocumentDB
Introducing DocumentDB Introducing DocumentDB
Introducing DocumentDB
 
BDA311 Introduction to AWS Glue
BDA311 Introduction to AWS GlueBDA311 Introduction to AWS Glue
BDA311 Introduction to AWS Glue
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
 
Introduction to azure cosmos db
Introduction to azure cosmos dbIntroduction to azure cosmos db
Introduction to azure cosmos db
 
Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech Talks
Tackle Your Dark Data  Challenge with AWS Glue - AWS Online Tech TalksTackle Your Dark Data  Challenge with AWS Glue - AWS Online Tech Talks
Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech Talks
 
AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!
 
Implementing a Data Lake
Implementing a Data LakeImplementing a Data Lake
Implementing a Data Lake
 
AWS glue technical enablement training
AWS glue technical enablement trainingAWS glue technical enablement training
AWS glue technical enablement training
 
Modern Data Platform on AWS
Modern Data Platform on AWSModern Data Platform on AWS
Modern Data Platform on AWS
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
 

Similar to warner-DP-203-slides.pptx

AZ-104 training & certification - warner-AZ-104.pptx
AZ-104 training & certification - warner-AZ-104.pptxAZ-104 training & certification - warner-AZ-104.pptx
AZ-104 training & certification - warner-AZ-104.pptxssuserb9c1ef1
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016James Serra
 
All about Storage - Series 2 Defining Data
All about Storage - Series 2 Defining DataAll about Storage - Series 2 Defining Data
All about Storage - Series 2 Defining DataDAGEOP LTD
 
SQL Server 2008 For Developers
SQL Server 2008 For DevelopersSQL Server 2008 For Developers
SQL Server 2008 For DevelopersJohn Sterrett
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
SQL Server 2016 New Features and Enhancements
SQL Server 2016 New Features and EnhancementsSQL Server 2016 New Features and Enhancements
SQL Server 2016 New Features and EnhancementsJohn Martin
 
Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftData Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftAmazon Web Services
 
SQL Server and SharePoint - Best Practices presented by Steffen Krause, Micro...
SQL Server and SharePoint - Best Practices presented by Steffen Krause, Micro...SQL Server and SharePoint - Best Practices presented by Steffen Krause, Micro...
SQL Server and SharePoint - Best Practices presented by Steffen Krause, Micro...European SharePoint Conference
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
SharePoint 2010 database maintenance
SharePoint 2010 database maintenanceSharePoint 2010 database maintenance
SharePoint 2010 database maintenanceMatt Ranlett
 
Sage 300 ERP: Environment setup and configuration
Sage 300 ERP: Environment setup and configurationSage 300 ERP: Environment setup and configuration
Sage 300 ERP: Environment setup and configurationSage 300 ERP CS
 
Tech-Spark: Scaling Databases
Tech-Spark: Scaling DatabasesTech-Spark: Scaling Databases
Tech-Spark: Scaling DatabasesRalph Attard
 
A Complete BI Solution in About an Hour!
A Complete BI Solution in About an Hour!A Complete BI Solution in About an Hour!
A Complete BI Solution in About an Hour!Aaron King
 
Unity Connect - Getting SQL Spinning with SharePoint - Best Practices for the...
Unity Connect - Getting SQL Spinning with SharePoint - Best Practices for the...Unity Connect - Getting SQL Spinning with SharePoint - Best Practices for the...
Unity Connect - Getting SQL Spinning with SharePoint - Best Practices for the...Knut Relbe-Moe [MVP, MCT]
 
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...Charley Hanania
 

Similar to warner-DP-203-slides.pptx (20)

AZ-104 training & certification - warner-AZ-104.pptx
AZ-104 training & certification - warner-AZ-104.pptxAZ-104 training & certification - warner-AZ-104.pptx
AZ-104 training & certification - warner-AZ-104.pptx
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016
 
SQLServer Database Structures
SQLServer Database Structures SQLServer Database Structures
SQLServer Database Structures
 
Database basics
Database basicsDatabase basics
Database basics
 
All about Storage - Series 2 Defining Data
All about Storage - Series 2 Defining DataAll about Storage - Series 2 Defining Data
All about Storage - Series 2 Defining Data
 
SQL Server 2008 For Developers
SQL Server 2008 For DevelopersSQL Server 2008 For Developers
SQL Server 2008 For Developers
 
Database
DatabaseDatabase
Database
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
SQL Server 2016 New Features and Enhancements
SQL Server 2016 New Features and EnhancementsSQL Server 2016 New Features and Enhancements
SQL Server 2016 New Features and Enhancements
 
Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftData Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
 
SQL Server and SharePoint - Best Practices presented by Steffen Krause, Micro...
SQL Server and SharePoint - Best Practices presented by Steffen Krause, Micro...SQL Server and SharePoint - Best Practices presented by Steffen Krause, Micro...
SQL Server and SharePoint - Best Practices presented by Steffen Krause, Micro...
 
Azure Fundamentals.pdf
Azure Fundamentals.pdfAzure Fundamentals.pdf
Azure Fundamentals.pdf
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
SharePoint 2010 database maintenance
SharePoint 2010 database maintenanceSharePoint 2010 database maintenance
SharePoint 2010 database maintenance
 
Sage 300 ERP: Environment setup and configuration
Sage 300 ERP: Environment setup and configurationSage 300 ERP: Environment setup and configuration
Sage 300 ERP: Environment setup and configuration
 
Tech-Spark: Scaling Databases
Tech-Spark: Scaling DatabasesTech-Spark: Scaling Databases
Tech-Spark: Scaling Databases
 
A Complete BI Solution in About an Hour!
A Complete BI Solution in About an Hour!A Complete BI Solution in About an Hour!
A Complete BI Solution in About an Hour!
 
Unity Connect - Getting SQL Spinning with SharePoint - Best Practices for the...
Unity Connect - Getting SQL Spinning with SharePoint - Best Practices for the...Unity Connect - Getting SQL Spinning with SharePoint - Best Practices for the...
Unity Connect - Getting SQL Spinning with SharePoint - Best Practices for the...
 
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
 

Recently uploaded

Potato Flakes Manufacturing Plant Project Report.pdf
Potato Flakes Manufacturing Plant Project Report.pdfPotato Flakes Manufacturing Plant Project Report.pdf
Potato Flakes Manufacturing Plant Project Report.pdfhostl9518
 
Falcon Invoice Discounting Setup for Small Businesses
Falcon Invoice Discounting Setup for Small BusinessesFalcon Invoice Discounting Setup for Small Businesses
Falcon Invoice Discounting Setup for Small BusinessesFalcon investment
 
Vendors of country report usefull datass
Vendors of country report usefull datassVendors of country report usefull datass
Vendors of country report usefull datassDilipParmar63
 
Did Paul Haggis Ever Win an Oscar for Best Filmmaker
Did Paul Haggis Ever Win an Oscar for Best FilmmakerDid Paul Haggis Ever Win an Oscar for Best Filmmaker
Did Paul Haggis Ever Win an Oscar for Best Filmmakerstajohn447
 
Understanding UAE Labour Law: Key Points for Employers and Employees
Understanding UAE Labour Law: Key Points for Employers and EmployeesUnderstanding UAE Labour Law: Key Points for Employers and Employees
Understanding UAE Labour Law: Key Points for Employers and EmployeesDragon Dream Bar
 
Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...
Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...
Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...Rahul Bedi
 
falcon-invoice-discounting-a-premier-platform-for-investors-in-india
falcon-invoice-discounting-a-premier-platform-for-investors-in-indiafalcon-invoice-discounting-a-premier-platform-for-investors-in-india
falcon-invoice-discounting-a-premier-platform-for-investors-in-indiaFalcon Invoice Discounting
 
Special Purpose Vehicle (Purpose, Formation & examples)
Special Purpose Vehicle (Purpose, Formation & examples)Special Purpose Vehicle (Purpose, Formation & examples)
Special Purpose Vehicle (Purpose, Formation & examples)linciy03
 
IPTV Subscription UK: Your Guide to Choosing the Best Service
IPTV Subscription UK: Your Guide to Choosing the Best ServiceIPTV Subscription UK: Your Guide to Choosing the Best Service
IPTV Subscription UK: Your Guide to Choosing the Best ServiceDragon Dream Bar
 
Copyright: What Creators and Users of Art Need to Know
Copyright: What Creators and Users of Art Need to KnowCopyright: What Creators and Users of Art Need to Know
Copyright: What Creators and Users of Art Need to KnowMiriam Robeson
 
Luxury Artificial Plants Dubai | Plants in KSA, UAE | Shajara
Luxury Artificial Plants Dubai | Plants in KSA, UAE | ShajaraLuxury Artificial Plants Dubai | Plants in KSA, UAE | Shajara
Luxury Artificial Plants Dubai | Plants in KSA, UAE | ShajaraShajara Artificial Plants
 
April 2024 Nostalgia Products Newsletter
April 2024 Nostalgia Products NewsletterApril 2024 Nostalgia Products Newsletter
April 2024 Nostalgia Products NewsletterNathanBaughman3
 
Creative Ideas for Interactive Team Presentations
Creative Ideas for Interactive Team PresentationsCreative Ideas for Interactive Team Presentations
Creative Ideas for Interactive Team PresentationsSlidesAI
 
Meaningful Technology for Humans: How Strategy Helps to Deliver Real Value fo...
Meaningful Technology for Humans: How Strategy Helps to Deliver Real Value fo...Meaningful Technology for Humans: How Strategy Helps to Deliver Real Value fo...
Meaningful Technology for Humans: How Strategy Helps to Deliver Real Value fo...Björn Rohles
 
Evolution and Growth of Supply chain.pdf
Evolution and Growth of Supply chain.pdfEvolution and Growth of Supply chain.pdf
Evolution and Growth of Supply chain.pdfGutaMengesha1
 
12 Conversion Rate Optimization Strategies for Ecommerce Websites.pdf
12 Conversion Rate Optimization Strategies for Ecommerce Websites.pdf12 Conversion Rate Optimization Strategies for Ecommerce Websites.pdf
12 Conversion Rate Optimization Strategies for Ecommerce Websites.pdfSOFTTECHHUB
 
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...BBPMedia1
 
Unveiling the Dynamic Gemini_ Personality Traits and Sign Dates.pptx
Unveiling the Dynamic Gemini_ Personality Traits and Sign Dates.pptxUnveiling the Dynamic Gemini_ Personality Traits and Sign Dates.pptx
Unveiling the Dynamic Gemini_ Personality Traits and Sign Dates.pptxmy Pandit
 
How to Maintain Healthy Life style.pptx
How to Maintain  Healthy Life style.pptxHow to Maintain  Healthy Life style.pptx
How to Maintain Healthy Life style.pptxrdishurana
 
Team-Spandex-Northern University-CS1035.
Team-Spandex-Northern University-CS1035.Team-Spandex-Northern University-CS1035.
Team-Spandex-Northern University-CS1035.smalmahmud11
 

Recently uploaded (20)

Potato Flakes Manufacturing Plant Project Report.pdf
Potato Flakes Manufacturing Plant Project Report.pdfPotato Flakes Manufacturing Plant Project Report.pdf
Potato Flakes Manufacturing Plant Project Report.pdf
 
Falcon Invoice Discounting Setup for Small Businesses
Falcon Invoice Discounting Setup for Small BusinessesFalcon Invoice Discounting Setup for Small Businesses
Falcon Invoice Discounting Setup for Small Businesses
 
Vendors of country report usefull datass
Vendors of country report usefull datassVendors of country report usefull datass
Vendors of country report usefull datass
 
Did Paul Haggis Ever Win an Oscar for Best Filmmaker
Did Paul Haggis Ever Win an Oscar for Best FilmmakerDid Paul Haggis Ever Win an Oscar for Best Filmmaker
Did Paul Haggis Ever Win an Oscar for Best Filmmaker
 
Understanding UAE Labour Law: Key Points for Employers and Employees
Understanding UAE Labour Law: Key Points for Employers and EmployeesUnderstanding UAE Labour Law: Key Points for Employers and Employees
Understanding UAE Labour Law: Key Points for Employers and Employees
 
Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...
Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...
Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...
 
falcon-invoice-discounting-a-premier-platform-for-investors-in-india
falcon-invoice-discounting-a-premier-platform-for-investors-in-indiafalcon-invoice-discounting-a-premier-platform-for-investors-in-india
falcon-invoice-discounting-a-premier-platform-for-investors-in-india
 
Special Purpose Vehicle (Purpose, Formation & examples)
Special Purpose Vehicle (Purpose, Formation & examples)Special Purpose Vehicle (Purpose, Formation & examples)
Special Purpose Vehicle (Purpose, Formation & examples)
 
IPTV Subscription UK: Your Guide to Choosing the Best Service
IPTV Subscription UK: Your Guide to Choosing the Best ServiceIPTV Subscription UK: Your Guide to Choosing the Best Service
IPTV Subscription UK: Your Guide to Choosing the Best Service
 
Copyright: What Creators and Users of Art Need to Know
Copyright: What Creators and Users of Art Need to KnowCopyright: What Creators and Users of Art Need to Know
Copyright: What Creators and Users of Art Need to Know
 
Luxury Artificial Plants Dubai | Plants in KSA, UAE | Shajara
Luxury Artificial Plants Dubai | Plants in KSA, UAE | ShajaraLuxury Artificial Plants Dubai | Plants in KSA, UAE | Shajara
Luxury Artificial Plants Dubai | Plants in KSA, UAE | Shajara
 
April 2024 Nostalgia Products Newsletter
April 2024 Nostalgia Products NewsletterApril 2024 Nostalgia Products Newsletter
April 2024 Nostalgia Products Newsletter
 
Creative Ideas for Interactive Team Presentations
Creative Ideas for Interactive Team PresentationsCreative Ideas for Interactive Team Presentations
Creative Ideas for Interactive Team Presentations
 
Meaningful Technology for Humans: How Strategy Helps to Deliver Real Value fo...
Meaningful Technology for Humans: How Strategy Helps to Deliver Real Value fo...Meaningful Technology for Humans: How Strategy Helps to Deliver Real Value fo...
Meaningful Technology for Humans: How Strategy Helps to Deliver Real Value fo...
 
Evolution and Growth of Supply chain.pdf
Evolution and Growth of Supply chain.pdfEvolution and Growth of Supply chain.pdf
Evolution and Growth of Supply chain.pdf
 
12 Conversion Rate Optimization Strategies for Ecommerce Websites.pdf
12 Conversion Rate Optimization Strategies for Ecommerce Websites.pdf12 Conversion Rate Optimization Strategies for Ecommerce Websites.pdf
12 Conversion Rate Optimization Strategies for Ecommerce Websites.pdf
 
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
 
Unveiling the Dynamic Gemini_ Personality Traits and Sign Dates.pptx
Unveiling the Dynamic Gemini_ Personality Traits and Sign Dates.pptxUnveiling the Dynamic Gemini_ Personality Traits and Sign Dates.pptx
Unveiling the Dynamic Gemini_ Personality Traits and Sign Dates.pptx
 
How to Maintain Healthy Life style.pptx
How to Maintain  Healthy Life style.pptxHow to Maintain  Healthy Life style.pptx
How to Maintain Healthy Life style.pptx
 
Team-Spandex-Northern University-CS1035.
Team-Spandex-Northern University-CS1035.Team-Spandex-Northern University-CS1035.
Team-Spandex-Northern University-CS1035.
 

warner-DP-203-slides.pptx

  • 1. Click to edit Master title style Exam DP-203: Data Engineering on Microsoft Azure Crash Course Tim Warner
  • 2. Click to edit Master title style Tim Warner • Based in Nashville, TN, US • Central time zone • MCT, MVP • Twitter: @TechTrainerTim • Badge: TechTrainerTim.com
  • 3. Click to edit Master title style Day 1 of 2 Agenda • Introduction • Design and implement data storage (40-45%) • Design and implement data security (10-15%)
  • 4. Click to edit Master title style Day 2 of 2 Agenda • Content catch-up • Design and develop data processing (25-30%) • Monitor and optimize data storage and data processing (10-15%) • Exam DP-203 strategy
  • 5. Click to edit Master title style Course Materials
  • 6. Click to edit Master title style Course Expectations • We'll learn by doing – at least 80 percent demo • Case study approach • Please review the recordings…several times! • 10-minute break at midpoint • I’m here to answer your questions – take advantage of this • Use the Q&A panel
  • 7. Click to edit Master title style Session Recordings
  • 8. Click to edit Master title style Session Recordings
  • 9. Click to edit Master title style Session Recordings
  • 10. Click to edit Master title style Mobile Browser: learning.oreilly.com
  • 11. Click to edit Master title style O'Reilly Mobile App
  • 12. Click to edit Master title style What is an Azure Data Engineer? • Design and implement the management, monitoring, security, and privacy of data using the full stack of data services • “Builds and tunes data pipelines” • “Implements, monitors, and optimizes data platforms” • “Has solid knowledge of SQL, Python, or Scala” • The Azure Data Scientist consumes the data the Engineer provides
  • 13. Click to edit Master title style Azure Data Engineer Associate 1-year validity period DP-203 DP-203 Data Engineering on Microsoft Azure Data Engineering on Microsoft Azure DP-203
  • 14. Click to edit Master title style Azure Data Fundamentals DP-900
  • 15. Click to edit Master title style Azure Data Scientist Associate DP-100
  • 16. Click to edit Master title style Azure Data Analyst Associate DA-100
  • 17. Click to edit Master title style Azure Cosmos DB Developer DP-420
  • 18. Click to edit Master title style Tim's Certification Study Model
  • 19. Click to edit Master title style Thank you! • Course materials: timw.info/dp203 • Twitter: @TechTrainerTim • Pluralsight: timw.info/ps • Web: timw.info
  • 20. Click to edit Master title style Data Fundamentals
  • 21. Click to edit Master title style Data Types Structured Table Semi-structured Unstructured Structured Table Semi-structured Unstructured Structured Table Semi-structured Unstructured Structured Table Semi-structured Unstructured Structured Table Semi-structured Unstructured Structured Table Semi-structured Unstructured Structured Table Semi-structured Unstructured Structured Table Semi-structured Unstructured Structured Table Semi-structured Unstructured Structured Table Semi-structured Unstructured Structured Table Semi-structured Unstructured Structured Table Semi-structured Unstructured Structured Table Semi-structured Unstructured
  • 22. Click to edit Master title style Data Workload Types Online Transactional Processing (OLTP) Customer CustomerID CustomerName CustomerPhone Orders OrderID CustomerID OrderDate Online Analytical Processing (OLAP) Online Transactional Processing (OLTP) Customer CustomerID CustomerName CustomerPhone Orders OrderID CustomerID OrderDate Online Analytical Processing (OLAP) Online Transactional Processing (OLTP) Customer CustomerID CustomerName CustomerPhone Orders OrderID CustomerID OrderDate Online Analytical Processing (OLAP) Online Transactional Processing (OLTP) Customer CustomerID CustomerName CustomerPhone Orders OrderID CustomerID OrderDate Online Analytical Processing (OLAP)
  • 23. Click to edit Master title style Data Processing Types
  • 24. Click to edit Master title style Data Processing Raw Data Data processing Functions Cognitive Services Databricks Other tools Cleaned and transformed data
  • 25. Click to edit Master title style ETL Extract Discard sensitive data Transform Basic filtering and transformations Load Azure Data Factory Azure Stream Analytics
  • 26. Click to edit Master title style ELT Extract Load Transform Complex processing Azure Data Factory Azure Synapse
  • 27. Click to edit Master title style Data Analytics On-premises data SQL Server, Oracle, fileshares, SAP Cloud data Azure, AWS, GCP SaaS data Salesforce, Dynamics Data ingestion Data storage Data processing Data visualization
  • 28. Click to edit Master title style Non-Binary Data Formats • CSV • Good for bandwidth-sensitive data loads • JSON • Clear, structured format with optional validation
  • 29. Click to edit Master title style Binary Data Formats • Optimized for splitting across compute nodes • Parquet, ORC: Columnar store • Fast read performance (compression) for analytical workloads • Avro: Row-based store that includes JSON • Schematized • Optimized for write performance
  • 30. Click to edit Master title style Blob Storage and Data Lake
  • 31. Click to edit Master title style Azure Blob Storage Block blobs Has a maximum size of 4.7TB Best for storing large, discrete, binary objects that changes infrequently Each individual block can store up to 100MB of data A block blob can contain up to 50000 blocks Page blobs Can hold up to 8TB of data Is organized as a collection of fixed sized-512 byte pages Used to implement virtual disk storage for virtual machines Append blobs The maximum size is just over 195GB Is a block blob that is used to optimize append operations Each individual block can store up to 4MB of data
  • 32. Click to edit Master title style Blob Storage
  • 33. Click to edit Master title style ADLS Gen 2 A repository of data for your Modern Data Warehouse Organises data into directories for improved file access Supports POSIX and RBAC permissions It is compatible with Hadoop Distributed File System Store Azure Data Lake Storage High performance data lake available in all 54 Azure regions
  • 34. Click to edit Master title style Data Lake Storage Gen 2
  • 35. Click to edit Master title style Azure Data Lake Storage Gen 2
  • 36. Click to edit Master title style Access Tiers & Lifecycle Management
  • 37. Click to edit Master title style Azure SQL Products
  • 38. Click to edit Master title style Relational Database Tables Customers CustomerID CustomerName CustomerPhone 100 Muisto Linna XXX-XXX-XXXX 101 Noam Maoz XXX-XXX-XXXX 102 Vanja Matkovic XXX-XXX-XXXX 103 Qamar Mounir XXX-XXX-XXXX 104 Zhenis Omar XXX-XXX-XXXX 105 Claude Paulet XXX-XXX-XXXX 106 Alex Pettersen XXX-XXX-XXXX 107 Francis Ribeiro XXX-XXX-XXXX Data is stored in a table Table consists of rows and columns All rows have same # of columns Each column is defined by a datatype
  • 39. Click to edit Master title style ACID Principle
  • 40. Click to edit Master title style Normalization Customers CustomerID CustomerName CustomerPhone 100 Muisto Linna XXX-XXX-XXXX 101 Noam Maoz XXX-XXX-XXXX 102 Vanja Matkovic XXX-XXX-XXXX 103 Qamar Mounir XXX-XXX-XXXX 104 Zhenis Omar XXX-XXX-XXXX 105 Claude Paulet XXX-XXX-XXXX 106 Alex Pettersen XXX-XXX-XXXX Orders OrderID CustomerName CustomerPhone AD100 Noam Maoz XXX-XXX-XXXX AD101 Noam Maoz XXX-XXX-XXXX AD102 Noam Maoz XXX-XXX-XXXX AX103 Qamar Mounir XXX-XXX-XXXX AS104 Qamar Mounir XXX-XXX-XXXX AR105 Claude Paulet XXX-XXX-XXXX MK106 Muisto Linna XXX-XXX-XXXX Data is normalized to: Reduce storage Avoid data duplication Improve data quality
  • 41. Click to edit Master title style Table Relationships Customers CustomerID CustomerName CustomerPhone 100 Muisto Linna XXX-XXX-XXXX 101 Noam Maoz XXX-XXX-XXXX 102 Vanja Matkovic XXX-XXX-XXXX 103 Qamar Mounir XXX-XXX-XXXX 104 Zhenis Omar XXX-XXX-XXXX 105 Claude Paulet XXX-XXX-XXXX 106 Alex Pettersen XXX-XXX-XXXX Orders OrderID CustomerID SalesPersonID AD100 101 200 AD101 101 200 AD102 101 200 AX103 103 201 AS104 103 201 AR105 105 200 MK106 105 201 In a normalized database schema: Primary Keys and Foreign keys are used to define relationships No data duplication exists (other than key values in 3rd Normal Form (3NF) Data is retrieved by joining tables together in a query
  • 42. Click to edit Master title style SQL Statement Categories DML Data Manipulation Language Used to query and manipulate data SELECT, INSERT, UPDATE, DELETE DDL Data Definition Language Used to define database objects CREATE, ALTER, DROP, RENAME DCL Data Control Language Used to manage security permissions GRANT, REVOKE, DENY
  • 43. Click to edit Master title style Azure Synapse PolyBase
  • 44. Click to edit Master title style Data Warehouse Star Schema
  • 45. Click to edit Master title style Data Warehouse Snowflake Schema
  • 46. Click to edit Master title style Azure Synapse
  • 47. Click to edit Master title style Azure Synapse SQL Pool (DW) Architecture
  • 48. Click to edit Master title style Synapse SQL Pool Types
  • 49. Click to edit Master title style Azure Synapse Table Distribution Modes https://timw.info/0jl
  • 50. Click to edit Master title style Azure Synapse Table Distribution Modes https://timw.info/0jl
  • 51. Click to edit Master title style Slowly Changing Dimensions (SCD)
  • 52. Click to edit Master title style Slowly Changing Dimensions (SCD)
  • 53. Click to edit Master title style Slowly Changing Dimensions (SCD)
  • 54. Click to edit Master title style Azure Databricks
  • 55. Click to edit Master title style Azure Databricks
  • 56. Click to edit Master title style Lambda Architecture
  • 57. Click to edit Master title style Lambda Architecture with Databricks
  • 58. Click to edit Master title style Kappa Architecture
  • 59. Click to edit Master title style Kappa Architecture with Databricks
  • 60. Click to edit Master title style Data Security
  • 61. Click to edit Master title style Network security Securing your network from attacks and unauthorized access is an important part of any architecture Internet protection Assess the resources that are internet-facing, and to only allow inbound and outbound communication where necessary. Make sure you identify all resources that are allowing inbound network traffic of any type Firewalls To provide inbound protection at the perimeter, there are several choices: • Azure Firewall • Azure Application Gateway • Azure Storage Firewall DDoS protection The Azure DDoS Protection service protects your Azure applications by scrubbing traffic at the Azure network edge before it can impact your service’s availability Network security groups Network Security Groups allow you to filter network traffic to and from Azure resources in an Azure virtual network. An NSG can contain multiple inbound and outbound security rules
  • 62. Click to edit Master title style Identity and access Authentication This is the process of establishing the identity of a person or service looking to access a resource. Azure Active Directory is a cloud-based identity service that provide this capability Authorization This is the process of establishing what level of access an authenticated person or service has. It specifies what data they're allowed to access and what they can do with it. Azure Active Directory also provides this capability Azure Active Directory features Single sign-on Enables users to remember only one ID and one password to access multiple applications Apps & device management You can manage your cloud and on-premises apps and devices and the access to your organizations resources Identity services Manage Business to business (B2B) identity services and Business-to- Customer (B2C) identity services
  • 63. Click to edit Master title style Encryption Encryption at rest Data at rest is the data that has been stored on a physical medium. This could be data stored on the disk of a server, data stored in a database, or data stored in a storage account Encryption in transit Data in transit is the data actively moving from one location to another, such as across the internet or through a private network. Secure transfer can be handled by several different layers Encryption on Azure Raw encryption Enables the encryption of: • Azure Storage • V.M. Disks • Disk Encryption Database encryption Enables the encryption of databases using: • Transparent Data Encryption Encrypting secrets Azure Key Vault is a centralized cloud service for storing your application secrets
  • 64. Click to edit Master title style Encryption Encryption at rest Data at rest is the data that has been stored on a physical medium. This could be data stored on the disk of a server, data stored in a database, or data stored in a storage account Encryption in transit Data in transit is the data actively moving from one location to another, such as across the internet or through a private network. Secure transfer can be handled by several different layers Encryption on Azure Raw encryption Enables the encryption of: • Azure Storage • V.M. Disks • Disk Encryption Database encryption Enables the encryption of databases using: • Transparent Data Encryption Encrypting secrets Azure Key Vault is a centralized cloud service for storing your application secrets
  • 65. Click to edit Master title style Azure SQL Database Firewall Rules
  • 66. Click to edit Master title style Azure SQL Database DDM
  • 67. Click to edit Master title style Azure SQL Database Always Encrypted
  • 68. Click to edit Master title style Azure Data Factory
  • 69. Click to edit Master title style Power BI
  • 70. Click to edit Master title style What are data streams Data streams: In the context of analytics, data streams are event data generated by sensors or other sources that can be analyzed by another technology Data stream processing approach: There are two approaches. Reference data is streaming data that can be collected over time and persisted in storage as static data. In contrast, streaming data have relatively low storage requirements. And run computations in sliding windows Data streams are used to: Analyze data: Continuously analyze data to detect issues and understand or respond to them Understand systems: Understand component or system behavior under various conditions to fuel further enhancements of said system Trigger actions: Trigger specific actions when certain thresholds are identified
  • 71. Click to edit Master title style Event processing The process of consuming data streams, analyzing them, and deriving actionable insights out of them is called Event Processing and has three distinct components: Event producer Examples include sensors or processes that generate data continuously such as a heart rate monitor or a highway toll lane sensor Event processor An engine to consume event data streams and deriving insights from them. Depending on the problem space, event processors either process one incoming event at a time (such as a heart rate monitor) or process multiple events at a time (such as a highway toll lane sensor) Event consumer An application which consumes the data and takes specific action based on the insights. Examples of event consumers include alert generation, dashboards, or even sending data to another event processing engine
  • 72. Click to edit Master title style Processing events with Azure Stream Analytics Microsoft Azure Stream Analytics is an event processing engine. It enables the consumption and analysis of high volumes of streaming data in real time Source Sensors Systems Applications Ingestion Event Hubs IoT Hubs Azure Blob Store Analytical engine Stream Analytics Query Language .NET SDK Destination Azure Data Lake Cosmos DB SQL Database Blob Store Power BI
  • 73. Click to edit Master title style Create an Event Hub Create an event hub namespace 1. In the Azure portal, select NEW, type Event Hubs, and then select Event Hubs from the resulting search. Then select Create 2. Provide a name for the event hub, and then create a resource group. Specify xx- name-eh and xx-name-rg respectively, XX- represent your initials to ensure uniqueness of the Event Hub name and Resource Group name 3. Click the checkbox to Pin to the dashboard, then select the Create button Create an event hub 1. After the deployment is complete, click the xx-name-eh event hub on the dashboard 2. Then, under Entities, select Event Hubs 3. To create the event hub, select the + Event Hub button. Provide the name socialstudy-eh, and then select Create 4. To grant access to the event hub, we need to create a shared access policy. Select the socialstudy- eh event hub when it appears, and then, under Settings, select Shared access policies 5. Under Shared access policies, create a policy with MANAGE permissions by selecting + Add. Give the policy the name of xx-name-eh-sap, check MANAGE, and then select Create 6. Select your new policy after it has been created, and then select the copy button for the CONNECTION STRING – PRIMARY KEY entity 7. Paste the CONNECTION STRING – PRIMARY KEY entity into Notepad, this is needed later in the exercise 8. Leave all windows open
  • 74. Click to edit Master title style Azure Stream Analytics workflow Complex event processing of Stream Data in Azure Input Adapter Complex Event Processor Output Adapter
  • 75. Click to edit Master title style Start a Stream Analytics Job
  • 76. Click to edit Master title style Azure Data Factory components Data set (e.g. table, file) Consumes Activity (e.g. hive, stored proc., copy ) Produces Pipeline (Schedule, monitor, manage) Is a logical grouping of Runs on Linked service (e.g. SQL Server, Hadoop Cluster) Represents a data item(s) stored in Control flow Parameters Integration runtime
  • 77. Click to edit Master title style Azure Data Factory components Linked Service Data Lake Store Azure Databricks Dataset Activities Pipeline Triggers @ Parameters IR Integration Runtime CF Control Flow
  • 78. Click to edit Master title style Azure Monitor
  • 79. Click to edit Master title style Data Pipelines
  • 80. Click to edit Master title style Azure Diagnostics
  • 81. Click to edit Master title style Log Analytics
  • 82. Click to edit Master title style Lambda architectures from a real time mode perspective Speed Layer: The Speed layer processes data streams in real or near real time. This works well when the aim is to minimize the latency of the data ingestion to analysis: 1. New data ingested from sources 4. Real time views of the data created Serving Layer: The serving layer is optional in the real-time architecture and acts as the storage output of either the Batch or Speed layer that is used by client applications to access the results of the data-sets
  • 83. Click to edit Master title style Architect a stream processing pipeline with Azure Stream Analytics
  • 84. Click to edit Master title style Design a stream processing pipeline with Azure Databricks
  • 85. Click to edit Master title style Automate an enterprise business intelligence architecture
  • 86. Click to edit Master title style Exam DP-203 Item Types
  • 87. Click to edit Master title style Multiple Choice
  • 88. Click to edit Master title style Multiple Choice
  • 89. Click to edit Master title style Repeated Scenario You need to move an Azure VM to another hardware host. Solution: You redeploy the VM. Does this solution meet the goal? a.Yes b.No
  • 90. Click to edit Master title style Repeated Scenario You need to move an Azure VM to another hardware host. Solution: You create a proximity placement group. Does this solution meet the goal? a.Yes b.No
  • 91. Click to edit Master title style Select and Place
  • 92. Click to edit Master title style Build List and Reorder
  • 93. Click to edit Master title style Active Screen
  • 94. Click to edit Master title style Case Study
  • 95. Click to edit Master title style Performance-Based Lab
  • 96. Click to edit Master title style Microsoft Online Testing
  • 97. Click to edit Master title style Microsoft Online Testing Process
  • 98. Click to edit Master title style
  • 99. Click to edit Master title style
  • 100. Click to edit Master title style
  • 101. Click to edit Master title style
  • 102. Click to edit Master title style
  • 103. Click to edit Master title style
  • 104. Click to edit Master title style
  • 105. Click to edit Master title style
  • 106. Click to edit Master title style
  • 107. Click to edit Master title style
  • 108. Click to edit Master title style
  • 109. Click to edit Master title style
  • 110. Click to edit Master title style
  • 111. Click to edit Master title style
  • 112. Click to edit Master title style
  • 113. Click to edit Master title style