October 2018
PoV – EDW Migration to Azure
Migration to Azure Cloud
Introduction
Azure SQL Data Warehouse is a cloud-based Enterprise Data Warehouse (EDW) that leverages Massively Parallel Processing (MPP) to quickly run
complex queries across petabytes of data. Its columnar storage in relational tables significantly reduces data storage cost and improves query performance.
It can run analytics near real time at massive scale using Azure Databricks streaming Dataframes.
SQL Datawarehouse uses PolyBase (T-SQL compliance) to query data from big data sources. Azure SQL DW migration provide utilities/services like Azure
Data Factory, Data Migration Assistant, SSIS and data migration service to make migration more streamlined.
Enablers in Retail
Azure is preferred by many Retailers as it isn’t viewed as a competitor but an enabler for digital transformation, providing more regions than other Cloud
providers, one of the best platforms with TCO, easy to use IoT ecosystem, strategic partnerships for data lifecycle such as Snowflake, strong AI / ML
capabilities, and greater control to build custom applications. TBD (i.e. advantages such as TCO, accelerators (ML frameworks, data pipeline frameworks,
data visualization)
• 10x - increase in the number of data sets that can be effectively handled
• 1 day to 15 minutes – develop granular data analytics reports
• $800 k-cost efficiencies from data analytics resulted in significant annual savings
• 158% - Average ROI for customers who modernized with Azure SQL DW
• $533K – In annual savings from enhanced IT team productivity
• $1 M + - Less per year thanks to simplified DW deployment and management
• $120K - In saved data replication costs by moving failover to Azure SQL DW
• $100K – Cost of backup DR data warehouse that was avoided with Azure SQL DW
• Fewer vulnerabilities - With Always Encrypted and standard endpoints
*Customers who modernized with Azure SQL DW…
*Reference: Forrester TEI Commissioned By Microsoft December 2017; https://pdfs.semanticscholar.org/presentation/318e/d2faf8df5c441637a3000cfa74f50cbb57cd.pdf
Azure DW Reference Architecture
Storage
Process Orchestration Data Governance
Job Scheduler
Workflow ADF
Azure Data
Catalogue
Data Profiling
Data Lifecycle
Metadata
Management
Audit
User roles and
Security
PII information
Compliance
ADF
Logic Apps
Event Hub
Data Producers / Source Systems
Structured Data
Merchandise
Planning
Sales
Master Data
Store Profile
• Supply Chain
• POS
• Sales
• Marketing
• Customer
Experience
Data
Acquisition
Batch
Integration tools
Native connectors
Real
time/Near
Real time
Unstructured Data
Logs
• Sensor Data
• Social Media
• Emails
• Clickstreams
External
Data
Source EDW Systems
Data Processing
Batch Realtime Advanced Analytics Layer
Join
Calculate
Aggregate
Azure Data Factory
Logic Apps
Stream analytics
Parse Validate
Cleanse
Transform
Semantic Analytics Layer
Pattern Mining
ML workflows
Classify
Analyze
Predict
Prepare
Train
Correlate
Data Storage (Target EDW)
Azure SQL DWH
Staging
Dynamic Layer
Azure HD Insight
Aggregated
Data Store
Data
Distribution
ContentDeliveryDataAbstractionAPIGateway
Data Consumption
(BI Tool)
Big Data
Connections
Data
Federation
Self Service Reports
and visualizations
Customer
360
Operational
Reporting
Next Best
Action
Churn
Propensity
Event Hub
Azure Blob
Storage
Azure Databricks
Azure ML
Deep Learning
Cognitive AI
Azure Analysis
Services
Azure AI
services
PolyBase
Azure AD
ADF
Event Hub
ETL
Logic Apps
Retail Analytics
EDW to Azure – Migration Strategies & Decision tree
Current
State
Lift and
Shift
Review &
Refine
Rearchitect
Data Models &
Taxonomy
MOM maturity
Valuation
Data Characteristics
(Quality, Volumes)
Sharding
Workload
Characteristics
Optimized
data
Needs
Optimization
Flawed
Data
By User Groups
By Pain Points
Batch
Real Time /
Near Real
time
Structured
Semi Structured
Unstructured
Current State Methodology Criteria Migration
Strategy
Mode Data Types
Schema Data / Tables
Logic Apps
Event Hub
Azure Data
Factory
Data Transfer
Data
Movement
ETL Tool
CDC
Metadata
Remodel
Aggregate
Join
Transform
Replicate
Azure
Functions
Stream
Analytics
Blob
Storage
Azure
SQL
DWH
PolyBase
Test
Validate
Operationalize
Migration Paths
Migrate
Retail eCommerce Value Chain
Advanced retail analytics
solutions spanning descriptive,
predictive, & prescriptive
modeling accelerating ROI
www.ness.com
Sanjay.Bhakta@ness.com
Head of Solutions Architecture, North America
www.ness.com

Data Migration to Azure

  • 1.
    October 2018 PoV –EDW Migration to Azure
  • 2.
    Migration to AzureCloud Introduction Azure SQL Data Warehouse is a cloud-based Enterprise Data Warehouse (EDW) that leverages Massively Parallel Processing (MPP) to quickly run complex queries across petabytes of data. Its columnar storage in relational tables significantly reduces data storage cost and improves query performance. It can run analytics near real time at massive scale using Azure Databricks streaming Dataframes. SQL Datawarehouse uses PolyBase (T-SQL compliance) to query data from big data sources. Azure SQL DW migration provide utilities/services like Azure Data Factory, Data Migration Assistant, SSIS and data migration service to make migration more streamlined. Enablers in Retail Azure is preferred by many Retailers as it isn’t viewed as a competitor but an enabler for digital transformation, providing more regions than other Cloud providers, one of the best platforms with TCO, easy to use IoT ecosystem, strategic partnerships for data lifecycle such as Snowflake, strong AI / ML capabilities, and greater control to build custom applications. TBD (i.e. advantages such as TCO, accelerators (ML frameworks, data pipeline frameworks, data visualization) • 10x - increase in the number of data sets that can be effectively handled • 1 day to 15 minutes – develop granular data analytics reports • $800 k-cost efficiencies from data analytics resulted in significant annual savings • 158% - Average ROI for customers who modernized with Azure SQL DW • $533K – In annual savings from enhanced IT team productivity • $1 M + - Less per year thanks to simplified DW deployment and management • $120K - In saved data replication costs by moving failover to Azure SQL DW • $100K – Cost of backup DR data warehouse that was avoided with Azure SQL DW • Fewer vulnerabilities - With Always Encrypted and standard endpoints *Customers who modernized with Azure SQL DW… *Reference: Forrester TEI Commissioned By Microsoft December 2017; https://pdfs.semanticscholar.org/presentation/318e/d2faf8df5c441637a3000cfa74f50cbb57cd.pdf
  • 3.
    Azure DW ReferenceArchitecture Storage Process Orchestration Data Governance Job Scheduler Workflow ADF Azure Data Catalogue Data Profiling Data Lifecycle Metadata Management Audit User roles and Security PII information Compliance ADF Logic Apps Event Hub Data Producers / Source Systems Structured Data Merchandise Planning Sales Master Data Store Profile • Supply Chain • POS • Sales • Marketing • Customer Experience Data Acquisition Batch Integration tools Native connectors Real time/Near Real time Unstructured Data Logs • Sensor Data • Social Media • Emails • Clickstreams External Data Source EDW Systems Data Processing Batch Realtime Advanced Analytics Layer Join Calculate Aggregate Azure Data Factory Logic Apps Stream analytics Parse Validate Cleanse Transform Semantic Analytics Layer Pattern Mining ML workflows Classify Analyze Predict Prepare Train Correlate Data Storage (Target EDW) Azure SQL DWH Staging Dynamic Layer Azure HD Insight Aggregated Data Store Data Distribution ContentDeliveryDataAbstractionAPIGateway Data Consumption (BI Tool) Big Data Connections Data Federation Self Service Reports and visualizations Customer 360 Operational Reporting Next Best Action Churn Propensity Event Hub Azure Blob Storage Azure Databricks Azure ML Deep Learning Cognitive AI Azure Analysis Services Azure AI services PolyBase Azure AD ADF Event Hub ETL Logic Apps Retail Analytics
  • 4.
    EDW to Azure– Migration Strategies & Decision tree Current State Lift and Shift Review & Refine Rearchitect Data Models & Taxonomy MOM maturity Valuation Data Characteristics (Quality, Volumes) Sharding Workload Characteristics Optimized data Needs Optimization Flawed Data By User Groups By Pain Points Batch Real Time / Near Real time Structured Semi Structured Unstructured Current State Methodology Criteria Migration Strategy Mode Data Types Schema Data / Tables Logic Apps Event Hub Azure Data Factory Data Transfer Data Movement ETL Tool CDC Metadata Remodel Aggregate Join Transform Replicate Azure Functions Stream Analytics Blob Storage Azure SQL DWH PolyBase Test Validate Operationalize Migration Paths Migrate
  • 5.
    Retail eCommerce ValueChain Advanced retail analytics solutions spanning descriptive, predictive, & prescriptive modeling accelerating ROI
  • 6.
    www.ness.com Sanjay.Bhakta@ness.com Head of SolutionsArchitecture, North America www.ness.com