SlideShare a Scribd company logo
1 of 31
Journey to Azure
WHATWE’VE LEARNED ALONGTHE
WAY
September 2020
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
Table of contents
• About VisiQuate
• Client Case Study and Table Setting
• Why? and Where? The idea behind the migration
• Initial architecture: what works, what doesn’t work
• Architecture evolution: v2, v3, …
• Summary: what we’ve learned, what would we do differently
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
VisiQuate
VisiQuate is an
advanced
analytics service
that helps
enterprises
achieve peak
business health.
AI and ML Powered full stack
data platform
Powered by AI and machine learning, streaming
analytics from disparate data sources deliver leading
insights that alert the right users to problems and
opportunities
Value focused point solutions
Our cloud-based solutions target a 3-5x ROI within 12 months
and are non-invasive to Internal IT and is supported with white
glove service by our SME’s, technical experts and account
managers.
Velocity Consulting Services
Increase your speed to value and ROI withVelocity Data
Fanatics as a Service offerings.
© 2019 VisiQuate, Inc. All Rights Reserved.
Case Study:Transformation at a multi-region,
nationwide Health System
Starting point: data chaos, data silos at the regional and facility level, manual data collection,
integration and reporting, lack of data governance.
Project Goals:
• Consolidated data lake infrastructure serving
– Power users in Rev Cycle, Finance, operations
• Streamlined month end close
• Dynamic data mining and exploration
• ML and data science initiatives
– Automation serving data marts and consolidated system wide reporting
• Rev cycle
• ED
• Hospital operations
• Finance / Decision support
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
Client Starting Point Architecture
Data Sources
Structured data
Semi-Structured data
RDBMS
SSIS
package
ODS
EDW
MS SQL Server SQL Server Agent
Manual data upload
Ad-hoc data requests
Reporting/ Data
distribution
© 2020 VisiQuate, Inc. All Rights Reserved.
Issues with current architecture
• Difficult to scale - MS SQL can only scale up can’t scale out
• Overprovisioning - you buy a box that stays idle during off peak
hours
• Rigid hardware footprint - hard to provision resources for a
short period of time
• Schema on write only
• No separation of storage and compute
• Read/write concurrency
• Maintenance cost is relatively high
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
We can solve all our problems if we
migrate everything to a cloud
Client’s Big Idea
© 2019 VisiQuate, Inc. All Rights Reserved.
Standard whitepaper architecture looks simple
Ingestion
engine
Real-Time
Processing
Batch
Processing
Hadoop Cluster
Spark
Hive
BLOB Storage
Reporting Engine
In-Memory cubes
Reports/dashboards
Dynamic query
generation
Virtual private cloud
Structured files
Semi-Structured
files
RDBMS
Logs
Events
Data Sources
© 2019 VisiQuate, Inc. All Rights Reserved.
This is how it looks in reality
© 2020 VisiQuate, Inc. All Rights Reserved.
• Flexible, scalable and secure
infrastructure
• Ability to scale up and scale down
• Pay as you go
• Reduce maintenance
• Data Lake - schema on read model
• Getting ready for AI/ML data processing
Project Goals
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
PROS
• Existing Microsoft products footprint: Office, SQL Server, etc
• Azure AD – leverage Office 365 subscription for Azure set up
• Native MS SQL Server support – different options for migration
• Power BI visualization as a part of Office 365
• Scalability
• Pay as you go model
• HIPAA compliant (for majority of services)
CONS
• Stability Issues due to rapid go to market strategy
• Not as many serverless services
• Open Source is not Native technology
• Developer community is not as strong as alternatives
Why Azure
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
Migration strategies
Here is an interpretation of Microsoft’s scenarios of migration to the cloud:
1. Rehost (aka Lift and shift) – migration scenario with no or minimum code change to move your data tier to the cloud. select
option for your SQL Server and ETL jobs. Migrate manually or using Azure migration tool.
Advantages: fastest and cheapest. Get the most of your existing investment.
Disadvantages: not all pieces of your infrastructure could be migrated one-to-one. For example SSIS is a part of Data Factory and will have to be
rebuilt. Also you won’t leverage all the advantages of the cloud - scalability.
2. Refactor – adjust your data tier to use the best of cloud native services. It will require rebuilding some of the parts of your
data base and or ETL. For example migration of your ETL projects to Azure Data Factory
Advantages: partial code change. Leveraging Azure native service brings scalability, performance and cost gains.
Disadvantages: requires some experience of Azure services, require code changes. Takes more time.
3. Rearchitect – rearchitect your data tier to use cloud native Azure services. For example migrate your SQL database to
SQL Database.
Advantages: get the most benefits of Azure services. For example use serverless services to reduce cost.
Disadvantages: requires architecture and code changes. Potentially can delay migration. Can be considered as phase 2 of migration projects after Lift
Lift and shift migration.
4. Rebuild – completely rethink your architecture making it a greenfield project using the best available technology for your
data tier such as Big Data stack. Azure provides a wide variety of development and deployment capabilities. Enrich your data with
AI, IoT and streaming pipelines.
Advantages: as a result you get a modern data tier with the best advantages of a cloud native data tier
Disadvantages: as any new greenfield project it requires skills, time and budget.
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
What we did Phase1 & 2:
Phase1: Rehost/Lift and shift POC – during that phase we migrated a Proof of Concept and migrate
our data layer to Azure utilizing the following services:
Azure BLOB storage – raw data storage
SQL Server Managed Instance – SQL Server instance
Azure VM – to run SSIS packages for ETL
Results: in the scope of that POC we successfully accomplished the migration from managed hosting to
Azure cloud without significant code change.
Phase 2: Rebuild – re-think, re-architect and re-build our data layer architecture using full power of
Azure cloud services. Making it a greenfield project allows to bring modern technologies and really look at
things differently. Big Data stack and ability to scale resources quickly for a short period of time open a lot of
opportunities to bring your data pipelines to the whole new level.
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
All typical risks associated with a greenfield project but
specifically:
• Lack of experience with certain technologies
• Lack of real-life case studies and white papers about Big
Data Azure deployments
• Azure developer community is not big
• Time and budget
Project Risks
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
Architecture – v1
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
Initial Azure Architecture
• One Spark cluster (Spark ETL + Hive Query)
• Spark as both: processing and query engine
• Parquet is storage format
• Python 2.7.x (last update in 2010)
• SQL style ETL code
• All Development on STG
• Code management in GIT
• MicroStrategy processing through Thrift server
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
Azure HDInsight eco system advanteges
• Low cost - by creating clusters on demand
• Automated cluster creation
• Managed hardware and configuration
• Easily scalable up or down
• Global availability
• Secure and compliant - protect enterprise data assets with
Virtual Network, encryption, and integration
with Azure AD.
• Simplified version management –
Hadoop eco-system components
keeps up-to-date.
• Extensibility with custom tools or third-party applications -
Azure Market place
• Easy management, administration, and monitoring - Azure
Monitor.
• Integration with other Azure services
• Azure Data Factory (ADF)
• Azure Blob Storage
• Azure Data Lake Storage Gen2
• Azure SQL Database
• Azure Analysis Services
• Self-healing processes and components
© 2020 VisiQuate, Inc. All Rights Reserved.
• You CAN build that architecture from scratch and be in production
relatively fast.
• Azure can sometimes surprise you: network issues, lost connections,
background updates, documented features didn’t work, etc.
• Performance of certain queries (e.g. table-based queries) might require
additional attention
• Plan ahead for access and security configuration.
• Make sure you understand your concurrency to architect appropriatly.
• Sometimes documentation and community search take longer than
expected.
We’re in Production - lessons learned
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
Drivers for Architecture v2
• Separate clusters for ETL and run-time
• Parallel ETL architecture
• New HDP platform 3.x
• Upgraded to Python 3.5 (New libs, end of support of Python v2)
• Code structure re-design OOP instead of SQL style.
• Internal Logging module
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
V2 in Production
© 2020 VisiQuate, Inc. All Rights Reserved.
Architecture v2 - Lessons learned
• Hive can be fast
• Documented features do not always work (e.g. Hive locks,
set up access to clusters, cache, etc)
• There are differences in how Spark and Hive treat
metadata.
• You need to learn where you persist your data: data frame
vs a table or a disk for optimal performance.
• Cluster resource management and balancing for parallel
jobs
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
Architecture v3
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
Architecture v3
• HDInsight Spark cluster for ETL
• Separate ETL pipelines for every data source
• ETL Orchestration – running jobs in parallel
• Increasing HDInsight cluster capacity but decreasing its uptime
• Additional data persistence to simplify Spark jobs
• Using ORC format for DWH and Data Marts (Hive)
• Azure Synapse as Analytical Database and Analytics sandbox
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
Why Azure Synapse
• Low maintenance solution
• Lightning speed of data ingestions from Data Lake (ORC, Parquet files)
• Several data replication models (can fine tune for your load pattern)
• Great performance on analytical workloads
• Flexible performance/cost configuration through performance tiers
• Ability to reduce performance tiers during nonproduction hours
• Easy to set up a separate instance for analysts (sandbox)
• SQL Server like experience for end users (SQL Server Management Studio, etc)
• Many connectivity options. (ODBC, Power BI, Excel etc)
© 2020 VisiQuate, Inc. All Rights Reserved.
Production system workload
0
2
4
6
8
10
12
Data Ready
HDI
Synapse
Work hoursADWH daily scaling:
• 100 DWU – off business hours
mode
• 1000 DWU – data refresh
• 300 DWH – run-time
© 2020 VisiQuate, Inc. All Rights Reserved.
Production Azure Synapse Workload pattern
ADWH daily scaling:
• 100 DWU – off business hours
mode
• 1000 DWU – data refresh
• 300 DWH – run-time
© 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved.
Synapse: lessons learned
• Documentation is not always up to date
• Performance tier scaling drops open connections
• Lack of cross database queries support
• No @@ROWCOUNT Support
• No PARTITIONED BY functionality when creating external tables
based on Hive tables
• No native Hive metastore integration
• Doesn’t integrate with SSDT
© 2020 VisiQuate, Inc. All Rights Reserved.
Summary
• There is definitely a way to get to the cloud quickly– you
should consider different options
• If you decide to rebuild your architecture be ready to go
through several iteration
• HDInsight clusters are great engine for data pipelines and
parallel workloads but it’s a very different technology and a
ramp up period is not insignificant
• Azure Synapse is a great engine for Analytical workloads and
Data Exploration and work great with all familiar
toolset(SSMS)
© 2020 VisiQuate, Inc. All Rights Reserved.
Leonid Nekhymchuk, Chief Technology Officer,
leonid.nekhymchuk@visiquate.com
Valeriy Zinovjev, Client Engineering Manager
valeriy.zinovjev@visiquate.com
© 2020 VisiQuate, Inc. All Rights Reserved.
Thank you.
Any Questions?

More Related Content

What's hot

Cloud workload migration guidelines
Cloud workload migration guidelinesCloud workload migration guidelines
Cloud workload migration guidelinesJen Wei Lee
 
CAF presentation 09 16-2020
CAF presentation 09 16-2020CAF presentation 09 16-2020
CAF presentation 09 16-2020Michael Nichols
 
An Overview of Best Practices for Large Scale Migrations - AWS Transformation...
An Overview of Best Practices for Large Scale Migrations - AWS Transformation...An Overview of Best Practices for Large Scale Migrations - AWS Transformation...
An Overview of Best Practices for Large Scale Migrations - AWS Transformation...Amazon Web Services
 
On-premise to Microsoft Azure Cloud Migration.
 On-premise to Microsoft Azure Cloud Migration. On-premise to Microsoft Azure Cloud Migration.
On-premise to Microsoft Azure Cloud Migration.Emtec Inc.
 
App Modernization Pitch Deck.pptx
App Modernization Pitch Deck.pptxApp Modernization Pitch Deck.pptx
App Modernization Pitch Deck.pptxMONISH407209
 
Where to Begin? Application Portfolio Migration
Where to Begin? Application Portfolio MigrationWhere to Begin? Application Portfolio Migration
Where to Begin? Application Portfolio MigrationAmazon Web Services
 
The Ideal Approach to Application Modernization; Which Way to the Cloud?
The Ideal Approach to Application Modernization; Which Way to the Cloud?The Ideal Approach to Application Modernization; Which Way to the Cloud?
The Ideal Approach to Application Modernization; Which Way to the Cloud?Codit
 
Migrate an Existing Application to Microsoft Azure
Migrate an Existing Application to Microsoft AzureMigrate an Existing Application to Microsoft Azure
Migrate an Existing Application to Microsoft AzureChris Dufour
 
Considerations for your Cloud Journey
Considerations for your Cloud JourneyConsiderations for your Cloud Journey
Considerations for your Cloud JourneyAmazon Web Services
 
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...Amazon Web Services
 
Azure Application Modernization
Azure Application ModernizationAzure Application Modernization
Azure Application ModernizationKarina Matos
 
Migrate to Microsoft Azure with Confidence
Migrate to Microsoft Azure with ConfidenceMigrate to Microsoft Azure with Confidence
Migrate to Microsoft Azure with ConfidenceDavid J Rosenthal
 
Cloud Migration PPT -final.pptx
Cloud Migration PPT -final.pptxCloud Migration PPT -final.pptx
Cloud Migration PPT -final.pptxRivarshin
 
Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...
Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...
Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...AWS Germany
 
Cloud migration strategies
Cloud migration strategiesCloud migration strategies
Cloud migration strategiesSogetiLabs
 

What's hot (20)

Cloud Migration Workshop
Cloud Migration WorkshopCloud Migration Workshop
Cloud Migration Workshop
 
Cloud workload migration guidelines
Cloud workload migration guidelinesCloud workload migration guidelines
Cloud workload migration guidelines
 
Oracle Cloud Infrastructure
Oracle Cloud InfrastructureOracle Cloud Infrastructure
Oracle Cloud Infrastructure
 
CAF presentation 09 16-2020
CAF presentation 09 16-2020CAF presentation 09 16-2020
CAF presentation 09 16-2020
 
An Overview of Best Practices for Large Scale Migrations - AWS Transformation...
An Overview of Best Practices for Large Scale Migrations - AWS Transformation...An Overview of Best Practices for Large Scale Migrations - AWS Transformation...
An Overview of Best Practices for Large Scale Migrations - AWS Transformation...
 
On-premise to Microsoft Azure Cloud Migration.
 On-premise to Microsoft Azure Cloud Migration. On-premise to Microsoft Azure Cloud Migration.
On-premise to Microsoft Azure Cloud Migration.
 
Application Portfolio Migration
Application Portfolio MigrationApplication Portfolio Migration
Application Portfolio Migration
 
AWS-Data-Migration-module3
AWS-Data-Migration-module3AWS-Data-Migration-module3
AWS-Data-Migration-module3
 
Cloud Migration: Moving to the Cloud
Cloud Migration: Moving to the CloudCloud Migration: Moving to the Cloud
Cloud Migration: Moving to the Cloud
 
App Modernization Pitch Deck.pptx
App Modernization Pitch Deck.pptxApp Modernization Pitch Deck.pptx
App Modernization Pitch Deck.pptx
 
Where to Begin? Application Portfolio Migration
Where to Begin? Application Portfolio MigrationWhere to Begin? Application Portfolio Migration
Where to Begin? Application Portfolio Migration
 
The Ideal Approach to Application Modernization; Which Way to the Cloud?
The Ideal Approach to Application Modernization; Which Way to the Cloud?The Ideal Approach to Application Modernization; Which Way to the Cloud?
The Ideal Approach to Application Modernization; Which Way to the Cloud?
 
Migrate an Existing Application to Microsoft Azure
Migrate an Existing Application to Microsoft AzureMigrate an Existing Application to Microsoft Azure
Migrate an Existing Application to Microsoft Azure
 
Considerations for your Cloud Journey
Considerations for your Cloud JourneyConsiderations for your Cloud Journey
Considerations for your Cloud Journey
 
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
 
Azure Application Modernization
Azure Application ModernizationAzure Application Modernization
Azure Application Modernization
 
Migrate to Microsoft Azure with Confidence
Migrate to Microsoft Azure with ConfidenceMigrate to Microsoft Azure with Confidence
Migrate to Microsoft Azure with Confidence
 
Cloud Migration PPT -final.pptx
Cloud Migration PPT -final.pptxCloud Migration PPT -final.pptx
Cloud Migration PPT -final.pptx
 
Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...
Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...
Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...
 
Cloud migration strategies
Cloud migration strategiesCloud migration strategies
Cloud migration strategies
 

Similar to VisiQuate: Azure cloud migration case study

Data platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxData platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxCalvinSim10
 
Azure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User StoreAzure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User StoreDataStax Academy
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksDatabricks
 
Webinar - Learn How to Deploy Microsoft SQL in the Cloud
Webinar - Learn How to Deploy Microsoft SQL in the CloudWebinar - Learn How to Deploy Microsoft SQL in the Cloud
Webinar - Learn How to Deploy Microsoft SQL in the CloudTuan Yang
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartchCloudera, Inc.
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Realise True Business Value .pdf
Realise True Business Value .pdfRealise True Business Value .pdf
Realise True Business Value .pdfThousandEyes
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSEDB
 
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?MarketingArrowECS_CZ
 
RightScale Webinar: Successfully Deploy Your Windows Workloads
RightScale Webinar: Successfully Deploy Your Windows WorkloadsRightScale Webinar: Successfully Deploy Your Windows Workloads
RightScale Webinar: Successfully Deploy Your Windows WorkloadsRightScale
 
Software Engineering in the Age of SaaS and Cloud Computing - SERA 2013 - MFF...
Software Engineering in the Age of SaaS and Cloud Computing - SERA 2013 - MFF...Software Engineering in the Age of SaaS and Cloud Computing - SERA 2013 - MFF...
Software Engineering in the Age of SaaS and Cloud Computing - SERA 2013 - MFF...Jaroslav Gergic
 
Getting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsGetting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsThousandEyes
 
Realize True Business Value With ThousandEyes
Realize True Business Value With ThousandEyesRealize True Business Value With ThousandEyes
Realize True Business Value With ThousandEyesThousandEyes
 
Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdf
Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdfData & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdf
Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdfChris Bingham
 
The Last Frontier- Virtualization, Hybrid Management and the Cloud
The Last Frontier-  Virtualization, Hybrid Management and the CloudThe Last Frontier-  Virtualization, Hybrid Management and the Cloud
The Last Frontier- Virtualization, Hybrid Management and the CloudKellyn Pot'Vin-Gorman
 
Making Money in the Cloud
Making Money in the CloudMaking Money in the Cloud
Making Money in the CloudGravitant, Inc.
 
RightScale Webinar: Get Your App To Azure
RightScale Webinar:  Get Your App To AzureRightScale Webinar:  Get Your App To Azure
RightScale Webinar: Get Your App To AzureRightScale
 

Similar to VisiQuate: Azure cloud migration case study (20)

Data platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxData platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptx
 
Azure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User StoreAzure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User Store
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 
Webinar - Learn How to Deploy Microsoft SQL in the Cloud
Webinar - Learn How to Deploy Microsoft SQL in the CloudWebinar - Learn How to Deploy Microsoft SQL in the Cloud
Webinar - Learn How to Deploy Microsoft SQL in the Cloud
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Realise True Business Value .pdf
Realise True Business Value .pdfRealise True Business Value .pdf
Realise True Business Value .pdf
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
 
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
 
RightScale Webinar: Successfully Deploy Your Windows Workloads
RightScale Webinar: Successfully Deploy Your Windows WorkloadsRightScale Webinar: Successfully Deploy Your Windows Workloads
RightScale Webinar: Successfully Deploy Your Windows Workloads
 
Software Engineering in the Age of SaaS and Cloud Computing - SERA 2013 - MFF...
Software Engineering in the Age of SaaS and Cloud Computing - SERA 2013 - MFF...Software Engineering in the Age of SaaS and Cloud Computing - SERA 2013 - MFF...
Software Engineering in the Age of SaaS and Cloud Computing - SERA 2013 - MFF...
 
Getting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsGetting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of Concepts
 
Realize True Business Value With ThousandEyes
Realize True Business Value With ThousandEyesRealize True Business Value With ThousandEyes
Realize True Business Value With ThousandEyes
 
Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdf
Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdfData & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdf
Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023].pdf
 
DevOps Case Studies
DevOps Case StudiesDevOps Case Studies
DevOps Case Studies
 
The Last Frontier- Virtualization, Hybrid Management and the Cloud
The Last Frontier-  Virtualization, Hybrid Management and the CloudThe Last Frontier-  Virtualization, Hybrid Management and the Cloud
The Last Frontier- Virtualization, Hybrid Management and the Cloud
 
Considering Windows Azure
Considering Windows AzureConsidering Windows Azure
Considering Windows Azure
 
Making Money in the Cloud
Making Money in the CloudMaking Money in the Cloud
Making Money in the Cloud
 
RightScale Webinar: Get Your App To Azure
RightScale Webinar:  Get Your App To AzureRightScale Webinar:  Get Your App To Azure
RightScale Webinar: Get Your App To Azure
 

Recently uploaded

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 

Recently uploaded (20)

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 

VisiQuate: Azure cloud migration case study

  • 1. Journey to Azure WHATWE’VE LEARNED ALONGTHE WAY September 2020
  • 2. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. Table of contents • About VisiQuate • Client Case Study and Table Setting • Why? and Where? The idea behind the migration • Initial architecture: what works, what doesn’t work • Architecture evolution: v2, v3, … • Summary: what we’ve learned, what would we do differently
  • 3. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. VisiQuate
  • 4. VisiQuate is an advanced analytics service that helps enterprises achieve peak business health. AI and ML Powered full stack data platform Powered by AI and machine learning, streaming analytics from disparate data sources deliver leading insights that alert the right users to problems and opportunities Value focused point solutions Our cloud-based solutions target a 3-5x ROI within 12 months and are non-invasive to Internal IT and is supported with white glove service by our SME’s, technical experts and account managers. Velocity Consulting Services Increase your speed to value and ROI withVelocity Data Fanatics as a Service offerings.
  • 5. © 2019 VisiQuate, Inc. All Rights Reserved. Case Study:Transformation at a multi-region, nationwide Health System Starting point: data chaos, data silos at the regional and facility level, manual data collection, integration and reporting, lack of data governance. Project Goals: • Consolidated data lake infrastructure serving – Power users in Rev Cycle, Finance, operations • Streamlined month end close • Dynamic data mining and exploration • ML and data science initiatives – Automation serving data marts and consolidated system wide reporting • Rev cycle • ED • Hospital operations • Finance / Decision support
  • 6. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. Client Starting Point Architecture Data Sources Structured data Semi-Structured data RDBMS SSIS package ODS EDW MS SQL Server SQL Server Agent Manual data upload Ad-hoc data requests Reporting/ Data distribution
  • 7. © 2020 VisiQuate, Inc. All Rights Reserved. Issues with current architecture • Difficult to scale - MS SQL can only scale up can’t scale out • Overprovisioning - you buy a box that stays idle during off peak hours • Rigid hardware footprint - hard to provision resources for a short period of time • Schema on write only • No separation of storage and compute • Read/write concurrency • Maintenance cost is relatively high
  • 8. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. We can solve all our problems if we migrate everything to a cloud Client’s Big Idea
  • 9. © 2019 VisiQuate, Inc. All Rights Reserved. Standard whitepaper architecture looks simple Ingestion engine Real-Time Processing Batch Processing Hadoop Cluster Spark Hive BLOB Storage Reporting Engine In-Memory cubes Reports/dashboards Dynamic query generation Virtual private cloud Structured files Semi-Structured files RDBMS Logs Events Data Sources
  • 10. © 2019 VisiQuate, Inc. All Rights Reserved. This is how it looks in reality
  • 11. © 2020 VisiQuate, Inc. All Rights Reserved. • Flexible, scalable and secure infrastructure • Ability to scale up and scale down • Pay as you go • Reduce maintenance • Data Lake - schema on read model • Getting ready for AI/ML data processing Project Goals
  • 12. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. PROS • Existing Microsoft products footprint: Office, SQL Server, etc • Azure AD – leverage Office 365 subscription for Azure set up • Native MS SQL Server support – different options for migration • Power BI visualization as a part of Office 365 • Scalability • Pay as you go model • HIPAA compliant (for majority of services) CONS • Stability Issues due to rapid go to market strategy • Not as many serverless services • Open Source is not Native technology • Developer community is not as strong as alternatives Why Azure
  • 13. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. Migration strategies Here is an interpretation of Microsoft’s scenarios of migration to the cloud: 1. Rehost (aka Lift and shift) – migration scenario with no or minimum code change to move your data tier to the cloud. select option for your SQL Server and ETL jobs. Migrate manually or using Azure migration tool. Advantages: fastest and cheapest. Get the most of your existing investment. Disadvantages: not all pieces of your infrastructure could be migrated one-to-one. For example SSIS is a part of Data Factory and will have to be rebuilt. Also you won’t leverage all the advantages of the cloud - scalability. 2. Refactor – adjust your data tier to use the best of cloud native services. It will require rebuilding some of the parts of your data base and or ETL. For example migration of your ETL projects to Azure Data Factory Advantages: partial code change. Leveraging Azure native service brings scalability, performance and cost gains. Disadvantages: requires some experience of Azure services, require code changes. Takes more time. 3. Rearchitect – rearchitect your data tier to use cloud native Azure services. For example migrate your SQL database to SQL Database. Advantages: get the most benefits of Azure services. For example use serverless services to reduce cost. Disadvantages: requires architecture and code changes. Potentially can delay migration. Can be considered as phase 2 of migration projects after Lift Lift and shift migration. 4. Rebuild – completely rethink your architecture making it a greenfield project using the best available technology for your data tier such as Big Data stack. Azure provides a wide variety of development and deployment capabilities. Enrich your data with AI, IoT and streaming pipelines. Advantages: as a result you get a modern data tier with the best advantages of a cloud native data tier Disadvantages: as any new greenfield project it requires skills, time and budget.
  • 14. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. What we did Phase1 & 2: Phase1: Rehost/Lift and shift POC – during that phase we migrated a Proof of Concept and migrate our data layer to Azure utilizing the following services: Azure BLOB storage – raw data storage SQL Server Managed Instance – SQL Server instance Azure VM – to run SSIS packages for ETL Results: in the scope of that POC we successfully accomplished the migration from managed hosting to Azure cloud without significant code change. Phase 2: Rebuild – re-think, re-architect and re-build our data layer architecture using full power of Azure cloud services. Making it a greenfield project allows to bring modern technologies and really look at things differently. Big Data stack and ability to scale resources quickly for a short period of time open a lot of opportunities to bring your data pipelines to the whole new level.
  • 15. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. All typical risks associated with a greenfield project but specifically: • Lack of experience with certain technologies • Lack of real-life case studies and white papers about Big Data Azure deployments • Azure developer community is not big • Time and budget Project Risks
  • 16. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. Architecture – v1
  • 17. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. Initial Azure Architecture • One Spark cluster (Spark ETL + Hive Query) • Spark as both: processing and query engine • Parquet is storage format • Python 2.7.x (last update in 2010) • SQL style ETL code • All Development on STG • Code management in GIT • MicroStrategy processing through Thrift server
  • 18. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. Azure HDInsight eco system advanteges • Low cost - by creating clusters on demand • Automated cluster creation • Managed hardware and configuration • Easily scalable up or down • Global availability • Secure and compliant - protect enterprise data assets with Virtual Network, encryption, and integration with Azure AD. • Simplified version management – Hadoop eco-system components keeps up-to-date. • Extensibility with custom tools or third-party applications - Azure Market place • Easy management, administration, and monitoring - Azure Monitor. • Integration with other Azure services • Azure Data Factory (ADF) • Azure Blob Storage • Azure Data Lake Storage Gen2 • Azure SQL Database • Azure Analysis Services • Self-healing processes and components
  • 19. © 2020 VisiQuate, Inc. All Rights Reserved. • You CAN build that architecture from scratch and be in production relatively fast. • Azure can sometimes surprise you: network issues, lost connections, background updates, documented features didn’t work, etc. • Performance of certain queries (e.g. table-based queries) might require additional attention • Plan ahead for access and security configuration. • Make sure you understand your concurrency to architect appropriatly. • Sometimes documentation and community search take longer than expected. We’re in Production - lessons learned
  • 20. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. Drivers for Architecture v2 • Separate clusters for ETL and run-time • Parallel ETL architecture • New HDP platform 3.x • Upgraded to Python 3.5 (New libs, end of support of Python v2) • Code structure re-design OOP instead of SQL style. • Internal Logging module
  • 21. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. V2 in Production
  • 22. © 2020 VisiQuate, Inc. All Rights Reserved. Architecture v2 - Lessons learned • Hive can be fast • Documented features do not always work (e.g. Hive locks, set up access to clusters, cache, etc) • There are differences in how Spark and Hive treat metadata. • You need to learn where you persist your data: data frame vs a table or a disk for optimal performance. • Cluster resource management and balancing for parallel jobs
  • 23. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. Architecture v3
  • 24. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. Architecture v3 • HDInsight Spark cluster for ETL • Separate ETL pipelines for every data source • ETL Orchestration – running jobs in parallel • Increasing HDInsight cluster capacity but decreasing its uptime • Additional data persistence to simplify Spark jobs • Using ORC format for DWH and Data Marts (Hive) • Azure Synapse as Analytical Database and Analytics sandbox
  • 25. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. Why Azure Synapse • Low maintenance solution • Lightning speed of data ingestions from Data Lake (ORC, Parquet files) • Several data replication models (can fine tune for your load pattern) • Great performance on analytical workloads • Flexible performance/cost configuration through performance tiers • Ability to reduce performance tiers during nonproduction hours • Easy to set up a separate instance for analysts (sandbox) • SQL Server like experience for end users (SQL Server Management Studio, etc) • Many connectivity options. (ODBC, Power BI, Excel etc)
  • 26. © 2020 VisiQuate, Inc. All Rights Reserved. Production system workload 0 2 4 6 8 10 12 Data Ready HDI Synapse Work hoursADWH daily scaling: • 100 DWU – off business hours mode • 1000 DWU – data refresh • 300 DWH – run-time
  • 27. © 2020 VisiQuate, Inc. All Rights Reserved. Production Azure Synapse Workload pattern ADWH daily scaling: • 100 DWU – off business hours mode • 1000 DWU – data refresh • 300 DWH – run-time
  • 28. © 2020 VisiQuate, Inc. All Rights Reserved.© 2020 VisiQuate, Inc. All Rights Reserved. Synapse: lessons learned • Documentation is not always up to date • Performance tier scaling drops open connections • Lack of cross database queries support • No @@ROWCOUNT Support • No PARTITIONED BY functionality when creating external tables based on Hive tables • No native Hive metastore integration • Doesn’t integrate with SSDT
  • 29. © 2020 VisiQuate, Inc. All Rights Reserved. Summary • There is definitely a way to get to the cloud quickly– you should consider different options • If you decide to rebuild your architecture be ready to go through several iteration • HDInsight clusters are great engine for data pipelines and parallel workloads but it’s a very different technology and a ramp up period is not insignificant • Azure Synapse is a great engine for Analytical workloads and Data Exploration and work great with all familiar toolset(SSMS)
  • 30. © 2020 VisiQuate, Inc. All Rights Reserved. Leonid Nekhymchuk, Chief Technology Officer, leonid.nekhymchuk@visiquate.com Valeriy Zinovjev, Client Engineering Manager valeriy.zinovjev@visiquate.com
  • 31. © 2020 VisiQuate, Inc. All Rights Reserved. Thank you. Any Questions?

Editor's Notes

  1. Azure AD – leverage Office 365 subscription for Azure set up: https://docs.microsoft.com/en-us/azure/cost-management-billing/manage/office-365-account-for-azure-subscription