SlideShare a Scribd company logo
Telecom Bell
Cloud Migration Kickoff
Yashodhan Kale
Delivery Solutions Architect | Databricks
05/30/2023
Contents Approach
Operating Model
Additional Details
01
3
4
5
Platform &
Architecture
2
Summary
1
Summary | Business challenges
1
1 2 3 4 5
TELECOM BELL must
improve network
QOS to align with
consumers' changing
emphasis on mobile
connectivity and
data usage
As IoT and 5G advance,
customers easily switch
providers, prompting
TELECOM BELL to
prioritize personalized
engagement using
customer data for
customized messaging
and services.
TELECOM BELL is subject
to many regulations,
including data privacy
and security regulations,
and needs effective ways
to adhere to these.
Power of data
there is a data-volume
explosion, requiring both
focus and new
capabilities.
Increase pressure to show
growth and profits
is constant and data and
AI will be a critical
enabler
Summary| Technical Challenges
2
Today there are increased expectations and pressure on the Telecom organization to have a strong data & analytics strategy
•Data platform is not scalable for analytics, AI/ML
Upfront capacity planning and cost
Governance of the data on HDFS is a challenge
Data sits in silos and not easy to integrate/ connect
•Lack of discoverability of data (catalog)
•Housekeeping - Maintenance of the in-house cluster is a difficult thru
different portals and installations
•Advance disaster recovery, durability and availability
•Bigger IT infra staff required
Summary | Executive Plan
Telecom Bell wants to improve the Quality of Service (QoS) of their network and to get there, start
migrating the core applications to cloud.
Databricks will bring industry leading expertise and Databricks platform expertise to drive the
transformation at speed.
Confluent will bring event streaming platform built on Kafka and the necessary platform support
Telecom Bell has a team of 10 Engineers with expertise on Kafka and spark
Desired timeline – May 2024
4
1
3
2
5
3
Contents Approach
Operating Model
Additional Details
01
3
4
5
Platform & Architecture
2
Summary
1
Platform & Architecture | Current Architecture
1
Limitations
• Data platform is not scalable for analytics,
AI/ML
• Upfront capacity planning and cost
• Governance of the data on HDFS is a
challenge
• Data sits in silos and not easy to integrate/
connect
• Lack of discoverability of data (catalog)
• Housekeeping - Maintenance of the in-
house cluster is a difficult thru different
portals and installations
• Advance disaster recovery, durability and
availability
• Bigger IT infra staff required
Platform & Architecture | End state Architecture
2
Design target state architecture for a scalable, secure and well governed data platform
(AI /ML self-serve, advanced engineering capabilities including necessary governance on lake capability)
Highlights
• Warehouse + Data Lake capabilities at scale with
Governance
• Data product mindset – Marketplace, Self service capabilities
• MLOps – Full ML Lifecycle
• Domain data tiers - Advance data management capabilities,
curated democratized data layers
Designing and activating a World Class Data
Platform:
Fundamental Principles
• Scalability
• Performance
• Industrialized processes governing the pipeline
• Distributed, fault tolerant architecture
• Open file format for better interoperability between systems
• Security and reliability
• Data provenance and lineage
• ACID complaint
Platform & Architecture | Current vs New
3
More performant and optimized spark
engine
1 Governance under the same roof
2
New
Platform & Architecture | Artifacts
4
Key components of the data platform:
A World Class Data Platform!
Contents Approach
Operating Model
Additional Details
01
3
4
5
Platform &
Architecture
2
Summary
1
Approach | Our Tenets
1
Security is job
zero
Agile
Methodology
Continues
delivery of
results
Because - "Approach is the first step towards achieving goals"
Leverage customer
asset first
Multiple velocity
joint delivery
approach
A B C D
E F G H
Zero down time Log the journey at
every step to look back
& learn
Principal of least
access
privilege(PoLAP)
Approach | Objectives
2
Build the data strategy
roadmap that
empowers Telecom Bell
to overcome its
business challenges
Mindset
HORIZO
N
HORIZO
N
HORIZO
N
Strategic
roadmap
Platform
1
2
3
Build strong
foundations with data
platform development
and implementation
Co-create an operating
model that would take
TELECOM BELL where it
wants, in a sustainable
way.
Migrate core
applications to cloud in
a secure and reliable
way
4
Industrialization
Contents Approach
Operating Model
Additional Details
01
3
4
5
Platform &
Architecture
2
Summary
1
Operating Model | Joint Delivery Approach
Executive Leadership
Databricks Leadership:
1
Application Team
Telecom Bell Leadership
1
Program Management
Databricks Lead
1
Telecom Bell Lead
1
Platform Team Data Quality &
Governance
Bringing it Together
Databricks
(Professional services)
5
D
C
B
A
Meeting Cadence
• Bi-Weekly Steering
Committee Meetings
• Weekly PMO Meetings
• Daily Delivery Team
Meetings
Telecom Bell Resources
3
1
Telecom Bell Resources
3
Telecom Bell Resources
4
Telecom Bell Resources
1
Databricks
(Professional services)
5
Databricks
(Professional services)
3
Databricks
(Professional services)
3
Leadership
Scrum Master
Application
Team
Functional
Domain Expert
Data Visualization
Engineer
Customer Success
Engineer
Data Engineer
Operating Model | Pod Structure
Data Quality &
Governance
Test /
Quality Lead
Data Quality
Engineer
Data Governance
Lead
Data Lineage and
Profiling Engineer
Product
Owner
Bring it
Together
Delivery Lead
Change
management
Specialist
PMO Lead
Roadmap
Officer
Databricks resource Telecom Bell resource
Leader
Leader
Leader
Leader
Platform
Azure Platform
Cloud Architect
Cloud DevOps
Engineer
Resident Solutions
Architect
Delivery Solutions
Architect
Customer Success
Engineer
Resident
Solutions
Architect
Resident
Solutions
Architect
Specialist Solutions
Architect (Security)
Specialist Solutions
Architect (Security)
Cloud DevOps
Engineer
Scrum Master
16 12
Shared
Resource
Shared
Resource
Shared Resource
Cloud DevOps
Engineer
2
Enterprise
Support
Enterprise
Support
CELEBRATION
Celebrate completion
`
PROGRAM
KICKOFF
Operating Model | Road Map
3
DELIVERABL
ES
DIAGNOSTIC OF THE CURRENT
ENVIRONMENT
1
PLATFOR
M
3
END STATE ARCHITECTURE
2
MIGRATION: 10
%
4
MIGRATION: 60%
5
6 MIGRATIO
N
100%
Progress
Progress
Consistently –
communicate,
remove
roadblocks &
eliminate
friction
Celebrate
completion of
quick wins to
strengthen
morale
ALONG THE WAY
Progress
MEASURE PROGRESS
MIGRATION
PLAYBOOK
A repeatable guideline to
migrate
applications to new
architecture
3
HUMAN-CENTERED
CHANGE
Focus on each individual team
member’s technical skills and
capacity for change. Reskill team
members whose roles are changing
1
MINDSET CHANGE
Adopt ‘Data as a Product’, self
service platform, federated
governance, domain specific
ownership
2
PROCESS GOALS
Operating Model | Timeline
3
Q2 2023 Q3 2023 Q4 2023 Q1 2024 Q2 2024
Agile : Update Roadmap and plan per evolving
priorities
Current State
Diagnostics
Assess skill and capability
gaps within the organization
Design & Deliver Governance Structure
Databricks workspace
setup
Assess Current State & Catalog Critical
Data Elements
Prepare Governance Strategy
(Identify roles, define interaction model)
Application
Platform
Bring it together
Data Quality +
Governance
Best practices and tagging
Design Target State DQ Monitoring
Steerco
Meeting
Assess Current State Data Governance
Steerco
Meeting
Steerco
Meeting
Steerco
Meeting
Confluent workspace
setup
Cost management reports
Define
Elements/Sources/Dat
a
Test & Modify
Refactor the
code
Deploy
Document &
KT
Define Pods and
teams
Create Upskilling Curriculum
and setup trainings sessions
Establish ways of working –
documentation, win celebrations
Continuously monitor, foresee risk, mitigate risks , fetch leadership
guidance
Project management
Arrange handover of all
areas
Handover
Handover
Handover
Security and compliance | phase1
Security and compliance | phase2
Talk to business
team
Incorporate changes
Cost optimization
Move towards Infra as
code
Implement Target State DQ
Monitoring
Contents Approach
Operating Model
01
3
4
5
Platform &
Architecture
2
Summary
1
Additional Details
Industrialization:
Competitive Differentiation
High throughput of innovation analytics (AI/ML)
Predictive analytics at scale
Data driven(real time what-if analysis)
Harmonized MDM; ML & AI based DQ
Fast, repeatable time-to-market from idea to
product
5
Additional Details | Future Scope
1
Additional Details | Risk & Mitigation - Technical
1
Risks Mitigating Actions
Data Loss Risk
Reconciliation, Check pointing, Audit, Monitoring. Use of fault tolerant ingestion/migration tools like Azure Data Factory
– Az Copy Activity
Data Corruption and Data Integrity
Risk
Data Validation - Each record is compared in a bidirectional manner, and each record in the old system is compared
against the target system and the target system against the old system
Interference Risks
(simultaneously use of source
application)
Align with the stakeholders of each source on how the bandwidth can be shared. “Bring it together” team come into
play to address this
Schema Evolution
(Changing Dimensions)
Delta file format – Schema evolution feature. Depends on schema on read. Further to make sure there are no
incompatible schemas coming in. A catalog and governance would be leveraged – Databricks Unity Catalog
Authorization Risk MFA and Identity Federation , access controls at row and column level by Delta Lake
Data Security Risk
Apply Encryption where possible and appropriate
All tokens and keys will be securely stored and rotated in Azure Key Vault
Rotate keys on regular interval
Down time due to migration Replicate and activate approach
Additional Details | Risk & Mitigation - Other
2
Risk Mitigating Actions
Resource Availability &
Competing Priorities
 Making sure employees are fully advised about participation into workshops and/or interviews.
 Get the right people at the right time
Senior Leadership Buy-In and Delays
in Decision Making
 Strong support from the leadership Group, including areas who are not fully involved by the initial changes. One Team,
One direction
 Establish governance to provide clarity on accountabilities for decision making
Potential Impacts to Other
Projects
 Strong support from Senior Leadership if there is a need to put a hold on
existing projects
 Review current state of ongoing projects to see how it impacts to the Finance model
 Prioritize major changes and focus on the big obstacles upfront
Lack of People Adoption –
Major Change
 Agile and inspirational change management and communication structure
 Leverage Bring it together team, and roles like change management experts to steward people readiness and prepare
for change
Design in Isolation
(Enterprise Integration)
 Work with scalable and flexible design principles in mind to ensure proper
integration and alignment with the business. It is a partnership approach
 Gather key inputs to support cross function process design decisions
where applicable
Availability of Key Data Inputs
and Information
 Simplify data requests to collect data and information at the appropriate level of detail
 Assign designated Databricks and Telecom Bell contact to ensure smooth and timely transition of data
 Discovery Phase to identify hidden environmental risks to foresee and mitigate
Area Assumption
1 Platform
Telecom Bell on premise platform is owned and managed by Telecom Bell and Databricks will get the necessary support to extent the setup to
provision the solution per the scope of this effort.
2 Data Security
Telecom Bell is responsible for the design, integration and operation of all Client Identity and Access Management, Security Incident and Event
Management, Vulnerability Scanning and Security Testing tooling and processes as appropriate.
5 Access & Setup
Telecom Bell will provide system access to all source systems or applications required by scope. Telecom Bell will provide access to systems and
environments(including DEV, SIT) within 5 business days of receipt of request.
6 Access & Setup
Databricks persona will not have access to unencrypted PII data. Telecom Bell will be responsible for encrypting any PII data, prior to extraction in
the Databricks platform.
7 Access & Setup PII and GDPR Data handling will be done by Telecom Bell as per the existing practices in delivery , any additional arrangement is out of scope.
9
Project
Management
Telecom Bell will provide relevant functional, technical and process documentation for data platforms and systems required by the scope.
10
Project
Management
Telecom Bell will nominate full time business and technical SMEs aligned to this project as per the agreed pod structure.
11
Project
Management
Telecom Bell data owners /nominees will make every attempt to attend the Scrum meetings and ceremonies to present their progress on the issues
assigned
12
Project
Management
Telecom Bell will make sure we get required time and support from all the stakeholders for complete success of the project.
14 Data Build
Databricks team will reuse and extend the existing data ingestion tooling and framework to support the ingestion activities into the platform. The
project will carry a data discovery exercise where it will assess the local market data quality and readiness.
15 Data Build Source System inventory have already been identified and already in place.
16 License The Cloudera CDH on premise license is already expired in March 2022. However, the extended support is required and obtained.
Additional Details | Assumptions
3
•Is there an onboarding guide for the consultants to get started on your environment ?
Is there a Source System inventory already identified and can be shared ?
What are the roles and skills of existing 10 engineers on the team ?
What is the current data governance mechanism ?
Other than Cloudera, what all other paid subscriptions and packages are installed on the concerned
architecture ?
•Is there any major business contingency on this project plan? If so, what is the impact of the delayed delivery?
•What are all the compliances and regulations that Telecom Bell need to follow about the concerned data?
•Does Telecom Bell already have Azure account? If so, what is the level of enterprise support plan that is subscribed ?
•Does Telecom Bell already have Confluent account? If so, what is the level of enterprise support plan that is subscribed
?
•Any due license expires ?
•What is the Cloudera’s extended support expiry date ?
Additional Details | Questions
4
Thank you
Thank you so much for you time today.. 
Yashodhan Kale
BACKGROUND SELECTED EXPERIENCES
Amazon Web Services Certified Data Analytics - Specialty
Amazon Web Services Solutions Architect - Associate
Cloudera Certified Developer for Apache Hadoop (CCDH)
RELEVANT FUNCTIONAL AND INDUSTRY EXPERIENCE
Modern Technologist | Data and ML at scale
Design and drive clients' Data and AI journeys powered by cloud analytics
expertise! Offering data product mindset-driven solutions to deliver platforms
and beyond: Self-service framework, rapid experimentation lab, democratized
data, data products marketplace, multi-cloud solutions, data lake, data fabric,
data mesh patterns with federated governance, domain-specific ownership, and
more
Industry Focus:
• HealthCare
• Retail
• Market Research
• Finance
Functional Expertise:
• Digital Transformation
• Analytics and CDO Strategy
• Open Source
• Machine Learning, IOT
• Data Drive Re-invention
• Fortune 5 American healthcare company
Establish and manage DevOps, Data Engineering, and ML engineering teams in close collaboration with Data
Scientists. Set up a self-service Data and ML platform on Azure cloud for a Retail enterprise, incorporating an
experimentation framework, Model Training pipelines, and real-time inference using Azure AKS, Kubeflow, and
Snowflake. Implement an Rx enterprise Data and ML platform on Azure cloud, enabling ETL pipelines with
Databricks and Apache Airflow. Lead the development of large-scale projects, including legacy modernization,
Rx personalization, and Retail personalization programs that impact millions of lives daily. Collaborate with
technology partners, MSFT and NVIDIA, to present objectives, findings, and incorporate feedback for ML
solutions with specialized NVIDIA GPUs. Architect and oversee the implementation of the Refrigerator IoT
project on Azure, leveraging IOT hub, Azure Analytics, and Databricks. Lead the development of SAP HANA to
Spark integration. Manage the enhancement team in Data Engineering for pharmacy-related projects, ensuring
critical business deliveries. Design data-driven solutions, including self-service analytics platforms, rapid
experimentation labs, democratized data, multi-cloud solutions, data fabric, data mesh patterns with federated
governance, and domain-specific ownership. Develop an ingestion framework for seamless data migration across
projects and cloud storage services.
• Multinational American information, data & market measurement company
Build a retail store data aggregation engine (Retail Intelligence system) for 24 countries, initially using Hadoop
MapReduce, later upgraded to Spark. Migrate on-premise batch processes to the cloud using Docker, Azure Batch
Services, and Azure Shipyard for cost efficiency. Perform performance tuning on Apache Spark, cloud Hadoop
clusters (HDI), and Databricks on Azure and Hadoop platforms.
CERTIFICATIONS
PREVIOUSLY
Sr Cloud Solution Architect @ Amazon Web Services Level 6
Sr ML Engineering Manager @ Databricks Level 6
WHAT HAS BROUGHT ME HERE
• Customer Obsession
• Deliver Results
• Earn trust
• Learn and Be Curious
ACID Compliant
Time Travel
Data as product
Inter Operability
Self service
experimentation
Scale &
Pay as you go
Lake House Governance
Data Migration
Identity Management,
SSO
Event Streaming
Exactly once
semantics
Upfront cost
Not easy to integrate/
connect
Lack of discoverability
Efforts to make data HA &
durable
End of support
Maintenance
Platform & Architecture | Artifacts
1
Key components of the data platform:
A World Class Data Platform!
Lake House
MLOps
Governance
Databricks Marketplace
Databricks Notebooks
1
Share
insights
Quickly discover new insights with
built-in interactive visualizations,
or leverage libraries such as
Matplotlib and ggplot. Export
results and Notebooks in HTML or
IPYNB format, or build and share
dashboards that always stay up to
date.
3 Production
at scale
Schedule Notebooks to automatically
run machine learning and data
pipelines at scale. Create multistage
pipelines using Databricks Workflows.
Set up alerts and quickly access audit
logs for easy monitoring and
troubleshooting.
2
Work
together
Share Notebooks and work with peers
across teams in multiple languages (R,
Python, SQL and Scala) and libraries of
your choice. Real-time coauthoring,
commenting and automated versioning
simplify collaboration while providing
control.

More Related Content

What's hot

Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
James Serra
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
Databricks
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta Lake
Databricks
 
Lakehouse in Azure
Lakehouse in AzureLakehouse in Azure
Lakehouse in Azure
Sergio Zenatti Filho
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
DATAVERSITY
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
Denodo
 
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
HostedbyConfluent
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
Dalibor Wijas
 
Getting Started with Delta Lake on Databricks
Getting Started with Delta Lake on DatabricksGetting Started with Delta Lake on Databricks
Getting Started with Delta Lake on Databricks
Knoldus Inc.
 
Accenture-Cloud-Data-Migration-POV-Final.pdf
Accenture-Cloud-Data-Migration-POV-Final.pdfAccenture-Cloud-Data-Migration-POV-Final.pdf
Accenture-Cloud-Data-Migration-POV-Final.pdf
Rajvir Kaushal
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
LibbySchulze
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
Data Migration to Azure
Data Migration to AzureData Migration to Azure
Data Migration to Azure
Sanjay B. Bhakta
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
Databricks
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
Kent Graziano
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
Alexey Grishchenko
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
DATAVERSITY
 

What's hot (20)

Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta Lake
 
Lakehouse in Azure
Lakehouse in AzureLakehouse in Azure
Lakehouse in Azure
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
 
Getting Started with Delta Lake on Databricks
Getting Started with Delta Lake on DatabricksGetting Started with Delta Lake on Databricks
Getting Started with Delta Lake on Databricks
 
Accenture-Cloud-Data-Migration-POV-Final.pdf
Accenture-Cloud-Data-Migration-POV-Final.pdfAccenture-Cloud-Data-Migration-POV-Final.pdf
Accenture-Cloud-Data-Migration-POV-Final.pdf
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Data Migration to Azure
Data Migration to AzureData Migration to Azure
Data Migration to Azure
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 

Similar to Hadoop Migration to databricks cloud project plan.pptx

Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...
Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...
Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...
Denodo
 
Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​
Precisely
 
On the Cloud? Data Integrity for Insurers in Cloud-Based Platforms
On the Cloud? Data Integrity for Insurers in Cloud-Based PlatformsOn the Cloud? Data Integrity for Insurers in Cloud-Based Platforms
On the Cloud? Data Integrity for Insurers in Cloud-Based Platforms
Precisely
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Denodo
 
Get ahead of the cloud or get left behind
Get ahead of the cloud or get left behindGet ahead of the cloud or get left behind
Get ahead of the cloud or get left behind
Matt Mandich
 
Data Mesh
Data MeshData Mesh
Top Trends and Challenges in the Cloud
Top Trends and Challenges in the CloudTop Trends and Challenges in the Cloud
Top Trends and Challenges in the Cloud
Precisely
 
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Denodo
 
Data Quality from Precisely: Spectrum Quality
Data Quality from Precisely: Spectrum QualityData Quality from Precisely: Spectrum Quality
Data Quality from Precisely: Spectrum Quality
Precisely
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
DATAVERSITY
 
Multi-Cloud Integration with Data Virtualization (ASEAN)
Multi-Cloud Integration with Data Virtualization (ASEAN)Multi-Cloud Integration with Data Virtualization (ASEAN)
Multi-Cloud Integration with Data Virtualization (ASEAN)
Denodo
 
Data Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWSData Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWS
Denodo
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
Nathan Bijnens
 
BATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data MeshBATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data Mesh
BATbern
 
Webinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
Webinar: Hybrid Cloud Integration - Why It's Different and Why It MattersWebinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
Webinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
SnapLogic
 
Data Governance for the Cloud with Oracle DRM
Data Governance for the Cloud with Oracle DRMData Governance for the Cloud with Oracle DRM
Data Governance for the Cloud with Oracle DRM
US-Analytics
 
Marlabs Capabilities Overview: DWBI, Analytics and Big Data Services
Marlabs Capabilities Overview: DWBI, Analytics and Big Data ServicesMarlabs Capabilities Overview: DWBI, Analytics and Big Data Services
Marlabs Capabilities Overview: DWBI, Analytics and Big Data Services
Marlabs
 
Logical Data Fabric and Industry-Focused Solutions by IQZ Systems
Logical Data Fabric and Industry-Focused Solutions by IQZ SystemsLogical Data Fabric and Industry-Focused Solutions by IQZ Systems
Logical Data Fabric and Industry-Focused Solutions by IQZ Systems
Denodo
 
Govern and Protect Your End User Information
Govern and Protect Your End User InformationGovern and Protect Your End User Information
Govern and Protect Your End User Information
Denodo
 
Evolving From Monolithic to Distributed Architecture Patterns in the Cloud
Evolving From Monolithic to Distributed Architecture Patterns in the CloudEvolving From Monolithic to Distributed Architecture Patterns in the Cloud
Evolving From Monolithic to Distributed Architecture Patterns in the Cloud
Denodo
 

Similar to Hadoop Migration to databricks cloud project plan.pptx (20)

Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...
Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...
Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...
 
Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​
 
On the Cloud? Data Integrity for Insurers in Cloud-Based Platforms
On the Cloud? Data Integrity for Insurers in Cloud-Based PlatformsOn the Cloud? Data Integrity for Insurers in Cloud-Based Platforms
On the Cloud? Data Integrity for Insurers in Cloud-Based Platforms
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
Get ahead of the cloud or get left behind
Get ahead of the cloud or get left behindGet ahead of the cloud or get left behind
Get ahead of the cloud or get left behind
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Top Trends and Challenges in the Cloud
Top Trends and Challenges in the CloudTop Trends and Challenges in the Cloud
Top Trends and Challenges in the Cloud
 
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
 
Data Quality from Precisely: Spectrum Quality
Data Quality from Precisely: Spectrum QualityData Quality from Precisely: Spectrum Quality
Data Quality from Precisely: Spectrum Quality
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
Multi-Cloud Integration with Data Virtualization (ASEAN)
Multi-Cloud Integration with Data Virtualization (ASEAN)Multi-Cloud Integration with Data Virtualization (ASEAN)
Multi-Cloud Integration with Data Virtualization (ASEAN)
 
Data Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWSData Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWS
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
BATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data MeshBATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data Mesh
 
Webinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
Webinar: Hybrid Cloud Integration - Why It's Different and Why It MattersWebinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
Webinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
 
Data Governance for the Cloud with Oracle DRM
Data Governance for the Cloud with Oracle DRMData Governance for the Cloud with Oracle DRM
Data Governance for the Cloud with Oracle DRM
 
Marlabs Capabilities Overview: DWBI, Analytics and Big Data Services
Marlabs Capabilities Overview: DWBI, Analytics and Big Data ServicesMarlabs Capabilities Overview: DWBI, Analytics and Big Data Services
Marlabs Capabilities Overview: DWBI, Analytics and Big Data Services
 
Logical Data Fabric and Industry-Focused Solutions by IQZ Systems
Logical Data Fabric and Industry-Focused Solutions by IQZ SystemsLogical Data Fabric and Industry-Focused Solutions by IQZ Systems
Logical Data Fabric and Industry-Focused Solutions by IQZ Systems
 
Govern and Protect Your End User Information
Govern and Protect Your End User InformationGovern and Protect Your End User Information
Govern and Protect Your End User Information
 
Evolving From Monolithic to Distributed Architecture Patterns in the Cloud
Evolving From Monolithic to Distributed Architecture Patterns in the CloudEvolving From Monolithic to Distributed Architecture Patterns in the Cloud
Evolving From Monolithic to Distributed Architecture Patterns in the Cloud
 

Recently uploaded

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 

Hadoop Migration to databricks cloud project plan.pptx

  • 1. Telecom Bell Cloud Migration Kickoff Yashodhan Kale Delivery Solutions Architect | Databricks 05/30/2023
  • 2. Contents Approach Operating Model Additional Details 01 3 4 5 Platform & Architecture 2 Summary 1
  • 3. Summary | Business challenges 1 1 2 3 4 5 TELECOM BELL must improve network QOS to align with consumers' changing emphasis on mobile connectivity and data usage As IoT and 5G advance, customers easily switch providers, prompting TELECOM BELL to prioritize personalized engagement using customer data for customized messaging and services. TELECOM BELL is subject to many regulations, including data privacy and security regulations, and needs effective ways to adhere to these. Power of data there is a data-volume explosion, requiring both focus and new capabilities. Increase pressure to show growth and profits is constant and data and AI will be a critical enabler
  • 4. Summary| Technical Challenges 2 Today there are increased expectations and pressure on the Telecom organization to have a strong data & analytics strategy •Data platform is not scalable for analytics, AI/ML Upfront capacity planning and cost Governance of the data on HDFS is a challenge Data sits in silos and not easy to integrate/ connect •Lack of discoverability of data (catalog) •Housekeeping - Maintenance of the in-house cluster is a difficult thru different portals and installations •Advance disaster recovery, durability and availability •Bigger IT infra staff required
  • 5. Summary | Executive Plan Telecom Bell wants to improve the Quality of Service (QoS) of their network and to get there, start migrating the core applications to cloud. Databricks will bring industry leading expertise and Databricks platform expertise to drive the transformation at speed. Confluent will bring event streaming platform built on Kafka and the necessary platform support Telecom Bell has a team of 10 Engineers with expertise on Kafka and spark Desired timeline – May 2024 4 1 3 2 5 3
  • 6. Contents Approach Operating Model Additional Details 01 3 4 5 Platform & Architecture 2 Summary 1
  • 7. Platform & Architecture | Current Architecture 1 Limitations • Data platform is not scalable for analytics, AI/ML • Upfront capacity planning and cost • Governance of the data on HDFS is a challenge • Data sits in silos and not easy to integrate/ connect • Lack of discoverability of data (catalog) • Housekeeping - Maintenance of the in- house cluster is a difficult thru different portals and installations • Advance disaster recovery, durability and availability • Bigger IT infra staff required
  • 8. Platform & Architecture | End state Architecture 2 Design target state architecture for a scalable, secure and well governed data platform (AI /ML self-serve, advanced engineering capabilities including necessary governance on lake capability) Highlights • Warehouse + Data Lake capabilities at scale with Governance • Data product mindset – Marketplace, Self service capabilities • MLOps – Full ML Lifecycle • Domain data tiers - Advance data management capabilities, curated democratized data layers Designing and activating a World Class Data Platform: Fundamental Principles • Scalability • Performance • Industrialized processes governing the pipeline • Distributed, fault tolerant architecture • Open file format for better interoperability between systems • Security and reliability • Data provenance and lineage • ACID complaint
  • 9. Platform & Architecture | Current vs New 3 More performant and optimized spark engine 1 Governance under the same roof 2 New
  • 10. Platform & Architecture | Artifacts 4 Key components of the data platform: A World Class Data Platform!
  • 11. Contents Approach Operating Model Additional Details 01 3 4 5 Platform & Architecture 2 Summary 1
  • 12. Approach | Our Tenets 1 Security is job zero Agile Methodology Continues delivery of results Because - "Approach is the first step towards achieving goals" Leverage customer asset first Multiple velocity joint delivery approach A B C D E F G H Zero down time Log the journey at every step to look back & learn Principal of least access privilege(PoLAP)
  • 13. Approach | Objectives 2 Build the data strategy roadmap that empowers Telecom Bell to overcome its business challenges Mindset HORIZO N HORIZO N HORIZO N Strategic roadmap Platform 1 2 3 Build strong foundations with data platform development and implementation Co-create an operating model that would take TELECOM BELL where it wants, in a sustainable way. Migrate core applications to cloud in a secure and reliable way 4 Industrialization
  • 14. Contents Approach Operating Model Additional Details 01 3 4 5 Platform & Architecture 2 Summary 1
  • 15. Operating Model | Joint Delivery Approach Executive Leadership Databricks Leadership: 1 Application Team Telecom Bell Leadership 1 Program Management Databricks Lead 1 Telecom Bell Lead 1 Platform Team Data Quality & Governance Bringing it Together Databricks (Professional services) 5 D C B A Meeting Cadence • Bi-Weekly Steering Committee Meetings • Weekly PMO Meetings • Daily Delivery Team Meetings Telecom Bell Resources 3 1 Telecom Bell Resources 3 Telecom Bell Resources 4 Telecom Bell Resources 1 Databricks (Professional services) 5 Databricks (Professional services) 3 Databricks (Professional services) 3
  • 16. Leadership Scrum Master Application Team Functional Domain Expert Data Visualization Engineer Customer Success Engineer Data Engineer Operating Model | Pod Structure Data Quality & Governance Test / Quality Lead Data Quality Engineer Data Governance Lead Data Lineage and Profiling Engineer Product Owner Bring it Together Delivery Lead Change management Specialist PMO Lead Roadmap Officer Databricks resource Telecom Bell resource Leader Leader Leader Leader Platform Azure Platform Cloud Architect Cloud DevOps Engineer Resident Solutions Architect Delivery Solutions Architect Customer Success Engineer Resident Solutions Architect Resident Solutions Architect Specialist Solutions Architect (Security) Specialist Solutions Architect (Security) Cloud DevOps Engineer Scrum Master 16 12 Shared Resource Shared Resource Shared Resource Cloud DevOps Engineer 2 Enterprise Support Enterprise Support
  • 17. CELEBRATION Celebrate completion ` PROGRAM KICKOFF Operating Model | Road Map 3 DELIVERABL ES DIAGNOSTIC OF THE CURRENT ENVIRONMENT 1 PLATFOR M 3 END STATE ARCHITECTURE 2 MIGRATION: 10 % 4 MIGRATION: 60% 5 6 MIGRATIO N 100% Progress Progress Consistently – communicate, remove roadblocks & eliminate friction Celebrate completion of quick wins to strengthen morale ALONG THE WAY Progress MEASURE PROGRESS MIGRATION PLAYBOOK A repeatable guideline to migrate applications to new architecture 3 HUMAN-CENTERED CHANGE Focus on each individual team member’s technical skills and capacity for change. Reskill team members whose roles are changing 1 MINDSET CHANGE Adopt ‘Data as a Product’, self service platform, federated governance, domain specific ownership 2 PROCESS GOALS
  • 18. Operating Model | Timeline 3 Q2 2023 Q3 2023 Q4 2023 Q1 2024 Q2 2024 Agile : Update Roadmap and plan per evolving priorities Current State Diagnostics Assess skill and capability gaps within the organization Design & Deliver Governance Structure Databricks workspace setup Assess Current State & Catalog Critical Data Elements Prepare Governance Strategy (Identify roles, define interaction model) Application Platform Bring it together Data Quality + Governance Best practices and tagging Design Target State DQ Monitoring Steerco Meeting Assess Current State Data Governance Steerco Meeting Steerco Meeting Steerco Meeting Confluent workspace setup Cost management reports Define Elements/Sources/Dat a Test & Modify Refactor the code Deploy Document & KT Define Pods and teams Create Upskilling Curriculum and setup trainings sessions Establish ways of working – documentation, win celebrations Continuously monitor, foresee risk, mitigate risks , fetch leadership guidance Project management Arrange handover of all areas Handover Handover Handover Security and compliance | phase1 Security and compliance | phase2 Talk to business team Incorporate changes Cost optimization Move towards Infra as code Implement Target State DQ Monitoring
  • 19. Contents Approach Operating Model 01 3 4 5 Platform & Architecture 2 Summary 1 Additional Details
  • 20. Industrialization: Competitive Differentiation High throughput of innovation analytics (AI/ML) Predictive analytics at scale Data driven(real time what-if analysis) Harmonized MDM; ML & AI based DQ Fast, repeatable time-to-market from idea to product 5 Additional Details | Future Scope 1
  • 21. Additional Details | Risk & Mitigation - Technical 1 Risks Mitigating Actions Data Loss Risk Reconciliation, Check pointing, Audit, Monitoring. Use of fault tolerant ingestion/migration tools like Azure Data Factory – Az Copy Activity Data Corruption and Data Integrity Risk Data Validation - Each record is compared in a bidirectional manner, and each record in the old system is compared against the target system and the target system against the old system Interference Risks (simultaneously use of source application) Align with the stakeholders of each source on how the bandwidth can be shared. “Bring it together” team come into play to address this Schema Evolution (Changing Dimensions) Delta file format – Schema evolution feature. Depends on schema on read. Further to make sure there are no incompatible schemas coming in. A catalog and governance would be leveraged – Databricks Unity Catalog Authorization Risk MFA and Identity Federation , access controls at row and column level by Delta Lake Data Security Risk Apply Encryption where possible and appropriate All tokens and keys will be securely stored and rotated in Azure Key Vault Rotate keys on regular interval Down time due to migration Replicate and activate approach
  • 22. Additional Details | Risk & Mitigation - Other 2 Risk Mitigating Actions Resource Availability & Competing Priorities  Making sure employees are fully advised about participation into workshops and/or interviews.  Get the right people at the right time Senior Leadership Buy-In and Delays in Decision Making  Strong support from the leadership Group, including areas who are not fully involved by the initial changes. One Team, One direction  Establish governance to provide clarity on accountabilities for decision making Potential Impacts to Other Projects  Strong support from Senior Leadership if there is a need to put a hold on existing projects  Review current state of ongoing projects to see how it impacts to the Finance model  Prioritize major changes and focus on the big obstacles upfront Lack of People Adoption – Major Change  Agile and inspirational change management and communication structure  Leverage Bring it together team, and roles like change management experts to steward people readiness and prepare for change Design in Isolation (Enterprise Integration)  Work with scalable and flexible design principles in mind to ensure proper integration and alignment with the business. It is a partnership approach  Gather key inputs to support cross function process design decisions where applicable Availability of Key Data Inputs and Information  Simplify data requests to collect data and information at the appropriate level of detail  Assign designated Databricks and Telecom Bell contact to ensure smooth and timely transition of data  Discovery Phase to identify hidden environmental risks to foresee and mitigate
  • 23. Area Assumption 1 Platform Telecom Bell on premise platform is owned and managed by Telecom Bell and Databricks will get the necessary support to extent the setup to provision the solution per the scope of this effort. 2 Data Security Telecom Bell is responsible for the design, integration and operation of all Client Identity and Access Management, Security Incident and Event Management, Vulnerability Scanning and Security Testing tooling and processes as appropriate. 5 Access & Setup Telecom Bell will provide system access to all source systems or applications required by scope. Telecom Bell will provide access to systems and environments(including DEV, SIT) within 5 business days of receipt of request. 6 Access & Setup Databricks persona will not have access to unencrypted PII data. Telecom Bell will be responsible for encrypting any PII data, prior to extraction in the Databricks platform. 7 Access & Setup PII and GDPR Data handling will be done by Telecom Bell as per the existing practices in delivery , any additional arrangement is out of scope. 9 Project Management Telecom Bell will provide relevant functional, technical and process documentation for data platforms and systems required by the scope. 10 Project Management Telecom Bell will nominate full time business and technical SMEs aligned to this project as per the agreed pod structure. 11 Project Management Telecom Bell data owners /nominees will make every attempt to attend the Scrum meetings and ceremonies to present their progress on the issues assigned 12 Project Management Telecom Bell will make sure we get required time and support from all the stakeholders for complete success of the project. 14 Data Build Databricks team will reuse and extend the existing data ingestion tooling and framework to support the ingestion activities into the platform. The project will carry a data discovery exercise where it will assess the local market data quality and readiness. 15 Data Build Source System inventory have already been identified and already in place. 16 License The Cloudera CDH on premise license is already expired in March 2022. However, the extended support is required and obtained. Additional Details | Assumptions 3
  • 24. •Is there an onboarding guide for the consultants to get started on your environment ? Is there a Source System inventory already identified and can be shared ? What are the roles and skills of existing 10 engineers on the team ? What is the current data governance mechanism ? Other than Cloudera, what all other paid subscriptions and packages are installed on the concerned architecture ? •Is there any major business contingency on this project plan? If so, what is the impact of the delayed delivery? •What are all the compliances and regulations that Telecom Bell need to follow about the concerned data? •Does Telecom Bell already have Azure account? If so, what is the level of enterprise support plan that is subscribed ? •Does Telecom Bell already have Confluent account? If so, what is the level of enterprise support plan that is subscribed ? •Any due license expires ? •What is the Cloudera’s extended support expiry date ? Additional Details | Questions 4
  • 25. Thank you Thank you so much for you time today.. 
  • 26. Yashodhan Kale BACKGROUND SELECTED EXPERIENCES Amazon Web Services Certified Data Analytics - Specialty Amazon Web Services Solutions Architect - Associate Cloudera Certified Developer for Apache Hadoop (CCDH) RELEVANT FUNCTIONAL AND INDUSTRY EXPERIENCE Modern Technologist | Data and ML at scale Design and drive clients' Data and AI journeys powered by cloud analytics expertise! Offering data product mindset-driven solutions to deliver platforms and beyond: Self-service framework, rapid experimentation lab, democratized data, data products marketplace, multi-cloud solutions, data lake, data fabric, data mesh patterns with federated governance, domain-specific ownership, and more Industry Focus: • HealthCare • Retail • Market Research • Finance Functional Expertise: • Digital Transformation • Analytics and CDO Strategy • Open Source • Machine Learning, IOT • Data Drive Re-invention • Fortune 5 American healthcare company Establish and manage DevOps, Data Engineering, and ML engineering teams in close collaboration with Data Scientists. Set up a self-service Data and ML platform on Azure cloud for a Retail enterprise, incorporating an experimentation framework, Model Training pipelines, and real-time inference using Azure AKS, Kubeflow, and Snowflake. Implement an Rx enterprise Data and ML platform on Azure cloud, enabling ETL pipelines with Databricks and Apache Airflow. Lead the development of large-scale projects, including legacy modernization, Rx personalization, and Retail personalization programs that impact millions of lives daily. Collaborate with technology partners, MSFT and NVIDIA, to present objectives, findings, and incorporate feedback for ML solutions with specialized NVIDIA GPUs. Architect and oversee the implementation of the Refrigerator IoT project on Azure, leveraging IOT hub, Azure Analytics, and Databricks. Lead the development of SAP HANA to Spark integration. Manage the enhancement team in Data Engineering for pharmacy-related projects, ensuring critical business deliveries. Design data-driven solutions, including self-service analytics platforms, rapid experimentation labs, democratized data, multi-cloud solutions, data fabric, data mesh patterns with federated governance, and domain-specific ownership. Develop an ingestion framework for seamless data migration across projects and cloud storage services. • Multinational American information, data & market measurement company Build a retail store data aggregation engine (Retail Intelligence system) for 24 countries, initially using Hadoop MapReduce, later upgraded to Spark. Migrate on-premise batch processes to the cloud using Docker, Azure Batch Services, and Azure Shipyard for cost efficiency. Perform performance tuning on Apache Spark, cloud Hadoop clusters (HDI), and Databricks on Azure and Hadoop platforms. CERTIFICATIONS PREVIOUSLY Sr Cloud Solution Architect @ Amazon Web Services Level 6 Sr ML Engineering Manager @ Databricks Level 6 WHAT HAS BROUGHT ME HERE • Customer Obsession • Deliver Results • Earn trust • Learn and Be Curious
  • 27. ACID Compliant Time Travel Data as product Inter Operability Self service experimentation Scale & Pay as you go Lake House Governance Data Migration Identity Management, SSO Event Streaming Exactly once semantics
  • 28. Upfront cost Not easy to integrate/ connect Lack of discoverability Efforts to make data HA & durable End of support Maintenance
  • 29. Platform & Architecture | Artifacts 1 Key components of the data platform: A World Class Data Platform!
  • 31. MLOps
  • 34. Databricks Notebooks 1 Share insights Quickly discover new insights with built-in interactive visualizations, or leverage libraries such as Matplotlib and ggplot. Export results and Notebooks in HTML or IPYNB format, or build and share dashboards that always stay up to date. 3 Production at scale Schedule Notebooks to automatically run machine learning and data pipelines at scale. Create multistage pipelines using Databricks Workflows. Set up alerts and quickly access audit logs for easy monitoring and troubleshooting. 2 Work together Share Notebooks and work with peers across teams in multiple languages (R, Python, SQL and Scala) and libraries of your choice. Real-time coauthoring, commenting and automated versioning simplify collaboration while providing control.

Editor's Notes

  1. Ex AWS Ex, Databricks ML Engineering Sr Manager, I have built data platforms and delivered - campaign management, personalization while touches millinos of lives a day. Extensively worked into Retail , healthcare, telecom and finance industries and worked into 3 different counties experienced start up culture. And I know how to deliver results. Qualities that has brought me here are – Customer ob, Delivering result, earn trust and not giving up on learning. fifa, chess, Salsa Transition – that’s me . With that lets get going
  2. 10K ft overview business, technical , and plan slightly deeper look into platform Transition – what how and when
  3. personalization customer engagement regulations - data privacy and security data volumn show growth and profits. Top priority : improve QOS Transition -Lets look at some technical challenges
  4. To address the B challenges above A Strong data and analytics strategy is maintain the pace of innovation = experimentation capability = pay as you go is crucial + easy access to data + SAAS model of services ==== benchmark , Red flags - : kafka and spark architecture which processes nework data End of support Transition – yes we saw the businesses and technical challenges so What’s the plan ? The direction : next slide
  5. we will improve the QoS of the network and start with migration Databricks Confluent telicom bell10 engineers we plan to complete this project in 12 months Transition – Alright. How do we achieve this? I have but together plan and that I will walk you thru. feedback, suggestions, concerns are all welcome. Craft a final version together.
  6. TB On -premise arch would look more or less like this.
  7. Transition – Enough time on architecture. Lets take 2 differences and move from here. 1. More optimized more performant.. With less configurations to worry about. : zordering, vaucum, auto optimize feautures. 2. integration with UC
  8. A leap towards data as a product mind set – Federated goverenance, self service platform, inter – operatability, share within and across organization – notebooks and code. (product mindset.) Add Marketplace
  9. Talk about few in the interest of time Security – Azure key vault, Encryption where possible. Network setup – No data will flow thru public internet.. Private endpoints will be used. Principal of least access privilege(PoLAP) Zero down time: Replicate and then activate Leverage customer asset first : you will see in the next few slides 10 engineers distributed across all project areas
  10. operating model is designed to deliver these objectives over next 12 months After the essential piece of roadmap and planning Platform : Not only run existing apps , empowers bell to accelerate on pace of innovation provide solutions beyond the scope of this project: - personalized customer engagement and other business experiments Also, OM delivers a needed shift in mindset. Think “Data as a product” , create a data-product culture features of marketplace, federated governance, delta sharing, Lastly, it deliver pay as you go, secure & low maintenance solution that can handle the immediate need to migrate to cloud “given end of support”
  11. sharing the resource where possible used all 10 tb enineers Assuming enterprise support from confluent and azure total count
  12. Dignostic - complete picture of where we are, our pain points, scope of improvement, asset Final End state architecture
  13. Time line activity Any party has any concerns, we can definitely relook at this and try adjust to make it smoothly achievable
  14. Ex AWS Ex, Accenture ML Engineering Sr Manager, I have built data platforms and delivered - campaign management, personalization while touches millinos of lives a day. Extensively worked into Retail , healthcare, telecom and finance industries and worked into 3 different counties experienced start up culture. And I know how to deliver results. Qualities that has brought me here are – Customer ob, Delivering result, earn trust and not giving up on learning. fifa, chess, Salsa Transition – that’s me . With that lets get going
  15. A leap towards data as a product mind set – Federated governance, self service platform, inter – operability, share within and across organization – notebooks and code. (product mindset.)
  16. Few pain points - These services all run on premise. upgrades Limitations Data platform is not scalable for analytics, AI/ML Upfront capacity planning and cost Governance of the data on HDFS is a challenge Data sits in silos and not easy to integrate/ connect Lack of discoverability of data (catalog) Housekeeping - Maintenance of the in-house cluster is a difficult thru different portals and installations Advance disaster recovery, durability and availability Bigger IT infra staff required
  17. A leap towards data as a product mind set – Federated goverenance, self service platform, inter – operatability, share within and across organization – notebooks and code. (product mindset.) Add Marketplace