SlideShare a Scribd company logo
APRIL, 2023
DataOps
The Future of Data Management - Embracing Agility,
Collaboration, and Automation
Agenda
2
Introductions
DevOps to DataOps
CI/CD for Data Products
Orchestration, Testing and Monitoring
Questions
Jeewan Singh
Senior Principal,
Data Analytics
Tomy Rhymond
Principal- Cloud Lead
Technology Enablement
3
About Us.
So…. what is DevOps, really???
DevOps is a cultural movement to:
• Improve Collaboration
• Automate operations (aka the “plumbing”)
• Increase the rate of deployment
• Improve quality and security
What
 Source Control
 CI/CD
 Infrastructure Automation (IAC)
 Automated Test and Validation
 Design for Scalability
 Use the Cloud
How
Why
Spend more time on valuable work
… and have more fun!
Continuous Deployment Of Databases : Part 1
Data and Analytics professional face unique challenges for
automation
State Rolling back Other
Testing
Down Time
Application code is
stateless
Database contains
valuable business data
Change structure and
data without loss
Hand crafting release
scripts is error-prone
Application servers are
easy to swap in/out
Database servers are
very difficult to swap
in/out (even in cluster)
Can sometimes swap
databases or tables
in/out
Applications easy to roll
back from source
control
Databases must be
explicitly backed up and
restored
Very time-consuming
Database unavailable
during restore
Application code is easy
to test with unit tests
Unit testing for
databases is challenging
Unit testing requires test
data generation and
management which gets
complicated quickly
Configuration changes
deployed via CI/CD
Most often only DBAs
touch the database
(control)
Prod databases don’t
match source control
(drift)
Database change
management is difficult
6
These Roadblocks add friction, prevent automation, and
slow adoption of DataOps best practices
Fragile Column Mappings
Embedded Credentials
Hard-coded connections
Black-Box SaaS
GUI-Only Tools
5 Critical Mindset Changes
 Business Requirements are Static
“Our job is to meet the agreed business requirements.”
 Single-Developer, Individual Ownership
“Someone will email me if it breaks.”
 UAT Testing Approach
“We will run some tests before we launch.”
 Everything Manual
“No time to build the automation yet.”
 Demos at End of Project
“Creating demos take time.”
Traditional Mindset DevOps Mindset
 Business Requirements are Fluid
“We aren’t doing right if we assume requirements are static.”
 Multiple Developers, Team Ownership
“Someone else may have to fix this if it breaks.”
 Continuous Testing Approach
“We wrote the tests before we started developing.”
 Mostly Automated
“No time to waste on manual stuff.”
 Demos Daily or Weekly
“Continual feedback is critical to success.”
8
DataOps is a collaborative and automated approach to
managing the entire lifecycle of data, from its creation to
its deletion, in a way that ensures that data is trustworthy,
accurate, and readily available to the right people at the
right time.
PEOPLE PROCESS
TECHNOL
OGY
DataOps Collaboration
Product
Owner/Architect
Operations/
Administration
Chief Data Officer
Data
Analysts
Data
Scientist
Data
Engineer
10
DataOps is an approach to data analytics and data-driven decision
making that follows the agile methodology of continuous
improvement.
Source
Data
Data
Ingestions
Data
Engineering
Data
Analytics
Business
Users
DataOps
CI/CD Orchestration Testing Monitoring
11
DataOps practices are an investment whose dividends
increase with time and experience
Increased speed of delivery
from improved processes
End-to-end efficient data
form automated pipelines
with feedback loops
Improved productivity and
collaboration from
empowered developers
Better business outcomes
from happier customers
Secure and compliant data
from automated, data
quality checks, masking,
tokenization and more.
Reduced mean time to
resolution (MTTR) from shift-
left quality approach
Increased data reliability
and resiliency
Developer empowerment with the
DevOps culture that promote
collaboration and ownership &
accountability
12
DataOps Principles
Analytics is code.
Differences can be spotted easily and
are all committed to the code repo.
Orchestrate.
When everything is automated, we
never have to choose between delivery
new features and performing manual
maintenance.
Make it reproducible.
The code runs the same way every time.
There is no state to manage and there are no
“two ways” to run it which might produce
different results.
Disposable environments.
There’s no such things as data loss. At any
time, the production environment can be
recycled, and a new environment can be spun
up automatically.
DataOps Maturity Model
CI/CD for Data
Products
Taken from Stefana Muller in Dev Leaders Compare Continuous Delivery vs. Continuous Deployment vs. Continuous Integration
What do we mean when we say “CI/CD”?
CI/CD Definitions
Continuous Integration (CI)
is a software engineering practice in which
developers integrate code into a shared
repository several times a day in order to
obtain rapid feedback of the feasibility of that
code. CI enables automated build and
testing so that teams can rapidly work on a
single project together.
Continuous Deployment (also
CD)
is the process by which qualified changes in
software code or architecture are deployed
to production as soon as they are ready and
without human intervention.
Continuous Delivery (CD)
is a software engineering practice in which
teams develop, build, test, and release
software in short cycles. It depends on
automation at every stage so that cycles can
be both quick and reliable.
Developing with
CI/CD commit
commit
commit
commit
commit
main
branch
dev
branch
Pull
Request
✔
✔
✔
❌
Rebuild a
“Beta” Copy
of DW
Auto-Publish
to Production
DW
❌
Refreshed daily/hourly
1. Continuous Integration (CI) Testing:
Automatic or with every commit!
2. Continuous Delivery (CD):
New changes automatically delivered in beta!
3. Continuous Deployment (also CD):
New features and fixes delivered
to customers automatically!
✔ ❌
 1) Store all your files in source control.
 2) Create a full deployment script.
 3) Create a text file pointing to your
deployment script.
CI/CDGettingStartedChecklist
Orchestration, Testing
and Monitoring
18
DataOps Compared to DevOps
Develop Build Test Deploy Run
CI CD
Sandbox Develop Orchestrate Test Deploy
Orchestrate
Monitor
CI
CD
©4/13/23
Slalom. All Rights Reserved. Proprietary and Confidential. 19
Modern Cloud Data Reference Architecture
Data Pipeline Orchestration and Monitoring
Security: Authorization & Authentication
Continuous Integration, Continuous Deployment (CI/CD)
End-User
Manufacturer
Management Team
Internal Analytics
Teams
External Users
Data Source Layer
External
Unstructured Data
Loyalty
E-Commerce
POS Technology
Patient Support Program
Wholesale Distribution
Vistex JDA MBA Anzio
SoloChain MSA
Maple CMSV2
PharmaClick
POS
Reflex POS
Tulip MagicBox
Guardian
Rewards
Uniprix
Rewards
Proxim
Rewards
Newsletter LMS NPS / Survey
IQVIA Nielsen Health Canada
Program
Participation
First Data Bank
IQ DataSmart UniBi
Website /
Facebook
Email
(Dialogue)
Mobile Apps
UniSante
ProxiSante
PTS (db)
Proxim POS Cyberlog ICN
General Pharmacy
Operations Team
Data Lake
Raw Zone
Processed Zone
Curated Zone
Data
Ingestion
Batch Ingestion
• Cloud based ETL
• Event driven f(x)
• Rest APIs
Streaming Ingestion
• Real-time ingestion
• IoT Devices
Machine Learning
(Predictions & Recommendations)
Feature
Generation
Model
Development
Model
Deployment
Model
Monitoring
Central Data Storage
Data Warehouse
Transformation
&
Business
Rules
Data Governance and Access
Data Access Layer Governance Layer Management Layer
Centralized Policies
Data Quality Monitoring
Data Lineage & Metadata
Data Catalog
Consistent Controls
Security Policy Enforcement
Data
Tokenization
&
Masking
Patient Data Hub
Facts
Dimensions
Aggregates
Views
Merge & Match
Deduplication
Enrichment
Specialty Pharmacy
Operations Team
Consumption Layer
Operational Reports
• Warehouse & Specialty
• Store Sales & Growth
• Kiosk Reports
External Data Portal
• Neilsen Data
• External Kiosk
• SharePoint
Sandbox Environment
• Ad-hoc data analysis
• Raw data analysis
• Merging / curating data
sets
Analytical Dashboard
• Manufacturer Insights
• Patient Insights
• Pharmacy Insights
API Apps
• LifeLabs Apps
• Loyalty Program Apps
• Etc.
VPN
Patient / Customer
Data Governance
SMEs
SIR
DLD
RX Technology
Kroll
Reflex RX
Fillware
Compliance
Cube
AssysteRx
PharmaClick RX
Applied
Robotics
Ubik
Data Warehouses
GCP E-
commerce
RelayHealth
Hub
SAP
BeWell
Diem
Taken from Stefana Muller in Dev Leaders Compare Continuous Delivery vs. Continuous Deployment vs. Continuous Integration
Orchestrated,Test and Monitor
Orchestrate
• Both Infrastructure as code and data
pipeline code with single pipeline
• Composer (GCP), Airflow, Azure Data
Factory (Azure), DBT, DataOps.live,
Informatica, Mattilion, Stitch, AWS Data
Pipeline
Monitor
• Cloud Resources
• GCP Monitoring, CloudWatch,
Azure Monitor, Datadog
• Data pipelines
• Respective tools, native cloud
monitoring dashboards
• Data Quality
• ETL tools, manual tools on top of
data platforms
Test
• At the end of the pipeline run
• DBT, DataOps.live, Google Dataform,
Boomi, Informatica, Matillion, Great
Expectations, TSQLT
21
From ETL
to ELTP
Extract
Load
Transform
Publish
Extract
Transform
Load
Extract
Load
Transform
Publish
Benefits of ELT over ETL:
• non-destructive updates
• improved stability and recoverability
“Publish” step signals that data is available
and ready for downstream subscribers, may
involve shipping a copy of the data into the
data lake, replicating to multiple redshift
clusters, populating BI models, or similar
actions.
22
At the core of DataOps is your organization’s information
architecture
• How well you know your data?
• Do you trust your data?
• Are you able to quickly detect errors?
• Can you make changes incrementally without
“breaking” your entire data pipeline?
Critical areas below can transform your data
pipeline:
• Data Curation services
• Metadata Management
• Data Governance
• Master Data Management
• Self-Service interaction
Thank You.
Questions?

More Related Content

What's hot

Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
Databricks
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)
Adrien Blind
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
James Serra
 
ODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps ManifestoODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps Manifesto
DataKitchen
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
LibbySchulze
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0
DataWorks Summit
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
DATAVERSITY
 
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup   -- Nov 2019Washington DC DataOps Meetup   -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019
DataKitchen
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
James Serra
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for Scale
Databricks
 
Screw DevOps, Let's Talk DataOps
Screw DevOps, Let's Talk DataOpsScrew DevOps, Let's Talk DataOps
Screw DevOps, Let's Talk DataOps
Kellyn Pot'Vin-Gorman
 
Data Architecture Brief Overview
Data Architecture Brief OverviewData Architecture Brief Overview
Data Architecture Brief Overview
Hal Kalechofsky
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
DATAVERSITY
 
Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture
Rajesh Kumar
 
Azure Data Factory Data Flow
Azure Data Factory Data FlowAzure Data Factory Data Flow
Azure Data Factory Data Flow
Mark Kromer
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
Matei Zaharia
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
Vivek Aanand Ganesan
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
Databricks
 
Data Modeling on Azure for Analytics
Data Modeling on Azure for AnalyticsData Modeling on Azure for Analytics
Data Modeling on Azure for Analytics
Ike Ellis
 

What's hot (20)

Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
ODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps ManifestoODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps Manifesto
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
 
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup   -- Nov 2019Washington DC DataOps Meetup   -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for Scale
 
Screw DevOps, Let's Talk DataOps
Screw DevOps, Let's Talk DataOpsScrew DevOps, Let's Talk DataOps
Screw DevOps, Let's Talk DataOps
 
Data Architecture Brief Overview
Data Architecture Brief OverviewData Architecture Brief Overview
Data Architecture Brief Overview
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture
 
Azure Data Factory Data Flow
Azure Data Factory Data FlowAzure Data Factory Data Flow
Azure Data Factory Data Flow
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Data Modeling on Azure for Analytics
Data Modeling on Azure for AnalyticsData Modeling on Azure for Analytics
Data Modeling on Azure for Analytics
 

Similar to DataOps , cbuswaw April '23

What is DevOps?
What is DevOps?What is DevOps?
What is DevOps?
Mesut Güneş
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
DATAVERSITY
 
DevOps at Scale: How Datadog is using AWS and PagerDuty to Keep Pace with Gr...
DevOps at Scale:  How Datadog is using AWS and PagerDuty to Keep Pace with Gr...DevOps at Scale:  How Datadog is using AWS and PagerDuty to Keep Pace with Gr...
DevOps at Scale: How Datadog is using AWS and PagerDuty to Keep Pace with Gr...
Amazon Web Services
 
DevOps 101 - IBM Impact 2014
DevOps 101 - IBM Impact 2014 DevOps 101 - IBM Impact 2014
DevOps 101 - IBM Impact 2014
Sanjeev Sharma
 
Quality 4.0 and reimagining quality
Quality 4.0 and reimagining qualityQuality 4.0 and reimagining quality
Quality 4.0 and reimagining quality
Dr. Anish Cheriyan (PhD)
 
Digital Disruption with DevOps - Reference Architecture Overview
Digital Disruption with DevOps - Reference Architecture OverviewDigital Disruption with DevOps - Reference Architecture Overview
Digital Disruption with DevOps - Reference Architecture Overview
IBM UrbanCode Products
 
IBM Collaborative Lifecycle Management Solution for DevOps v6
IBM Collaborative Lifecycle Management Solution for DevOps v6IBM Collaborative Lifecycle Management Solution for DevOps v6
IBM Collaborative Lifecycle Management Solution for DevOps v6
Strongback Consulting
 
SplunkLive! London 2016 Splunk for Devops
SplunkLive! London 2016 Splunk for DevopsSplunkLive! London 2016 Splunk for Devops
SplunkLive! London 2016 Splunk for Devops
Splunk
 
How SQL Change Automation helps you deliver value faster
How SQL Change Automation helps you deliver value fasterHow SQL Change Automation helps you deliver value faster
How SQL Change Automation helps you deliver value faster
Red Gate Software
 
Pivotal korea transformation_strategy_seminar_enterprise_dev_ops_20160630_v1.0
Pivotal korea transformation_strategy_seminar_enterprise_dev_ops_20160630_v1.0Pivotal korea transformation_strategy_seminar_enterprise_dev_ops_20160630_v1.0
Pivotal korea transformation_strategy_seminar_enterprise_dev_ops_20160630_v1.0
minseok kim
 
AWS Partner: Grindr: Aggregate, Analyze, and Act on 900M Daily API Calls
AWS Partner: Grindr: Aggregate, Analyze, and Act on 900M Daily API CallsAWS Partner: Grindr: Aggregate, Analyze, and Act on 900M Daily API Calls
AWS Partner: Grindr: Aggregate, Analyze, and Act on 900M Daily API Calls
Amazon Web Services
 
Back To Basics
Back To BasicsBack To Basics
Back To Basics
kamalikamj
 
How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP Testing
RTTS
 
Data summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data opsData summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data ops
Ryan Gross
 
Using Lean Thinking to Identify and Address Delivery Pipeline Bottlenecks
Using Lean Thinking to Identify and Address Delivery Pipeline BottlenecksUsing Lean Thinking to Identify and Address Delivery Pipeline Bottlenecks
Using Lean Thinking to Identify and Address Delivery Pipeline Bottlenecks
IBM UrbanCode Products
 
Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...
Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...
Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...
Splunk
 
Continuous Integration and Continuous Delivery on Azure
Continuous Integration and Continuous Delivery on AzureContinuous Integration and Continuous Delivery on Azure
Continuous Integration and Continuous Delivery on Azure
CitiusTech
 
A DevOps adoption playbook- achieving business value at scale
A DevOps adoption playbook- achieving business value at scaleA DevOps adoption playbook- achieving business value at scale
A DevOps adoption playbook- achieving business value at scale
Sanjeev Sharma
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
Cameron. A. Bradbury
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
Cameron. A. Bradbury
 

Similar to DataOps , cbuswaw April '23 (20)

What is DevOps?
What is DevOps?What is DevOps?
What is DevOps?
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
DevOps at Scale: How Datadog is using AWS and PagerDuty to Keep Pace with Gr...
DevOps at Scale:  How Datadog is using AWS and PagerDuty to Keep Pace with Gr...DevOps at Scale:  How Datadog is using AWS and PagerDuty to Keep Pace with Gr...
DevOps at Scale: How Datadog is using AWS and PagerDuty to Keep Pace with Gr...
 
DevOps 101 - IBM Impact 2014
DevOps 101 - IBM Impact 2014 DevOps 101 - IBM Impact 2014
DevOps 101 - IBM Impact 2014
 
Quality 4.0 and reimagining quality
Quality 4.0 and reimagining qualityQuality 4.0 and reimagining quality
Quality 4.0 and reimagining quality
 
Digital Disruption with DevOps - Reference Architecture Overview
Digital Disruption with DevOps - Reference Architecture OverviewDigital Disruption with DevOps - Reference Architecture Overview
Digital Disruption with DevOps - Reference Architecture Overview
 
IBM Collaborative Lifecycle Management Solution for DevOps v6
IBM Collaborative Lifecycle Management Solution for DevOps v6IBM Collaborative Lifecycle Management Solution for DevOps v6
IBM Collaborative Lifecycle Management Solution for DevOps v6
 
SplunkLive! London 2016 Splunk for Devops
SplunkLive! London 2016 Splunk for DevopsSplunkLive! London 2016 Splunk for Devops
SplunkLive! London 2016 Splunk for Devops
 
How SQL Change Automation helps you deliver value faster
How SQL Change Automation helps you deliver value fasterHow SQL Change Automation helps you deliver value faster
How SQL Change Automation helps you deliver value faster
 
Pivotal korea transformation_strategy_seminar_enterprise_dev_ops_20160630_v1.0
Pivotal korea transformation_strategy_seminar_enterprise_dev_ops_20160630_v1.0Pivotal korea transformation_strategy_seminar_enterprise_dev_ops_20160630_v1.0
Pivotal korea transformation_strategy_seminar_enterprise_dev_ops_20160630_v1.0
 
AWS Partner: Grindr: Aggregate, Analyze, and Act on 900M Daily API Calls
AWS Partner: Grindr: Aggregate, Analyze, and Act on 900M Daily API CallsAWS Partner: Grindr: Aggregate, Analyze, and Act on 900M Daily API Calls
AWS Partner: Grindr: Aggregate, Analyze, and Act on 900M Daily API Calls
 
Back To Basics
Back To BasicsBack To Basics
Back To Basics
 
How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP Testing
 
Data summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data opsData summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data ops
 
Using Lean Thinking to Identify and Address Delivery Pipeline Bottlenecks
Using Lean Thinking to Identify and Address Delivery Pipeline BottlenecksUsing Lean Thinking to Identify and Address Delivery Pipeline Bottlenecks
Using Lean Thinking to Identify and Address Delivery Pipeline Bottlenecks
 
Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...
Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...
Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...
 
Continuous Integration and Continuous Delivery on Azure
Continuous Integration and Continuous Delivery on AzureContinuous Integration and Continuous Delivery on Azure
Continuous Integration and Continuous Delivery on Azure
 
A DevOps adoption playbook- achieving business value at scale
A DevOps adoption playbook- achieving business value at scaleA DevOps adoption playbook- achieving business value at scale
A DevOps adoption playbook- achieving business value at scale
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
 

More from Jason Packer

Third Party Cookies: Columbus DAW March 2024
Third Party Cookies: Columbus DAW March 2024Third Party Cookies: Columbus DAW March 2024
Third Party Cookies: Columbus DAW March 2024
Jason Packer
 
Cbuswaw October '23, Marketing Mix Modeling
Cbuswaw October '23, Marketing Mix ModelingCbuswaw October '23, Marketing Mix Modeling
Cbuswaw October '23, Marketing Mix Modeling
Jason Packer
 
Generative AI and SEO
Generative AI and SEOGenerative AI and SEO
Generative AI and SEO
Jason Packer
 
Google Analytics Alternatives
Google Analytics AlternativesGoogle Analytics Alternatives
Google Analytics Alternatives
Jason Packer
 
Google Analytics Alternatives
Google Analytics AlternativesGoogle Analytics Alternatives
Google Analytics Alternatives
Jason Packer
 
Web Analytics Wednesday April 2020 - Customer Journey Mapping
Web Analytics Wednesday April 2020 - Customer Journey MappingWeb Analytics Wednesday April 2020 - Customer Journey Mapping
Web Analytics Wednesday April 2020 - Customer Journey Mapping
Jason Packer
 
Introduction to Factor Analysis
Introduction to Factor AnalysisIntroduction to Factor Analysis
Introduction to Factor Analysis
Jason Packer
 
Product Analytics at Web Analytics Wednesday
Product Analytics at Web Analytics WednesdayProduct Analytics at Web Analytics Wednesday
Product Analytics at Web Analytics Wednesday
Jason Packer
 
Columbus Web Analytics Wednesday September 2019
Columbus Web Analytics Wednesday September 2019Columbus Web Analytics Wednesday September 2019
Columbus Web Analytics Wednesday September 2019
Jason Packer
 
How to Present Test Results to Inspire Action
How to Present Test Results to Inspire ActionHow to Present Test Results to Inspire Action
How to Present Test Results to Inspire Action
Jason Packer
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Jason Packer
 
CBUSWAW - October 2017 Alain Stephan
CBUSWAW - October 2017 Alain StephanCBUSWAW - October 2017 Alain Stephan
CBUSWAW - October 2017 Alain Stephan
Jason Packer
 
Attribution 101
Attribution 101Attribution 101
Attribution 101
Jason Packer
 
CBUSWAW presentation July 2016
CBUSWAW presentation July 2016CBUSWAW presentation July 2016
CBUSWAW presentation July 2016
Jason Packer
 
CBUSWAW presentation May 2016
CBUSWAW presentation May 2016CBUSWAW presentation May 2016
CBUSWAW presentation May 2016
Jason Packer
 
Digging into Data Collection
Digging into Data CollectionDigging into Data Collection
Digging into Data Collection
Jason Packer
 
Columbus WordCamp 2015
Columbus WordCamp 2015Columbus WordCamp 2015
Columbus WordCamp 2015
Jason Packer
 

More from Jason Packer (17)

Third Party Cookies: Columbus DAW March 2024
Third Party Cookies: Columbus DAW March 2024Third Party Cookies: Columbus DAW March 2024
Third Party Cookies: Columbus DAW March 2024
 
Cbuswaw October '23, Marketing Mix Modeling
Cbuswaw October '23, Marketing Mix ModelingCbuswaw October '23, Marketing Mix Modeling
Cbuswaw October '23, Marketing Mix Modeling
 
Generative AI and SEO
Generative AI and SEOGenerative AI and SEO
Generative AI and SEO
 
Google Analytics Alternatives
Google Analytics AlternativesGoogle Analytics Alternatives
Google Analytics Alternatives
 
Google Analytics Alternatives
Google Analytics AlternativesGoogle Analytics Alternatives
Google Analytics Alternatives
 
Web Analytics Wednesday April 2020 - Customer Journey Mapping
Web Analytics Wednesday April 2020 - Customer Journey MappingWeb Analytics Wednesday April 2020 - Customer Journey Mapping
Web Analytics Wednesday April 2020 - Customer Journey Mapping
 
Introduction to Factor Analysis
Introduction to Factor AnalysisIntroduction to Factor Analysis
Introduction to Factor Analysis
 
Product Analytics at Web Analytics Wednesday
Product Analytics at Web Analytics WednesdayProduct Analytics at Web Analytics Wednesday
Product Analytics at Web Analytics Wednesday
 
Columbus Web Analytics Wednesday September 2019
Columbus Web Analytics Wednesday September 2019Columbus Web Analytics Wednesday September 2019
Columbus Web Analytics Wednesday September 2019
 
How to Present Test Results to Inspire Action
How to Present Test Results to Inspire ActionHow to Present Test Results to Inspire Action
How to Present Test Results to Inspire Action
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
CBUSWAW - October 2017 Alain Stephan
CBUSWAW - October 2017 Alain StephanCBUSWAW - October 2017 Alain Stephan
CBUSWAW - October 2017 Alain Stephan
 
Attribution 101
Attribution 101Attribution 101
Attribution 101
 
CBUSWAW presentation July 2016
CBUSWAW presentation July 2016CBUSWAW presentation July 2016
CBUSWAW presentation July 2016
 
CBUSWAW presentation May 2016
CBUSWAW presentation May 2016CBUSWAW presentation May 2016
CBUSWAW presentation May 2016
 
Digging into Data Collection
Digging into Data CollectionDigging into Data Collection
Digging into Data Collection
 
Columbus WordCamp 2015
Columbus WordCamp 2015Columbus WordCamp 2015
Columbus WordCamp 2015
 

Recently uploaded

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 

Recently uploaded (20)

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 

DataOps , cbuswaw April '23

  • 1. APRIL, 2023 DataOps The Future of Data Management - Embracing Agility, Collaboration, and Automation
  • 2. Agenda 2 Introductions DevOps to DataOps CI/CD for Data Products Orchestration, Testing and Monitoring Questions
  • 3. Jeewan Singh Senior Principal, Data Analytics Tomy Rhymond Principal- Cloud Lead Technology Enablement 3 About Us.
  • 4. So…. what is DevOps, really??? DevOps is a cultural movement to: • Improve Collaboration • Automate operations (aka the “plumbing”) • Increase the rate of deployment • Improve quality and security What  Source Control  CI/CD  Infrastructure Automation (IAC)  Automated Test and Validation  Design for Scalability  Use the Cloud How Why Spend more time on valuable work … and have more fun!
  • 5. Continuous Deployment Of Databases : Part 1 Data and Analytics professional face unique challenges for automation State Rolling back Other Testing Down Time Application code is stateless Database contains valuable business data Change structure and data without loss Hand crafting release scripts is error-prone Application servers are easy to swap in/out Database servers are very difficult to swap in/out (even in cluster) Can sometimes swap databases or tables in/out Applications easy to roll back from source control Databases must be explicitly backed up and restored Very time-consuming Database unavailable during restore Application code is easy to test with unit tests Unit testing for databases is challenging Unit testing requires test data generation and management which gets complicated quickly Configuration changes deployed via CI/CD Most often only DBAs touch the database (control) Prod databases don’t match source control (drift) Database change management is difficult
  • 6. 6 These Roadblocks add friction, prevent automation, and slow adoption of DataOps best practices Fragile Column Mappings Embedded Credentials Hard-coded connections Black-Box SaaS GUI-Only Tools
  • 7. 5 Critical Mindset Changes  Business Requirements are Static “Our job is to meet the agreed business requirements.”  Single-Developer, Individual Ownership “Someone will email me if it breaks.”  UAT Testing Approach “We will run some tests before we launch.”  Everything Manual “No time to build the automation yet.”  Demos at End of Project “Creating demos take time.” Traditional Mindset DevOps Mindset  Business Requirements are Fluid “We aren’t doing right if we assume requirements are static.”  Multiple Developers, Team Ownership “Someone else may have to fix this if it breaks.”  Continuous Testing Approach “We wrote the tests before we started developing.”  Mostly Automated “No time to waste on manual stuff.”  Demos Daily or Weekly “Continual feedback is critical to success.”
  • 8. 8 DataOps is a collaborative and automated approach to managing the entire lifecycle of data, from its creation to its deletion, in a way that ensures that data is trustworthy, accurate, and readily available to the right people at the right time. PEOPLE PROCESS TECHNOL OGY
  • 10. 10 DataOps is an approach to data analytics and data-driven decision making that follows the agile methodology of continuous improvement. Source Data Data Ingestions Data Engineering Data Analytics Business Users DataOps CI/CD Orchestration Testing Monitoring
  • 11. 11 DataOps practices are an investment whose dividends increase with time and experience Increased speed of delivery from improved processes End-to-end efficient data form automated pipelines with feedback loops Improved productivity and collaboration from empowered developers Better business outcomes from happier customers Secure and compliant data from automated, data quality checks, masking, tokenization and more. Reduced mean time to resolution (MTTR) from shift- left quality approach Increased data reliability and resiliency Developer empowerment with the DevOps culture that promote collaboration and ownership & accountability
  • 12. 12 DataOps Principles Analytics is code. Differences can be spotted easily and are all committed to the code repo. Orchestrate. When everything is automated, we never have to choose between delivery new features and performing manual maintenance. Make it reproducible. The code runs the same way every time. There is no state to manage and there are no “two ways” to run it which might produce different results. Disposable environments. There’s no such things as data loss. At any time, the production environment can be recycled, and a new environment can be spun up automatically.
  • 15. Taken from Stefana Muller in Dev Leaders Compare Continuous Delivery vs. Continuous Deployment vs. Continuous Integration What do we mean when we say “CI/CD”? CI/CD Definitions Continuous Integration (CI) is a software engineering practice in which developers integrate code into a shared repository several times a day in order to obtain rapid feedback of the feasibility of that code. CI enables automated build and testing so that teams can rapidly work on a single project together. Continuous Deployment (also CD) is the process by which qualified changes in software code or architecture are deployed to production as soon as they are ready and without human intervention. Continuous Delivery (CD) is a software engineering practice in which teams develop, build, test, and release software in short cycles. It depends on automation at every stage so that cycles can be both quick and reliable.
  • 16. Developing with CI/CD commit commit commit commit commit main branch dev branch Pull Request ✔ ✔ ✔ ❌ Rebuild a “Beta” Copy of DW Auto-Publish to Production DW ❌ Refreshed daily/hourly 1. Continuous Integration (CI) Testing: Automatic or with every commit! 2. Continuous Delivery (CD): New changes automatically delivered in beta! 3. Continuous Deployment (also CD): New features and fixes delivered to customers automatically! ✔ ❌  1) Store all your files in source control.  2) Create a full deployment script.  3) Create a text file pointing to your deployment script. CI/CDGettingStartedChecklist
  • 18. 18 DataOps Compared to DevOps Develop Build Test Deploy Run CI CD Sandbox Develop Orchestrate Test Deploy Orchestrate Monitor CI CD
  • 19. ©4/13/23 Slalom. All Rights Reserved. Proprietary and Confidential. 19 Modern Cloud Data Reference Architecture Data Pipeline Orchestration and Monitoring Security: Authorization & Authentication Continuous Integration, Continuous Deployment (CI/CD) End-User Manufacturer Management Team Internal Analytics Teams External Users Data Source Layer External Unstructured Data Loyalty E-Commerce POS Technology Patient Support Program Wholesale Distribution Vistex JDA MBA Anzio SoloChain MSA Maple CMSV2 PharmaClick POS Reflex POS Tulip MagicBox Guardian Rewards Uniprix Rewards Proxim Rewards Newsletter LMS NPS / Survey IQVIA Nielsen Health Canada Program Participation First Data Bank IQ DataSmart UniBi Website / Facebook Email (Dialogue) Mobile Apps UniSante ProxiSante PTS (db) Proxim POS Cyberlog ICN General Pharmacy Operations Team Data Lake Raw Zone Processed Zone Curated Zone Data Ingestion Batch Ingestion • Cloud based ETL • Event driven f(x) • Rest APIs Streaming Ingestion • Real-time ingestion • IoT Devices Machine Learning (Predictions & Recommendations) Feature Generation Model Development Model Deployment Model Monitoring Central Data Storage Data Warehouse Transformation & Business Rules Data Governance and Access Data Access Layer Governance Layer Management Layer Centralized Policies Data Quality Monitoring Data Lineage & Metadata Data Catalog Consistent Controls Security Policy Enforcement Data Tokenization & Masking Patient Data Hub Facts Dimensions Aggregates Views Merge & Match Deduplication Enrichment Specialty Pharmacy Operations Team Consumption Layer Operational Reports • Warehouse & Specialty • Store Sales & Growth • Kiosk Reports External Data Portal • Neilsen Data • External Kiosk • SharePoint Sandbox Environment • Ad-hoc data analysis • Raw data analysis • Merging / curating data sets Analytical Dashboard • Manufacturer Insights • Patient Insights • Pharmacy Insights API Apps • LifeLabs Apps • Loyalty Program Apps • Etc. VPN Patient / Customer Data Governance SMEs SIR DLD RX Technology Kroll Reflex RX Fillware Compliance Cube AssysteRx PharmaClick RX Applied Robotics Ubik Data Warehouses GCP E- commerce RelayHealth Hub SAP BeWell Diem
  • 20. Taken from Stefana Muller in Dev Leaders Compare Continuous Delivery vs. Continuous Deployment vs. Continuous Integration Orchestrated,Test and Monitor Orchestrate • Both Infrastructure as code and data pipeline code with single pipeline • Composer (GCP), Airflow, Azure Data Factory (Azure), DBT, DataOps.live, Informatica, Mattilion, Stitch, AWS Data Pipeline Monitor • Cloud Resources • GCP Monitoring, CloudWatch, Azure Monitor, Datadog • Data pipelines • Respective tools, native cloud monitoring dashboards • Data Quality • ETL tools, manual tools on top of data platforms Test • At the end of the pipeline run • DBT, DataOps.live, Google Dataform, Boomi, Informatica, Matillion, Great Expectations, TSQLT
  • 21. 21 From ETL to ELTP Extract Load Transform Publish Extract Transform Load Extract Load Transform Publish Benefits of ELT over ETL: • non-destructive updates • improved stability and recoverability “Publish” step signals that data is available and ready for downstream subscribers, may involve shipping a copy of the data into the data lake, replicating to multiple redshift clusters, populating BI models, or similar actions.
  • 22. 22 At the core of DataOps is your organization’s information architecture • How well you know your data? • Do you trust your data? • Are you able to quickly detect errors? • Can you make changes incrementally without “breaking” your entire data pipeline? Critical areas below can transform your data pipeline: • Data Curation services • Metadata Management • Data Governance • Master Data Management • Self-Service interaction