Virtual Data Platform:
or
Revolutionizing Database Cloning
How can the DBA make the biggest
impact on the company
6/26/2014 1
http://kylehailey.com
kyle@delphix.com
The Goal
Theory of Constraints
Improvement
not made
at the constraint
is an illusion
factory floor optimization
Factory floor :
straight forward
Factory floor :
straight forward
constraint
Not a relay race
Factory floor :
Tune before constraint
constraint
Tuning here
Stock piling
Factory floor :
Tune after constraint
constraint
Tuning here
Starvation
Factory floor :
straight forward
constraint
Goal: find the constraint and optimize it
Factory Floor
Optimization
Goal: find the constraint and optimize it
Theory of Constraints
work for IT ?
• Goals Clarify
• Metrics Define
• Constraints Identify
• Priorities Set
• Iterations Fast
• CI
• Cloud
• Agile
• Kanban
“IT is the factory floor of this century”
The Phoenix Project
What is the
constraint
in IT ?
“One of the most powerful things that organizations
can do is to enable development and testing to get
environment they need when they need it“
11
1. Dev environments
2. QA setup
3. Code Architecture
4. Development
5. Product management
What is the constraint in IT?
What is the constraint in IT
If you can’t satisfy
the business demands
then your process is broken.
Data is the constraint
60% Projects Over Schedule
85% delayed waiting for data
Data is the Constraint
CIO Magazine Survey:
only getting worse
• Data Constraint
I. strains IT
II. price is huge
III. companies unaware
• Solution
• Use Cases
In this presentation :
• Data Constraint
I. strains IT
II. price is huge
III. companies unaware
• Solution
• Use Cases
In this presentation :
– Storage & Systems
– Personnel
– Time
I. Data Constraint :
moving data is hard
Typical Architecture
Production
Instance
File system
Database
Typical Architecture
Production
Instance
Backup
File system
Database
File system
Database
Typical Architecture
Production
Instance
Reporting Backup
File system
Database
Instance
File system
Database
File system
Database
Typical Architecture
Production
Instance
File system
Database
Instance
File system
Database
File system
Database
File system
Database
Instance
Instance
Instance
File system
Database
File system
Database
Dev, QA, UAT Reporting Backup
Triple Tax
Typical Architecture
Production
Instance
File system
Database
Instance
File system
Database
File system
Database
File system
Database
Instance
Instance
Instance
File system
Database
File system
Database
I. Data constraint:
Data floods infrastructure
92% of the cost of business,
in financial services business , is “data”
www.wsta.org/resources/industry-articles
Most companies have
2-9% IT spending , ½ on “data”
http://uclue.com/?xq=1133
Gartner: Data Doomsday
• Data Constraint
I. strains IT
II. price is huge
III. companies unaware
• Solution
• Use Cases
In this presentation :
• Four Areas data tax hits
1. IT Capital resources $
2. IT Operations personnel $
3. Application Development $$$
4. Business $$$$$$$
II. Data constraint:
price is Huge
• Four Areas data tax hits
1. IT Capital resources $
2. IT Operations personnel $
3. Application Development $$$
4. Business $$$$$$$
II. Data constraint:
price is Huge
• Hardware
–Servers
–Storage
–Network
–Data center floor space, power, cooling
II. Data constraint price is huge :
IT Capital
II. Data constraint price is huge :
IT Capital
Never enough environments
• Four Areas data tax hits
1. IT Capital resources $
2. IT Operations personnel $
3. Application Development $$$
4. Business $$$$$$$
II. Data constraint:
price is Huge
• People
– DBAs
– SYS Admin
– Storage Admin
– Backup Admin
– Network Admin
• Hours : 1000s just for DBAs
• $100s Millions for data center modernizations
II. Data constraint price is huge:
IT Operations
II. Data constraint price is huge:
IT Operations
Developer Asks for DB Get Access
Manager approves
DBA Request
system
Setup DB
System
Admin
Request
storage
Setup
machine
Storage
Admin
Allocate
storage
(take snapshot)
Why are hand offs so expensive?
1hour
1 day
9 days
II. Data constraint price is
huge: IT Operations
• Four Areas data tax hits
1. IT Capital resources $
2. IT Operations personnel $
3. Application Development $$$
4. Business $$$$$$$
II. Data constraint:
price is Huge
“One of the most powerful things that IT can do is get
environments to development and QA when they need it”
- Gene Kim author of The Phoenix Project
• Inefficient QA: Higher costs of QA
• QA Delays : Greater re-work of code
• Sharing DB Environments : Bottlenecks
• Using DB Subsets: More bugs in Prod
• Slow Environment Builds: Delays
II. Data constraint price is Huge :
Application Development
• Four Areas data tax hits
1. IT Capital resources $
2. IT Operations personnel $
3. Application Development $$$
4. Business $$$$$$$
Part II. Data constraint:
price is Huge
Ability to capture revenue
• Business Applications
– Delays cause lost revenue
• Business Intelligence
– Old data = less intelligence
II. Data constraint price is Huge :
Business
• Data Constraint
I. strains IT
II. price is huge
III. companies unaware
• Solution
• Use Cases
In this presentation :
Part III. Data Constraint
companies unaware
III. Data Constraint
companies unaware
Developer or AnalystBoss, Storage Admin, DBA
Metrics
–Time
–Old Data
–Storage
–Analysts
–Audits
III. Data Constraint
companies unaware
• Data Constraint
I. strains IT
II. price is huge
III. companies unaware
• Solution
• Use Cases
In this presentation :
Clone 1 Clone 3Clone 2
99% of blocks are
identical
Solution
Clone 1 Clone 2 Clone 3
Thin Clone
• EMC
– 16 snapshots on Symmetrix
– Write performance impact
– No snapshots of snapshots
• Netapp
– 255 snapshots
• ZFS
– Compression
– Unlimited snapshots
– Snapshots of Snapshots
• DxFS
– “”
– Storage agnostic
– Shared cache in memory
Technology Core :
file system snapshots
Also check out new SSD storage such as:
Pure Storage, EMC XtremIO
Fuel not equal car
Challenges
1. Technical
2. Bureaucracy
II. Data constraint price is huge:
IT Operations
Developer Asks for DB Get Access
Manager approves
DBA Request
system
Setup DB
System
Admin
Request
storage
Setup
machine
Storage
Admin
Allocate
storage
(take snapshot)
1. Technical Challenge
Database
Luns
Production Filer
Target A
Target B
Target C
snapshot
clones
InstanceInstance
InstanceInstance
InstanceInstance
InstanceInstance
Instance
Source
Database
LUNs
snapshot
clonesProduction Filer
Development Filer
1. Technical Challenge
Instance
Target A
Target B
Target C
InstanceInstance
InstanceInstance
InstanceInstance
Instance
1. Technical Challenge
Copy
Time Flow
Purge
Production
File System Instance
DevelopmentStorage
21 3
Clone (snapshot)
Compress
Share Cache
Provision
Mount, recover, rename
Self Service, Roles & Security
Instance
Data Virtualization
How to get a Data Virtualization?
– EMC + SRDF + scripting
– Netapp + SMO + scripting
– Oracle EM 12c DBaaS + data guard + Netapp /ZFS + scripting
– Delphix
2 31
Production DevelopmentStorage
21 3
2 31
23 1
2 31
Final Goal
6/26/2014 51
Data Supply Chain
• Security
• Masking
• Chain of custody
• Self Service
• Roles
• Restrictions
• Developer
• Data Versioning
• Refresh, Rollback
• Audit:
• Live Archive
Snap Shots
Thin Cloning
Data Virtualization
Data Supply Chain
Install Delphix on x86
hardware
Intel hardware
Application Stack Data
Allocate Any Storage to
Delphix
Allocate Storage
Any type
Pure Storage + Delphix
Better Performance for
1/10 the cost
One time backup of
source database
Database
Production
File systemFile system
InstanceInstanceInstance
DxFS (Delphix) Compress
Data
Database
Production
Data is
compressed
typically 1/3
size
File system
InstanceInstanceInstance
Incremental forever
change collection
Database
Production
File system
Changes
• Collected incrementally forever
• Old data purged
File system
Time Window
Production
InstanceInstanceInstance
Virtual DB
57 / 30Jonathan Lewis
© 2013
Snapshot 1 – full backup
once only at link time
a b c d e f g h i
We start with a full backup - analogous to a level 0 rman backup. Includes
the archived redo log files needed for recovery. Run in archivelog mode.
Virtual DB
58 / 30Jonathan Lewis
© 2013
Snapshot 2 (from SCN)
b' c'
a b c d e f g h i
The "backup from SCN" is analogous to a level 1 incremental backup (which
includes the relevant archived redo logs). Sensible to enable BCT.
Delphix executes
standard rman scripts
Virtual DB
59 / 30Jonathan Lewis
© 2013
a b c d e f g h i
Apply Snapshot 2
b' c'
The Delphix appliance unpacks the rman backup and "overwrites" the initial
backup with the changed blocks - but DxFS makes new copies of the blocks
Virtual DB
60 / 30Jonathan Lewis
© 2013
Drop Snapshot 1
b' c'a d e f g h i
The call to rman leaves us with a new level 0 backup, waiting for recovery.
But we can pick the snapshot root block. We have EVERY level 0 backup
Virtual DB
61 / 30Jonathan Lewis
© 2013
Creating a vDB
b' c'a d e f g h i
The first step in creating a vDB is to take a snapshot of the filesystem as at
the backup you want (then roll it forward)
My vDB
(filesystem)
Your vDB
(filesystem)
b' c'a d e f g h i
Virtual DB
62 / 30Jonathan Lewis
© 2013
Creating a vDB
b' c'a d e f g h i
The first step in creating a vDB is to take a snapshot of the filesystem as at
the backup you want (then roll it forward)
My vDB
(filesystem)
Your vDB
(filesystem)
i’b' c'a d e f g h ib' c'a d e f g h i
Database Virtualization
Three Physical Copies
Three Virtual Copies
Data
Virtualization
Appliance
Before Virtual Data
Production Dev, QA, UAT
Instance
Reporting Backup
File system
Database
Instance
File system
Database
File system
Database
File system
Database
Instance
Instance
Instance
File system
Database
File system
Database
“triple data
tax”
With Virtual Data
Production
Instance
Database
Dev & QA
Instance
Database
Reporting
Instance
Database
Backup
Instance Instance Instance
Database
InstanceInstance
Database
InstanceInstance
File system
Database
Data
Virtualization
Appliance
• Problem in the Industry
• Solution
• Use Cases
In this presentation :
1. Development and QA
2. Recovery
3. Business
Use Cases
1. Development and QA
2. Recovery
3. Business
Use Cases
Development : bottlenecks
Frustration Waiting
Old Unrepresentative Data
Development : subsets
Development : bugs
http://martinfowler.com/bliki/NoDBA.html
Development without Virtual Data: slow env build
times
Development: Virtual Data
• Unlimited
• Full size
• Self Service
Development:
Virtual Data
Development
Virtual Data:
Easy
Instance
Instance
Instance
Instance
Source
Data
Virtualization
Appliance
DVA
Parallel
Environments
• QA
• Dev
Development Virtual Data:
Parallelize
gif by Steve Karam
Development Virtual
Data: Full size
Development Virtual
Data: Self Service
Dev2
Dev1Merge to dev1
Trunk
DB
VC
Fork
Fork
Fork
Fork
DBmaestro
QA : Virtual Data
• Fast
• Parallel
• Rollback
• A/B testing
QA
Virtual Data
QA : Long Build times
96% of QA time was building environment
$.04/$1.00 actual testing vs. setup
QA Build QA
QA Build QA
QA before virtual : resource expensive & slow
BugX
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7
Delay in Fixing the bug
Cost
To
Correct
Software Engineering Economics – Barry Boehm (1981)
QA Virtual Data : Fast
Dev
QA
Instance
Prod
DVA
Time Flow
• Low Resource
• Find bugs Fast
QA Virtual Data : Fast
QA with Virtual Data:
Rewind
Instance
Instance
Development
Prod
QA with Virtual Data: A/B
Instance
Instance
Instance
Index 1
Index 2
Data Version Control
6/26/2014 86
Dev
QA
2.1
Dev
QA
2.2
2.1 2.2
Instance
Prod
DVA
1. Development and QA
2. Recovery
3. Business
Use Cases
• Backups
• Recovery
• Forensics
Recovery
Recovery: Backups
Recovery: scenarios
Instance
Instance
Recover VDB
Drop
Source
DVA
Recovery: Forensics
Instance
Instance
DevelopmentDVA
Source
Recovery: Development
Instance
Instance
Development
DVA
Source
Instance
Development
1. Development and QA
2. Recovery
3. Business Intelligence
Use Cases
Business Intelligence
Business Intelligence:
ETL and Refresh Windows
1pm 10pm 8am
noon
Business Intelligence:
batch taking too long
1pm 10pm 8am
noon
2011
2012
2013
2014
2015
2011
2012
2013
2014
2015
1pm 10pm 8am
noon
10pm 8am noon 9pm
6am 8am 10pm
Business Intelligence:
batch taking too long
Business Intelligence:
ETL and DW Refreshes
Instance
Prod
Instance
DW & BI
Before Virtual: limited, slow ETL and DW refreshes
• Collect only Changes
• Refresh in minutes
Virtual Data:
Fast Refreshes
Instance Instance
Prod BI and DW
ETL
24x7
DVA Instance
Virtual Data: Fast Refreshes
Temporal Data
Confidence testing
1. Federated
2. Migration
3. Auditing
Modernization
Modernization:
Federated
Instance
Instance
Instance
Instance
Source1
Source2
DVA
Modernization:
Federated
“I looked like a hero”
Tony Young, CIO Informatica
Modernization:
Federated
Modernization:
Migration
Audit
6/26/2014 107
Instance
Prod
DVA
Live Archive
Consolidation
1. Development & QA
2. Recovery
3. Business
Use Case Summary
How expensive is the
Data Constraint?
DVA at Fortune 500 :
Dev throughput increase by 2x
• 10 x Faster Financial Close
• 9x Faster BI refreshes
• 8x Faster surgical recovery
• 3x Project tracks
• 2x Faster Projects
How expensive is the
Data Constraint?
• Projects “12 months to 6 months.”
– New York Life
• Insurance product “about 50 days ... to about 23 days”
– Presbyterian Health
• “Can't imagine working without it”
– State of California
Virtual Data Quotes
• Problem: Data is the constraint
• Solution: Virtual Data
• Results:
– Half the time for projects
– Higher quality
– Increase revenue
Summary
Thank you!
• Kyle Hailey| Oracle ACE and Technical
Evangelist, Delphix
– Kyle@delphix.com
– kylehailey.com
– slideshare.net/khailey
Oracle 12c
80MB buffer cache ?
200GB
Cache
5000
Tnxs/minLatency
300
ms
1 5 10 20 30 60 100 200
with
1 5 10 20 30 60 100 200
Users
8000
Tnxs/minLatency
600
ms
1 5 10 20 30 60 100 200
Users
1 5 10 20 30 60 100 200
$1,000,000
1TB cache on SAN
$6,000
200GB shared cache on Delphix
Five 200GB database copies are
cached with :
6/26/2014 122
6/26/2014 123
Business Intelligence
a) 24x7 Batches
b) Temporal queries
c) Confidence testing
Thin Cloning
Snap
Manager
SnapManager
Repository
Protection
Manager
Snap Drive
Snap
Manager
Snap Mirror
Flex Clone
RMAN
Repository
Production
Development
DBA
Storage
Admin
1
tr-3761.pdf
Netapp
NetApp Filer - DevelopmentNetApp Filer - Production
Database
Luns
Snap
mirror
Snapshot Manager
for Oracle
Flexclone
Repository
Database
Snap
Drive
Protection
Manage
Production
Development
1 Netapp
Target A
Target B
Target C
InstanceInstance
InstanceInstance
InstanceInstance
Instance
Where we want to be
Database
File system
Production
Instance
Database
Development
Instance
Database
QA
Instance
Database
UAT
Instance
Snapshots
Instance Instance Instance Instance
EM 12c: Snap Clone
Production Development
Flexclone Flexclone
Netapp
Snap Manager for Oracle
II. Data constraint price is Huge : 4. Business
II. Data constraint price is
Huge : 4. Business
0 5 10 15 20 25 30
Storage
IT Ops
Dev
Revenue
Billion $
III. Data Constraint
companies unaware
#1 Biggest Enemy :
IT departments believe
– best processes
– greatest technology
– Just the way it is
There are always new
and better ways to do
things
III. Data Constraint
companies unaware
Why do I need an iPhone ?
Don’t we already do that ?
SQL scripts
Alter database begin backup
Back up datafiles
Redo
Archive
Alter database end backup
RMAN

Kscope 14 Presentation : Virtual Data Platform