Agile Data :
Virtual Data Revolution
Kyle@delphix.com
kylehailey.com
slideshare.com/khailey
The Goal - Theory of Constraints (TOC)
Improvement
not made at the constraint
is an illusion
Factory floor : straight forward
Factory floor : straight forward
constraint
Factory floor : optimize at the constraint
constraint
Tuning here
Stock piling
Factory floor : optimize at the constraint
constraint
Tuning here
Stock piling
Tuning here
Starvation
Factory floor : straight forward
constraint
Factory Floor Optimization
Drum, Rope, Buffer (DBR)
45 75
305080
constraint
25
For IT does the Theory of Constraints work ?
The Phoenix Project
• Constraints Identify
• Metrics Define
• Priorities Set
• Goals Clarify
• Iterations Fast
• CI continuous integration
• Cloud
• Agile
• Kanban
The Phoenix Project
What is the constraint in IT ?
“One of the most powerful things that IT can do is get
environments to development and QA when they need
it”
2x throughput increase with agile data
In this presentation :
• Data Constraint
• Solution
• Use Cases
In this presentation :
• Data Constraint
I. strains IT
II. price is huge
III. companies unaware
• Solution
• Use Cases
Data is the constraint
60% Projects Over Schedule
85% delayed waiting for data
Data is the Constraint
CIO Magazine Survey:
Current situation: only getting worse … Data Doomsday
In this presentation :
• Data Constraint
I. strains IT
II. price is huge
III. companies unaware
• Solution
• Use Cases
I. Data Constraint strains IT
If you can’t satisfy the business demands then your process is broken.
I. Data Constraint : moving data is hard
– Storage & Systems
– Personnel
– Time
Typical Architecture
Production
Instance
File system
Database
Typical Architecture
Production
Instance
Backup
File system
Database
File system
Database
Typical Architecture
Production
Instance
Reporting Backup
File system
Database
Instance
File system
Database
File system
Database
Typical Architecture
Production
Instance
File system
Database
Instance
File system
Database
File system
Database
File system
Database
Instance
Instance
Instance
File system
Database
File system
Database
Dev, QA, UAT Reporting Backup
Triple Tax
Typical Architecture
Production
Instance
File system
Database
Instance
File system
Database
File system
Database
File system
Database
Instance
Instance
Instance
File system
Database
File system
Database
In this presentation :
• Data Constraint
I. strains IT
II. price is huge
III. companies unaware
• Solution
• Use Cases
II. Data Constraint price is huge
I. Data constraint: Data floods company
infrastructure
92% of the cost of business
, in financial services business ,
is “data”
www.wsta.org/resources/industry-articles
Most companies have
2-9% IT spending
http://uclue.com/?xq=1133
Data management is the largest
Part of IT expense
Gartner: Data Doomsday
II. Data constraint price is Huge
• Four Areas data tax hits
1. IT Capital resources $
2. IT Operations personnel $
3. Application Development $$$
4. Business $$$$$$$
II. Data constraint price is Huge
• Four Areas data tax hits
1. IT Capital resources
2. IT Operations personnel
3. Application Development
4. Business
II. Data constraint price is huge : 1. IT Capital
• Hardware
–Servers
–Storage
–Network
–Data center floor space, power,
cooling
II. Data constraint price is Huge
• Four Areas data tax hits
1. IT Capital resources
2. IT Operations personnel
3. Application Development
4. Business
II. Data constraint price is huge : 2. IT Operations
• People
– DBAs
– SYS Admin
– Storage Admin
– Backup Admin
– Network Admin
• Hours : 1000s just for DBAs
• $100s Millions for data center modernizations
II. Data constraint price is Huge
• Four Areas data tax hits
1. IT Capital resources
2. IT Operations personnel
3. Application Development
4. Business
II. Data constraint price is Huge : 3. App Dev
• Inefficient QA: Higher costs of QA
• QA Delays : Greater re-work of code
• Sharing DB Environments : Bottlenecks
• Using DB Subsets: More bugs in Prod
• Slow Environment Builds: Delays
“if you can't measure it you can’t manage it”
II. Data Tax is Huge : 3. App Dev
Slow Environment Builds
Never enough environments
Part II. Data constraint price is Huge
• Four Areas data tax hits
1. IT Capital resources
2. IT Operations personnel
3. Application Development
4. Business
II. Data constraint price is Huge : 4. Business
Ability to capture revenue
• Business Intelligence
– Old data = less intelligence
• Business Applications
– Delays cause
=> Lost Revenue
II. Data constraint price is Huge : 4. Business
II. Data constraint price is Huge : 4. Business
0 5 10 15 20 25 30
Storage
IT Ops
Dev
Revenue
Billion $
II. Data constraint price is Huge
• Four Areas data tax hits
1. IT Capital resources $
2. IT Operations personnel $
3. Application Development $$$
4. Business $$$$$$$
Review
In this presentation :
• Data Constraint
I. strains IT
II. price is huge
III. companies unaware
• Solution
• Use Cases
Part III. Data Constraint companies unaware
III. Data Constraint companies unaware
DBA Developer
III. Data Constraint companies unaware
#1 Biggest Enemy :
IT departments believe
– best processes
– greatest technology
– Just the way it is
There are always new and better ways to do things
III. Data Constraint companies unaware
Why do I need an iPhone ?
Don’t we already do that ?
SQL scripts
Alter database begin backup
Back up datafiles
Redo
Archive
Alter database end backup
RMAN
III. Data Constraint companies unaware
• Ask Questions
– me: we provision environments in minutes for
almost not extra storage.
– Customer: We already do that
– me: How long does it take a developer to get
an environment after they ask ?
– Customer: 2-3 weeks
– me: we do it in 2-3 minutes
III. Data Constraint companies unaware
How to enlighten? Ask for metrics
– How long does it take a developer to get a DB copy?
• QA?
– How old is data ?
• BI and DW : ETL batch windows
• QA and Dev : how often refreshed
– How much storage used for copies?
• How much DBA time?
In this presentation :
• Data Constraint
I. strains IT
II. price is huge
III. companies unaware
• Solution
• Use Cases
Clone 1 Clone 3Clone 2
99% of blocks are identical
Solution
Clone 1 Clone 2 Clone 3
Thin Clone
Technology Core : file system snapshots
• Vmware Linked Clones
– Not supported for Oracle
• EMC
– 16 snapshots
– Write performance impact
• Netapp
– 255 snapshots
• ZFS
– Unlimited snapshots
Engine not equal car
Three Core Parts
Production
File System Instance
DevelopmentStorage
21 3
Copy
Sync
Snapshots
Time Flow
Purge
Clone
(snapshot)
Compress
Share Cache
Storage
Mount, recover, rename
Self Service, Roles &
Security
Rollback & Refresh
Branch & Tag
Instance
Database Virtualization
Three Physical Copies
Three Virtual Copies
Data
Virtualization
Appliance
Install Delphix on x86 hardware
Intel hardware
Allocate Any Storage to Delphix
Allocate Storage
Any type
Pure Storage + Delphix
Better Performance for
1/10 the cost
One time backup of source database
Database
Production
File systemFile system
Upcoming
Supports
InstanceInstanceInstance
Application Stack Data
DxFS (Delphix) Compress Data
Database
Production
Data is
compressed
typically 1/3
size
File system
InstanceInstanceInstance
Incremental forever change collection
Database
Production
File system
Changes
• Collected incrementally forever
• Old data purged
File system
Time Window
Production
InstanceInstanceInstance
Virtual DB
62 / 30
Jonathan Lewis
© 2013
Snapshot 1 – full backup once only at link time
a b c d e f g h i
We start with a full backup - analogous to a level 0 rman backup. Includes
the archived redo log files needed for recovery. Run in archivelog mode.
Virtual DB
63 / 30
Jonathan Lewis
© 2013
Snapshot 2 (from SCN)
b' c'
a b c d e f g h i
The "backup from SCN" is analogous to a level 1 incremental backup (which
includes the relevant archived redo logs). Sensible to enable BCT.
Delphix executes
standard rman scripts
Virtual DB
64 / 30
Jonathan Lewis
© 2013
a b c d e f g h i
Apply Snapshot 2
b' c'
The Delphix appliance unpacks the rman backup and "overwrites" the initial
backup with the changed blocks - but DxFS makes new copies of the blocks
Virtual DB
65 / 30
Jonathan Lewis
© 2013
Drop Snapshot 1
b' c'a d e f g h i
The call to rman leaves us with a new level 0 backup, waiting for recovery.
But we can pick the snapshot root block. We have EVERY level 0 backup
Virtual DB
66 / 30
Jonathan Lewis
© 2013
Creating a vDB
b' c'a d e f g h i
The first step in creating a vDB is to take a snapshot of the filesystem as at
the backup you want (then roll it forward)
My vDB
(filesystem)
Your vDB
(filesystem)
b' c'a d e f g h i
Virtual DB
67 / 30
Jonathan Lewis
© 2013
Creating a vDB
b' c'a d e f g h i
The first step in creating a vDB is to take a snapshot of the filesystem as at
the backup you want (then roll it forward)
My vDB
(filesystem)
Your vDB
(filesystem)
i’b' c'a d e f g h ib' c'a d e f g h i
Before Delphix
Production Dev, QA, UAT
Instance
Reporting Backup
File system
Database
Instance
File system
Database
File system
Database
File system
Database
Instance
Instance
Instance
File system
Database
File system
Database
“triple data
tax”
With Delphix
Production
Instance
Database
Dev & QA
Instance
Database
Reporting
Instance
Database
Backup
Instance Instance Instance
Database
InstanceInstance
Database
InstanceInstance
File system
Database
In this presentation :
• Problem in the Industry
• Solution
• Use Cases
Use Cases
1. Development
2. QA
3. Recovery
4. Business Intelligence
5. Modernization
Use Cases
1. Development
2. QA
3. Recovery
4. Business Intelligence
5. Modernization
Development
• Parallelized Environments
• Full size environments
• Self Service
Development
Development without Agile Data: bottlenecks
Frustration Waiting
Old Unrepresentative Data
Development without Agile Data: subsets of DB
Development without Agile Data: bugs
Development with Agile Data: Parallelize
Environments
gif by Steve Karam
Development with Agile Data: Full size copies
Development without Agile Data: slow env build
times
Developer Asks for DB Get Access
Manager approves
DBA Request
system
Setup DB
System
Admin
Request
storage
Setup
machine
Storage
Admin
Allocate
storage
(take snapshot)
3-6 Months to Deliver Data
Development without Agile Data: slow env build
times
Why are hand offs so expensive?
1hour
1 day
9 days
http://martinfowler.com/bliki/NoDBA.html
Development without Agile Data: slow env build
times
Development with Agile Data: Self Service
Use Cases
1. Development
2. QA
3. Recovery
4. Business Intelligence
5. Modernization
QA
• Fast
• Parallel
• Rollback
• A/B testing
QA without Agile Data : Long Build times
Build Time
96% of QA time was building environment
$.04/$1.00 actual testing vs. setup
Build
Build Time
QA
Test
QA
Test
Build
QA without Agile Data : slow QA
Build QA Env QA Build QA Env QA
Sprint 1 Sprint 2 Sprint 3
Bug CodeX
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7
Delay in Fixing the bug
Cost
To
Correct
Software Engineering Economics – Barry Boehm (1981)
QA with Agile Data : Fast environments with
Branching
Instance
Instance
Instance
Source Dev
QA
branched from Dev
Source
dev
QA
QA with Agile Data: Fast environments with
Branching
B
u
i
l
d
T
i
m
e
QA Test
1% of QA time was building environment
$.99/$1.00 actual testing vs. setup
Build Time
QA Test
Build
QA with Agile Data: bugs found fast
Sprint 1 Sprint 2 Sprint 3
Bug CodeX
QA QA
Build QA
Env
Q
A
Build QA
Env
Q
A
Sprint 1 Sprint 2 Sprint 3
Bug
Cod
e
X
QA with Agile Data: Parallel environments
Instance
Instance
Instance
Instance
Source
QA with Agile Data: Rewind for patch and QA
testing
Instance Instance
Development
Time Window
Prod
QA with Agile Data: A/B testing
Instance
Instance
Instance
Index 1
Index 2
Use Cases
1. Development
2. QA
3. Quality
4. Business Intelligence
5. Modernization
Quality
• Prod & Dev Backups
• Surgical recovery
• Recovery of Production
• Recovery of Development
• Bug Forensics
Quality : 50 days of backup in size of
production
Quality : Surgical recovery
Instance Instance
Development
Time Window
Before dropDrop
Source
Quality: recovery of development
Instance
Instance
Dev1 VDB
Time Window
Time Window
Dev1 VDB
Instance
Source
Source
Dev2 VDB Branched
Quality : recovery of production
Instance Instance
VDBSource
Time Window
Corruption
Quality : Forensics - Investigate Production Bugs
Instance
Time Window
Instance
Development
Bug
Yesterday
Yesterday
Use Cases
1. Development
2. QA
3. Quality
4. Business Intelligence
5. Modernization
Business Intelligence
• 24x7 Batches
• Low Bandwidth
• Temporal Data
• Confidence Testing
Business Intelligence: ETL and Refresh Windows
1pm 10pm 8am
noon
Business Intelligence: ETL and DW refreshes
taking longer
1pm 10pm 8am
noon
2011
2012
2013
2014
2015
Business Intelligence ETL and Refresh
Windows
2011
2012
2013
2014
2015
1pm 10pm 8am
noon
10pm 8am noon 9pm
6am 8am 10pm
Business Intelligence: ETL and DW Refreshes
Instance
Prod
Instance
DW & BI
Data Guard – requires full refresh if used
Active Data Guard – read only, most reports don’t work
Business Intelligence: Fast Refreshes
• Collect only Changes
• Refresh in minutes
Instance Instance
Prod
Instance
BI and DW
ETL
24x7
Business Intelligence: Temporal Data
Business Intelligence
a) 24x7 Batches & Refreshes
a) Temporal queries
b) Confidence testing
Use Cases
1. Development
2. QA
3. Quality
4. Business Intelligence
5. Modernization
Modernization
1. Federated
2. Consolidation
3. Migration
4. Auditing
Modernization: Federated
Instance
Instance
Instance
Instance
Source1
Source2
Source1
Modernization: Federated
“I looked like a hero”
Tony Young, CIO Informatica
Modernization: Federated
Data movement required for 1 source DB and 4
clones
5x Source Data Copy < 1 x Source Data Copy
S SC C C C V V V V
Without Delphix (c = clone) With Delphix (v = virtual DB)
Modernization: Consolidation
Without Delphix With Delphix
Dev
QA
UAT
Dev
QA
UAT
2.6
2.7
Dev
QA
UAT
2.8
Data Control = Source Control for the Database
Production Time Flow
Modernization: Auditing & Version Control
CIO
Insurance
600 Applications
CIO
Investment Banking
180 Applications
CIO
South America
65 Applications
Use Cases
1. Development
• Parallelized Environments
• Full size environments
• Self Service
2. QA
• Fast
• Parallel
• Rollback
• A/B testing
3. Recovery
• Prod & Dev Backups
• Surgical recovery
• Recovery of Production
• Recovery of Development
• Bug Forensics
4. Business Intelligence
• 24x7 Batches
• Low Bandwidth
• Temporal Data
• Confidence Testing
5. Modernization
• Federated
• Consolidation
• Migration
• Auditing
Use Case Summary
1. Development
2. QA
3. Quality
4. Business Intelligence
5. Modernization
How expensive is the Data Constraint?
Before and after Delphix w/ Fortune 500 :
Dev throughput increase by 2x
How expensive is the Data Constraint?
• 10 x Faster Financial Close
– 21 days down to 2
• 9x Faster BI refreshes
– 3 weeks per refresh to 3x a week
• 2x faster Projects
• 20 % less bugs
Agile Data Quotes
• “Allowed us to shrink our project schedule from 12
months to 6 months.”
– BA Scott, NYL VP App Dev
• "It used to take 50-some-odd days to develop an
insurance product, … Now we can get a product to the
customer in about 23 days.”
– Presbyterian Health
• “Can't imagine working without it”
– Ramesh Shrinivasan CA Department of General Services
Summary
• Problem: Data is the constraint
• Solution: Agile data is small & fast
• Results: Deliver projects
– Half the Time
– Higher Quality
– Increase Revenue
Kyle@delphix.com
kylehailey.com
slideshare.net/khailey
Oracle 12c
80MB buffer cache ?
200GB
Cache
5000
Tnxs/minLatency
300
ms
1 5 10 20 30 60 100 200
with
1 5 10 20 30 60 100 200
Users
8000
Tnxs/minLatency
600
ms
1 5 10 20 30 60 100 200
Users
1 5 10 20 30 60 100 200
$1,000,000
1TB cache on SAN
$6,000
200GB shared cache on Delphix
Five 200GB database copies are
cached with :
Use Cases
1. Development
• Parallelized Environments
• Full size environments
• Self Service
2. QA
• Fast
• Parallel
• Rollback
• A/B testing
3. Recovery
• Prod & Dev Backups
• Surgical recovery
• Recovery of Production
• Recovery of Development
• Bug Forensics
4. Business Intelligence
• 24x7 Batches
• Low Bandwidth
• Temporal Data
• Confidence Testing
5. Modernization
• Federated
• Consolidation
• Migration
• Auditing
•END
Source Full Copy Source backup
from SCN 1
Snapshot 1
Snapshot 2
Snapshot 1
Snapshot 2
Backup from SCN
Snapshot 1
Snapshot 2
Snapshot 3
Drop Snapshot
Snapshot 1
Snapshot 2
Snapshot 3
Snapshot 2
Snapshot 3
Drop
Snapshot 1
Agile Data: revolutionizing data and database cloning

Agile Data: revolutionizing data and database cloning

  • 1.
    Agile Data : VirtualData Revolution Kyle@delphix.com kylehailey.com slideshare.com/khailey
  • 2.
    The Goal -Theory of Constraints (TOC) Improvement not made at the constraint is an illusion
  • 3.
    Factory floor :straight forward
  • 4.
    Factory floor :straight forward constraint
  • 5.
    Factory floor :optimize at the constraint constraint Tuning here Stock piling
  • 6.
    Factory floor :optimize at the constraint constraint Tuning here Stock piling Tuning here Starvation
  • 7.
    Factory floor :straight forward constraint
  • 8.
  • 9.
    Drum, Rope, Buffer(DBR) 45 75 305080 constraint 25
  • 10.
    For IT doesthe Theory of Constraints work ?
  • 11.
    The Phoenix Project •Constraints Identify • Metrics Define • Priorities Set • Goals Clarify • Iterations Fast • CI continuous integration • Cloud • Agile • Kanban
  • 12.
    The Phoenix Project Whatis the constraint in IT ? “One of the most powerful things that IT can do is get environments to development and QA when they need it” 2x throughput increase with agile data
  • 13.
    In this presentation: • Data Constraint • Solution • Use Cases
  • 14.
    In this presentation: • Data Constraint I. strains IT II. price is huge III. companies unaware • Solution • Use Cases
  • 15.
    Data is theconstraint 60% Projects Over Schedule 85% delayed waiting for data Data is the Constraint CIO Magazine Survey: Current situation: only getting worse … Data Doomsday
  • 16.
    In this presentation: • Data Constraint I. strains IT II. price is huge III. companies unaware • Solution • Use Cases
  • 17.
    I. Data Constraintstrains IT If you can’t satisfy the business demands then your process is broken.
  • 18.
    I. Data Constraint: moving data is hard – Storage & Systems – Personnel – Time
  • 19.
  • 20.
  • 21.
    Typical Architecture Production Instance Reporting Backup Filesystem Database Instance File system Database File system Database
  • 22.
    Typical Architecture Production Instance File system Database Instance Filesystem Database File system Database File system Database Instance Instance Instance File system Database File system Database Dev, QA, UAT Reporting Backup Triple Tax
  • 23.
    Typical Architecture Production Instance File system Database Instance Filesystem Database File system Database File system Database Instance Instance Instance File system Database File system Database
  • 24.
    In this presentation: • Data Constraint I. strains IT II. price is huge III. companies unaware • Solution • Use Cases
  • 25.
    II. Data Constraintprice is huge
  • 26.
    I. Data constraint:Data floods company infrastructure 92% of the cost of business , in financial services business , is “data” www.wsta.org/resources/industry-articles Most companies have 2-9% IT spending http://uclue.com/?xq=1133 Data management is the largest Part of IT expense Gartner: Data Doomsday
  • 27.
    II. Data constraintprice is Huge • Four Areas data tax hits 1. IT Capital resources $ 2. IT Operations personnel $ 3. Application Development $$$ 4. Business $$$$$$$
  • 28.
    II. Data constraintprice is Huge • Four Areas data tax hits 1. IT Capital resources 2. IT Operations personnel 3. Application Development 4. Business
  • 29.
    II. Data constraintprice is huge : 1. IT Capital • Hardware –Servers –Storage –Network –Data center floor space, power, cooling
  • 30.
    II. Data constraintprice is Huge • Four Areas data tax hits 1. IT Capital resources 2. IT Operations personnel 3. Application Development 4. Business
  • 31.
    II. Data constraintprice is huge : 2. IT Operations • People – DBAs – SYS Admin – Storage Admin – Backup Admin – Network Admin • Hours : 1000s just for DBAs • $100s Millions for data center modernizations
  • 32.
    II. Data constraintprice is Huge • Four Areas data tax hits 1. IT Capital resources 2. IT Operations personnel 3. Application Development 4. Business
  • 33.
    II. Data constraintprice is Huge : 3. App Dev • Inefficient QA: Higher costs of QA • QA Delays : Greater re-work of code • Sharing DB Environments : Bottlenecks • Using DB Subsets: More bugs in Prod • Slow Environment Builds: Delays “if you can't measure it you can’t manage it”
  • 34.
    II. Data Taxis Huge : 3. App Dev Slow Environment Builds Never enough environments
  • 35.
    Part II. Dataconstraint price is Huge • Four Areas data tax hits 1. IT Capital resources 2. IT Operations personnel 3. Application Development 4. Business
  • 36.
    II. Data constraintprice is Huge : 4. Business Ability to capture revenue • Business Intelligence – Old data = less intelligence • Business Applications – Delays cause => Lost Revenue
  • 37.
    II. Data constraintprice is Huge : 4. Business
  • 38.
    II. Data constraintprice is Huge : 4. Business 0 5 10 15 20 25 30 Storage IT Ops Dev Revenue Billion $
  • 39.
    II. Data constraintprice is Huge • Four Areas data tax hits 1. IT Capital resources $ 2. IT Operations personnel $ 3. Application Development $$$ 4. Business $$$$$$$ Review
  • 40.
    In this presentation: • Data Constraint I. strains IT II. price is huge III. companies unaware • Solution • Use Cases
  • 41.
    Part III. DataConstraint companies unaware
  • 42.
    III. Data Constraintcompanies unaware DBA Developer
  • 43.
    III. Data Constraintcompanies unaware #1 Biggest Enemy : IT departments believe – best processes – greatest technology – Just the way it is
  • 44.
    There are alwaysnew and better ways to do things
  • 45.
    III. Data Constraintcompanies unaware Why do I need an iPhone ? Don’t we already do that ? SQL scripts Alter database begin backup Back up datafiles Redo Archive Alter database end backup RMAN
  • 46.
    III. Data Constraintcompanies unaware • Ask Questions – me: we provision environments in minutes for almost not extra storage. – Customer: We already do that – me: How long does it take a developer to get an environment after they ask ? – Customer: 2-3 weeks – me: we do it in 2-3 minutes
  • 47.
    III. Data Constraintcompanies unaware How to enlighten? Ask for metrics – How long does it take a developer to get a DB copy? • QA? – How old is data ? • BI and DW : ETL batch windows • QA and Dev : how often refreshed – How much storage used for copies? • How much DBA time?
  • 48.
    In this presentation: • Data Constraint I. strains IT II. price is huge III. companies unaware • Solution • Use Cases
  • 49.
    Clone 1 Clone3Clone 2 99% of blocks are identical
  • 50.
  • 51.
    Clone 1 Clone2 Clone 3 Thin Clone
  • 52.
    Technology Core :file system snapshots • Vmware Linked Clones – Not supported for Oracle • EMC – 16 snapshots – Write performance impact • Netapp – 255 snapshots • ZFS – Unlimited snapshots
  • 53.
  • 54.
    Three Core Parts Production FileSystem Instance DevelopmentStorage 21 3 Copy Sync Snapshots Time Flow Purge Clone (snapshot) Compress Share Cache Storage Mount, recover, rename Self Service, Roles & Security Rollback & Refresh Branch & Tag Instance
  • 55.
  • 56.
    Three Physical Copies ThreeVirtual Copies Data Virtualization Appliance
  • 57.
    Install Delphix onx86 hardware Intel hardware
  • 58.
    Allocate Any Storageto Delphix Allocate Storage Any type Pure Storage + Delphix Better Performance for 1/10 the cost
  • 59.
    One time backupof source database Database Production File systemFile system Upcoming Supports InstanceInstanceInstance Application Stack Data
  • 60.
    DxFS (Delphix) CompressData Database Production Data is compressed typically 1/3 size File system InstanceInstanceInstance
  • 61.
    Incremental forever changecollection Database Production File system Changes • Collected incrementally forever • Old data purged File system Time Window Production InstanceInstanceInstance
  • 62.
    Virtual DB 62 /30 Jonathan Lewis © 2013 Snapshot 1 – full backup once only at link time a b c d e f g h i We start with a full backup - analogous to a level 0 rman backup. Includes the archived redo log files needed for recovery. Run in archivelog mode.
  • 63.
    Virtual DB 63 /30 Jonathan Lewis © 2013 Snapshot 2 (from SCN) b' c' a b c d e f g h i The "backup from SCN" is analogous to a level 1 incremental backup (which includes the relevant archived redo logs). Sensible to enable BCT. Delphix executes standard rman scripts
  • 64.
    Virtual DB 64 /30 Jonathan Lewis © 2013 a b c d e f g h i Apply Snapshot 2 b' c' The Delphix appliance unpacks the rman backup and "overwrites" the initial backup with the changed blocks - but DxFS makes new copies of the blocks
  • 65.
    Virtual DB 65 /30 Jonathan Lewis © 2013 Drop Snapshot 1 b' c'a d e f g h i The call to rman leaves us with a new level 0 backup, waiting for recovery. But we can pick the snapshot root block. We have EVERY level 0 backup
  • 66.
    Virtual DB 66 /30 Jonathan Lewis © 2013 Creating a vDB b' c'a d e f g h i The first step in creating a vDB is to take a snapshot of the filesystem as at the backup you want (then roll it forward) My vDB (filesystem) Your vDB (filesystem) b' c'a d e f g h i
  • 67.
    Virtual DB 67 /30 Jonathan Lewis © 2013 Creating a vDB b' c'a d e f g h i The first step in creating a vDB is to take a snapshot of the filesystem as at the backup you want (then roll it forward) My vDB (filesystem) Your vDB (filesystem) i’b' c'a d e f g h ib' c'a d e f g h i
  • 68.
    Before Delphix Production Dev,QA, UAT Instance Reporting Backup File system Database Instance File system Database File system Database File system Database Instance Instance Instance File system Database File system Database “triple data tax”
  • 69.
    With Delphix Production Instance Database Dev &QA Instance Database Reporting Instance Database Backup Instance Instance Instance Database InstanceInstance Database InstanceInstance File system Database
  • 70.
    In this presentation: • Problem in the Industry • Solution • Use Cases
  • 71.
    Use Cases 1. Development 2.QA 3. Recovery 4. Business Intelligence 5. Modernization
  • 72.
    Use Cases 1. Development 2.QA 3. Recovery 4. Business Intelligence 5. Modernization
  • 73.
    Development • Parallelized Environments •Full size environments • Self Service Development
  • 74.
    Development without AgileData: bottlenecks Frustration Waiting Old Unrepresentative Data
  • 75.
    Development without AgileData: subsets of DB
  • 76.
  • 77.
    Development with AgileData: Parallelize Environments gif by Steve Karam
  • 78.
    Development with AgileData: Full size copies
  • 79.
    Development without AgileData: slow env build times Developer Asks for DB Get Access Manager approves DBA Request system Setup DB System Admin Request storage Setup machine Storage Admin Allocate storage (take snapshot) 3-6 Months to Deliver Data
  • 80.
    Development without AgileData: slow env build times Why are hand offs so expensive? 1hour 1 day 9 days
  • 81.
  • 82.
    Development with AgileData: Self Service
  • 83.
    Use Cases 1. Development 2.QA 3. Recovery 4. Business Intelligence 5. Modernization
  • 84.
    QA • Fast • Parallel •Rollback • A/B testing
  • 85.
    QA without AgileData : Long Build times Build Time 96% of QA time was building environment $.04/$1.00 actual testing vs. setup Build Build Time QA Test QA Test Build
  • 86.
    QA without AgileData : slow QA Build QA Env QA Build QA Env QA Sprint 1 Sprint 2 Sprint 3 Bug CodeX 0 10 20 30 40 50 60 70 1 2 3 4 5 6 7 Delay in Fixing the bug Cost To Correct Software Engineering Economics – Barry Boehm (1981)
  • 87.
    QA with AgileData : Fast environments with Branching Instance Instance Instance Source Dev QA branched from Dev Source dev QA
  • 88.
    QA with AgileData: Fast environments with Branching B u i l d T i m e QA Test 1% of QA time was building environment $.99/$1.00 actual testing vs. setup Build Time QA Test Build
  • 89.
    QA with AgileData: bugs found fast Sprint 1 Sprint 2 Sprint 3 Bug CodeX QA QA Build QA Env Q A Build QA Env Q A Sprint 1 Sprint 2 Sprint 3 Bug Cod e X
  • 90.
    QA with AgileData: Parallel environments Instance Instance Instance Instance Source
  • 91.
    QA with AgileData: Rewind for patch and QA testing Instance Instance Development Time Window Prod
  • 92.
    QA with AgileData: A/B testing Instance Instance Instance Index 1 Index 2
  • 93.
    Use Cases 1. Development 2.QA 3. Quality 4. Business Intelligence 5. Modernization
  • 94.
    Quality • Prod &Dev Backups • Surgical recovery • Recovery of Production • Recovery of Development • Bug Forensics
  • 95.
    Quality : 50days of backup in size of production
  • 96.
    Quality : Surgicalrecovery Instance Instance Development Time Window Before dropDrop Source
  • 97.
    Quality: recovery ofdevelopment Instance Instance Dev1 VDB Time Window Time Window Dev1 VDB Instance Source Source Dev2 VDB Branched
  • 98.
    Quality : recoveryof production Instance Instance VDBSource Time Window Corruption
  • 99.
    Quality : Forensics- Investigate Production Bugs Instance Time Window Instance Development Bug Yesterday Yesterday
  • 100.
    Use Cases 1. Development 2.QA 3. Quality 4. Business Intelligence 5. Modernization
  • 101.
    Business Intelligence • 24x7Batches • Low Bandwidth • Temporal Data • Confidence Testing
  • 102.
    Business Intelligence: ETLand Refresh Windows 1pm 10pm 8am noon
  • 103.
    Business Intelligence: ETLand DW refreshes taking longer 1pm 10pm 8am noon 2011 2012 2013 2014 2015
  • 104.
    Business Intelligence ETLand Refresh Windows 2011 2012 2013 2014 2015 1pm 10pm 8am noon 10pm 8am noon 9pm 6am 8am 10pm
  • 105.
    Business Intelligence: ETLand DW Refreshes Instance Prod Instance DW & BI Data Guard – requires full refresh if used Active Data Guard – read only, most reports don’t work
  • 106.
    Business Intelligence: FastRefreshes • Collect only Changes • Refresh in minutes Instance Instance Prod Instance BI and DW ETL 24x7
  • 107.
  • 108.
    Business Intelligence a) 24x7Batches & Refreshes a) Temporal queries b) Confidence testing
  • 109.
    Use Cases 1. Development 2.QA 3. Quality 4. Business Intelligence 5. Modernization
  • 110.
  • 111.
  • 112.
  • 113.
    “I looked likea hero” Tony Young, CIO Informatica Modernization: Federated
  • 114.
    Data movement requiredfor 1 source DB and 4 clones 5x Source Data Copy < 1 x Source Data Copy S SC C C C V V V V Without Delphix (c = clone) With Delphix (v = virtual DB)
  • 115.
  • 116.
    Dev QA UAT Dev QA UAT 2.6 2.7 Dev QA UAT 2.8 Data Control =Source Control for the Database Production Time Flow Modernization: Auditing & Version Control CIO Insurance 600 Applications CIO Investment Banking 180 Applications CIO South America 65 Applications
  • 117.
    Use Cases 1. Development •Parallelized Environments • Full size environments • Self Service 2. QA • Fast • Parallel • Rollback • A/B testing 3. Recovery • Prod & Dev Backups • Surgical recovery • Recovery of Production • Recovery of Development • Bug Forensics 4. Business Intelligence • 24x7 Batches • Low Bandwidth • Temporal Data • Confidence Testing 5. Modernization • Federated • Consolidation • Migration • Auditing
  • 118.
    Use Case Summary 1.Development 2. QA 3. Quality 4. Business Intelligence 5. Modernization
  • 119.
    How expensive isthe Data Constraint? Before and after Delphix w/ Fortune 500 : Dev throughput increase by 2x
  • 120.
    How expensive isthe Data Constraint? • 10 x Faster Financial Close – 21 days down to 2 • 9x Faster BI refreshes – 3 weeks per refresh to 3x a week • 2x faster Projects • 20 % less bugs
  • 121.
    Agile Data Quotes •“Allowed us to shrink our project schedule from 12 months to 6 months.” – BA Scott, NYL VP App Dev • "It used to take 50-some-odd days to develop an insurance product, … Now we can get a product to the customer in about 23 days.” – Presbyterian Health • “Can't imagine working without it” – Ramesh Shrinivasan CA Department of General Services
  • 123.
    Summary • Problem: Datais the constraint • Solution: Agile data is small & fast • Results: Deliver projects – Half the Time – Higher Quality – Increase Revenue Kyle@delphix.com kylehailey.com slideshare.net/khailey
  • 124.
  • 125.
  • 126.
  • 127.
    5000 Tnxs/minLatency 300 ms 1 5 1020 30 60 100 200 with 1 5 10 20 30 60 100 200 Users
  • 128.
    8000 Tnxs/minLatency 600 ms 1 5 1020 30 60 100 200 Users 1 5 10 20 30 60 100 200
  • 129.
    $1,000,000 1TB cache onSAN $6,000 200GB shared cache on Delphix Five 200GB database copies are cached with :
  • 130.
    Use Cases 1. Development •Parallelized Environments • Full size environments • Self Service 2. QA • Fast • Parallel • Rollback • A/B testing 3. Recovery • Prod & Dev Backups • Surgical recovery • Recovery of Production • Recovery of Development • Bug Forensics 4. Business Intelligence • 24x7 Batches • Low Bandwidth • Temporal Data • Confidence Testing 5. Modernization • Federated • Consolidation • Migration • Auditing
  • 131.
  • 136.
    Source Full CopySource backup from SCN 1
  • 137.
  • 138.
  • 139.
  • 140.
    Drop Snapshot Snapshot 1 Snapshot2 Snapshot 3 Snapshot 2 Snapshot 3 Drop Snapshot 1