Agile Data: revolutionizing data and database cloning
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Agile Data: revolutionizing data and database cloning

on

  • 241 views

 

Statistics

Views

Total Views
241
Views on SlideShare
238
Embed Views
3

Actions

Likes
0
Downloads
7
Comments
0

1 Embed 3

https://twitter.com 3

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Agile Data: revolutionizing data and database cloning Presentation Transcript

  • 1. Agile Data : Virtual Data Revolution Kyle@delphix.com kylehailey.com slideshare.com/khailey
  • 2. The Goal - Theory of Constraints (TOC) Improvement not made at the constraint is an illusion
  • 3. Factory floor : straight forward
  • 4. Factory floor : straight forward constraint
  • 5. Factory floor : optimize at the constraint constraint Tuning here Stock piling
  • 6. Factory floor : optimize at the constraint constraint Tuning here Stock piling Tuning here Starvation
  • 7. Factory floor : straight forward constraint
  • 8. Factory Floor Optimization
  • 9. Drum, Rope, Buffer (DBR) 45 75 305080 constraint 25
  • 10. For IT does the Theory of Constraints work ?
  • 11. The Phoenix Project • Constraints Identify • Metrics Define • Priorities Set • Goals Clarify • Iterations Fast • CI continuous integration • Cloud • Agile • Kanban
  • 12. The Phoenix Project What is the constraint in IT ? “One of the most powerful things that IT can do is get environments to development and QA when they need it” 2x throughput increase with agile data
  • 13. In this presentation : • Data Constraint • Solution • Use Cases
  • 14. In this presentation : • Data Constraint I. strains IT II. price is huge III. companies unaware • Solution • Use Cases
  • 15. Data is the constraint 60% Projects Over Schedule 85% delayed waiting for data Data is the Constraint CIO Magazine Survey: Current situation: only getting worse … Data Doomsday
  • 16. In this presentation : • Data Constraint I. strains IT II. price is huge III. companies unaware • Solution • Use Cases
  • 17. I. Data Constraint strains IT If you can’t satisfy the business demands then your process is broken.
  • 18. I. Data Constraint : moving data is hard – Storage & Systems – Personnel – Time
  • 19. Typical Architecture Production Instance File system Database
  • 20. Typical Architecture Production Instance Backup File system Database File system Database
  • 21. Typical Architecture Production Instance Reporting Backup File system Database Instance File system Database File system Database
  • 22. Typical Architecture Production Instance File system Database Instance File system Database File system Database File system Database Instance Instance Instance File system Database File system Database Dev, QA, UAT Reporting Backup Triple Tax
  • 23. Typical Architecture Production Instance File system Database Instance File system Database File system Database File system Database Instance Instance Instance File system Database File system Database
  • 24. In this presentation : • Data Constraint I. strains IT II. price is huge III. companies unaware • Solution • Use Cases
  • 25. II. Data Constraint price is huge
  • 26. I. Data constraint: Data floods company infrastructure 92% of the cost of business , in financial services business , is “data” www.wsta.org/resources/industry-articles Most companies have 2-9% IT spending http://uclue.com/?xq=1133 Data management is the largest Part of IT expense Gartner: Data Doomsday
  • 27. II. Data constraint price is Huge • Four Areas data tax hits 1. IT Capital resources $ 2. IT Operations personnel $ 3. Application Development $$$ 4. Business $$$$$$$
  • 28. II. Data constraint price is Huge • Four Areas data tax hits 1. IT Capital resources 2. IT Operations personnel 3. Application Development 4. Business
  • 29. II. Data constraint price is huge : 1. IT Capital • Hardware –Servers –Storage –Network –Data center floor space, power, cooling
  • 30. II. Data constraint price is Huge • Four Areas data tax hits 1. IT Capital resources 2. IT Operations personnel 3. Application Development 4. Business
  • 31. II. Data constraint price is huge : 2. IT Operations • People – DBAs – SYS Admin – Storage Admin – Backup Admin – Network Admin • Hours : 1000s just for DBAs • $100s Millions for data center modernizations
  • 32. II. Data constraint price is Huge • Four Areas data tax hits 1. IT Capital resources 2. IT Operations personnel 3. Application Development 4. Business
  • 33. II. Data constraint price is Huge : 3. App Dev • Inefficient QA: Higher costs of QA • QA Delays : Greater re-work of code • Sharing DB Environments : Bottlenecks • Using DB Subsets: More bugs in Prod • Slow Environment Builds: Delays “if you can't measure it you can’t manage it”
  • 34. II. Data Tax is Huge : 3. App Dev Slow Environment Builds Never enough environments
  • 35. Part II. Data constraint price is Huge • Four Areas data tax hits 1. IT Capital resources 2. IT Operations personnel 3. Application Development 4. Business
  • 36. II. Data constraint price is Huge : 4. Business Ability to capture revenue • Business Intelligence – Old data = less intelligence • Business Applications – Delays cause => Lost Revenue
  • 37. II. Data constraint price is Huge : 4. Business
  • 38. II. Data constraint price is Huge : 4. Business 0 5 10 15 20 25 30 Storage IT Ops Dev Revenue Billion $
  • 39. II. Data constraint price is Huge • Four Areas data tax hits 1. IT Capital resources $ 2. IT Operations personnel $ 3. Application Development $$$ 4. Business $$$$$$$ Review
  • 40. In this presentation : • Data Constraint I. strains IT II. price is huge III. companies unaware • Solution • Use Cases
  • 41. Part III. Data Constraint companies unaware
  • 42. III. Data Constraint companies unaware DBA Developer
  • 43. III. Data Constraint companies unaware #1 Biggest Enemy : IT departments believe – best processes – greatest technology – Just the way it is
  • 44. There are always new and better ways to do things
  • 45. III. Data Constraint companies unaware Why do I need an iPhone ? Don’t we already do that ? SQL scripts Alter database begin backup Back up datafiles Redo Archive Alter database end backup RMAN
  • 46. III. Data Constraint companies unaware • Ask Questions – me: we provision environments in minutes for almost not extra storage. – Customer: We already do that – me: How long does it take a developer to get an environment after they ask ? – Customer: 2-3 weeks – me: we do it in 2-3 minutes
  • 47. III. Data Constraint companies unaware How to enlighten? Ask for metrics – How long does it take a developer to get a DB copy? • QA? – How old is data ? • BI and DW : ETL batch windows • QA and Dev : how often refreshed – How much storage used for copies? • How much DBA time?
  • 48. In this presentation : • Data Constraint I. strains IT II. price is huge III. companies unaware • Solution • Use Cases
  • 49. Clone 1 Clone 3Clone 2 99% of blocks are identical
  • 50. Solution
  • 51. Clone 1 Clone 2 Clone 3 Thin Clone
  • 52. Technology Core : file system snapshots • Vmware Linked Clones – Not supported for Oracle • EMC – 16 snapshots – Write performance impact • Netapp – 255 snapshots • ZFS – Unlimited snapshots
  • 53. Engine not equal car
  • 54. Three Core Parts Production File System Instance DevelopmentStorage 21 3 Copy Sync Snapshots Time Flow Purge Clone (snapshot) Compress Share Cache Storage Mount, recover, rename Self Service, Roles & Security Rollback & Refresh Branch & Tag Instance
  • 55. Database Virtualization
  • 56. Three Physical Copies Three Virtual Copies Data Virtualization Appliance
  • 57. Install Delphix on x86 hardware Intel hardware
  • 58. Allocate Any Storage to Delphix Allocate Storage Any type Pure Storage + Delphix Better Performance for 1/10 the cost
  • 59. One time backup of source database Database Production File systemFile system Upcoming Supports InstanceInstanceInstance Application Stack Data
  • 60. DxFS (Delphix) Compress Data Database Production Data is compressed typically 1/3 size File system InstanceInstanceInstance
  • 61. Incremental forever change collection Database Production File system Changes • Collected incrementally forever • Old data purged File system Time Window Production InstanceInstanceInstance
  • 62. Virtual DB 62 / 30 Jonathan Lewis © 2013 Snapshot 1 – full backup once only at link time a b c d e f g h i We start with a full backup - analogous to a level 0 rman backup. Includes the archived redo log files needed for recovery. Run in archivelog mode.
  • 63. Virtual DB 63 / 30 Jonathan Lewis © 2013 Snapshot 2 (from SCN) b' c' a b c d e f g h i The "backup from SCN" is analogous to a level 1 incremental backup (which includes the relevant archived redo logs). Sensible to enable BCT. Delphix executes standard rman scripts
  • 64. Virtual DB 64 / 30 Jonathan Lewis © 2013 a b c d e f g h i Apply Snapshot 2 b' c' The Delphix appliance unpacks the rman backup and "overwrites" the initial backup with the changed blocks - but DxFS makes new copies of the blocks
  • 65. Virtual DB 65 / 30 Jonathan Lewis © 2013 Drop Snapshot 1 b' c'a d e f g h i The call to rman leaves us with a new level 0 backup, waiting for recovery. But we can pick the snapshot root block. We have EVERY level 0 backup
  • 66. Virtual DB 66 / 30 Jonathan Lewis © 2013 Creating a vDB b' c'a d e f g h i The first step in creating a vDB is to take a snapshot of the filesystem as at the backup you want (then roll it forward) My vDB (filesystem) Your vDB (filesystem) b' c'a d e f g h i
  • 67. Virtual DB 67 / 30 Jonathan Lewis © 2013 Creating a vDB b' c'a d e f g h i The first step in creating a vDB is to take a snapshot of the filesystem as at the backup you want (then roll it forward) My vDB (filesystem) Your vDB (filesystem) i’b' c'a d e f g h ib' c'a d e f g h i
  • 68. Before Delphix Production Dev, QA, UAT Instance Reporting Backup File system Database Instance File system Database File system Database File system Database Instance Instance Instance File system Database File system Database “triple data tax”
  • 69. With Delphix Production Instance Database Dev & QA Instance Database Reporting Instance Database Backup Instance Instance Instance Database InstanceInstance Database InstanceInstance File system Database
  • 70. In this presentation : • Problem in the Industry • Solution • Use Cases
  • 71. Use Cases 1. Development 2. QA 3. Recovery 4. Business Intelligence 5. Modernization
  • 72. Use Cases 1. Development 2. QA 3. Recovery 4. Business Intelligence 5. Modernization
  • 73. Development • Parallelized Environments • Full size environments • Self Service Development
  • 74. Development without Agile Data: bottlenecks Frustration Waiting Old Unrepresentative Data
  • 75. Development without Agile Data: subsets of DB
  • 76. Development without Agile Data: bugs
  • 77. Development with Agile Data: Parallelize Environments gif by Steve Karam
  • 78. Development with Agile Data: Full size copies
  • 79. Development without Agile Data: slow env build times Developer Asks for DB Get Access Manager approves DBA Request system Setup DB System Admin Request storage Setup machine Storage Admin Allocate storage (take snapshot) 3-6 Months to Deliver Data
  • 80. Development without Agile Data: slow env build times Why are hand offs so expensive? 1hour 1 day 9 days
  • 81. http://martinfowler.com/bliki/NoDBA.html Development without Agile Data: slow env build times
  • 82. Development with Agile Data: Self Service
  • 83. Use Cases 1. Development 2. QA 3. Recovery 4. Business Intelligence 5. Modernization
  • 84. QA • Fast • Parallel • Rollback • A/B testing
  • 85. QA without Agile Data : Long Build times Build Time 96% of QA time was building environment $.04/$1.00 actual testing vs. setup Build Build Time QA Test QA Test Build
  • 86. QA without Agile Data : slow QA Build QA Env QA Build QA Env QA Sprint 1 Sprint 2 Sprint 3 Bug CodeX 0 10 20 30 40 50 60 70 1 2 3 4 5 6 7 Delay in Fixing the bug Cost To Correct Software Engineering Economics – Barry Boehm (1981)
  • 87. QA with Agile Data : Fast environments with Branching Instance Instance Instance Source Dev QA branched from Dev Source dev QA
  • 88. QA with Agile Data: Fast environments with Branching B u i l d T i m e QA Test 1% of QA time was building environment $.99/$1.00 actual testing vs. setup Build Time QA Test Build
  • 89. QA with Agile Data: bugs found fast Sprint 1 Sprint 2 Sprint 3 Bug CodeX QA QA Build QA Env Q A Build QA Env Q A Sprint 1 Sprint 2 Sprint 3 Bug Cod e X
  • 90. QA with Agile Data: Parallel environments Instance Instance Instance Instance Source
  • 91. QA with Agile Data: Rewind for patch and QA testing Instance Instance Development Time Window Prod
  • 92. QA with Agile Data: A/B testing Instance Instance Instance Index 1 Index 2
  • 93. Use Cases 1. Development 2. QA 3. Quality 4. Business Intelligence 5. Modernization
  • 94. Quality • Prod & Dev Backups • Surgical recovery • Recovery of Production • Recovery of Development • Bug Forensics
  • 95. Quality : 50 days of backup in size of production
  • 96. Quality : Surgical recovery Instance Instance Development Time Window Before dropDrop Source
  • 97. Quality: recovery of development Instance Instance Dev1 VDB Time Window Time Window Dev1 VDB Instance Source Source Dev2 VDB Branched
  • 98. Quality : recovery of production Instance Instance VDBSource Time Window Corruption
  • 99. Quality : Forensics - Investigate Production Bugs Instance Time Window Instance Development Bug Yesterday Yesterday
  • 100. Use Cases 1. Development 2. QA 3. Quality 4. Business Intelligence 5. Modernization
  • 101. Business Intelligence • 24x7 Batches • Low Bandwidth • Temporal Data • Confidence Testing
  • 102. Business Intelligence: ETL and Refresh Windows 1pm 10pm 8am noon
  • 103. Business Intelligence: ETL and DW refreshes taking longer 1pm 10pm 8am noon 2011 2012 2013 2014 2015
  • 104. Business Intelligence ETL and Refresh Windows 2011 2012 2013 2014 2015 1pm 10pm 8am noon 10pm 8am noon 9pm 6am 8am 10pm
  • 105. Business Intelligence: ETL and DW Refreshes Instance Prod Instance DW & BI Data Guard – requires full refresh if used Active Data Guard – read only, most reports don’t work
  • 106. Business Intelligence: Fast Refreshes • Collect only Changes • Refresh in minutes Instance Instance Prod Instance BI and DW ETL 24x7
  • 107. Business Intelligence: Temporal Data
  • 108. Business Intelligence a) 24x7 Batches & Refreshes a) Temporal queries b) Confidence testing
  • 109. Use Cases 1. Development 2. QA 3. Quality 4. Business Intelligence 5. Modernization
  • 110. Modernization 1. Federated 2. Consolidation 3. Migration 4. Auditing
  • 111. Modernization: Federated Instance Instance Instance Instance Source1 Source2 Source1
  • 112. Modernization: Federated
  • 113. “I looked like a hero” Tony Young, CIO Informatica Modernization: Federated
  • 114. Data movement required for 1 source DB and 4 clones 5x Source Data Copy < 1 x Source Data Copy S SC C C C V V V V Without Delphix (c = clone) With Delphix (v = virtual DB)
  • 115. Modernization: Consolidation Without Delphix With Delphix
  • 116. Dev QA UAT Dev QA UAT 2.6 2.7 Dev QA UAT 2.8 Data Control = Source Control for the Database Production Time Flow Modernization: Auditing & Version Control CIO Insurance 600 Applications CIO Investment Banking 180 Applications CIO South America 65 Applications
  • 117. Use Cases 1. Development • Parallelized Environments • Full size environments • Self Service 2. QA • Fast • Parallel • Rollback • A/B testing 3. Recovery • Prod & Dev Backups • Surgical recovery • Recovery of Production • Recovery of Development • Bug Forensics 4. Business Intelligence • 24x7 Batches • Low Bandwidth • Temporal Data • Confidence Testing 5. Modernization • Federated • Consolidation • Migration • Auditing
  • 118. Use Case Summary 1. Development 2. QA 3. Quality 4. Business Intelligence 5. Modernization
  • 119. How expensive is the Data Constraint? Before and after Delphix w/ Fortune 500 : Dev throughput increase by 2x
  • 120. How expensive is the Data Constraint? • 10 x Faster Financial Close – 21 days down to 2 • 9x Faster BI refreshes – 3 weeks per refresh to 3x a week • 2x faster Projects • 20 % less bugs
  • 121. Agile Data Quotes • “Allowed us to shrink our project schedule from 12 months to 6 months.” – BA Scott, NYL VP App Dev • "It used to take 50-some-odd days to develop an insurance product, … Now we can get a product to the customer in about 23 days.” – Presbyterian Health • “Can't imagine working without it” – Ramesh Shrinivasan CA Department of General Services
  • 122. Summary • Problem: Data is the constraint • Solution: Agile data is small & fast • Results: Deliver projects – Half the Time – Higher Quality – Increase Revenue Kyle@delphix.com kylehailey.com slideshare.net/khailey
  • 123. Oracle 12c
  • 124. 80MB buffer cache ?
  • 125. 200GB Cache
  • 126. 5000 Tnxs/minLatency 300 ms 1 5 10 20 30 60 100 200 with 1 5 10 20 30 60 100 200 Users
  • 127. 8000 Tnxs/minLatency 600 ms 1 5 10 20 30 60 100 200 Users 1 5 10 20 30 60 100 200
  • 128. $1,000,000 1TB cache on SAN $6,000 200GB shared cache on Delphix Five 200GB database copies are cached with :
  • 129. Use Cases 1. Development • Parallelized Environments • Full size environments • Self Service 2. QA • Fast • Parallel • Rollback • A/B testing 3. Recovery • Prod & Dev Backups • Surgical recovery • Recovery of Production • Recovery of Development • Bug Forensics 4. Business Intelligence • 24x7 Batches • Low Bandwidth • Temporal Data • Confidence Testing 5. Modernization • Federated • Consolidation • Migration • Auditing
  • 130. •END
  • 131. Source Full Copy Source backup from SCN 1
  • 132. Snapshot 1 Snapshot 2
  • 133. Snapshot 1 Snapshot 2 Backup from SCN
  • 134. Snapshot 1 Snapshot 2 Snapshot 3
  • 135. Drop Snapshot Snapshot 1 Snapshot 2 Snapshot 3 Snapshot 2 Snapshot 3 Drop Snapshot 1