Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

BGOUG "Agile Data: revolutionizing database cloning'

1,720 views

Published on

agile development with data virtualization to support devops and continuous deployment

Published in: Software

BGOUG "Agile Data: revolutionizing database cloning'

  1. 1. Agile Data: Revolutionizing database cloning 1 http://kylehailey.com kyle@delphix.com Tim Gorman tim@delphix.com
  2. 2. Are you too busy to Innovate? Inertia A new way : Welcome Agile & DevOps!
  3. 3. Waterfall, Agile, Devops • Waterfall Design Code test Deploy • Agile Design Code test Code test Code test Code test Deploy • Agile with Continuous Deploy Design Continuous Deploy requires DevOps
  4. 4. What is DevOps = tools + culture • Culture : – Bridging silos between Dev & Ops – Empathy avoid blame – Collaboration • Tools : – Automation VMs, Puppet, Jenkins – Self-service – Measurement 4
  5. 5. Note: DevOps > Tools + Culture DevOps Goal= optimizing flow from Dev to Ops to Pro 5 Don’t copy steps. Copy the goal Goal = company’s bottom line
  6. 6. Missed ! Goal Agile & CI vsWaterfall Agile & CI Achieved !
  7. 7. bugs time Missed ! Goal Agile & CI Achieved ! Bugs
  8. 8. profit time Missed ! Goal Agile & CI Achieved ! Profit
  9. 9. Missed ! Goal Cost per Deployment Agile & CI Achieved ! Cost Per Deployment time
  10. 10. DevOps and Data : Impossible? Waterfall Agile & DevOps Big Software Release Small Continuous Releases DevOps Goal= optimizing flow from Dev to Ops to Pro
  11. 11. The Goal : Theory of Constraints Improvement not made at the constraint is an illusion factory floor optimization
  12. 12. Factory floor
  13. 13. Factory floor constraint Not a relay race
  14. 14. Tune before constraint constraint Tuning here Stock piling
  15. 15. Tune after constraint constraint Tuning here Starvation
  16. 16. Factory floor : straight forward constraint Goal: find constraint optimize it
  17. 17. Theory of Constraints work for IT ? • Goals Clarify • Metrics Define • Constraints Identify • Priorities Set • Iterations Fast • CI • Cloud • Agile • Kanban • Kata “IT is the factory floor of this century”
  18. 18. The Phoenix Project What is the constraint in IT ?
  19. 19. What are the top 5 constraints in IT? 1. Dev environments setup 2. QA setup 3. Code Architecture 4. Development 5. Product management “One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it“ - Gene Kim
  20. 20. Data is the constraint CIO Magazine Survey: 60% Projects Over Schedule 85% delayed waiting for data Data is the Constraint only getting worse Gartner: Data Doomsday, by 2017 1/3rd IT in crisis
  21. 21. In this presentation : • Data Constraint • Solution • Use Cases
  22. 22. • Data Constraint • Solution • Use Cases
  23. 23. Typical Architecture Production Instance Database File system
  24. 24. Typical Architecture Production Instance Backup Database File system Database File system
  25. 25. Typical Architecture Production Instance Reporting Backup Database File system Instance Database File system Database File system
  26. 26. Typical Architecture Production Instance Database File system Triple Tax Dev, QA, UAT Reporting Backup Instance Instance Instance Instance Database Database File system Database File system File system Database File system Database File system
  27. 27. Typical Architecture Production Instance Database File system Instance Instance Instance Instance Database Database File system Database File system File system Database File system Database File system
  28. 28. moving data is hard – Storage & Systems – Personnel – Time
  29. 29. copies take up space –Servers –Storage –Network –Data center floor space, power, cooling
  30. 30. Never enough environments
  31. 31. Your Project Available Resources
  32. 32. Copies require People & Time • People 1000s hours per year just for DBAs – DBAs – SYS Admin – Storage Admin – Backup Admin – Network Admin • $100s Millions for data center modernizations
  33. 33. Data floods infrastructure 92% of the cost of business, in financial services business , is “data” www.wsta.org/resources/industry-articles Most companies average 5% IT spending , ½ on “data” http://uclue.com/?xq=1133
  34. 34. companies unaware
  35. 35. companies unaware Boss, Storage Admin, DBA Developer or Analyst
  36. 36. companies unaware Metrics – Time – Old Data – Storage Other – Analysts –Audits
  37. 37. What Problems does Data Constraint Cause 1. Bottlenecks 2. Waiting for environments 3. Waiting to check in code 4. Production Bugs 5. Expensive Slow QA
  38. 38. Development : waiting
  39. 39. Development : bottlenecks Frustration Waiting
  40. 40. Development : Bugs Old Unrepresentative Data
  41. 41. Development : subsets False Negatives False Positives Bugs in Production
  42. 42. Production Wall 42
  43. 43. Development : silos
  44. 44. QA : Long Build times X Bug 70 60 50 40 30 20 10 0 1 2 3 4 5 6 7 Delay in Fixing the bug Cost To Correct Software Engineering Economics – Barry Boehm (1981)
  45. 45. DevOps : Impossible with databaes? • Need lots of copies Design • Each copy is like
  46. 46. In this presentation : • Data Constraint • Solution • Use Cases
  47. 47. 99% of blocks are identical Development QA UAT
  48. 48. Solution
  49. 49. Thin Clone Development QA UAT
  50. 50. Technology Core : file system snapshots • EMC – 16 snapshots on Symmetrix – Write performance impact – No snapshots of snapshots • Netapp – 255 snapshots • ZFS – Compression – Unlimited snapshots – Snapshots of Snapshots • DxFS – “” – Storage agnostic – Shared cache in memory Also check out new SSD storage such as: Pure Storage, EMC XtremIO
  51. 51. Fuel not equal car Challenges 1. Technical 2. Bureaucracy
  52. 52. Bureaucracy Developer Asks for DB Get Access Manager approves DBA Request system Setup DB System Admin Request storage Setup machine Storage Admin Allocate storage (take snapshot)
  53. 53. 1hour 9 days 1 day Why are hand offs so expensive? Bureaucracy
  54. 54. Technical Challenge Production Filer Database Luns Target A Target B Target C snapshot clones InsIntsatannccee InInssttaannccee InInssttaanncece InInssttaanncece Instance Source
  55. 55. Development Filer Production Filer clones Database LUNs snapshot Technical Challenge Instance Target A InInssttaannccee Target B InInssttaannccee Target C InInssttaanncece Instance
  56. 56. Technical Challenge 1 2 3 Production Copy Time Flow Purge Storage Development File System Instance Clone (snapshot) Compress Share Cache Provision Mount, recover, rename Self Service, Roles & Security Instance
  57. 57. Technical Challenge Production Storage Development 1 2 3
  58. 58. How to get a Data Virtualization? – ZFS –EMC 2 + SRDF 1 – Netapp 2 + SMO 1 – Oracle EM 12c + data guard + Netapp /ZFS – Actifio - hardware – Delphix - software 3 1 2 Source sync Deploy automation Storage snapshots 1 2 3 2 1 2 3 1 2 3
  59. 59. Goal : virtualize, govern, deliver 59 • Masking: Masking • Security: Chain of custody • Self Service: Logins • Developer: Versioning , branching • Audit: Live Archive Data Supply Chain Data Virtualization Thin Cloning Snap Shots 1 2 3 2 3 2
  60. 60. Intel hardware DB2 Data File Systems Binaries Install Delphix on x86 hardware
  61. 61. Allocate Any Storage to Delphix Allocate Storage Any type Pure Storage + Delphix Better Performance for 1/10 the cost
  62. 62. One time backup of source database Production InsIIntnsasttanannccceee Database File system
  63. 63. DxFS (Delphix) Compress Data Production InsIIntnsasttanannccceee Database Data is compressed typically 1/3 size File system
  64. 64. Incremental forever change collection Production Database File system Changes • Collected incrementally forever • Old data purged InsIIntnsasttanannccceee Time Flow
  65. 65. Snapshot 1 – full backup once only at link time Jonathan Lewis © 2013 Virtual DB 65 / 30 a b c d e f g h i We start with a full backup - analogous to a level 0 rman backup. Includes the archived redo log files needed for recovery. Run in archivelog mode.
  66. 66. Snapshot 2 (from SCN) a b c d e f g h i b' c' The "backup from SCN" is analogous to a level 1 incremental backup (which includes the relevant archived redo logs). Sensible to enable BCT. Jonathan Lewis © 2013 Delphix executes standard rman scripts
  67. 67. Apply Snapshot 2 a b b' c c' d e f g h i The Delphix appliance unpacks the rman backup and "overwrites" the initial backup with the changed blocks - but DxFS makes new copies of the blocks Jonathan Lewis © 2013
  68. 68. Drop Snapshot 1 a b' c' d e f g h i The call to rman leaves us with a new level 0 backup, waiting for recovery. But we can pick the snapshot root block. We have EVERY level 0 backup Jonathan Lewis © 2013
  69. 69. Creating a vDB a b' c' d e f g h i The first step in creating a vDB is to take a snapshot of the filesystem as at the backup you want (then roll it forward) Jonathan Lewis © 2013 My vDB (filesystem) Your vDB (filesystem)
  70. 70. Creating a vDB a b' c' d e f g h i The first step in creating a vDB is to take a snapshot of the filesystem as at the backup you want (then roll it forward) Jonathan Lewis © 2013 My vDB (filesystem) Your vDB (filesystem) aa bb'' cc'' dd ee ff gg hh ii i’
  71. 71. Database Virtualization
  72. 72. Three Physical Copies Three Virtual Copies Data Virtualization Appliance
  73. 73. Before Virtual Data Production Dev, QA, UAT Instance Reporting Backup Database File system Instance Instance Instance Instance Database Database File system Database File system File system Database File system Database File system “triple data tax”
  74. 74. With Virtual Data Production Instance Dev & QA Instance InInssttaannccee InInssttaannccee Database Reporting Instance Database Backup Database Instance Instance Instance Database Database Database File system Data Virtualization Appliance
  75. 75. • Problem in the Industry • Solution • Use Cases
  76. 76. Use Cases 1. Development and QA 2. Production Support 3. Business
  77. 77. Use Cases 1. Development and QA 2. Production Support 3. Business
  78. 78. Development: Virtual Data • Unlimited • Full size • Self Service Development
  79. 79. Virtual Data: Easy Instance Instance Instance Instance Source DVA
  80. 80. Development Virtual Data: Parallelize gif by Steve Karam
  81. 81. Development Virtual Data: Full size
  82. 82. Development Virtual Data: Self Service
  83. 83. QA : Virtual Data • Fast • Parallel • Rollback • A/B testing
  84. 84. Dev QA QA Virtual Data : Fast Prod Instance DVA • Low Resource • Find bugs Fast Production Time Flow
  85. 85. QA with Virtual Data: Rewind Instance QA Prod Production Time Flow
  86. 86. QA with Virtual Data: A/B Instance Instance Instance Index 1 Index 2 Production Time Flow
  87. 87. Data Version Control Dev QA 2.1 Dev QA 2.2 DVA Production Time Flow 2.1 2.2 Prod Instance 12/3/2014 87
  88. 88. Use Cases 1. Development and QA 2. Production Support 3. Business
  89. 89. • Backups • Recovery • Forensics • Migration • Consolidation Recovery
  90. 90. 9TB database 1TB change day 30 day backups storage requirements 90 70 60 50 40 30 20 10 0 week 1 week 2 week 3 week 4 original Oracle Delphix
  91. 91. Recovery Source Instance Recover VDB Instance Drop DVA Production Time Flow
  92. 92. Forensics Instance Development DVA Source Production Time Flow
  93. 93. Development (the new production) Instance Development DVA Source Development Prod & VDB Time Flow X
  94. 94. Migration
  95. 95. Consolidation
  96. 96. Use Cases 1. Development and QA 2. Production Support 3. Business Intelligence
  97. 97. Business Intelligence • ETL • Temporal • Confidence Testing • Federated Databases • Audits
  98. 98. Business Intelligence: ETL and DW Refreshes Prod Instance DW & BI Instance
  99. 99. Virtual Data: Fast Refreshes • Collect only Changes • Refresh in minutes Prod Instance BI and DW ETL 24x7 DVA Production Time Flow
  100. 100. Temporal Data
  101. 101. Confidence testing
  102. 102. Modernization: Federated Source1 Instance Source2 Instance DVA Production Time Flow 1 Production Time Flow 2
  103. 103. Modernization: Federated
  104. 104. Modernization: Federated “I looked like a hero” Tony Young, CIO Informatica
  105. 105. Live Archive Production Time Flow Audit Prod Instance DVA 12/3/2014 105
  106. 106. Use Case Summary 1. Development & QA 2. Production Support 3. Business
  107. 107. How expensive is the Data Constraint? DVA at Fortune 500 : Dev throughput increase by 2x
  108. 108. How expensive is the Data Constraint? Faster • Financial Close • BI refreshes • Surgical recovery • Projects
  109. 109. Virtual Data Quotes • Projects “12 months to 6 months.” – New York Life • Insurance product “about 50 days ... to about 23 days” – Presbyterian Health • “Can't imagine working without it” – State of California
  110. 110. Summary • Problem: Data is the constraint • Solution: Virtualize Data • Results: • Half the time for projects • Higher quality • Increase revenue
  111. 111. Thank you! • Kyle Hailey| Oracle ACE and Technical Evangelist, Delphix – Kyle@delphix.com – kylehailey.com – slideshare.net/khailey

×