• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Data is the Constraint
 

Data is the Constraint

on

  • 96 views

 

Statistics

Views

Total Views
96
Views on SlideShare
96
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Interview Delphix blew me away. As a DBA I had to spend 50% of my time making copiesAfter joining Delphix, I banged on Delphix for 2 years. It worksBlog entryReinforce ideas we’ve already seen from a different perspectiveWork for a company called DelphixWe write software that enables Oracle and SQL Server customers toCopy their databases in 2 minutes with almost no storage overheadWe accomplish that by taking one initial copy and sharing the duplicate blocks Across all the clones
  • BLOG
  • want data now.don’t understand DBAs.Db bigger and harder to copy.Devswant more copies.Reporting wants more copies.Everyone has storage constraints.If you can’t satisfy the business demands your process is broken
  • Moving the data IS the big gorilla. This gorilla of a data tax is hitting your bottom line hard.
  • Probably nothing more onerous for a DBA than to hear “can you get me a copy of the production database for my project”RMAN vs Delphix. I was running out of space for RMAN live demo !When moving data is too hard, then the data in non production systems such as reporting, development or QA becomes older, and the older the data, the less actionable intelligence your BI or Analytics can give you.
  • You are paying a data tax. You are paying a data tax moving and copying data and moving and copying data over and over again. Business intelligence and analyticsProject data for development, testing, QA, integration, UAT and trainingData protection backups for recovery
  • We know from our experience that there are some $1B+ Data center consolidation price tags. Taking even 30% of the cost out of that, and cutting the timeline, is a strong and powerful way to improve margin.What about really big problems like consolidating data center real estate, or moving to the cloud?f you can non-disruptively collect the data, and easily and repeatedly present it in the target data center, you take huge chunks out of these migration timelines. Moreover, with data being so easy to move on demand, you neutralize the hordes of users who insist that there isn’t enough time to do this, or its too hard, or too risky. Annual time spent coping databases can measure in the 1000s of hours just for DBAs not including all the other personnel required to supply the infrastructure necessary
  • Data gets old because not refreshedInstead of running 5 tests in two weeks (because it takes me 2 days to rollback after each of my 1 hour tests) and paying the cost of bugs slipping into production, what if I could run 15 tests in that same two weeks and have no bugs at all in production?
  • And they told us that they spend 96% of their QA cycle time building the QA environmentAnd only 4% actually running the QA suiteThis happens for every QA suitemeaningFor every dollar spent on QA there was only 4 cents of actual QA value Meaning 96% cost is spent infrastructure time and overhead
  • Because of the time required to set up QA environmentsThe actual QA tests suites lag behind the end of a sprint or code freezeMeaning that the amount of time that goes by after the introduction of a bug in code and before the bug is found increasesAnd the more time that goes by after the introduction of a bug into the codeThe more dependent is written on top of the bug Increasing the amount of code rework required after the bug is finally foundIn his seminal book that some of you may be familiar with, “Software Engineering Economics”, author Barry Boehm Introduce the computer world to the idea that the longer one delays fixing a bug in the application design lifescyleThe more expensive it is to to fix that bug and these cost rise exponentially the laterThe bug is address in the cycle
  • Not sure if you’ve run into this but I have personally experience the followingWhen I was talking to one group at Ebay, in that development group they Shared a single copy of the production database between the developers on that team.What this sharing of a single copy of production meant, is that whenever a Developer wanted to modified that database, they had to submit their changes to codeReview and that code review took 1 to 2 weeks.I don’t know about you, but that kind of delay would stifle my motivationAnd I have direct experience with the kind of disgruntlement it can cause.When I was last a DBA, all schema changes went through me.It took me about half a day to process schema changes. That delay was too much so it was unilaterally decided byThey developers to go to an EAV schema. Or entity attribute value schemaWhich mean that developers could add new fields without consulting me and without stepping on each others feat.It also mean that SQL code as unreadable and performance was atrocious.Besides creating developer frustration, sharing a database also makes refreshing the data difficult as it takes a while to refresh the full copyAnd it takes even longer to coordinate a time when everyone stops using the copy to make the refreshAll this means is that the copy rarely gets refreshed and the data gets old and unreliable
  • To circumvent the problems of sharing a single copy of productionMany shops we talk to create subsets.One company we talked to , spends 50% of time copying databases have to subset because not enough storagesubsetting process constantly needs fixing modificationNow What happens when developers use subsets -- ****** -----
  • Stubhub (ebay) estimates that 20% of there production bugs arise from testing onSubsets instead of full database copies.
  • Due to the constraints of building clone copy database environments one ends up in the “culture of no”Where developers stop asking for a copy of a production database because the answer is “no”If the developers need to debug an anomaly seen on production or if they need to write a custom module which requires a copy of production they know not to even ask and just give up.
  • If Walmart in New York sold Lego Batman like hotcakes the morning it came out, wouldn’t be good to know at Walmart CaliforniaWeek old data happens when refreshes are too disruptive and limited to weekends
  • You might be familiar with this cycle that we’ve seen in the industry:Where IT departments budgets are being constrainedWhen IT budgets are constrained one of the first targets is reducing storageAs storage budgets are reduced the ability to provision database copies and development environments goes downAs development environments become constrained, projects start to hit delays. As projects are delayed The applications that the business depend on to generate revenue to pay for IT budgets are delayedWhich reduces revenue as the business cannot access new applications Which in turn puts more pressure on the IT budget.It becomes a viscous circle
  • From our experience before and after with Fortune 500 companies
  • How big is the data tax? One way we can measure it is by looking at the improvements in project timelines at companies that have eliminated this data tax through implementing a data virtualization appliance (DVA) and creating an agile data platform (ADP). Agile data is data that is delivered to the exact spot it’s needed just in time and with much less time/cost/effort. By looking at productivity rates after implementing an ADP compared to before the ADP we can get an idea of the price of the data tax without an ADP. IT experts building mission critical systems for Fortune 500 companies have seen real project returns averaging 20-50% productivity increases after having implemented an ADP. That’s a big data tax to pay without an ADP. The data tax is real, and once you understand how real it is, you realize how many of your key business decisions and strategies are affected by the agility of the data in your applications.Took us 50 days to develop an insurance product … now we can get a product to the customer in 23 days with Delphix
  • http://www.computerworld.com/s/article/9242959/The_Grill_Gino_Pokluda_gains_control_of_an_unwieldy_database_system?taxonomyId=19
  • Internet vs browserAutomate or die – the revolution will be automatedThe worst enemy of companies today is thinking that they have the best processes that exist, that their IT organizations are using the latest and greatest technology and nothing better exists in the field. This mentality will be the undermining of many companies.http://www.kylehailey.com/automate-or-die-the-revolution-will-be-automated/Data IS the constraintBusiness skeptics are saying to themselves that data processes are just a rounding error in most of their project timelines, and that they are sure their IT has developed processes to fix that. That’s the fundamental mistake. The very large and often hidden data tax lay in all the ways that we’ve optimized our software, data protection, and decision systems around the expectation that data is simply not agile. The belief that there is no agility problem is part of the problem.http://www.kylehailey.com/data-is-the-constraint/
  • Internet vs browserengine vs carAutomate or die – the revolution will be automatedThe worst enemy of companies today is thinking that they have the best processes that exist, that their IT organizations are using the latest and greatest technology and nothing better exists in the field. This mentality will be the undermining of many companies.http://www.kylehailey.com/automate-or-die-the-revolution-will-be-automated/Data IS the constraintBusiness skeptics are saying to themselves that data processes are just a rounding error in most of their project timelines, and that they are sure their IT has developed processes to fix that. That’s the fundamental mistake. The very large and often hidden data tax lay in all the ways that we’ve optimized our software, data protection, and decision systems around the expectation that data is simply not agile. The belief that there is no agility problem is part of the problem.http://www.kylehailey.com/data-is-the-constraint/
  • Source Syncing* Initial backup once onlyContinual forever change collection Purging of old data Storage DxFSShare blocks snap shots , unlimited, storage agnosticCompression , 1/3 typically, compress on block boundaries. Overhead for compression is basically undetectable Share data in memory, super caching*Self Service AutomationVirtual database provisioning, rollback, refresh*, branching*, tagging*Mount files over NFSInit.ora, SID, database name, database unique nameSecurity on who can see which source databases, how many clones they can make and how much storage they can use
  • Don’t know anyone successfully using this , yet
  • Internet vs browserAutomate or die – the revolution will be automatedThe worst enemy of companies today is thinking that they have the best processes that exist, that their IT organizations are using the latest and greatest technology and nothing better exists in the field. This mentality will be the undermining of many companies.http://www.kylehailey.com/automate-or-die-the-revolution-will-be-automated/Data IS the constraintBusiness skeptics are saying to themselves that data processes are just a rounding error in most of their project timelines, and that they are sure their IT has developed processes to fix that. That’s the fundamental mistake. The very large and often hidden data tax lay in all the ways that we’ve optimized our software, data protection, and decision systems around the expectation that data is simply not agile. The belief that there is no agility problem is part of the problem.http://www.kylehailey.com/data-is-the-constraint/
  • if you look at what’s really impeding flow from development to operations to the customer,  it’s typically IT operations.Operations can never deliver environments upon demand. You have to wait months or quarters to get a test environment.  When that happens terrible things happen. People actually horde environments.  They invite people to their teams because the know they have  reputation for having a cluster of test environments so people end up testing on environments that are years old which doesn’t actually achieve the goal.One of the most powerful things that organizations can do is to enable development and testing to get environment they need  when they need it“One of the best predictors of DevOps performance is that IT Operations can make available environments available on-demand to Development and Test, so that they can build and test the application in an environment that is synchronized with Production.One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need itEliyahuGoldratt
  • How long does it take a developer to get a copy of a database
  • Moving the data IS the big gorilla. Eliminating the data tax is crucial to the success of your company. And, if huge databases can be ready at target data centers in minutes, the rest of the excuses are flimsy. Agile data – virtualized data – uses a small footprint. A truly agile data platform can deliver full size datasets cheaper than subsets. A truly agile data platform can move the time or the location pointer on its data very rapidly, and can store any version that’s needed in a library at an unbelievably low cost. And, a truly agile data platform can massively improve app quality by making it reliable and dead simple to return to a common baseline for one or many databases in a very short amount of time. Applications delivered with agile data can afford a lot more full size virtual copies, eliminating wait time and extra work caused by sharing, as well as side effects. With the cost of data falling so dramatically, business can radically increase their utilization of existing hardware and storage, delivering much more rapidly without any additional cost. An agile data platform presents data so rapidly and reliably that the data becomes commoditized – and servers that sit idle because it would just take too long to rebuild can now switch roles on demand.
  • In the physical database world, 3 clones take up 3x the storage.In the virtual world 3 clones take up 1/3 the storage thanks to block sharing and compression
  • Software installs an any x86 hardware uses any storage supports Oracle 9.2-12c, standard edition, enterprise edition, single instance and RAC on AIX, Sparc, HPUX, LINUX support SQL Server
  • EMC, Netapp, Fujitsu, Or newer flash storage likeViolin, Pure Storage, Fusion IO etc
  • Delphix does a one time only copy of the source database onto Delphix
  • Physically independent but logically correlatedCloning multiple source databases at the same time can be a daunting task
  • One example with our customers is InformaticaWho had a project to integrate 6 databases into one central databaseThe time of the project was estimated at 12 monthsWith much of that coming from trying to orchestratingGetting copies of the 6 databases at the same point in timeLike herding cats
  • Informatical had a 12 month project to integrate 6 databases.After installing Delphix they did it in 6 months.I delivered this earlyI generated more revenueI freed up money and put it into innovationwon an award with Ventana Research for this project
  • Developer each get a copyFast, fresh, full, frequentSelf serviceQA branch from DevelopmentFederated cloning easyForensicsA/B testingRecovery : Logical and physical Development Provision and RefreshFullFreshFrequent (Many) Source control for code, data control for the database Data version per release version Federated cloning QA fork copies off to QA QA fork copies back to Dev Instant replay – set up and run destructive tests performance A/B Upgrade patching Recovery Backup 50 days in size of 1 copy, continuous data protection (use recent slide ob backup schedules full, incr,inrc,inrc, full) Restore logical recovery on prod logical recovery on Dev Debugging debug on clone instead of prod debug on data at the time of a problem Validate physical integrity (test for physical corruption)
  • Presbyterian when from 10 hour builds to 10 minute buildsTotal Investment in Test Environment: $2M/year10 QA engineersDBA, storage team dedicated to support testingApp, Oracle server, storage, backupsRestore load competes with backup jobsRequirements: fast data refresh, rollbackData delivery takes 480 out of 500 minute test cycle (4% value)$.04/$1.00 actual testing vs. setup
  • For example Stubhub went from 5 copies of production in development to 120Giving each developer their own copy
  • Stubhub estimated a 20% reduction in bugs that made it to production
  • Multiple scripted dumps or RMAN backups are used to move data today. With application awareness, we only request change blocks—dramatically reducing production loads by as much as 80%. We also eliminate the need for DBAs to manage custom scripts, which are expensive to maintain and support over time.
  • Once Last Thinghttp://www.dadbm.com/wp-content/uploads/2013/01/12c_pluggable_database_vs_separate_database.png
  • 250 pdb x 200 GB = 50 TBEMC sells 1GB$1000Dell sells 32GB $1,000.terabyte of RAM on a Dell costs around $32,000terabyte of RAM on a VMAX 40k costs around $1,000,000.
  • http://www.emc.com/collateral/emcwsca/master-price-list.pdf    These prices obtain on pages 897/898:Storage engine for VMAX 40k with 256 GB RAM is around $393,000Storage engine for VMAX 40k with  48 GB RAM is around $200,000So, the cost of RAM here is 193,000 / 208 = $927 a gigabyte.   That seems like a good deal for EMC, as Dell sells 32 GB RAM DIMMs for just over $1,000.    So, a terabyte of RAM on a Dell costs around $32,000, and a terabyte of RAM on a VMAX 40k costs around $1,000,000.2) Most DBs have a buffer cache that is less than 0.5% (not 5%, 0.5%) of the datafile size.

Data is the Constraint Data is the Constraint Presentation Transcript

  • Data is the Constraint Blog: kylehailey.com • • • • • • • 20 years working w/ Oracle Oracle Ace Oaktable Conferences Training days Tech journal articles Sales meetings
  • Main blogs • Personal blog : http://kylehailey.com • Linkedin www.linkedin.com/in/kylehailey • Twitter @kylehhailey • Facebook facebook.com/kyle.hailey.73
  • Top Blog Posts • http://www.kylehailey.com/delphix/ – – – – – What is Delphix Benefits in Brief High Performance Delphix Jonathan Lewis Explains Delphix Delphix vs Netapp and EM 12c
  • Data is the constraint Three points I. Data Tax strains infrastructure II. Data Tax price is huge III. Companies unaware of Data Tax
  • I. Data Taxes your business If you can’t satisfy the business demands then your process is broken.
  • II. Data Tax hits the bottom line hard
  • III. Companies don’t think there is a Data Tax problem
  • I. Data Tax – Moving data is hard – Triple tax – Data Floods infrastructure
  • I. Data Tax : moving data is hard – Storage & Systems : capital resources – Personnel : operation expenditure – Time : delayed projects
  • I. Data Tax : Typical Architecture Production Dev, QA, UAT Reporting Instance Instance Instance Instance Instance Instance Instance Instance Database Database Database Database Database File system File system File system File system File system File system File system File system File system File system File system Backup Instance Instance Instance File system File system
  • I. Data Tax: triple data tax Sandbox Development QA UAT Production Business Intelligence Analytics Reporting Tape Backup Recovery Forensics
  • I. Data Tax : Copying data floods infrastructure
  • I. Data Tax : Data Tax floods company infrastructure 92% of the cost of business — the financial services business — is “data” www.wsta.org/resources/industry-articles Most companies have 2-9% IT spending http://uclue.com/?xq=1133 Data management is the largest portion of IT expense
  • Part II. Data Tax is Huge
  • Part II. Data Tax is Huge • Four Areas data tax hits 1. 2. 3. 4. IT Capital resources IT Operations personnel Application Development *** Business • How big is the data tax? – Measure before and after installing Delphix
  • II. Data Tax is huge : 1. IT Capital • Hardware – – – – Servers Storage Network Data center floor space, power, cooling • Example – Some customers have over 1 Petabyte duplicate data – (1000 TB, ie 1,000,000 GB )
  • II. Data Tax is huge : 2. IT Operations • Involves many people – – – – – DBAs SYS Admin Storage Admin Backup Admin Network Admin • 1000s of hours annually just for DBAs – not including all the other personnel required to supply the infrastructure necessary • Data center efforts costly and difficult – Consolidation – Migrations – Move to cloud
  • II. Data Tax is Huge : 3. App Dev quality and speed Five examples of application Data Tax impact • • • • • Inefficient QA: Higher costs of QA QA Delays : Greater re-work of code Sharing DB Environments : Bottlenecks Using DB Subsets: More bugs in Prod Slow Environment Builds: Delays “if you can't measure it you can’t manage it”
  • II. Data Tax is Huge : 3. App Dev Inefficient QA : Long Build times Build QA Test Build Time 96% of QA time was building environment $.04/$1.00 actual testing vs. setup
  • II. Data Tax is Huge : 3. App Dev QA Delays: bugs found late = more code re-work Build QA Env Sprint 3 Sprint 2 Sprint 1 X Build QA Env QA Bug Code 70 60 50 40 30 20 10 0 Cost To Correct 1 2 3 4 5 6 7 Delay in Fixing the bug Software Engineering Economics – Barry Boehm (1981) QA
  • II. Data Tax is Huge : 3. App Dev full copies cause bottlenecks Old Unrepresentative Data Frustration Waiting
  • II. Data Tax is Huge : 3. App Dev subsets cause bugs
  • II. Data Tax is Huge : 3. App Dev subsets cause bugs The Production ‘Wall’ Classic problem is that queries that run fast on subsets hit the wall in production. Developers are unable to test against all data
  • II. Data Tax is Huge : 3. App Dev Slow Environment Builds: 3-6 Months to Deliver Data Developer Asks for DB Manager approves DBA System Admin Storage Admin Get Access Request system Setup DB Request storage Setup machine Allocate storage (take snapshot)
  • II. Data Tax is Huge : 3. App Dev Slow Environment Builds Why are hand offs so expensive?
  • II. Data Tax is Huge : 3. App Dev Slow Environment Builds: culture of no DBA Developer
  • II. Data Tax is Huge : 3. App Dev Slow Environment Builds Never enough environments
  • II. Data Tax is Huge : 3. App Dev What We’ve Seen Five examples of application Data Tax impact • • • • • Inefficient QA: Higher costs QA Delays : Increased re-work Sharing DB : Bottlenecks Subset DB : Bugs Slow Environment Builds: Delays
  • II. Data Tax is Huge : 4. Business Ability to capture revenue • Business Intelligence – ETLs or data warehouse refreshes • Old data = less intelligence • Less Intelligence = Missed Revenue • Business Applications – Delays on getting applications that generate the revenue
  • II. Data Tax is Huge : 4. Business Intelligence 1pm noon 10pm 2011 2012 2013 2014 2015 8am
  • II. Data Tax is Huge : 4. Business
  • II. Data Tax is Huge : 4. Business Magnitude of business impact • $27 Billion Revenue • $1 Billion IT spend • $850M development staff • $110M IT Ops • $ 40M storage Revenue Dev IT Ops Storage 0 5000 10000 15000 20000 25000 30000
  • II. Data Tax is Huge : 4. Business More important Revenue Dev IT Ops More obvious Storage 0 5000 10000 15000 20000 25000 30000
  • II. Data Tax is Huge : How Big is the Data tax? Measure before and after Delphix w/ Fortune 500 : Median App Dev throughput increase by 2x
  • II. Data Tax is Huge : How big is the data tax ? • 10 x Faster Financial Close – Facebook 21 days down to 2 • 9x Faster BI refreshes – Bank of West from 3 weeks per refresh to 3x a week – Weekly refreshes down to 3 times a day • 2x faster Projects – – – – Informatica KLA-Tencor (5 x increase) Presbyterian Health NY Life • 20 % less bugs – Stubhub
  • II. Data Tax is Huge : Public Customer Quotes • "Delphix allowed us to shrink our project schedule from 12 months to 6 months.” – BA Scott, NYL VP App Dev • "It used to take 50-some-odd days to develop an insurance product, … Now we can get a product to the customer in about 23 days.” – Presbyterian Health • “Can't imagine working without Delphix” – Ramesh Shrinivasan CA Department of General Services
  • Part III. Companies unaware of the Data Tax
  • III. Companies unaware of the Data Tax Isn’t there technology that does that? Why do I need Delphix? Why do I need an iPhone ?
  • III. Companies unaware of the Data Tax Nobody does what we do • • • • • • NetBackup Flashback DataGuard GoldenGate SnapClone FlexClone =
  • Three Core Parts Production Instance 1 Copy Sync Snapshots Time Flow Purge Storage File System 2 Clone (snapshot) Compress Share Cache Storage Development Instance 3 Mount, recover, rename Self Service, Roles & Security Rollback & Refresh Branch & Tag
  • Netapp 1 SnapManager Repository DBA Snap Manager Protection Manager RMAN Repository Snap Manager tr-3761.pdf Flex Clone Production Snap Drive Snap Mirror Storage Admin Development Netapp Frankenstein
  • 3 Oracle EM 12c Snap Clone EM 12c Test Master Source Instance ? instance Profile • • • • • • • • Register Netapp or ZFS with Storage Credentials Install agents on a LINUX machine to manage the Netapp or ZFS storage. Register test master database Enable Snap Clone for the test master database Set up a zone – set max CPU and Memory and the roles that can see these zones Set up a pool – a pool is a set of machines where databases can be provisioned Set up a profile – a source database that can be used for thin cloning Set up a service template – init.ora values Oracle Frankenstein Linux Clone instance Agents Pool Template Zone ZFS or NetApp
  • EM 12c: Snap Clone Production Flexclone Development Flexclone Netapp Snap Manager for Oracle Other technology? • Prove it • Bake off • Customer refs
  • III. Companies unaware of the Data Tax #1 Biggest Enemy IT departments believe – – – – – have best processes that exist have latest and greatest technology nothing better exists Just the way it is Data management is a drop in the bucket of overall issues “The status quo is pre-ordained failure”
  • The Phoenix Project • • • • • IT bottlenecks Setting Priorities Company Goals Defining Metrics Fast Iterations IT version of “The Goal” by E. Goldratt • “Any improvement not made at the constraint is an illusion.” • What is the constraint ? • The DBAs and environments. • “One of the most powerful things that IT can do is get environments to development and QA when they need it”
  • III. Companies unaware of the Data Tax • Ask Questions – Delphix: we provision environments in minutes for almost not extra storage. – Customer: We already do that – Delphix: How long does it take a developer to get an environment after they ask ? – Customer: 2-3 weeks – Delphix: we do it in 2-3 minutes No other product does what we do
  • III. Companies unaware of the Data Tax • How to enlighten companies ? Ask for metrics – – – – – Batch window size for ETL How new (old) is their BI data? Number: app projects per year How long does it take a developer to get a DB copy? How long does it take QA to setup an environment • How long to rollback • How long to refresh • How many time do they run a QA cycle – How old is data in QA and DEV
  • Summary I. Companies pay a Data Tax II. Data Tax is huge III. Companies unaware of the Data Tax
  • Life with Delphix
  • Typical Architecture Production Dev, QA, UAT Reporting Instance Instance Instance Instance Instance Instance Instance Instance Database Database Database Database Database File system File system File system File system File system File system File system File system File system File system File system Backup Instance Instance Instance File system File system
  • With Delphix Production Instance Instance Database File system Dev & QA Instance Instance Instance Instance Instance Instance Database Database Database Reporting Backup Instance Instance Database Database
  • DevOps With Delphix 1. 2. 3. 4. 5. Efficient QA: Low cost, high utilization Quick QA : Fast Bug Fix Every Dev gets DB: Parallelized Dev Full DB : Less Bugs Fast Builds: Culture of Yes
  • Impact on bottom line • IT Capital expense – 90 % less storage – 30 % less licenses, floor space, servers , power • IT Operational & Personnel – 1000s of hours down to 10 • Application Development – 2x project output at higher quality • Business – 9x more Fresh BI data – 2x Faster time to market and higher quality – Increase revenue
  • Summary • Data tax IS the big gorilla. • Delphix Agile data is small & fast • With Delphix , Deliver projects – Half the Time – Higher Quality – Increase Revenue
  • kylehailey.com/delphix Use Cases What is Delphix Competition
  • What is Delphix
  • Three Physical Copies Three Virtual Copies Delphix
  • Install Delphix on x86 hardware Intel hardware
  • Allocate Any Storage to Delphix Allocate Storage Any type Pure Storage + Delphix Better Performance for 1/10 the cost
  • One time backup of source database Production Supports Instance Instance Instance Database File system Upcoming
  • DxFS (Delphix) Compress Data Production Instance Instance Instance Database File system Data is compressed typically 1/3 size
  • Incremental forever change collection Production Instance Instance Instance Database Changes Time Window File system • Collected incrementally forever • Old data purged
  • Source Full Copy Source backup from SCN 1
  • Snapshot 1 Snapshot 2
  • Snapshot 1 Snapshot 2 Backup from SCN
  • Snapshot 1 Snapshot 3 Snapshot 2
  • Snapshot 3 Snapshot 2 Drop Snapshot 1
  • Cloning Production Instance Instance Instance Instance Instance Database Database Time Window File system
  • Typical Architecture Production Instance Instance Database File system File system QA UAT Instance Instance Instance Instance Instance Instance Database Database Database File system File system File system File system File system File system File system Development
  • With Delphix Production Instance Instance Database File system Development QA UAT Instance Instance Instance Instance Instance Instance Database Database Database
  • Delphix Use Cases 1. 2. 3. 4. 5. Fast, Fresh, Full Free Branching Federated Self Serve
  • Fast, Fresh, Full Source Development VDB Instance Instance Time Window
  • Free Instance Source Instance Instance Instance gif by Steve Karam
  • Branching Source Instance branching Dev Instance checkout Source QA branched from Dev Instance bookmark
  • Federated Cloning
  • Federated Source1 Instance Source1 Instance Source2 Instance Instance
  • “I looked like a hero” Tony Young, CIO Informatica
  • Self Service
  • Use Cases 1. Development Acceleration 2. Quality 3. BI
  • DevOps
  • DevOps With Delphix 1. 2. 3. 4. 5. Efficient QA: Low cost, high utilization Quick QA : Fast Bug Fix Every Dev gets DB: Parallelized Dev Full DB : Less Bugs Fast Builds: Culture of Yes
  • 1. Efficient QA: Lower cost Build QA Test Build Time B u i l d T i m e QA Test 1% of QA time was building environment $.99/$1.00 actual testing vs. setup
  • Rapid QA via Branching
  • 2. QA Immediate: bugs found fast and fixed Build QA Env Sprint 2 Sprint 1 X Q A Build QA Env Q A Sprint 3 Bug Code QA QA Sprint 2 Sprint 1 X Bug Code Sprint 3
  • 3. Private Copies: Parallelize gif by Steve Karam
  • 4. Full Size DB : Eliminate bugs
  • 5. Self Service: Fast, Efficient. Culture of Yes!
  • Quality 1. Forensics 2. Testing 3. Recovery
  • 1. Forensics: Investigate Production Bugs Development Instance Instance Time Window Anomaly on Prod Possible code bug At noon yesterday Spin up VDB of Prod as it was during anomaly
  • 2. Testing : Rewind for patch and QA testing Prod Development Instance Instance Time Window Time Window
  • 2. Testing: A/B Instance Test A with Index 1 Instance Instance Time Window • Keep tests for compare • Production vs Virtual – invisible index on Prod – Creating index on virtual • Flashback vs Virtual Test B with Index 2
  • 3. Recovery: Surgical recover of Production Source Development Instance Instance Spin VDB up Before drop Time Window Problem on Prod Dropped Table Accidently
  • 3. Recovery Surgical or Full Recovery on VDB Dev1 VDB Source Instance Instance Dev2 VDB Branched Source Time Window Dev1 VDB Time Window Instance
  • 3. Recovery: Virtual to Physical Source VDB Instance Instance Spin VDB up Before drop Time Window Corruption
  • 3. Recovery
  • Business Intelligence
  • ETL and Refresh Windows 1pm noon 10pm 8am
  • ETL and DW refreshes taking longer 1pm noon 10pm 2011 2012 2013 2014 2015 8am
  • ETL and Refresh Windows 6am 8am 10pm 10pm 1pm noon 8am 10pm 2011 2012 2013 2014 2015 noon 9pm 8am
  • ETL and DW Refreshes Prod DW & BI Instance Instance Data Guard – requires full refresh if used Active Data Guard – read only, most reports don’t work
  • Fast Refreshes • Collect only Changes • Refresh in minutes Prod Instance BI DW Instance Instance ETL 24x7
  • Temporal Data
  • Oracle 12c
  • 80MB buffer cache ?
  • 200GB Cache
  • with Latency Tnxs / min 5000 300 ms 1 5 10 20 30 60 100 200 Users 1 5 10 20 30 60 100 200
  • Latency Tnxs / min 8000 600 ms 1 5 10 20 30 60 100 200 Users 1 5 10 20 30 60 100 200
  • $1,000,000 $6,000
  • Data Center Migration : clone migration 5x Source Data Copy < 1x Source Data Copy
  • Data Center Migration : clone migration + source S S 5x Source Data Copy < 2 x Source Data Copy
  • Data Center Migration : clone migration + source S C C C 5x Source Data Copy C S V V V < 1 x Source Data Copy V
  • Consolidation Without Delphix Active Active With Delphix Idle Active Idle Active