Your SlideShare is downloading. ×

Smarter Management for Your Data Growth

3,072

Published on

Matt Aslett (The451Group) and Deirdre Mahon (RainStor) examine the evolving data management landscape and how RainStor's Online Data Retention (OLDR) repository fits into the equation.

Matt Aslett (The451Group) and Deirdre Mahon (RainStor) examine the evolving data management landscape and how RainStor's Online Data Retention (OLDR) repository fits into the equation.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,072
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
53
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • De-dupe & ReductionAny storage / PlatformCloud EnabledLimitless Data VolumesFast load – Ingestion RatesSQL Query – High PerformanceImmutable Compliant Store
  • So if we take a look at Matt’s earlier high level architecture diagram, I think its worth pointing out the key areas RainStor technology can be applied – at the top, we have a RS repository which can be deployed alongside the RDBMS … and can be archived / retired saving by compressing the data to a much smaller footprint. Our INFA partnership focuses on this area predominantly and retires a large number of applications such as Oracle ebusiness suite… On the lower part of the screen – RS can be deployed as the leading repository to store long term historical data for EDW’s and additionally the same data sets can be stored on the cloud…
  • Security Industry:The combination of the increase in cybercrime, changing regulations, and public exposures is increasing the attention and resources dedicated to data security. Over the next three years it's expected that data security issues (and the related application security) will account for over 60% of new enterprise security spending- this includes spending on new technologies, and excludes maintenance of existing technologies such as firewalls and antivirus, which account for most current security costs.Data and business application security will drive most of the new growth of the security market over the next 3-5 years.Business network traffic for 2010 > 3,800 Pb / month> 2,500 Pb internet traffic > 1,200 Pb WAN traffic > 58 Pb mobile trafficCisco forecasts 20% CAGRData breaches are common - 95% of records stolen externally - 90% involved malware - 70% were uncovered by outsiders - 50% went unnoticed for monthsCSPs: Global mobile data traffic will increase 26-fold between 2010 and 2015. Mobile data traffic will grow at acompound annual growth rate (CAGR) of 92 percent from 2010 to 2015, reaching 6.3 exabytes per month by 2015.Last year’s mobile data traffic was three times the size of the entire global Internet in 2000.
  • Transcript

    • 1. Smarter Management for Your Data Growth
      Retain Critical Data Online At A Fraction of The Cost
      April 2011
    • 2. Introductions
      Changing Data Management Landscape & Trends
      From Operational to Analytical
      Cloud and Hadoop
      Where do They Fit?
      RainStor and How it Works
      Analytics Data Retention Use-case
      Economics
      Q&A
      Matt Aslett, The 451 Group
      Deirdre Mahon, VP Marketing – RainStor
      Ramon Chen, VP Product Management - RainStor
      Agenda
    • 3. Total Data
      The changing data management landscape
      Matthew Aslett, The 451 Group
      matthew.aslett@the451group.com
      © 2011 by The 451 Group. All rights reserved
    • 4. 451 Research is focused on the business of enterprise IT innovation. The company’s analysts provide critical and timely insight into the competitive dynamics of innovation in emerging technology segments.
      The 451 Group
      Tier1 Research is a single-source research and advisory firm covering the multi-tenant datacenter, hosting, IT and cloud-computing sectors, blending the best of industry and financial research.
      The Uptime Institute is ‘The Global Data Center Authority’ and a pioneer in the creation and facilitation of end-user knowledge communities to improve reliability and uninterruptible availability in datacenter facilities.
      TheInfoPro is a leading IT advisory and research firm that provides real-world perspectives on the customer and market dynamics of the enterprise information technology landscape, harnessing the collective knowledge and insight of leading IT organizations worldwide.
      ChangeWave Research is a research firm that identifies and quantifies ‘change’ in consumer spending behavior, corporate purchasing, and industry, company and technology trends.
    • 5. Overview
      The changing data management landscape
      One overarching trend:
      Total Data
      Impacting four technology areas:
      Operational database
      Analytic database
      Data archiving
      Machine-generated data
      The trends driving data management
      5
    • 6. Trends driving data management
      The volume, variety and velocity of data has never been greater and is growing
      The value of data has never been better understood
      The capabilities for processing data have never been better
      Higher processor performance and density are enabling advanced processing on commodity hardware
      Software enhancements designed to make best use of processing performance and scalable architecture
      Advanced and in-database analytics bring processing to the data, reducing latency and improving efficiency
      The data deluge problem is also a big data opportunity
      6
    • 7. Introducing Total Data
      A concept define by The 451 Group to describe new approaches to data management – beyond restrictive silos
      Reflects the changing data management landscape as pragmatic choices are being made about data storage and analysis techniques
      Processing any data that might be applicable to analytics
      in the operational database, data warehouse, or Hadoop, or archive
      Structured, semi-structured or unstructured
      Relational or non-relational, on-premise or in the cloud
      Inspired by ‘Total Football’
      7
    • 8. Total Football meets Total Data
      “You make space, you come into space. And if the ball doesn’t come, you leave this space and another player will come into it.”
      BernadusHulshoff, Ajax 1966-77
      Abandonment of restrictive (self-imposed) rules about individual roles and responsibility
      Enabled and relied on fluidity and flexibility to respond to changing requirements
      Reliant on, and exploited, improved performance levels
      8
    • 9. Reporting/BI
      Data management – in theory
      9
      • The application is the primary source of data
      • 10. The relational database is sacrosanct
      • 11. The enterprise data warehouse is the single source of the truth (or is supposed to be)
      • 12. Offline data archiving
      • 13. Infrastructure primarily exists to support the data/application layer
      Enterprise app
      Operationaldatabase
      Data cleansing/sampling/MDM
      EDW
      Data archive
      Infrastructure
    • 14. Data management – in practice
      10
      • The relational database is sacrosanct
      • 15. Distributed data layer to meet the scalability and performance demands
      • 16. New opportunities for real-time BI
      • 17. Polyglot persistence – use the most appropriate data storage for the application
      Enterprise app
      Reporting/BI
      Reporting/BI
      Distributed data
      Data cleansing/sampling/MDM
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      EDW
      Data archive
      Infrastructure
    • 18. Data management – in practice
      11
      • The enterprise data warehouse is the single source of the truth
      • 19. Data is copied into departmental or regional data marts
      • 20. Data warehouse administrators are fighting a losing battle for control
      Enterprise app
      Reporting/BI
      Reporting/BI
      Reporting
      Reporting
      Reporting
      Distributed data
      Data cleansing/sampling/MDM
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      EDW
      Data archive
      Infrastructure
    • 21. Data management – in practice
      12
      • Higher processor performance and density are enabling advanced processing on commodity hardware
      • 22. Advanced in-database analytics bring processing to the data, reducing latency and improving efficiency
      Enterprise app
      Reporting/BI
      Reporting/BI
      Reporting
      Reporting
      Reporting
      Distributed data
      Data cleansing/sampling/MDM
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      EDW
      Data archive
      Infrastructure
    • 23. Data management – in practice
      13
      • Hadoop and associated analysis tools (Hive, Pig) for large-scale batch processing of large, complex data sets
      • 24. Taking further advantage of hardware economics
      Enterprise app
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting
      Reporting
      Reporting
      Distributed data
      Data cleansing/sampling/MDM
      Hadoop
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      EDW
      Data archive
      Infrastructure
    • 25. Data management – in practice
      14
      • Integrating Hadoop with the data warehouse for ETL and also two-step data analysis
      • 26. Greater acceptance that the EDW is part of a broader data analytics architecture
      Enterprise app
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting
      Reporting
      Reporting
      Distributed data
      Data cleansing/sampling/MDM
      Hadoop
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      EDW
      Data archive
      Infrastructure
    • 27. Data location, data location, data location
      Not the end of the EDW, but the EDW is one of many sources of BI, rather than the only source of BI
      The issue of data location becomes paramount
      Choose the right storage technology – software and hardware
      EDW, Hadoop or archive
      On-premise or on the cloud
      Memory, disk or SSD
      Understand the requirements:
      Value and temperature of the data
      Ensure data can be queried using existing tools/skills
      Cost
      15
    • 28. EDW requirements/characteristics
      High performance query/analysis response
      Ability to support multiple users concurrently
      Capacity for multi-terabyte storage and scale
      Fast data load and staging for data transformation
      Ability to operate with BI/analytics tools
      Security and governance
      Cost - $20k-$50k per TB
      Alternatives
      Do nothing and suffer the consequences
      Deploy appliances and/or Hadoop for specific use-cases
      Offload to an online repository
      16
    • 29. Data management – in practice
      17
      • Offline data archiving
      • 30. Traditionally, data archived for legal requirements
      • 31. Previously little need for querying/analytics
      Enterprise app
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting
      Reporting
      Reporting
      Distributed data
      Data cleansing/sampling/MDM
      Hadoop
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      EDW
      Data archive
      Infrastructure
    • 32. Data management – in practice
      18
      • Regulations have increased the need to query archived data
      • 33. Focus shifts on to how to enable querying easily and cost effectively
      • 34. Becomes an online repository for historical data
      Enterprise app
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting
      Reporting
      Reporting
      Reporting
      Distributed data
      Data cleansing/sampling/MDM
      Hadoop
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      EDW
      Data repository
      Infrastructure
    • 35. Data management – in practice
      19
      • Infrastructure primarily exists to support the data/application layer
      • 36. “Machine generated data” an untapped source of data
      Enterprise app
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting
      Reporting
      Reporting
      Reporting
      Distributed data
      Data cleansing/sampling/MDM
      Hadoop
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      EDW
      Data repository
      Infrastructure
    • 37. Data management – in practice
      20
      • Infrastructure as a source of data for analysis and integration with application data: ‘datastructure’
      • 38. Likely to transform into data-generating and data-processing infrastructure as analytics capabilities are applied directly to the data source
      Enterprise app
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting
      Reporting
      Reporting
      Reporting
      Distributed data
      Data cleansing/sampling/MDM
      Hadoop
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      EDW
      Data repository
      Datastructure
    • 39. Data management – in practice
      21
      • Cloud as both a source of data and data storage and processing layer
      Enterprise app
      Hadoop/DW
      Data archive
      Analytic DB
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Distributed data
      Data cleansing/sampling/MDM
      Hadoop
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      EDW
      Cloud Infrastructure
      Data repository
      Datastructure
    • 40. Total Data
      22
      • More flexible approach to data management
      • 41. Greater opportunities for business intelligence
      Enterprise app
      Hadoop/DW
      Data archive
      Analytic DB
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Distributed data
      Data cleansing/sampling/MDM
      Hadoop
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      EDW
      Cloud Infrastructure
      Data repository
      Datastructure
    • 42. Data location, data location, data location
      Avoid data movement and duplication – retain governance
      Virtual data marts and data clouds
      Data virtualization to provide access to multiple data sources
      23
    • 43. Data virtualization
      24
      Enterprise app
      Hadoop/DW
      Data archive
      Analytic DB
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Distributed data
      Data cleansing/sampling/MDM
      Hadoop
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      EDW
      Cloud Infrastructure
      Data repository
      Datastructure
    • 44. Data virtualization
      25
      Enterprise app
      Analytic DB
      Hadoop/DW
      Data archive
      Reporting/BI
      Reporting/BI
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Distributed data
      Datavirtualization
      Data cleansing/sampling/MDM
      Hadoop
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      Virtualdata mart
      Virtualdata mart
      Virtualdata mart
      Virtualdata mart
      Virtualdata mart
      Virtualdata mart
      EDW
      Cloud Infrastructure
      Data repository
      Datastructure
    • 45. Who is RainStor?
      Specialized database for cost effective
      reduction, retention & on-demand retrieval
      of historical structured data
      At 10x Less Cost
      OEM Partner Model
      Cloud or On-premise
    • 46. Partner Case Studies
      HP
      Sector :Telco
      Solution : CDR/IPDR retention and lawful intercept (HP Dragon)
      Retaining billions of CDRs per day in immutable form and enabling cost effective query for regulatory authorities
      • Sector : Telco
      • 47. Solution : Message (SMS/MMS) and traffic log management
      • 48. Retaining 1000s of messages a second while keeping accessible for regulatory purposes
      • 49. Sector : Horizontal
      • 50. Solution : Teradata Data Retention Machine
      • 51. Retain BI & Analytical data long term in RainStor powered Data Retention Machine for low cost per TB stored. Eliminating tape.
      • 52. Sector : Various/Horizontal
      • 53. Solution : Information Lifecycle Management
      • 54. Retaining historical data from highly complex packaged applications while keeping accessible for business and regulatory purposes
    • Data Retention Solution Requirements
      Database Archiving
      Application Retirement
      Data Warehouse Archiving
      Data Warehouse Appliance
      Online Data Retention (OLDR)
      Analytical
      OLAP
      Transactional
      OLTP
      Compliance
      Query
      Static Machine-Generated Data (MGD)
    • 55. Where RainStor Fits
      Enterpriseapp
      Hadoop/DW
      Data archive
      Analytic DB
      Application
      Archive / Retired
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting/BI
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Reporting
      Distributed data
      Data cleansing/sampling/MDM
      Hadoop
      Operational
      database
      Operational
      database
      Operational
      database
      Operational
      database
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      Analytic
      database
      Analytic
      database
      Analyticdatabase
      EDW
      Cloud Infrastructure
      Data repository
      Datastructure
    • 56. RainStor’s Focus
      SmartGrid to Generated 1 Exabyte of Data
      In US Alone
      Next 2 years
      Data security will account for over 60% of new enterprise security spending in next 3 years
      Global mobile data traffic will grow 26-fold between 2010 and 2015!
      (6.3 Exabyte's p/mth)
      Utilities
      Security
      Network Forensics
      Cyber-security
      Communications
      Big Data Volumes
      - Needs to be online & Query-able
      Found the needle – where’s the haystack?
      Volumes are rising-
      Regulated -
      Infrastructure needs -
      Reaching Telco-scale
      Multi- billions of records
      Strict Compliance
      RDBMS’s Break
      Analytics Required
      10’s of Petabytes Retained
    • 60. How Does RainStor Do It?
      Reduce
      SIZE: Massive de-dupe ~97% savings in storage
      HARDWARE: On commodity server/disk infrastructure
      RESOURCES: Without specialist DBA support
      Retain
      PRESERVED: Massive record volumes in original form
      IMMUTABLE: Tamper proofed with audit trail
      CONFIGURABLE: With retention & expiry policies
      Retrieve
      STANDARDS: SQL & BI tools via ODBC/JDBC
      PERFORMANT: Fast queries for large complex data sets
      FLEXIBLE: With schema evolution & point-in-time access
    • 61. RainStor’s Disruptive Technology
      • Patented – 4 layers of compression
      • 62. Data Reduction through value and pattern de-duplication
      • 63. Further Algorithmic-level and byte-level compression
      • 64. Fast Queries in stored format without re-inflation.
      Smith
      Pharma
      Peter
      $40,000
      Pharma
      Smith
      $40,000
      Peter
      Finance
      Paul
      $35,000
      Pharma
      Smith
      $40,000
      Peter
      Finance
      Paul
      Brown
      $35,000
      John
    • 65. Offload Warehouse Data to Online ArchiveHigh Performance & Lower Cost
      • Augment existing warehouse & analytics systems by providing access to years of history
      • 66. Run query on RainStor and import results to data warehouse
      • 67. Re-instate data from data retention repository back to warehouse for deep analytics
      Benefits:
      • Lower TCO (Admin, Storage, CPU)
      • 68. Compliant data retention
      • 69. Unlimited scalability
      • 70. Add more data sources for broader analysis
      50 Quarters
      Source DB
      e.g. Oracle
      Analytics/DW
      5 Quarters
    • 71. RainStor Cloud
      2. Encrypted data stored in private containers ensuring security and easy management.
      1. Compressed de-duplicated data sent to the cloud resulting in quicker and cheaper uploads.
      VM Software Appliance
      Amazon
      Send
      S3
      Search
      EC2
      ODBC/JDBC
      Store
      3. Data accessed on demand using standard SQL tools leveraging elasticity of the cloud
    • 72. How Do the Economics Stack Up?
    • 73. Quick summary
      The growing volume, variety and velocity of data is a problem, but it is also an opportunity
      Requires a broader approach to data management
      Deploy appliances and Hadoop for specific use-cases, and online repository for historical data
      ‘Datastructure’ will become increasingly valuable, not only as a source of data but also as a source of intelligence
      Data location, and the role of data virtualization will come into greater focus
      36
    • 74. Q&A
    • 75. FULL TIME
      Thank you

    ×