Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big Data: Movement, Warehousing, & Virtualization

1,465 views

Published on

This presentation was given by Barry Thompson, CTO of Tervela, to TSAM (a financial buy-side technology & operations event) in July 2011. It covers trends in big data and how to solve problems with data movement, warehousing, and virtualization solutions.

Published in: Technology, Business

Big Data: Movement, Warehousing, & Virtualization

  1. 1. Big Data:Movement, Warehousing, & VirtualizationPresented to TSAM Data Management Stream – July 14th, 2011
  2. 2. Overview• Major Industry Trends• Data Virtualization & Distributed Storage• Impacts to Our Industry• Solution Alignment to Technology 2
  3. 3. Trend #1: Cost Structure of Storage•  Cost 2009 -  67 TB in 4U •  Distributed commodity storage is -  24.5x multiple -  Reliability as key 25x cheaper than Tier 1 SAN differentiator •  High reliability (replication) it is -  With replication (55x) closer to 55x -  Equivalent Performance•  Performance •  Distributed is now faster •  Flash Exacerbates Source: BackBlaze.com 2011•  Decreasing Differentiators -  145 TB in 4U (Disk) -  27 TB in 4U (Flash) •  Perceived Reliability -  26x multiple •  Enterprise Management -  w/ Data Fabric / Virtualization is •  Legacy Compatibility as reliable -  Higher Performance 3
  4. 4. Trend #2: Moving from Blocks to Data•  Blocks are a legacy to tape storage•  Deeply embedded in the OS / Driver fabric and most legacy DB architectures•  Horribly inefficient for modern requirements •  Replication / Synchronization (>100x retransmission) •  Networks are not designed for blocks •  Applications have to Load / Store •  Wall Street data usage is different than standard Fortune 500 (more dynamic data and higher churn rates) •  WAN Optimization can not fully solve•  Atomic Data is an emerging model •  DB Rows / Messages are the historical Atomic example •  PaaS interfaces are ALL data and file driven•  What is YOUR interface? 4
  5. 5. Trend #3: End of Single Location•  Single Location Warehouse’s are Challenged •  Time to Query •  User Experience & SLA •  Data volumes and WAN bandwidth •  Regulatory and Security •  Integrated System Dependencies •  Clients / customers / applications are all in motion (mobile platform & need for•  Impact of Moving from Single Location •  Dynamic data synchronization • 1 Second global SLA for data synchronization – emerging standard for risk • Mechanisms for distribute data sync are different •  PUSH = the new Data Fabric •  PULL = existing WAN Optimization •  Need for a new model for WAN optimization (beyond zlib / dedupe) • Networks can’t handle file copy (block) it must be data •  Elasticity in data movement – the “fabric” must be able to buffer •  Turns the file and database replication and model on it’s head: 1 to many 5
  6. 6. Data Virtualization & Distributed Storage•  Data Virtualization Layers •  Data (storage, DB, cache, streaming sources, state, etc…) •  Data Fabric (data movement, reliability, buffering, WAN services) •  Data transformation (EII) and coordination services (virtualization) •  Data Access / Interface &•  Distributed Storage Model •  Data (storage, DB, cache, streaming sources, state, etc…) •  Data Fabric (data movement, reliability, buffering, WAN services) •  Legacy Interfaces 6
  7. 7. Impact of the New Model• Database Vendor Market •  New Architectures (column store & distributed) can have the same reliability, enterprise features and far better performance •  Monolithic DB solutions no longer need to rely upon storage for DR / reliability• Cost Structure – One size does NOT fit all• Platform •  Cloud – Public / Private •  Existing Infrastructure •  Is there any difference• Elasticity of Compute 7
  8. 8. Adoption•  Early Adopters of the Model in the Enterprise •  Big Data and Mining: • Options • Back testing • Regulatory and compliance • Real-time risk • Global position & Instrument Master • Best Execution •  Hot-Hot DR •  Global Data Availability•  Flexible Computing Utilizing Cloud Technologies •  Complex derivative pricing •  Grid – DR •  Seamless integration of remote locations / venues 8
  9. 9. About Tervela: Data In MotionThe Tervela Data Fabric ProductsThe fastest, most reliable, and costeffective data transport system for globally TMX: Message Switchdistributed, mission-critical applications. Message transport through the fabric •  10-100x performance increase over traditional solutions TPE: Persistence Engine Embedded storage within the fabric •  Beyond 5x9’s built-in fault tolerance & high availability TPM: Provisioning & Management Central management of the fabric •  50% faster to deliver new apps simple development tools & embedded services Data Fabric Optimized for Distributed Data and •  Data-layer security Applications integrated data entitlements & protection Client APIs C, C++, C#, Java, JMS, PaaS Virtual Data Fabric Appliance Free Download www.tervela.com/download 9
  10. 10. Q&A 10
  11. 11. Big Data:Movement, Warehousing, & virtualizationPresented to TSAM Data Management Stream – July 14th, 2011 11

×