• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
New trends in data
 

New trends in data

on

  • 959 views

Key trends in data including NoSQL and NewSQL

Key trends in data including NoSQL and NewSQL

Statistics

Views

Total Views
959
Views on SlideShare
959
Embed Views
0

Actions

Likes
0
Downloads
63
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Why is data so in vogue these days?This resurgence is driven by an unprecedented explosion of data, as organizations capture a greater quantity and diversity of data than ever before. The challenge is how to both manage and extract more value from the data assets being accumulated. We’re seeing a surge of new tools and new techniques as traditional databases are being stretched beyond their capabilities.Explosion of dataNew techniques and technologies to manage dataDeriving new value from the data
  • Why is data so in vogue these days?This resurgence is driven by an unprecedented explosion of data, as organizations capture a greater quantity and diversity of data than ever before. The challenge is how to both manage and extract more value from the data assets being accumulated. We’re seeing a surge of new tools and new techniques as traditional databases are being stretched beyond their capabilities.Explosion of dataNew techniques and technologies to manage dataDeriving new value from the data
  • The current lay of the land. Before we start talking about NoSQL, NewSQL etc, where are we today?Benefits of virtualizing dataRelational databases have been used by enterprises for packaged apps like SAP, BI apps like for analyzing consumer sales, and custom business apps like a payroll system. The same databases will continue to cater to these application types even with the emergence of all the new data types.The traditional deployment of the databases has been on dedicated physical hardware, overprovisioned for each application’s peak usage. With virtualization, you can also deploy them on shared pool of virtual resources, and abstract from the applications using them.
  • How do I architect my data tier for highly variable application usage? My app has 10,000 users on a normal day but 10,000,000 on Mother’s Day.How do I distribute data efficiently to my compute clouds? I have applications and users in multiple places that need access to the same data in real-time.How do I process these large quantities of data in an efficient manner to allow for better real-time decision-making?I need to build a very high scale application with transactional consistency
  • Modern applications are accessed by users from a variety of compute systems; traditional desktop computers and more frequently tablets and mobile devices.
  • Why is data so in vogue these days?This resurgence is driven by an unprecedented explosion of data, as organizations capture a greater quantity and diversity of data than ever before. The challenge is how to both manage and extract more value from the data assets being accumulated. We’re seeing a surge of new tools and new techniques as traditional databases are being stretched beyond their capabilities.Explosion of dataNew techniques and technologies to manage dataDeriving new value from the data
  • RDBMS not designed for today's automated, data-heavy transactional environments. They've acquired decades worth of questionable new features– bloat.CAP theorem – pick 2Consistency (all nodes see the same data at the same time) Availability (node failures do not prevent survivors from continuing to operate) Partition Tolerance (the system continues to operate despite arbitrary message loss) tolerance to network partitions.Database developers all know the ACID acronym. It says that database transactions should be:Atomic: Everything in a transaction succeeds or the entire transaction is rolled back.Consistent: A transaction cannot leave the database in an inconsistent state. Isolated: Transactions cannot interfere with each other. Other operations cannot access data that has been modified during a transaction that has not yet completed.Durable: Completed transactions persist, even when servers restart etc. Ability to recover the committed transaction updates against any kind of system failure (transaction log)An alternative to ACID is BASE:Basic AvailabilitySoft-stateEventual consistency
  • NoSQL has emerged as a common name for databases that are key-value pairs supporting simple operations. These are the de facto databases for Web 2.0 applications, like Craigslist, Foursquare; due to the simplicity, flexibility, and high productivity these enable.There are no major commercial players in the space, which is dominated by a number of open source communities. Most common ones you will run into are MongoDB, Cassandra, and Couchbase. These databases differ by the type of data they can store, e.g., documents, graphs, programming objects.Primary use cases for NoSQL are high productivity web and mobile applications, which constitute most of the new development today.These applications typically require simple transactions on large data sets. The deployment of MongoDB at Craigslist is a very typical example usage. Craigslist has archived all 5 billion user posts, and has the ability to perform a search on the database. Craigslist uses MongoDB deployed on commodity hardware to store and retrieve the post documents.
  • ODS operational data store
  • Why is data so in vogue these days?This resurgence is driven by an unprecedented explosion of data, as organizations capture a greater quantity and diversity of data than ever before. The challenge is how to both manage and extract more value from the data assets being accumulated. We’re seeing a surge of new tools and new techniques as traditional databases are being stretched beyond their capabilities.Explosion of dataNew techniques and technologies to manage dataDeriving new value from the data

New trends in data New trends in data Presentation Transcript

  • NoSQL, NewSQL, whose SQL?Key trends in data you need to know
  • Agenda
    What’s driving new data offerings
    Key trends in harnessing data
    Learn more this week
  • Agenda
    What’s driving new data offerings
    Key trends in harnessing data
    Learn more this week
  • Traditional Relational Database Management System (RDBMS)
    Primary Use cases
    • Packaged enterprise applications
    • Classic Business Intelligence (BI) applications
    • Custom business apps with OLTP database needs
    Billing
    App
    Expense
    App
    SAPApp
    Expense
    App
    Billing
    App
    SAP App
    VMware vSphere
  • Data Challenges for the Cloud Era
    Elastic scalability/Low-latency
    Multi-Site / Multi-Cloud
    Distributed Processing
    WAN
  • Modern Apps – Indeterminate usage
    Spike in usage
  • Agenda
    What’s driving new data offerings
    Key trends in harnessing data
    Learn more this week
  • Trend #1: One size no longer fits all
    Traditional RDBMS not designed for modern distributed systems
    How to scale-out, elastically?
    This is important to achieve performance under shifting load
    How to reduce disk latency?
    How to provide a consistent view of data across geographies?
    Complex to Scale Out
    Costly to Scale Up
  • Trend #2: Modern data questions, In-memory answers
    In-memory systems
    Pooling and sharing memory (and compute, disk resources)
    Data lives in memory, asynchronous write to disk
    Low latency -> memory faster than disk
  • Elastic Scalability / Low Latency
    Before
    After
    Web
    Server
    Web
    Server
    Web
    Server
    As load increases, virtualization allows stateless web and app tier to be rapidly scaled
    Data in Memory Pool
    Web
    Server
    Web
    Server
    Web
    Server
    As load increases, virtualization allows stateless web and app tier to be rapidly scaled
    App
    Server
    App
    Server
    App
    Server
    But stateful Database tier must be over-provisioned in advanced – and sit idle
    App
    Server
    App
    Server
    App
    Server
    Data in-memory
    • Reduces databases required
    • Allows for linear application scalability
  • Trend #3: New ways to work with data
    NoSQL
    In-memory
    Key/value pairs, simplicity, high productivity
    Different offerings, different data models: document, graph, big table, column
    NewSQL
    In-memory
    Scalability benefits of in-memory systems with standardized SQL
    +
    SQL
  • VMware solutions
    vFabric GemFire
    Memory-oriented distributed data grid
    Cloud scale with database-like reliability
    Object interface
    vFabric SQLFire (public beta)
    In-memory SQL database
    Leverage SQL knowledge
    Horizontal scale, speed and high availability
    Cloud Foundry
    MongoDB, Redis
    VMware vFabric GemFire®
  • Multi-Site / Multi-Cloud
    Before
    After
    System of record
    Batch load to ODS
    Real-time
    GemFire Node
    WAN
    Nightly replication
    WAN
    GemFire Node
    Object interface: GemFire
    SQL interface: SQLFire
  • Distributed processing
    Memory-oriented database with elastic scalability, lightning-fast performance & HA
    Client
    Client
    Client
    WAN
    GemFire Node
    GemFire Node
    GemFire Node
    Co-locate compute with data
    Object interface: GemFire
    SQL interface: SQLFire
  • Agenda
    What’s driving new data offerings
    Key trends in harnessing data
    Learn more this week
  • Learn more
    Demo’s in the VMware booth
    vFabric SQLFire, vFabric GemFire, Cloud Foundry
    Hands-on Lab
    “Optimizing Data Access for Your Cloud Infrastructure” (SQLFire)
    Ongoing, HOL12, Twitter hashtag #HOL12
    Expert One-on-One
    Schedule 15 minutes with Jags Ramnarayan, Chief Architect
    Tuesday 4pm, Experts-06, Twitter hashtag #Experts-06
    Group Discussion
    Hosted by Jags
    Wednesday, 3:30 pm GD31, Twitter hashtag #GD31
    VMware vFabric GemFire®
    CPU Pool
  • Learn more this week and beyond
    Sessions / Panel
    Managing High Performance Data with vFabric SQLFire
    Tuesday, 1pm, CAP1942, Twitter hashtag: #CAP1942
    A Customer Scenario for Next-Generation Data Management with vFabric
    Wednesday, 4pm, CAP2471, Twitter hashtag #CAP2471
    Building Resilient, High Performance, Distributed Applications That Are Data Intensive (GemFire)
    Replay ~ 2 weeks from now on vmworld 2011 website, CAP1992, Twitter hashtag #CAP1992
    Big Compute and Big (NoSQL) Data Panel
    Replay ~ 2 weeks from now on vmworld 2011 website, CAP3362, Twitter hashtag #CAP3362