Thursday, November 3, 11
Thursday, November 3, 11
Data Grids vs
                            Databases
                               Galder Zamarreño
                            Senior Software Engineer
                                  Red Hat, Inc
                             3rd October 2011, Soft Shake

Thursday, November 3, 11
Galder Zamarreño

                           •   R&D Engineer, Red Hat Inc.
                           •   Infinispan developer
                           •   5+ years exp. with distributed data systems
                           •   Twitter: @galderz
                           •   Blog: zamarreno.com



Thursday, November 3, 11
Agenda

                           •   Why do we need Data Grids?
                           •   What are exactly In-memory Data Grids?
                           •   Data Grids + Databases
                           •   Data Grids without a Database
                           •   Can Data Grids replace Databases?



Thursday, November 3, 11
Traditionally...


                           Store everything in a DB!



Thursday, November 3, 11
Modern requirements


                           DBs not particularly good at
                           horizontal scaling...


Thursday, November 3, 11
One size doesn’t fill all!

                           DBs are not bad, but they’re
                           not the solution to every
                           problem either


Thursday, November 3, 11
Data Grids


Thursday, November 3, 11
Data Grids are not new

                           Mainstream traction only
                           recent: vertical scaling needs,
                           cheaper memory... and cloud!


Thursday, November 3, 11
Who’s offering
                            Data Grids?

Thursday, November 3, 11
The Players

                           • Open Source:
                            • Infinispan, EhCache, Hazelcast...
                           • Commercial:
                            • Oracle Coherence, Gigaspaces, Gemfire,
                              IBM eXtreme Scale



Thursday, November 3, 11
But, what are
                           In-memory DGs?

Thursday, November 3, 11
Definition

                           In-memory data structures that
                           offer extremely fast access to
                           data


Thursday, November 3, 11
Maps are popular!

                           Normally come with a Map-like
                           API, but often come with
                           alternatives


Thursday, November 3, 11
Data distribution

                           Store data in a subset of the
                           grid to provide failover while
                           being able to scale up!


Thursday, November 3, 11
With failure in mind

                           Suitable for commodity
                           hardware because they can
                           handle failure


Thursday, November 3, 11
Elastic


                           Remain available during
                           topology changes


Thursday, November 3, 11
Durability


                           More durability achieved
                           flushing to a persistent store


Thursday, November 3, 11
Access patterns

                           Embedded (client and DG in
                           same VM)
                           or Remote (just like DBs)


Thursday, November 3, 11
ACID or BASE


                           Transactions or Eventual
                           Consistency?


Thursday, November 3, 11
DGs + DBs?


Thursday, November 3, 11
Caching!

                           Use Data Grids as caches to
                           enhance Database access
                           performance!


Thursday, November 3, 11
Can a Data Grid
                            replace a DB?

Thursday, November 3, 11
Reiterating benefits


                           Speed, scalability, cloud-
                           friendliness...etc


Thursday, November 3, 11
What are the Data
                           Grid challenges?

Thursday, November 3, 11
Access patterns


                           Migrating from SQL to Map or
                           alternative APIs not easy


Thursday, November 3, 11
Skill set


                           Different skill set:
                           OO programmer vs SQL


Thursday, November 3, 11
Application data layer

                           Data layer to take data
                           collocation into account and
                           do more validation (less strict
                           schema)

Thursday, November 3, 11
E.g. with a DB...




Thursday, November 3, 11
Same with Infinispan




Thursday, November 3, 11
Map/Reduce in detail




Thursday, November 3, 11
Technology to
                            bridge gap?

Thursday, November 3, 11
What about JPA?

                           Hibernate OGM (Object/Grid
                           Mapper) uses JPA to store in
                           DGs as opposed to DBs


Thursday, November 3, 11
Most frequent use
                            cases for DGs?

Thursday, November 3, 11
Use cases

                           •   Analytic systems, i.e. financial/trading apps
                           •   XTP
                           •   Event driven apps, i.e. CEP
                           •   Clustering toolkit



Thursday, November 3, 11
Do I see DGs as DB
                             replacements?

Thursday, November 3, 11
DBs are here to stay!

                           No. DBs are proven, mature,
                           well understood plus, there are
                           millions of systems out there!


Thursday, November 3, 11
One size doesn’t fill all!


                           DBs are not a universal data
                           storage system any more


Thursday, November 3, 11
Consider Data Grids

                           For their speed, capabilities as
                           data store, and cloud
                           friendliness


Thursday, November 3, 11
Still some way to go

                           More deployments and
                           standardization (JSR-107,
                           JSR-347)


Thursday, November 3, 11
Questions

                           infinispan.org - @infinispan

                           speakerrate.com/galder


Thursday, November 3, 11

Data Grids vs Databases

  • 1.
  • 2.
  • 3.
    Data Grids vs Databases Galder Zamarreño Senior Software Engineer Red Hat, Inc 3rd October 2011, Soft Shake Thursday, November 3, 11
  • 4.
    Galder Zamarreño • R&D Engineer, Red Hat Inc. • Infinispan developer • 5+ years exp. with distributed data systems • Twitter: @galderz • Blog: zamarreno.com Thursday, November 3, 11
  • 5.
    Agenda • Why do we need Data Grids? • What are exactly In-memory Data Grids? • Data Grids + Databases • Data Grids without a Database • Can Data Grids replace Databases? Thursday, November 3, 11
  • 6.
    Traditionally... Store everything in a DB! Thursday, November 3, 11
  • 7.
    Modern requirements DBs not particularly good at horizontal scaling... Thursday, November 3, 11
  • 8.
    One size doesn’tfill all! DBs are not bad, but they’re not the solution to every problem either Thursday, November 3, 11
  • 9.
  • 10.
    Data Grids arenot new Mainstream traction only recent: vertical scaling needs, cheaper memory... and cloud! Thursday, November 3, 11
  • 11.
    Who’s offering Data Grids? Thursday, November 3, 11
  • 12.
    The Players • Open Source: • Infinispan, EhCache, Hazelcast... • Commercial: • Oracle Coherence, Gigaspaces, Gemfire, IBM eXtreme Scale Thursday, November 3, 11
  • 13.
    But, what are In-memory DGs? Thursday, November 3, 11
  • 14.
    Definition In-memory data structures that offer extremely fast access to data Thursday, November 3, 11
  • 15.
    Maps are popular! Normally come with a Map-like API, but often come with alternatives Thursday, November 3, 11
  • 16.
    Data distribution Store data in a subset of the grid to provide failover while being able to scale up! Thursday, November 3, 11
  • 17.
    With failure inmind Suitable for commodity hardware because they can handle failure Thursday, November 3, 11
  • 18.
    Elastic Remain available during topology changes Thursday, November 3, 11
  • 19.
    Durability More durability achieved flushing to a persistent store Thursday, November 3, 11
  • 20.
    Access patterns Embedded (client and DG in same VM) or Remote (just like DBs) Thursday, November 3, 11
  • 21.
    ACID or BASE Transactions or Eventual Consistency? Thursday, November 3, 11
  • 22.
    DGs + DBs? Thursday,November 3, 11
  • 23.
    Caching! Use Data Grids as caches to enhance Database access performance! Thursday, November 3, 11
  • 24.
    Can a DataGrid replace a DB? Thursday, November 3, 11
  • 25.
    Reiterating benefits Speed, scalability, cloud- friendliness...etc Thursday, November 3, 11
  • 26.
    What are theData Grid challenges? Thursday, November 3, 11
  • 27.
    Access patterns Migrating from SQL to Map or alternative APIs not easy Thursday, November 3, 11
  • 28.
    Skill set Different skill set: OO programmer vs SQL Thursday, November 3, 11
  • 29.
    Application data layer Data layer to take data collocation into account and do more validation (less strict schema) Thursday, November 3, 11
  • 30.
    E.g. with aDB... Thursday, November 3, 11
  • 31.
  • 32.
  • 33.
    Technology to bridge gap? Thursday, November 3, 11
  • 34.
    What about JPA? Hibernate OGM (Object/Grid Mapper) uses JPA to store in DGs as opposed to DBs Thursday, November 3, 11
  • 35.
    Most frequent use cases for DGs? Thursday, November 3, 11
  • 36.
    Use cases • Analytic systems, i.e. financial/trading apps • XTP • Event driven apps, i.e. CEP • Clustering toolkit Thursday, November 3, 11
  • 37.
    Do I seeDGs as DB replacements? Thursday, November 3, 11
  • 38.
    DBs are hereto stay! No. DBs are proven, mature, well understood plus, there are millions of systems out there! Thursday, November 3, 11
  • 39.
    One size doesn’tfill all! DBs are not a universal data storage system any more Thursday, November 3, 11
  • 40.
    Consider Data Grids For their speed, capabilities as data store, and cloud friendliness Thursday, November 3, 11
  • 41.
    Still some wayto go More deployments and standardization (JSR-107, JSR-347) Thursday, November 3, 11
  • 42.
    Questions infinispan.org - @infinispan speakerrate.com/galder Thursday, November 3, 11