SlideShare a Scribd company logo
1 of 38
To Scale or Not to ScaleWhich API is Right for You? Uri Cohen, GigaSpaces @uri1803
> SELECT * FROM devoxx2010.speakers WHERE name=‘Uri Cohen’ +-----------------------------------------------------+ | Name      | Company    | Role            | Twitter  | +-----------------------------------------------------+ | Uri Cohen | GigaSpaces | Product Manager | @uri1803 | +-----------------------------------------------------+ > db.devoxx_speakers.find({name:”Uri Cohen”}) {   “name”:”Uri Cohen”,    “company”: {                 name:”GigaSpaces”,                  products:[“XAP”, “IMDG”]                 domain: “In memory data grids”              }   “role”:”product manager”,    “twitter”:”@uri1803” }
Agenda SQL What it is and isn’t good for  NoSQL Motivation & Main Concepts  Common interaction models Key/Value, Column, Document NOT consistency and distribution algorithms  One Data Store, Multiple APIs Brief intro to GigaSpaces  Key/Value challenges  SQL challenges: Add-hoc querying, Relationships (JPA)
A Few (more) Words About SQL
SQL (Usually) Centralized  Transactional, consistent Hard to Scale
SQL Static, normalized data schema Don’t duplicate, use FKs
SQL Add hoc query support   Model first, query later
SQL Standard   Well known   Rich ecosystem
(Brief) NOSql Recap
NoSql (or a Naive Attempt to Define It) A loosely coupled collection of non-relational data stores
NoSql (or a Naive Attempt to Define It) (Mostly)  d   i   s   t   r   i   b   u   t   e   d
NoSql (or a Naive Attempt to Define It) scalable (Up & Out)
NoSql (or a Naive Attempt to Define It) Not (always) ACID  BASE anyone?
Why Now? Timing is everything… Exponential Increase in data & throughput  Non or semi structured data that changes frequently
A Universe of Data Models  15 Key / Value Column Document {   “name”:”uri”,   “ssn”:”213445”,   “hobbies”:[”…”,“…”],   “…”: {       “…”:”…”       “…”:”…”    }  } {   { ... } } {   { ... } }
Key/Value Have the key? Get the value That’s about it when it comes to querying  Map/Reduce (sometimes) Good for cache aside (e.g. Hibernate 2nd level cache) Simple, id based interactions (e.g. user profiles)  In most cases, values are Opaque
Key/Value Scaling out is relatively easy  (just hash the keys) Some will do that automatically for you  Fixed vs. consistent hashing
Key/Value Implementations:  Memcached, Redis, Riak  In memory data grids (mostly Java-based) started this way  GigaSpaces, Oracle Coherence, WebSphere XS, JBoss Infinispan, etc.
Column Based
Column Based  Mostly derived from Google’s BigTable / Amazon Dynamo papers  One giant table of rows and columns Column == pair (name and a value, sometimes timestamp) Each row can have a different number of columns Table is sparse:  (#rows) × (#columns) ≥ (#values)
Column Based  Query on row key  Or column value (aka secondary index) Good for a constantly changing, (albeit flat) domain model
Document Think JSON (or BSON, or XML) {   “name”:”Lady Gaga”,   “ssn”:”213445”,   “hobbies”:[”Dressing up”,“Singing”],   “albums”:      [{“name”:”The fame”        “release_year”:”2008”},       {“name”:”Born this way”        “release_year”:”2011”}]  } {   { ... } } {   { ... } }
Document Model is not flat, data store is aware of it  Arrays, nested documents  Better support for ad hoc queries MongoDB  excels at this  Very intuitive model  Flexible schema
What if you didn’t have to choose? JPA {   “name”:”uri”,   “ssn”:”213445”,   “hobbies”:[”…”,“…”],   “…”: {       “…”:”…”       “…”:”…”    }  } {   { ... } } {   { ... } } JDBC
A Brief Intro to GigaSpaces  In Memory Data Grid  ,[object Object],[object Object],[object Object],[object Object]
Distributed (multiple partitions),[object Object]
Memcached (the Daemon is in the Details) 30
Memcached (the Daemon is in the Details) 31
SQL/JDBC – Query Them All Query may involve Map/Reduce Reduce phase includes merging and sorting 32
SQL/JDBC – Things to Consider  Unique and FK constraints are not practically enforceable  Sorting and aggregation may be expensive  Distributed transactions are evil  Stay local… 33
JPA  It’s all about relationships… 34
JPA Relationships  To embed or not to embed, that is the question…. 35 ,[object Object]
Easy to query: user.accounts.type= ‘checking’ ,[object Object],[object Object]
Partitioning is hard
Querying involves joining,[object Object]
Thank You!@uri1803http://www.gigaspaces.com 38
To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my app?
To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my app?

More Related Content

What's hot

Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Lucidworks
 
Modernizing Infrastructures for Fast Data with Spark, Kafka, Cassandra, React...
Modernizing Infrastructures for Fast Data with Spark, Kafka, Cassandra, React...Modernizing Infrastructures for Fast Data with Spark, Kafka, Cassandra, React...
Modernizing Infrastructures for Fast Data with Spark, Kafka, Cassandra, React...
Lightbend
 

What's hot (20)

Spark Summit EU talk by Ted Malaska
Spark Summit EU talk by Ted MalaskaSpark Summit EU talk by Ted Malaska
Spark Summit EU talk by Ted Malaska
 
Buzzwords 2014 / Overview / part1
Buzzwords 2014 / Overview / part1Buzzwords 2014 / Overview / part1
Buzzwords 2014 / Overview / part1
 
How to Make Norikra Perfect
How to Make Norikra PerfectHow to Make Norikra Perfect
How to Make Norikra Perfect
 
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
 
High Performance Solr
High Performance SolrHigh Performance Solr
High Performance Solr
 
Ali Asad Lotia (DevOps at Beamly) - Riemann Stream Processing at #DOXLON
Ali Asad Lotia (DevOps at Beamly) - Riemann Stream Processing at #DOXLONAli Asad Lotia (DevOps at Beamly) - Riemann Stream Processing at #DOXLON
Ali Asad Lotia (DevOps at Beamly) - Riemann Stream Processing at #DOXLON
 
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
 
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark StreamingBuilding Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
 
Zeppelin and spark sql demystified
Zeppelin and spark sql demystifiedZeppelin and spark sql demystified
Zeppelin and spark sql demystified
 
Facebook Presto presentation
Facebook Presto presentationFacebook Presto presentation
Facebook Presto presentation
 
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & KafkaBack-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
 
使用ZooKeeper打造軟體式負載平衡
使用ZooKeeper打造軟體式負載平衡使用ZooKeeper打造軟體式負載平衡
使用ZooKeeper打造軟體式負載平衡
 
Monitoring the Dynamic Resource Usage of Scala and Python Spark Jobs in Yarn:...
Monitoring the Dynamic Resource Usage of Scala and Python Spark Jobs in Yarn:...Monitoring the Dynamic Resource Usage of Scala and Python Spark Jobs in Yarn:...
Monitoring the Dynamic Resource Usage of Scala and Python Spark Jobs in Yarn:...
 
Automated Spark Deployment With Declarative Infrastructure
Automated Spark Deployment With Declarative InfrastructureAutomated Spark Deployment With Declarative Infrastructure
Automated Spark Deployment With Declarative Infrastructure
 
Mail Search As A Sercive: Presented by Rishi Easwaran, Aol
Mail Search As A Sercive: Presented by Rishi Easwaran, AolMail Search As A Sercive: Presented by Rishi Easwaran, Aol
Mail Search As A Sercive: Presented by Rishi Easwaran, Aol
 
Scaling an invoicing SaaS from zero to over 350k customers
Scaling an invoicing SaaS from zero to over 350k customersScaling an invoicing SaaS from zero to over 350k customers
Scaling an invoicing SaaS from zero to over 350k customers
 
Prestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for PrestoPrestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for Presto
 
Terraform Modules Restructured
Terraform Modules RestructuredTerraform Modules Restructured
Terraform Modules Restructured
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalability
 
Modernizing Infrastructures for Fast Data with Spark, Kafka, Cassandra, React...
Modernizing Infrastructures for Fast Data with Spark, Kafka, Cassandra, React...Modernizing Infrastructures for Fast Data with Spark, Kafka, Cassandra, React...
Modernizing Infrastructures for Fast Data with Spark, Kafka, Cassandra, React...
 

Similar to To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my app?

NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
BigBlueHat
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Justin Smestad
 

Similar to To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my app? (20)

Yes, Sql!
Yes, Sql!Yes, Sql!
Yes, Sql!
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction
 
Cassandra 3.0 - JSON at scale - StampedeCon 2015
Cassandra 3.0 - JSON at scale - StampedeCon 2015Cassandra 3.0 - JSON at scale - StampedeCon 2015
Cassandra 3.0 - JSON at scale - StampedeCon 2015
 
Data Modeling on Azure for Analytics
Data Modeling on Azure for AnalyticsData Modeling on Azure for Analytics
Data Modeling on Azure for Analytics
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
 
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataIntroduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big Data
 
Compass Framework
Compass FrameworkCompass Framework
Compass Framework
 
SQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle ProfessionalSQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle Professional
 
Big data technology unit 3
Big data technology unit 3Big data technology unit 3
Big data technology unit 3
 
Hands On Spring Data
Hands On Spring DataHands On Spring Data
Hands On Spring Data
 
Big Data answers in seconds with Amazon Athena
Big Data answers in seconds with Amazon AthenaBig Data answers in seconds with Amazon Athena
Big Data answers in seconds with Amazon Athena
 
Not only SQL
Not only SQL Not only SQL
Not only SQL
 
Inside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTPInside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTP
 
Polyalgebra
PolyalgebraPolyalgebra
Polyalgebra
 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
 
Database Systems - A Historical Perspective
Database Systems - A Historical PerspectiveDatabase Systems - A Historical Perspective
Database Systems - A Historical Perspective
 
7 Databases in 70 minutes
7 Databases in 70 minutes7 Databases in 70 minutes
7 Databases in 70 minutes
 

More from Uri Cohen

Alef event - going open source
Alef event - going open source Alef event - going open source
Alef event - going open source
Uri Cohen
 
Oscon 2013 - Lessons from building an open source community
Oscon 2013 - Lessons from building an open source community Oscon 2013 - Lessons from building an open source community
Oscon 2013 - Lessons from building an open source community
Uri Cohen
 
Cassandra summit - Big Data Apps on the cloud
Cassandra summit - Big Data Apps on the cloud Cassandra summit - Big Data Apps on the cloud
Cassandra summit - Big Data Apps on the cloud
Uri Cohen
 

More from Uri Cohen (20)

Orchestration tool roundup - OpenStack Israel summit - kubernetes vs. docker...
Orchestration tool roundup  - OpenStack Israel summit - kubernetes vs. docker...Orchestration tool roundup  - OpenStack Israel summit - kubernetes vs. docker...
Orchestration tool roundup - OpenStack Israel summit - kubernetes vs. docker...
 
Cloudify workshop at CCCEU 2014
Cloudify workshop at CCCEU 2014 Cloudify workshop at CCCEU 2014
Cloudify workshop at CCCEU 2014
 
SSDs, IMDGs and All the Rest - Jax London
SSDs, IMDGs and All the Rest - Jax LondonSSDs, IMDGs and All the Rest - Jax London
SSDs, IMDGs and All the Rest - Jax London
 
Alef event - going open source
Alef event - going open source Alef event - going open source
Alef event - going open source
 
GigaSpaces XAP for Financial Services
GigaSpaces XAP for Financial Services GigaSpaces XAP for Financial Services
GigaSpaces XAP for Financial Services
 
In Memory Data Grids, Demystified!
In Memory Data Grids, Demystified! In Memory Data Grids, Demystified!
In Memory Data Grids, Demystified!
 
App Centric Devops - CloudStack 2014 Collaboration Conference #CCNA14
App Centric Devops - CloudStack 2014 Collaboration Conference #CCNA14App Centric Devops - CloudStack 2014 Collaboration Conference #CCNA14
App Centric Devops - CloudStack 2014 Collaboration Conference #CCNA14
 
Its the app stupid - CloudStack 2014 Collaboration Conference #CCNA14
Its the app stupid - CloudStack 2014 Collaboration Conference #CCNA14 Its the app stupid - CloudStack 2014 Collaboration Conference #CCNA14
Its the app stupid - CloudStack 2014 Collaboration Conference #CCNA14
 
Deployment Automation on OpenStack with TOSCA and Cloudify
Deployment Automation on OpenStack with TOSCA and CloudifyDeployment Automation on OpenStack with TOSCA and Cloudify
Deployment Automation on OpenStack with TOSCA and Cloudify
 
Cloud stack collabiration conference - It's the app, stupid!
Cloud stack collabiration conference - It's the app, stupid!Cloud stack collabiration conference - It's the app, stupid!
Cloud stack collabiration conference - It's the app, stupid!
 
Changing organizational culture - a sweaty usecase
Changing organizational culture - a sweaty usecaseChanging organizational culture - a sweaty usecase
Changing organizational culture - a sweaty usecase
 
GigaSpaces XAP - Don't Call Me Cache!
GigaSpaces XAP - Don't Call Me Cache!GigaSpaces XAP - Don't Call Me Cache!
GigaSpaces XAP - Don't Call Me Cache!
 
Oscon 2013 - Lessons from building an open source community
Oscon 2013 - Lessons from building an open source community Oscon 2013 - Lessons from building an open source community
Oscon 2013 - Lessons from building an open source community
 
Oscon 2013 -Your OSS Project Is now served
Oscon 2013 -Your OSS Project Is now servedOscon 2013 -Your OSS Project Is now served
Oscon 2013 -Your OSS Project Is now served
 
OpenStack Israel Summit 2013 - It’s the App, Stupid!
OpenStack Israel Summit 2013 - It’s the App, Stupid! OpenStack Israel Summit 2013 - It’s the App, Stupid!
OpenStack Israel Summit 2013 - It’s the App, Stupid!
 
One Does Not Simply Walk Into Devops
One Does Not Simply Walk Into Devops One Does Not Simply Walk Into Devops
One Does Not Simply Walk Into Devops
 
MongoDB in the Clouds
MongoDB in the CloudsMongoDB in the Clouds
MongoDB in the Clouds
 
Carrier Paas - CloudStack Collaboration Event 2012
Carrier Paas - CloudStack Collaboration Event 2012Carrier Paas - CloudStack Collaboration Event 2012
Carrier Paas - CloudStack Collaboration Event 2012
 
Your Apps on the Cloud - What it really takes
Your Apps on the Cloud - What it really takes Your Apps on the Cloud - What it really takes
Your Apps on the Cloud - What it really takes
 
Cassandra summit - Big Data Apps on the cloud
Cassandra summit - Big Data Apps on the cloud Cassandra summit - Big Data Apps on the cloud
Cassandra summit - Big Data Apps on the cloud
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my app?

  • 1. To Scale or Not to ScaleWhich API is Right for You? Uri Cohen, GigaSpaces @uri1803
  • 2. > SELECT * FROM devoxx2010.speakers WHERE name=‘Uri Cohen’ +-----------------------------------------------------+ | Name | Company | Role | Twitter | +-----------------------------------------------------+ | Uri Cohen | GigaSpaces | Product Manager | @uri1803 | +-----------------------------------------------------+ > db.devoxx_speakers.find({name:”Uri Cohen”}) { “name”:”Uri Cohen”, “company”: { name:”GigaSpaces”, products:[“XAP”, “IMDG”] domain: “In memory data grids” } “role”:”product manager”, “twitter”:”@uri1803” }
  • 3. Agenda SQL What it is and isn’t good for NoSQL Motivation & Main Concepts Common interaction models Key/Value, Column, Document NOT consistency and distribution algorithms One Data Store, Multiple APIs Brief intro to GigaSpaces Key/Value challenges SQL challenges: Add-hoc querying, Relationships (JPA)
  • 4. A Few (more) Words About SQL
  • 5. SQL (Usually) Centralized  Transactional, consistent Hard to Scale
  • 6. SQL Static, normalized data schema Don’t duplicate, use FKs
  • 7. SQL Add hoc query support  Model first, query later
  • 8. SQL Standard  Well known  Rich ecosystem
  • 10. NoSql (or a Naive Attempt to Define It) A loosely coupled collection of non-relational data stores
  • 11. NoSql (or a Naive Attempt to Define It) (Mostly) d i s t r i b u t e d
  • 12. NoSql (or a Naive Attempt to Define It) scalable (Up & Out)
  • 13. NoSql (or a Naive Attempt to Define It) Not (always) ACID BASE anyone?
  • 14. Why Now? Timing is everything… Exponential Increase in data & throughput Non or semi structured data that changes frequently
  • 15. A Universe of Data Models 15 Key / Value Column Document { “name”:”uri”, “ssn”:”213445”, “hobbies”:[”…”,“…”], “…”: { “…”:”…” “…”:”…” } } { { ... } } { { ... } }
  • 16. Key/Value Have the key? Get the value That’s about it when it comes to querying Map/Reduce (sometimes) Good for cache aside (e.g. Hibernate 2nd level cache) Simple, id based interactions (e.g. user profiles) In most cases, values are Opaque
  • 17. Key/Value Scaling out is relatively easy (just hash the keys) Some will do that automatically for you Fixed vs. consistent hashing
  • 18. Key/Value Implementations: Memcached, Redis, Riak In memory data grids (mostly Java-based) started this way GigaSpaces, Oracle Coherence, WebSphere XS, JBoss Infinispan, etc.
  • 20. Column Based Mostly derived from Google’s BigTable / Amazon Dynamo papers One giant table of rows and columns Column == pair (name and a value, sometimes timestamp) Each row can have a different number of columns Table is sparse: (#rows) × (#columns) ≥ (#values)
  • 21. Column Based Query on row key Or column value (aka secondary index) Good for a constantly changing, (albeit flat) domain model
  • 22. Document Think JSON (or BSON, or XML) { “name”:”Lady Gaga”, “ssn”:”213445”, “hobbies”:[”Dressing up”,“Singing”], “albums”: [{“name”:”The fame” “release_year”:”2008”}, {“name”:”Born this way” “release_year”:”2011”}] } { { ... } } { { ... } }
  • 23. Document Model is not flat, data store is aware of it Arrays, nested documents Better support for ad hoc queries MongoDB excels at this Very intuitive model Flexible schema
  • 24. What if you didn’t have to choose? JPA { “name”:”uri”, “ssn”:”213445”, “hobbies”:[”…”,“…”], “…”: { “…”:”…” “…”:”…” } } { { ... } } { { ... } } JDBC
  • 25.
  • 26.
  • 27. Memcached (the Daemon is in the Details) 30
  • 28. Memcached (the Daemon is in the Details) 31
  • 29. SQL/JDBC – Query Them All Query may involve Map/Reduce Reduce phase includes merging and sorting 32
  • 30. SQL/JDBC – Things to Consider Unique and FK constraints are not practically enforceable Sorting and aggregation may be expensive Distributed transactions are evil Stay local… 33
  • 31. JPA It’s all about relationships… 34
  • 32.
  • 33.
  • 35.