Considerations for using{"no":"SQL"} technology onyour next IT projectAkmal B. Chaudhri(艾克摩 曹理)
AbstractOver the past few years, we have seen the emergenceand growth in NoSQL technology. This has attractedinterest from...
Agenda
In a packed program ...• Introduction• Market analysis• NoSQL• Security and vulnerability• Polyglot persistence• Benchmark...
In a packed program ...• Introduction• Market analysis• NoSQL• Security and vulnerability• Polyglot persistence• Benchmark...
In a packed program ...• Introduction• Market analysis• NoSQL• Security and vulnerability• Polyglot persistence• Benchmark...
In a packed program ...• Introduction• Market analysis• NoSQL• Security and vulnerability• Polyglot persistence• Benchmark...
In a packed program ...• Introduction• Market analysis• NoSQL• Security and vulnerability• Polyglot persistence• Benchmark...
Introduction
My background• ~25 years experience in IT– Developer (Reuters)– Academic (City University)– Consultant (Logica)– Technical...
HistoryHave you run into limitations withtraditional relational databases? Don’tmind trading a query language forscalabili...
Your road leads to NoSQL?NoSQLNoSQLNoSQLNoSQLNoSQLNoSQLNoSQLNoSQL
Gartner hype curveNoSQL
Innovation adoption lifecycleSource: http://en.wikipedia.org/wiki/Technology_adoption_lifecycle
Crossing the chasmChasm
20 years ago
10 years ago
Today
The way developers really thinkOOXMLNoSQL
NoSQL is developer-friendlyOther StakeholdersDevelopers
But ...Riak ... We’re talking about nearly a yearof learning.[1]Things I wish I knew about MongoDB ayear ago[2]I am learni...
NoSQL hoopla and hype
Extra! extra! ...
Extra! extra! ...Source: Inspired by “The Next Big Thing 2012” The Wall Street Journal 27 September 2012
Extra! extra! ...
Extra! extra! ...
Extra! extra!Source: Inspired by the movie “Airplane!” (1980)
Past proclamations of the imminentdemise of relational technology• Object databases vs. relational– GemStone, ObjectStore,...
Market analysis
NoSQL market size ...• Private companies donot publish results• Venture Capital (VC)funding 10s/100s ofmillions of US $[1]...
NoSQL market sizeSource: http://wikibon.org/wiki/v/Big_Data_Vendor_Revenue_and_Market_Forecast_2012-2017
NoSQL job trendsSource: http://regulargeek.com/2012/08/30/nosql-job-trends-august-2012/ (August 2012)Example:100,000x 0.08...
NoSQL jobs in the UK• Database andBusiness Intelligence– MongoDB (696)– Cassandra (291)– Redis (242)– CouchDB (122)– Hive ...
DB-Engines ranking ...Source: http://db-engines.com/en/ranking/ (November 2012)
DB-Engines rankingSource: http://db-engines.com/en/ranking/ (November 2012)
PaaS example ...Source: Jelastic, used with permission
PaaS exampleSource: Jelastic, used with permission
NoSQL
NoSQL The Movie!Sequel
Why did NoSQL datastores arise?• Some applications need very few databasefeatures, but need high scale• Desire to avoid da...
NoSQL driversSource: Couchbase NoSQL Survey (December 2011)
Source: 451 Research, used with permission
How many systems?Source: http://nosql-database.org/ (October 2012)
Major categories of NoSQL ...Type ExamplesDocument storeColumn storeKey-value storeGraph store
Major categories of NoSQLDocument store Column storeKey-value store Graph storeKeyDocument(collection of key-values)Key CF...
Source: Ilya Katsov, used with permission
Commercialization examples
Document store• Represent rich, hierarchical data structures,reducing the need for multi-table joins• Structure of the doc...
Connectionprivate static final String DBNAME = "demodb";private static final String COLLNAME = "people";...MongoClient mon...
CreateBasicDBObject document = new BasicDBObject();List<String> likes = new ArrayList<String>();likes.add("satay");likes.a...
ReadBasicDBObject document = new BasicDBObject();document.put("name", "akmal");DBCursor cursor = collection.find(document)...
UpdateBasicDBObject document = new BasicDBObject();document.put("name", "akmal");BasicDBObject newDocument = new BasicDBOb...
DeleteBasicDBObject document = new BasicDBObject();document.put("name", "akmal");collection.remove(document);
Connectionvar async = require(async);var MongoClient = require(mongodb).MongoClient;MongoClient.connect("mongodb://localho...
Createfunction (callback) {collection.insert(document, {w:1}, function(err, result) {if (err) {return callback(err);}callb...
Readfunction (callback) {collection.findOne({name:akmal}, function(err, item) {if (err) {return callback(err);}console.log...
Updatefunction (callback) {collection.update({name:akmal}, {$set:{age:29}}, {w:1},function(err, result) {if (err) {return ...
Deletefunction (callback) {collection.remove({name:akmal}, function(err, result) {if (err) {return callback(err);}callback...
Column store ...• Manage structured data, with multiple-attributeaccess• Columns are grouped together in “column-families/...
Column store• Scale using replication, multi-node distributionfor high availability and easy failover• Optimized for write...
ConnectionClass.forName("org.apache.cassandra.cql.jdbc.CassandraDriver");connection = DriverManager.getConnection("jdbc:ca...
CreateString query ="BEGIN BATCHn" +"INSERT INTO people (name, age, date, likes) VALUES (akmal, 40, "+ new Date() +", {sat...
ReadString query = "SELECT * FROM people";Statement statement = connection.createStatement();ResultSet cursor = statement....
UpdateString query ="UPDATE people SET age = 29 WHERE name = akmal";Statement statement = connection.createStatement();sta...
DeleteString query ="BEGIN BATCHn" +"DELETE FROM people WHERE name = akmaln" +"APPLY BATCH;";Statement statement = connect...
Key-value store• Simplest NoSQL stores, provide low-latencywrites but single key/value access• Store data as a hash table ...
ConnectionJedis j = new Jedis("localhost", 6379);j.connect();System.out.println("Connected to Redis");
CreateString id = Long.toString(j.incr("global:nextUserId"));j.set("uid:" + id + ":name", "akmal");j.set("uid:" + id + ":a...
ReadString id = j.hget("uid:lookup:name", "akmal");print("name ", j.get("uid:" + id + ":name"));print("age ", j.get("uid:"...
UpdateString id = j.hget("uid:lookup:name", "akmal");j.set("uid:" + id + ":age", "29");
DeleteString id = j.hget("uid:lookup:name", "akmal");j.del("uid:" + id + ":name");j.del("uid:" + id + ":age");j.del("uid:"...
Graph store• Use nodes, relationships between nodes, andkey-value properties• Access data using graph traversal, navigatin...
Connectionprivate static final String DB_PATH ="C:/neo4j-community-1.8.2/data/graph.db";private static enum RelTypes imple...
CreateTransaction tx = graphDb.beginTx();try {firstNode = graphDb.createNode();firstNode.setProperty("name", "akmal");firs...
ReadTransaction tx = graphDb.beginTx();try {print("name", firstNode.getProperty("name"));print("age", firstNode.getPropert...
UpdateTransaction tx = graphDb.beginTx();try {firstNode.setProperty("age", 29);tx.success();} finally { tx.finish(); }
DeleteTransaction tx = graphDb.beginTx();try {firstNode.getSingleRelationship(RelTypes.LIKES,Direction.OUTGOING).delete();...
NoSQL use cases ...• Online/mobile gaming– Leaderboard (high score table) management– Dynamic placement of visual elements...
NoSQL use cases• Dynamic content management and publishing(news and media)– Store content from distributed authors, with f...
Use case requirements ...• Schema flexibility and development agility– Application not constrained by fixed pre-definedsch...
Use case requirements ...• Consistent low latency, even under high load– Typically milliseconds or sub-milliseconds, for r...
Use case requirements• High availability– 24 x 7 x 365 availability– (Today) Requires data distribution and replication– A...
Security andvulnerability
Security
NoSQL injection attacks ...• NoSQL systems arevulnerable• Various types ofattacks• Understand thevulnerabilities andconseq...
PolyglotpersistenceSource: Heroku, used with permission
Polyglot persistenceUser Sessions Financial Data Shopping Cart RecommendationsProduct Catalog Reporting Analytics User Act...
Polyglot persistence examples• Disney– Cassandra, Hadoop, MongoDB• Interactive Mediums– CouchDB, MySQL• Mendeley– HBase, M...
Polyglot persistence• NoSQL product specialization requiresdeveloper knowledge and skills for eachplatform• Different APIs...
Public API for NoSQL storeIn some cases, the team decided to hidethe platform’s complexity from users; notto facilitate it...
Benchmarks andperformance
Yahoo Cloud Serving BM ...• Originally Tested Systems– Cassandra, HBase, Yahoo!’s PNUTS,sharded MySQL• Tier 1 (performance...
Yahoo Cloud Serving BM• Yahoo Cloud ServingBenchmark (YCSB)– Research paper– Slide deck• Altoros Report– 50+ pages• Thumbt...
Linked Data Benchmark Council• EU-funded project• Develop Graph and RDF benchmarks
BI/Analytics
Architectures• NoSQL reports• NoSQL thru and thru• NoSQL + MySQL• NoSQL as ETL source• NoSQL programs in BI tools• NoSQL v...
NoSQL alternatives
Source: 451 Research, used with permission
What about Oracle?
Summary
Structured vs. unstructuredStructured
Relational vs. NoSQL toolbox
Relational vs. NoSQL ...It is specious to compare NoSQLdatabases to relational databases; asyou’ll see, none of the so-cal...
Limitations of NoSQL• Lack of standardized or well-defined semantics– Transactions? Isolation levels?• Reduced consistency...
Choices, choices
Traditional RDBMSSimpleSlowSmallFastComplexLargeApplicationComplexityValue of Individual Data Item Aggregate Data ValueDat...
Understand your use caseSource: http://www.techvalidate.com/tvid/F66-11B-178/
Understand vendor-speakWhat vendor says What vendor meansThe biggest in the world The biggest one we’ve gotThe biggest in ...
Contact details
Find me on ...– http://www.linkedin.com/in/akmalchaudhri– http://twitter.com/akmalchaudhri– http://www.quora.com/Akmal-Cha...
Akmal B. Chaudhrifirstname.lastname@live.com
{"thank":"You"}
Resources
History• First NoSQL meetup– http://nosql.eventbrite.com/– http://blog.oskarsson.nu/post/22996139456/nosql-meetup• First N...
NoSQL Search roadshow• Multi-city tour 2013– Munich– Berlin– San Francisco– Copenhagen– Zurich– Amsterdam– LondonSource: h...
Web sites• NoSQL Databases and Polyglot Persistence: ACurated Guide– http://nosql.mypopescu.com/• NoSQL: Your Ultimate Gui...
Free books ...• The Little MongoDB Book– http://openmymind.net/2011/3/28/The-Little-MongoDB-Book/• The Little Redis Book– ...
Free books• CouchDB: The Definitive Guide– http://guide.couchdb.org/• A Little Riak Book– https://github.com/coderoshi/lit...
Free training• Free courses on MongoDB– https://education.10gen.com/
Articles• Saying Yes to NoSQL– http://www.nofluffjuststuff.com/s/magazine/NFJS_theMagazine_Vol3_Issue3_May2011.pdf• The St...
White papers• The CIO’s Guide toNoSQL– http://documents.dataversity.net/whitepapers/the-cios-guide-to-nosql.html
Product selection ...• 101 Questions to Ask When Considering aNoSQL Database– http://highscalability.com/blog/2011/6/15/10...
Product selection• NoSQL Options Compared: Different Horses forDifferent Courses– http://www.slideshare.net/tazija/nosql-o...
Short product overviews ...• Picking the Right NoSQL Database Tool– http://blog.monitis.com/index.php/2011/05/22/picking-t...
Short product overviews ...• A Look at Some NoSQL Databases --MongoDB, Redis and Basho Riak– http://blog.monitis.com/index...
Short product overviews• Cassandra vs MongoDB vs CouchDB vs Redisvs Riak vs HBase vs Couchbase vs Neo4j vsHypertable vs El...
Case studies ...• Real World NoSQL: HBase at Trend Micro– http://gigaom.com/cloud/real-world-nosql-hbase-at-trend-micro/• ...
Case studies• Real World NoSQL: Amazon SimpleDB at Netflix– http://gigaom.com/cloud/real-world-nosql-amazon-simpledb-at-ne...
Security ...• NoSQL, no security?– http://www.slideshare.net/wurbanski/nosql-no-security/• NoSQL, No Injection!?– http://w...
Security• NoSQL Database Security– http://conference.auscert.org.au/conf2011/presentations/Louis Nyffenegger V1.pdf• Does ...
Polyglot persistence ...• Polyglot Persistence– http://www.slideshare.net/jwoodslideshare/polyglot-persistence-two-great-t...
Polyglot persistence• Polyglot Persistence: EclipseLink with MongoDBand Derby– http://java.dzone.com/articles/polyglot-per...
Performance benchmarks ...• Yahoo Cloud Serving Benchmark– http://research.yahoo.com/node/3202/– http://altoros.com/nosql-...
Performance benchmarks ...• Ultra-High Performance NoSQL Benchmarking– http://thumbtack.net/solutions/ThumbtackWhitePaper....
Performance benchmarks ...• MongoDB Performance Pitfalls -- Behind TheScenes– http://blog.trackerbird.com/content/mongodb-...
Performance benchmarks• Can the Elephants Handle the NoSQLOnslaught?– http://vldb.org/pvldb/vol5/p1712_avriliafloratou_vld...
BI/Analytics• BI/Analytics on NoSQL: Review of ArchitecturesPart 1– http://www.dataversity.net/bianalytics-on-nosql-review...
Various graphics ...• Database Landscape Map -- December 2012– http://blogs.the451group.com/information_management/2012/12...
Various graphics• The NoSQL vs. SQL hoopla, another turn of thescrew!– http://www.parelastic.com/blog/nosql-vs-sql-hoopla-...
Discussion fora• NoSQL– http://www.linkedin.com/groups?gid=2085042• NewSQL– http://www.linkedin.com/groups/NewSQL-4135938•...
London meetup groups ...• Cassandra– http://www.meetup.com/Cassandra-London/• MongoDB– http://www.meetup.com/London-MongoD...
London meetup groups• Riak– http://www.meetup.com/riak-london/
NoSQL jokes/humour ...• LinkedIn discussion thread– http://www.linkedin.com/groups/NoSQL-Jokes-Humour-2085042.S.177321213•...
NoSQL jokes/humour• C.R.U.D.– http://crudcomic.com/– Scroll-back for some great humour on NoSQL!
Miscellaneous ...• Powerpoint template– http://www.articulate.com/rapid-elearning/heres-a-free-powerpoint-template-how-i-m...
Miscellaneous ...• Bar and Column charts– http://www.diychart.com/• Newspaper headlines– http://www.imagechef.com/ic/make....
Miscellaneous• Icons and images– http://www.geekpedia.com/icons.php– http://cemagraphics.deviantart.com/– http://www.frees...
Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri
Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri
Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri
Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri
Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri
Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri
Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri
Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri
Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri
Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri
Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri
Upcoming SlideShare
Loading in...5
×

Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri

2,662

Published on

Over the past few years, we have seen the emergence and growth of NoSQL technology. This has attracted interest from organizations looking to solve new business problems. There are also examples of how this technology has been used to bring practical and commercial benefits to some organizations. However, since it is still an emerging technology, careful consideration is required in finding the relevant developer skills and choosing the right product. This presentation will discuss these issues in greater detail. In particular, it will focus on some of the leading NoSQL products, such as MongoDB, Cassandra, Redis, and Neo4j and will discuss their architectures and suitability for different problems. Short demonstrations, using Java, are planned to give the audience a feel for the practical aspects of such products.

Published in: Technology, Business
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,662
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Considerations for using NoSQL technology on your next IT project - Akmal Chaudhri

  1. 1. Considerations for using{"no":"SQL"} technology onyour next IT projectAkmal B. Chaudhri(艾克摩 曹理)
  2. 2. AbstractOver the past few years, we have seen the emergenceand growth in NoSQL technology. This has attractedinterest from organizations looking to solve newbusiness problems. There are also examples of how thistechnology has been used to bring practical andcommercial benefits to some organizations. However,since it is still an emerging technology, carefulconsideration is required in finding the relevantdeveloper skills and choosing the right product. Thispresentation will discuss these issues in greater detail. Inparticular, it will focus on some of the leading NoSQLproducts and discuss their architectures and suitabilityfor different problems
  3. 3. Agenda
  4. 4. In a packed program ...• Introduction• Market analysis• NoSQL• Security and vulnerability• Polyglot persistence• Benchmarks and performance• BI/Analytics• NoSQL alternatives• Summary• Resources
  5. 5. In a packed program ...• Introduction• Market analysis• NoSQL• Security and vulnerability• Polyglot persistence• Benchmarks and performance• BI/Analytics• NoSQL alternatives• Summary• Resources
  6. 6. In a packed program ...• Introduction• Market analysis• NoSQL• Security and vulnerability• Polyglot persistence• Benchmarks and performance• BI/Analytics• NoSQL alternatives• Summary• Resources
  7. 7. In a packed program ...• Introduction• Market analysis• NoSQL• Security and vulnerability• Polyglot persistence• Benchmarks and performance• BI/Analytics• NoSQL alternatives• Summary• Resources
  8. 8. In a packed program ...• Introduction• Market analysis• NoSQL• Security and vulnerability• Polyglot persistence• Benchmarks and performance• BI/Analytics• NoSQL alternatives• Summary• Resources
  9. 9. Introduction
  10. 10. My background• ~25 years experience in IT– Developer (Reuters)– Academic (City University)– Consultant (Logica)– Technical Architect (CA)– Senior Architect (Informix)– Senior IT Specialist (IBM)• Broad industry experience• Worked with varioustechnologies– Programming languages– IDE– Database Systems• Client-facing roles– Developers– Senior executives– Journalists• Community outreach• 10 books, many presentations
  11. 11. HistoryHave you run into limitations withtraditional relational databases? Don’tmind trading a query language forscalability? Or perhaps you just like shinynew things to try out? Either way thismeetup is for you.Join us in figuring out why these newfangled Dynamo clones and BigTableshave become so popular lately.Source: http://nosql.eventbrite.com/
  12. 12. Your road leads to NoSQL?NoSQLNoSQLNoSQLNoSQLNoSQLNoSQLNoSQLNoSQL
  13. 13. Gartner hype curveNoSQL
  14. 14. Innovation adoption lifecycleSource: http://en.wikipedia.org/wiki/Technology_adoption_lifecycle
  15. 15. Crossing the chasmChasm
  16. 16. 20 years ago
  17. 17. 10 years ago
  18. 18. Today
  19. 19. The way developers really thinkOOXMLNoSQL
  20. 20. NoSQL is developer-friendlyOther StakeholdersDevelopers
  21. 21. But ...Riak ... We’re talking about nearly a yearof learning.[1]Things I wish I knew about MongoDB ayear ago[2]I am learning Cassandra. It is not easy.[3][1] http://productionscale.com/home/2011/11/20/building-an-application-upon-riak-part-1.html[2] http://snmaynard.com/2012/10/17/things-i-wish-i-knew-about-mongodb-a-year-ago/[3] http://planetcassandra.org/blog/post/datastax-java-driver-for-apache-cassandra
  22. 22. NoSQL hoopla and hype
  23. 23. Extra! extra! ...
  24. 24. Extra! extra! ...Source: Inspired by “The Next Big Thing 2012” The Wall Street Journal 27 September 2012
  25. 25. Extra! extra! ...
  26. 26. Extra! extra! ...
  27. 27. Extra! extra!Source: Inspired by the movie “Airplane!” (1980)
  28. 28. Past proclamations of the imminentdemise of relational technology• Object databases vs. relational– GemStone, ObjectStore, Objectivity, etc.• In-memory databases vs. relational– TimesTen, SolidDB, etc.• Persistence frameworks vs. relational– Hibernate, OpenJPA, etc.• XML databases vs. relational– Tamino, BaseX, etc.• Column-store databases vs. relational– Sybase IQ, Vertica, etc.
  29. 29. Market analysis
  30. 30. NoSQL market size ...• Private companies donot publish results• Venture Capital (VC)funding 10s/100s ofmillions of US $[1]• NoSQL softwarerevenue was US $20million in 2011[2][1] http://blogs.the451group.com/information_management/2011/11/15/[2] http://blogs.the451group.com/information_management/2012/05/
  31. 31. NoSQL market sizeSource: http://wikibon.org/wiki/v/Big_Data_Vendor_Revenue_and_Market_Forecast_2012-2017
  32. 32. NoSQL job trendsSource: http://regulargeek.com/2012/08/30/nosql-job-trends-august-2012/ (August 2012)Example:100,000x 0.08 %= 80 jobs!
  33. 33. NoSQL jobs in the UK• Database andBusiness Intelligence– MongoDB (696)– Cassandra (291)– Redis (242)– CouchDB (122)– Hive (70)– HBase (67)– Neo4j (52)– Couchbase (38)Source: http://www.itjobswatch.co.uk/jobs/uk/nosql.do (May 2013)
  34. 34. DB-Engines ranking ...Source: http://db-engines.com/en/ranking/ (November 2012)
  35. 35. DB-Engines rankingSource: http://db-engines.com/en/ranking/ (November 2012)
  36. 36. PaaS example ...Source: Jelastic, used with permission
  37. 37. PaaS exampleSource: Jelastic, used with permission
  38. 38. NoSQL
  39. 39. NoSQL The Movie!Sequel
  40. 40. Why did NoSQL datastores arise?• Some applications need very few databasefeatures, but need high scale• Desire to avoid data/schema pre-designaltogether for simple applications• Need for a low-latency, low-overhead API toaccess data• Simplicity -- do not need fancy indexing -- justfast lookup by primary key
  41. 41. NoSQL driversSource: Couchbase NoSQL Survey (December 2011)
  42. 42. Source: 451 Research, used with permission
  43. 43. How many systems?Source: http://nosql-database.org/ (October 2012)
  44. 44. Major categories of NoSQL ...Type ExamplesDocument storeColumn storeKey-value storeGraph store
  45. 45. Major categories of NoSQLDocument store Column storeKey-value store Graph storeKeyDocument(collection of key-values)Key CF1:C1CF1:C2CF2:C1CF3:C1Key Binary Data Key PropertiesNode 1Key PropertiesNode 2Key PropertiesRelationship 1
  46. 46. Source: Ilya Katsov, used with permission
  47. 47. Commercialization examples
  48. 48. Document store• Represent rich, hierarchical data structures,reducing the need for multi-table joins• Structure of the documents need not be known apriori, can be variable, and evolve instantly, buta query can understand the contents of adocument• Use cases: rapid ingest and delivery for evolvingschemas and web-based objects
  49. 49. Connectionprivate static final String DBNAME = "demodb";private static final String COLLNAME = "people";...MongoClient mongoClient = new MongoClient("localhost", 27017);DB db = mongoClient.getDB(DBNAME);DBCollection collection = db.getCollection(COLLNAME);System.out.println("Connected to MongoDB");
  50. 50. CreateBasicDBObject document = new BasicDBObject();List<String> likes = new ArrayList<String>();likes.add("satay");likes.add("kebabs");likes.add("fish-n-chips");document.put("name", "akmal");document.put("age", 40);document.put("date", new Date());document.put("likes", likes);collection.insert(document);
  51. 51. ReadBasicDBObject document = new BasicDBObject();document.put("name", "akmal");DBCursor cursor = collection.find(document);while (cursor.hasNext())System.out.println(cursor.next());cursor.close();
  52. 52. UpdateBasicDBObject document = new BasicDBObject();document.put("name", "akmal");BasicDBObject newDocument = new BasicDBObject();newDocument.put("age", 29);BasicDBObject updateObj = new BasicDBObject();updateObj.put("$set", newDocument);collection.update(document, updateObj);
  53. 53. DeleteBasicDBObject document = new BasicDBObject();document.put("name", "akmal");collection.remove(document);
  54. 54. Connectionvar async = require(async);var MongoClient = require(mongodb).MongoClient;MongoClient.connect("mongodb://localhost:27017/demodb",function(err, db) {if (err) {return console.log(err);}console.log("Connected to MongoDB");var collection = db.collection(people);var document = {name:akmal,age:40,date:new Date(),likes:[satay, kebabs, fish-n-chips]};
  55. 55. Createfunction (callback) {collection.insert(document, {w:1}, function(err, result) {if (err) {return callback(err);}callback();});},
  56. 56. Readfunction (callback) {collection.findOne({name:akmal}, function(err, item) {if (err) {return callback(err);}console.log(item);callback();});},
  57. 57. Updatefunction (callback) {collection.update({name:akmal}, {$set:{age:29}}, {w:1},function(err, result) {if (err) {return callback(err);}callback();});},
  58. 58. Deletefunction (callback) {collection.remove({name:akmal}, function(err, result) {if (err) {return callback(err);}callback();});},
  59. 59. Column store ...• Manage structured data, with multiple-attributeaccess• Columns are grouped together in “column-families/groups”; each storage block containsdata from only one column/column set to providedata locality for “hot” columns• Column groups defined a priori, but supportvariable schemas within a column group
  60. 60. Column store• Scale using replication, multi-node distributionfor high availability and easy failover• Optimized for writes• Use cases: high throughput verticals (activityfeeds, message queues), caching, weboperations
  61. 61. ConnectionClass.forName("org.apache.cassandra.cql.jdbc.CassandraDriver");connection = DriverManager.getConnection("jdbc:cassandra://localhost:9160/demodb");System.out.println("Connected to Cassandra");
  62. 62. CreateString query ="BEGIN BATCHn" +"INSERT INTO people (name, age, date, likes) VALUES (akmal, 40, "+ new Date() +", {satay, kebabs, fish-n-chips})n" +"APPLY BATCH;";Statement statement = connection.createStatement();statement.executeUpdate(query);statement.close();
  63. 63. ReadString query = "SELECT * FROM people";Statement statement = connection.createStatement();ResultSet cursor = statement.executeQuery(query);while (cursor.next())for (int j = 1; j < cursor.getMetaData().getColumnCount()+1; j++)System.out.printf("%-10s: %s%n",cursor.getMetaData().getColumnName(j),cursor.getString(cursor.getMetaData().getColumnName(j)));cursor.close();statement.close();
  64. 64. UpdateString query ="UPDATE people SET age = 29 WHERE name = akmal";Statement statement = connection.createStatement();statement.executeUpdate(query);statement.close();
  65. 65. DeleteString query ="BEGIN BATCHn" +"DELETE FROM people WHERE name = akmaln" +"APPLY BATCH;";Statement statement = connection.createStatement();statement.executeUpdate(query);statement.close();
  66. 66. Key-value store• Simplest NoSQL stores, provide low-latencywrites but single key/value access• Store data as a hash table of keys where everykey maps to an opaque binary object• Easily scale across many machines• Use-cases: applications that require massiveamounts of simple data (sensor, weboperations), applications that require rapidlychanging data (stock quotes), caching
  67. 67. ConnectionJedis j = new Jedis("localhost", 6379);j.connect();System.out.println("Connected to Redis");
  68. 68. CreateString id = Long.toString(j.incr("global:nextUserId"));j.set("uid:" + id + ":name", "akmal");j.set("uid:" + id + ":age", "40");j.set("uid:" + id + ":date", new Date().toString());j.sadd("uid:" + id + ":likes", "satay");j.sadd("uid:" + id + ":likes", "kebabs");j.sadd("uid:" + id + ":likes", "fish-n-chips");j.hset("uid:lookup:name", "akmal", id);
  69. 69. ReadString id = j.hget("uid:lookup:name", "akmal");print("name ", j.get("uid:" + id + ":name"));print("age ", j.get("uid:" + id + ":age"));print("date ", j.get("uid:" + id + ":date"));print("likes ", j.smembers("uid:" + id + ":likes"));
  70. 70. UpdateString id = j.hget("uid:lookup:name", "akmal");j.set("uid:" + id + ":age", "29");
  71. 71. DeleteString id = j.hget("uid:lookup:name", "akmal");j.del("uid:" + id + ":name");j.del("uid:" + id + ":age");j.del("uid:" + id + ":date");j.del("uid:" + id + ":likes");
  72. 72. Graph store• Use nodes, relationships between nodes, andkey-value properties• Access data using graph traversal, navigatingfrom start nodes to related nodes according tograph algorithms• Faster for associative data sets• Use cases: storing and reasoning on complexand connected data, such as inferencingapplications in healthcare, government, telecom,oil, performing closure on social networkinggraphs
  73. 73. Connectionprivate static final String DB_PATH ="C:/neo4j-community-1.8.2/data/graph.db";private static enum RelTypes implements RelationshipType {LIKES}...graphDb =new GraphDatabaseFactory().newEmbeddedDatabase(DB_PATH);registerShutdownHook(graphDb);System.out.println("Connected to Neo4j");
  74. 74. CreateTransaction tx = graphDb.beginTx();try {firstNode = graphDb.createNode();firstNode.setProperty("name", "akmal");firstNode.setProperty("age", 40);firstNode.setProperty("date", new Date().toString());secondNode = graphDb.createNode();secondNode.setProperty("food", "satay, kebabs, fish-n-chips");relationship = firstNode.createRelationshipTo(secondNode,RelTypes.LIKES);relationship.setProperty("likes", "likes");tx.success();} finally { tx.finish(); }
  75. 75. ReadTransaction tx = graphDb.beginTx();try {print("name", firstNode.getProperty("name"));print("age", firstNode.getProperty("age"));print("date", firstNode.getProperty("date"));print("likes", secondNode.getProperty("food"));tx.success();} finally { tx.finish(); }
  76. 76. UpdateTransaction tx = graphDb.beginTx();try {firstNode.setProperty("age", 29);tx.success();} finally { tx.finish(); }
  77. 77. DeleteTransaction tx = graphDb.beginTx();try {firstNode.getSingleRelationship(RelTypes.LIKES,Direction.OUTGOING).delete();firstNode.delete();secondNode.delete();tx.success();} finally { tx.finish(); }
  78. 78. NoSQL use cases ...• Online/mobile gaming– Leaderboard (high score table) management– Dynamic placement of visual elements– Game object management– Persisting game/user state information– Persisting user generated data (e.g. drawings)• Display advertising on web sites– Ad Serving: match content with profile and present– Real-time bidding: match cookie profile with advertinventory, obtain bids, and present advert
  79. 79. NoSQL use cases• Dynamic content management and publishing(news and media)– Store content from distributed authors, with fastretrieval and placement– Manage changing layouts and user generated content• E-commerce/social commerce– Storing frequently changing product catalogs• Social networking/online communities• Communications– Device provisioning
  80. 80. Use case requirements ...• Schema flexibility and development agility– Application not constrained by fixed pre-definedschema– Application drives the schema– Ability to develop a minimal application rapidly, anditerate quickly in response to customer feedback– Ability to quickly add, change or delete “fields” ordata-elements– Ability to handle a mix of structured and unstructureddata– Easier, faster programming, so faster time to marketand quick to adapt
  81. 81. Use case requirements ...• Consistent low latency, even under high load– Typically milliseconds or sub-milliseconds, for readsand writes– Even with millions of users• Dynamic elasticity– Rapid horizontal scalability– Ability to add or delete nodes dynamically– Application transparent elasticity, such as automatic(re)distribution of data, if needed– Cloud compatibility
  82. 82. Use case requirements• High availability– 24 x 7 x 365 availability– (Today) Requires data distribution and replication– Ability to upgrade hardware or software without anydown time• Low cost– Commonly available hardware– Lower cost software, such as open source or pay-per-use in cloud– Reduced need for database administration, andmaintenance
  83. 83. Security andvulnerability
  84. 84. Security
  85. 85. NoSQL injection attacks ...• NoSQL systems arevulnerable• Various types ofattacks• Understand thevulnerabilities andconsequences
  86. 86. PolyglotpersistenceSource: Heroku, used with permission
  87. 87. Polyglot persistenceUser Sessions Financial Data Shopping Cart RecommendationsProduct Catalog Reporting Analytics User Activity LogsSource: Adapted from http://martinfowler.com/bliki/PolyglotPersistence.html
  88. 88. Polyglot persistence examples• Disney– Cassandra, Hadoop, MongoDB• Interactive Mediums– CouchDB, MySQL• Mendeley– HBase, MongoDB, Solr, Voldemort• Netflix– Cassandra, Hadoop/HBase, RDBMS, SimpleDB• Twitter– Cassandra, FlockDB, Hadoop/HBase, MySQL
  89. 89. Polyglot persistence• NoSQL product specialization requiresdeveloper knowledge and skills for eachplatform• Different APIs– Develop public API for each NoSQL store (Disney)
  90. 90. Public API for NoSQL storeIn some cases, the team decided to hidethe platform’s complexity from users; notto facilitate its use, but to keep loose-cannon developers from doing somethingcrazy that could take down the wholecluster. It could show them all the controlsand knobs in a NoSQL database, but “theytend to shoot each other,” Jacob said.“First they shoot themselves, then theyshoot each other.”Source: “How Disney built a big data platform on a startup budget” Derrick Harris (2012)
  91. 91. Benchmarks andperformance
  92. 92. Yahoo Cloud Serving BM ...• Originally Tested Systems– Cassandra, HBase, Yahoo!’s PNUTS,sharded MySQL• Tier 1 (performance)– Latency by increasing the server load• Tier 2 (scalability)– Scalability by increasing the number ofservers
  93. 93. Yahoo Cloud Serving BM• Yahoo Cloud ServingBenchmark (YCSB)– Research paper– Slide deck• Altoros Report– 50+ pages• Thumbtack Report– 40+ pages
  94. 94. Linked Data Benchmark Council• EU-funded project• Develop Graph and RDF benchmarks
  95. 95. BI/Analytics
  96. 96. Architectures• NoSQL reports• NoSQL thru and thru• NoSQL + MySQL• NoSQL as ETL source• NoSQL programs in BI tools• NoSQL via BI database (SQL)Source: Nicholas Goodman
  97. 97. NoSQL alternatives
  98. 98. Source: 451 Research, used with permission
  99. 99. What about Oracle?
  100. 100. Summary
  101. 101. Structured vs. unstructuredStructured
  102. 102. Relational vs. NoSQL toolbox
  103. 103. Relational vs. NoSQL ...It is specious to compare NoSQLdatabases to relational databases; asyou’ll see, none of the so-called “NoSQL”databases have the same implementation,goals, features, advantages, anddisadvantages. So comparing “NoSQL” to“relational” is really a shell game.-- Eben HewittSource: “Cassandra: The Definitive Guide” Eben Hewitt (2010)
  104. 104. Limitations of NoSQL• Lack of standardized or well-defined semantics– Transactions? Isolation levels?• Reduced consistency for performance andscalability– “Eventual consistency”– “Soft commit”• Limited forms of access, e.g. often no joins, etc.• Proprietary interfaces• Large clusters, failover, etc.?• Security?
  105. 105. Choices, choices
  106. 106. Traditional RDBMSSimpleSlowSmallFastComplexLargeApplicationComplexityValue of Individual Data Item Aggregate Data ValueDataValueVelocityInteractiveInteractiveRealReal--timetimeAnalyticsAnalyticsRecord LookupRecord LookupHistoricalHistoricalAnalyticsAnalyticsExploratoryExploratoryAnalyticsAnalyticsTransactionalTransactional AnalyticAnalyticSource: VoltDB, used with permission
  107. 107. Understand your use caseSource: http://www.techvalidate.com/tvid/F66-11B-178/
  108. 108. Understand vendor-speakWhat vendor says What vendor meansThe biggest in the world The biggest one we’ve gotThe biggest in the universe The biggest one we’ve gotThere is no limit to ... It’s untested, but we don’t mind if youtry itA new and unique feature Something the competition has had foragesCurrently available feature We are about to start Beta testingPlanned feature Something the competition has, that wewish we had too, that we might haveone dayHighly distributed International officesEngineered for robustness Comes in a tough boxSource: “Object Databases: An Evaluation and Comparison” Bloor Research (1994)
  109. 109. Contact details
  110. 110. Find me on ...– http://www.linkedin.com/in/akmalchaudhri– http://twitter.com/akmalchaudhri– http://www.quora.com/Akmal-Chaudhri– http://www.facebook.com/akmal.chaudhri– http://plus.google.com/105126255575427189842/– http://www.slideshare.net/VeryFatBoy/– http://www.youtube.com/VeryFatBoyVideos/
  111. 111. Akmal B. Chaudhrifirstname.lastname@live.com
  112. 112. {"thank":"You"}
  113. 113. Resources
  114. 114. History• First NoSQL meetup– http://nosql.eventbrite.com/– http://blog.oskarsson.nu/post/22996139456/nosql-meetup• First NoSQL meetup debrief– http://blog.oskarsson.nu/post/22996140866/nosql-debrief• First NoSQL meetup photographs– http://www.flickr.com/photos/russss/sets/72157619711038897/
  115. 115. NoSQL Search roadshow• Multi-city tour 2013– Munich– Berlin– San Francisco– Copenhagen– Zurich– Amsterdam– LondonSource: http://nosqlroadshow.com/
  116. 116. Web sites• NoSQL Databases and Polyglot Persistence: ACurated Guide– http://nosql.mypopescu.com/• NoSQL: Your Ultimate Guide to the Non-Relational Universe!– http://nosql-database.org/
  117. 117. Free books ...• The Little MongoDB Book– http://openmymind.net/2011/3/28/The-Little-MongoDB-Book/• The Little Redis Book– http://openmymind.net/2012/1/23/The-Little-Redis-Book/
  118. 118. Free books• CouchDB: The Definitive Guide– http://guide.couchdb.org/• A Little Riak Book– https://github.com/coderoshi/little_riak_book/
  119. 119. Free training• Free courses on MongoDB– https://education.10gen.com/
  120. 120. Articles• Saying Yes to NoSQL– http://www.nofluffjuststuff.com/s/magazine/NFJS_theMagazine_Vol3_Issue3_May2011.pdf• The State of NoSQL– http://www.infoq.com/articles/State-of-NoSQL/
  121. 121. White papers• The CIO’s Guide toNoSQL– http://documents.dataversity.net/whitepapers/the-cios-guide-to-nosql.html
  122. 122. Product selection ...• 101 Questions to Ask When Considering aNoSQL Database– http://highscalability.com/blog/2011/6/15/101-questions-to-ask-when-considering-a-nosql-database.html• 35+ Use Cases for Choosing Your Next NoSQLDatabase– http://highscalability.com/blog/2011/6/20/35-use-cases-for-choosing-your-next-nosql-database.html
  123. 123. Product selection• NoSQL Options Compared: Different Horses forDifferent Courses– http://www.slideshare.net/tazija/nosql-options-compared/• NoSQL Data Modeling Techniques– http://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/• Choosing a NoSQL data store according to yourdata set– http://00f.net/2010/05/15/choosing-a-nosql-data-store-according-to-your-data-set/
  124. 124. Short product overviews ...• Picking the Right NoSQL Database Tool– http://blog.monitis.com/index.php/2011/05/22/picking-the-right-nosql-database-tool/• NoSQL Databases -- A Look at ApacheCassandra– http://blog.monitis.com/index.php/2011/05/24/nosql-databases-a-look-at-apache-cassandra/• The NoSQL Databases -- A Look at HBase– http://blog.monitis.com/index.php/2011/05/31/the-nosql-databases-a-look-at-hbase/
  125. 125. Short product overviews ...• A Look at Some NoSQL Databases --MongoDB, Redis and Basho Riak– http://blog.monitis.com/index.php/2011/06/06/a-look-at-some-nosql-databases-mongodb-redis-and-basho-riak/• Picking the Right NoSQL Database, Part 4 --CouchDB and Membase– http://blog.monitis.com/index.php/2011/06/17/picking-the-right-nosql-database-part-4-couchdb-and-membase/
  126. 126. Short product overviews• Cassandra vs MongoDB vs CouchDB vs Redisvs Riak vs HBase vs Couchbase vs Neo4j vsHypertable vs ElasticSearch vs Accumulo vsVoltDB vs Scalaris comparison– http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis/• vsChart.com– http://vschart.com/list/database/
  127. 127. Case studies ...• Real World NoSQL: HBase at Trend Micro– http://gigaom.com/cloud/real-world-nosql-hbase-at-trend-micro/• Real World NoSQL: MongoDB at Shutterfly– http://gigaom.com/cloud/real-world-nosql-mongodb-at-shutterfly/• Real World NoSQL: Cassandra at Openwave– http://gigaom.com/cloud/realworld-nosql-cassandra-at-openwave/
  128. 128. Case studies• Real World NoSQL: Amazon SimpleDB at Netflix– http://gigaom.com/cloud/real-world-nosql-amazon-simpledb-at-netflix/• Real World NoSQL: Membase at Tribal Crossing– http://gigaom.com/cloud/real-world-nosql-membase-at-tribal-crossing/• How Disney built a big data platform on a startupbudget– http://gigaom.com/data/how-disney-built-a-big-data-platform-on-a-startup-budget/
  129. 129. Security ...• NoSQL, no security?– http://www.slideshare.net/wurbanski/nosql-no-security/• NoSQL, No Injection!?– http://www.slideshare.net/wayne_armorize/nosql-no-sql-injections-4880169/• Attacking MongoDB– http://www.slideshare.net/cyber-punk/mongo-db-eng/• NoSQL, But Even Less Security– http://blogs.adobe.com/asset/files/2011/04/NoSQL-But-Even-Less-Security.pdf
  130. 130. Security• NoSQL Database Security– http://conference.auscert.org.au/conf2011/presentations/Louis Nyffenegger V1.pdf• Does NoSQL Mean No Security?– http://www.darkreading.com/database-security/167901020/security/news/232400214/does-nosql-mean-no-security.html• A Response To NoSQL Security Concerns– http://www.darkreading.com/blog/232600288/a-response-to-nosql-security-concerns.html
  131. 131. Polyglot persistence ...• Polyglot Persistence– http://www.slideshare.net/jwoodslideshare/polyglot-persistence-two-great-tastes-that-taste-great-together-4625004/• HBase at Mendeley– http://www.slideshare.net/danharvey/hbase-at-mendeley/• Polyglot Persistence Patterns– http://abhishek-tiwari.com/post/polyglot-persistence-patterns/
  132. 132. Polyglot persistence• Polyglot Persistence: EclipseLink with MongoDBand Derby– http://java.dzone.com/articles/polyglot-persistence-0/• D. Ghosh (2010) Multiparadigm data storage forenterprise applications. IEEE Software. Vol. 27,No. 5, pp. 57-60
  133. 133. Performance benchmarks ...• Yahoo Cloud Serving Benchmark– http://research.yahoo.com/node/3202/– http://altoros.com/nosql-research– http://www.slideshare.net/tazija/evaluating-nosql-performance-time-for-benchmarking/• Benchmarking Couchbase Server– http://www.slideshare.net/Couchbase/t1-s4-couchbase-performancebenchmarkingv34/
  134. 134. Performance benchmarks ...• Ultra-High Performance NoSQL Benchmarking– http://thumbtack.net/solutions/ThumbtackWhitePaper.html• Benchmarking Top NoSQL Databases– http://www.datastax.com/resources/whitepapers/benchmarking-top-nosql-databases
  135. 135. Performance benchmarks ...• MongoDB Performance Pitfalls -- Behind TheScenes– http://blog.trackerbird.com/content/mongodb-performance-pitfalls-behind-the-scenes/• MySQL vs. MongoDB Disk Space Usage– http://blog.trackerbird.com/content/mysql-vs-mongodb-disk-space-usage/• MongoDB: Scaling write performance– http://www.slideshare.net/daumdna/mongodb-scaling-write-performance/
  136. 136. Performance benchmarks• Can the Elephants Handle the NoSQLOnslaught?– http://vldb.org/pvldb/vol5/p1712_avriliafloratou_vldb2012.pdf• Solving Big Data Challenges for EnterpriseApplication Performance Management– http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf• Linked Data Benchmark Council– http://ldbc.eu/
  137. 137. BI/Analytics• BI/Analytics on NoSQL: Review of ArchitecturesPart 1– http://www.dataversity.net/bianalytics-on-nosql-review-of-architectures-part-1/• BI/Analytics on NoSQL: Review of ArchitecturesPart 2– http://www.dataversity.net/bianalytics-on-nosql-review-of-architectures-part-2/
  138. 138. Various graphics ...• Database Landscape Map -- December 2012– http://blogs.the451group.com/information_management/2012/12/20/database-landscape-map-december-2012/• Necessity is the mother of NoSQL– http://blogs.the451group.com/information_management/2011/04/20/necessity-is-the-mother-of-nosql/• NoSQL, Heroku, and You– https://blog.heroku.com/archives/2010/7/20/nosql/
  139. 139. Various graphics• The NoSQL vs. SQL hoopla, another turn of thescrew!– http://www.parelastic.com/blog/nosql-vs-sql-hoopla-another-turn-screw/• Navigating the Database Universe– http://www.slideshare.net/lisapaglia/navigating-the-database-universe/
  140. 140. Discussion fora• NoSQL– http://www.linkedin.com/groups?gid=2085042• NewSQL– http://www.linkedin.com/groups/NewSQL-4135938• Google groups– http://groups.google.com/group/nosql-discussion
  141. 141. London meetup groups ...• Cassandra– http://www.meetup.com/Cassandra-London/• MongoDB– http://www.meetup.com/London-MongoDB-User-Group/• Neo4j– http://www.meetup.com/graphdb-london/• Redis– http://www.meetup.com/Redis-London/
  142. 142. London meetup groups• Riak– http://www.meetup.com/riak-london/
  143. 143. NoSQL jokes/humour ...• LinkedIn discussion thread– http://www.linkedin.com/groups/NoSQL-Jokes-Humour-2085042.S.177321213• NoSQL Better Than MySQL?– http://www.youtube.com/watch?v=QU34ZVD2ylY– Shorter version of “Episode 1 - MongoDB is WebScale”• say No! No! and No! (=NoSQL Parody)– http://www.youtube.com/watch?v=fXc-QDJBXpw
  144. 144. NoSQL jokes/humour• C.R.U.D.– http://crudcomic.com/– Scroll-back for some great humour on NoSQL!
  145. 145. Miscellaneous ...• Powerpoint template– http://www.articulate.com/rapid-elearning/heres-a-free-powerpoint-template-how-i-made-it/• Autostereogram– http://www.all-freeware.com/images/full/46590-free_stereogram_screensaver_audio___multimedia_other.jpeg• Theatre Curtain Animations– http://www.slideshare.net/chinateacher1/theater-curtain-animations/
  146. 146. Miscellaneous ...• Bar and Column charts– http://www.diychart.com/• Newspaper headlines– http://www.imagechef.com/ic/make.jsp?tid=Newspaper+Headline• Pie charts– http://www.onlinecharttool.com/
  147. 147. Miscellaneous• Icons and images– http://www.geekpedia.com/icons.php– http://cemagraphics.deviantart.com/– http://www.freestockphotos.biz/– http://www.graphicsfuel.com/2011/09/comments-speech-bubble-icon-psd/– http://icondock.com/

×