Your SlideShare is downloading. ×
0
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

NoSQL Databases Introduction - UTN 2013

718

Published on

This was one of the workshop that we gave at the UTN University, to the students of Computer Science.

This was one of the workshop that we gave at the UTN University, to the students of Computer Science.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
718
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
29
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. NoSQL Databases Introduction facundo.farias@intel.com October, 2013
  • 2. Agenda  Introduction  SQL overview  Why NoSQL?  Characteristics of NoSQL databases  Use Cases  A NoSQL database in action!  Summary
  • 3. Introduction  A database is an organized collection of data. The data are typically organized to model relevant aspects of reality in a way that supports processes requiring this information.  Management systems (DBMSs) are specially designed applications that interact with the user, other applications, and the database itself to capture and analyze data.  Formally, the term database refers to the data itself and supporting data structures. Databases are created to operate large quantities of information by inputting, storing, retrieving, and managing that information.
  • 4. SQL Databases
  • 5. Characteristics  SQL is an ANSI and ISO standard computer language for creating and manipulating databases.  SQL allows the user to create, update, delete, and retrieve data from a database.  SQL is very simple and easy to learn.  High Speed: SQL Queries can be used to retrieve large amounts of records from a database quickly and efficiently.  Well Defined Standards Exist: SQL databases use long-established standard, which is being adopted by ANSI & ISO. Non-SQL databases do not adhere to any clear standard.  No Coding Required: Using standard SQL it is easier to manage database systems without having to write substantial amount of code.  Transactions – ACID Properties (Atomic, Consistent, Isolated, Durable)
  • 6. What has happened?  Relational databases were introduced into the 1970s to allow applications to store data through a standard data modeling and query language (SQL). Since the rise of the web, the volume of data stored about users, objects, products and events has exploded. Data is also accessed more frequently, and is processed more intensively – for example, social networks create hundreds of millions of customized, real-time activity feeds for users based on their connections' activities.  In response to this demand, computing infrastructure and deployment strategies have also changed dramatically. Low-cost, commodity cloud hardware has emerged to replace vertical scaling on highly complex and expensive single-server deployments. And engineers now use agile development methods, which aim for continuous deployment and short development cycles, to allow for quick response to user demand for features.
  • 7. NoSQL Databases
  • 8. But.. What’s NoSQL?  A NoSQL database provides a mechanism for storage and retrieval of data that employs less constrained consistency models than traditional relational databases.  NoSQL systems are also referred to as "Not only SQL" to emphasize that they do in fact allow SQL-like query languages to be used.
  • 9. Characteristics  Large data volumes (such as Google’s big data’)  Scalable replication and distribution  Potentially thousands of machines  Potentially distributed around the world  Queries need to return answers quickly  Mostly query, few updates  Asynchronous Inserts & Updates  Schema-less  ACID transaction properties are not needed – BASE (Basically Available, SoftState, Eventually Consistent).  CAP Theorem  Open source development
  • 10. CAP Theorem  According to the theorem, a distributed system cannot satisfy all three of these guarantees at the same time.  Eventual consistency guarantees that if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value.
  • 11. Taxonomy  The basic classification that most would agree on is based on data model. A few of these and their prototypes are:  Column: HBase, Accumulo  Document: MongoDB, Couchbase  Key-value : Dynamo, Riak, Redis, Cache, Project Voldemort  Graph: Neo4J, Allegro, Virtuoso
  • 12. MapReduce A MapReduce program is composed of a Map() procedure that performs filtering and sorting (such as sorting students by first name into queues, one queue for each name) and a Reduce() procedure that performs a summary operation (such as counting the number of students in each queue, yielding name frequencies).
  • 13. NoSQL is not a magic solution  Inconsistent APIs between NoSQL providers.  Denormalized data requires you to maintain you own data relationships in code.  Not a lot of real operational power for DevOps / IT.  Lack of complicated queries requires joins / aggregations / filters to be done in code (except for MapReduce).  Need whole value from the key to read or write any partial information.
  • 14. NoSQL Use Cases:  SAP uses MongoDB as a core component of SAP’s platform- as-a-service (PaaS) offering.  Foursquare uses MongoDB to store venues and user ‘check-ins’ into venues, sharding the data over more than 25 machines on Amazon EC2.  MongoDB is used for back-end storage on the SourceForge front pages, project pages, and download pages for all projects.  Codecademy is the easiest way to learn to code online.  Guardian.co.uk is a leading UK-based news website.  EA Sports: MongoDB is being used for the game feeds component.
  • 15. NoSQL Use Cases:  AOL: “We selected Couchbase after evaluating several open source products to power our next-generation backend ad serving platform”.  Zynga’s FarmVille, Café World, Mafia Wars and other games have over 235 million active users per month. We rely on technology from Couchbase to make that possible.  In the PayPal Media Network Advertising Pipeline, Couchbase is used to build a scalable cross channel audience profiling, segmentation, identity mapping & frequency capping.  LinkedIn built a durable and scalable index for it's metrics visualization engine using Couchbase.  Skyscanner scaled one of its flight search APIs from 100,000 searches a day to over 3 million, introducing Couchbase on its tech stack.
  • 16. Another use cases..  Netflix is using Amazon SimpleDB. Link  Twitter uses Cassandra, Hadoop, Hbase, amont others. Link  Facebook and Instagram, are both using Cassandra.  Google uses BigTable (equivalent to Hadoop HBase).  LinkedIn uses Voldemort.  Etc
  • 17. Summary  This is just the tip of an iceberg. Now on, the rest it’s on you!   SQL works great, cant scale for large data.  NoSQL works great, cant fit for all.  Use SQL + NoSQL 
  • 18. References  Base de Datos [Wikipedia]  SQL [Wikipedia]  NoSQL Distilled [Martin Fowler]  NoSQL vs. SQL - Battle of the Backends [Google IO12]  SQL Standard and NoSQL Databases  What is NoSQL? [MongoDB]  Why NoSQL? [Couchbase]  CouchDB: The Definitive Guide  BigTable Patent [Google]
  • 19. Thanks!
  • 20. Backup
  • 21. JSON  JSON or JavaScript Object Notation, is a text-based open standard designed for human-readable data interchange. Derived from the JavaScript scripting language, JSON is a language for representing simple data structures and associative arrays, called objects. Despite its relationship to JavaScript, JSON is language-independent, with parsers available for many languages.  Sample:

×