This document discusses NoSQL databases and provides an overview of different data models including flat file, hierarchical, network, relational, and object models. It defines key terms related to databases and NoSQL. The document outlines some advantages of the relational model but also challenges it faces. It reviews characteristics of popular NoSQL databases like Redis, Cassandra, MongoDB and Neo4j and discusses research topics in NoSQL databases.
OrientDB: Unlock the Value of Document Data RelationshipsFabrizio Fortino
a) A general introduction of graph databases and OrientDB,
b) Why connected data has more value than just data,
c)How to "have fun" with OrientDB combining documents with graphs via SQL,
d) A use case on how OrientDB has helped to raise standards in Irish Public Office.
On OrientDB: NOSQL document databases provide an elegant way to deal with data in different shapes enabling developers to create better and faster products quickly. The main goal of these systems is to find the most efficient solution to manage data itself. With the Big Data Explosion we need to deal with a myriad of highly interconnected information. The challenge now is not only on how to store data but on how to manage, analyse, traverse and use your data within the context of relationships. Graph databases shine at maintaining highly connected data and is the fastest growing category in database management systems: 2014 registered an increase of 250% in terms of adoption and Forrester Research predicts that more than a quarter of enterprises will be using graphs by 2017. OrientDB combines more than one NOSQL model offering the unique flexibility of modelling data in the form of either documents, or graphs, while incorporating object oriented programming as a way of encapsulating relationships.
Polyglot Persistence vs Multi-Model DatabasesLuca Garulli
Many complex applications scale up by using several different databases, i.e. selecting the best DBMS for each use case. This tends to complicate modern architecture with many products by different vendors, no standards, and a lot of ETL which ultimately causes unpredictable results and a lot of headaches. Multi-Model DBMSs were created to make your life easier, giving you the option of using one NoSQL product with powerful multi-purpose engines capable of handling complex domains. Could one DBMS handle all your needs including speed and scalability in the times of Big Data? Luca will walk you through the benefits and trade-offs of multi-model DBMSs and will show you how easy it is to setup one open source database to handle many different use cases, saving you time and money.
Presented at Data Day Texas - Austin (TX) - USA
Towards a rebirth of data science (by Data Fellas)Andy Petrella
Nowadays, Data Science is buzzing all over the place.
But what is a, so-called, Data Scientist?
Some will argue that a Data Scientist is a person able to report and present insights in a data set. Others will say that a Data Scientist can handle a high throughput of values and expose them in services. Yet another definition includes the capacity to create meaningful visualizations on the data.
However, we enter an age where velocity is a key. Not only the velocity of your data is high, but the time to market is shortened. Hence, the time separating the moment you receive a set of data and the time you’ll be able to deliver added value is crucial.
In this talk, we’ll review the legacy Data Science methodologies, what it meant in terms of delivered work and results.
Afterwards, we’ll slightly move towards different concepts, techniques and tools that Data Scientists will have to learn and appropriate in order to accomplish their tasks in the age of Big Data.
The dissertation is closed by exposing the Data Fellas view on a solution to the challenges, specially thanks to the Spark Notebook and the Shar3 product we develop.
Demi Ben Ari - Apache Spark 101 - First Steps into distributed computing:
The world has changed, having one huge server won’t do the job, the ability to Scale Out would be your savior. Apache Spark is a fast and general engine for big data processing, with streaming, SQL, machine learning and graph processing. Showing the basics of Apache Spark and distributed computing.
Demi is a Software engineer, Entrepreneur and an International Tech Speaker.
Demi has over 10 years of experience in building various systems both from the field of near real time applications and Big Data distributed systems.
Co-Founder of the “Big Things” Big Data community and Google Developer Group Cloud.
Big Data Expert, but interested in all kinds of technologies, from front-end to backend, whatever moves data around.
Here is my seminar presentation on No-SQL Databases. it includes all the types of nosql databases, merits & demerits of nosql databases, examples of nosql databases etc.
For seminar report of NoSQL Databases please contact me: ndc@live.in
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisTrieu Nguyen
Growth of big datasets
Introduction to Apache Hadoop and Spark for developing applications
Components of Hadoop, HDFS, MapReduce and HBase
Capabilities of Spark and the differences from a typical MapReduce solution
Some Spark use cases for data analysis
The Information Technology have led us into an era where the production, sharing and use of information are now part of everyday life and of which we are often unaware actors almost: it is now almost inevitable not leave a digital trail of many of the actions we do every day; for example, by digital content such as photos, videos, blog posts and everything that revolves around the social networks (Facebook and Twitter in particular). Added to this is that with the "internet of things", we see an increase in devices such as watches, bracelets, thermostats and many other items that are able to connect to the network and therefore generate large data streams. This explosion of data justifies the birth, in the world of the term Big Data: it indicates the data produced in large quantities, with remarkable speed and in different formats, which requires processing technologies and resources that go far beyond the conventional systems management and storage of data. It is immediately clear that, 1) models of data storage based on the relational model, and 2) processing systems based on stored procedures and computations on grids are not applicable in these contexts. As regards the point 1, the RDBMS, widely used for a great variety of applications, have some problems when the amount of data grows beyond certain limits. The scalability and cost of implementation are only a part of the disadvantages: very often, in fact, when there is opposite to the management of big data, also the variability, or the lack of a fixed structure, represents a significant problem. This has given a boost to the development of the NoSQL database. The website NoSQL Databases defines NoSQL databases such as "Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open source and horizontally scalable." These databases are: distributed, open source, scalable horizontally, without a predetermined pattern (key-value, column-oriented, document-based and graph-based), easily replicable, devoid of the ACID and can handle large amounts of data. These databases are integrated or integrated with processing tools based on the MapReduce paradigm proposed by Google in 2009. MapReduce with the open source Hadoop framework represent the new model for distributed processing of large amounts of data that goes to supplant techniques based on stored procedures and computational grids (step 2). The relational model taught courses in basic database design, has many limitations compared to the demands posed by new applications based on Big Data and NoSQL databases that use to store data and MapReduce to process large amounts of data.
Course Website http://pbdmng.datatoknowledge.it/
Contact me for other informations and to download the slides
In this lecture we analyze document oriented databases. In particular we consider why there are the first approach to nosql and what are the main features. Then, we analyze as example MongoDB. We consider the data model, CRUD operations, write concerns, scaling (replication and sharding).
Finally we presents other document oriented database and when to use or not document oriented databases.
An Approach for RDF-based Semantic Access to NoSQL Repositories, presented as partial requiremnt for the discipline "Metodologia da Pesquisa em Ciência da Computação" at UFSC/2015
OrientDB: Unlock the Value of Document Data RelationshipsFabrizio Fortino
a) A general introduction of graph databases and OrientDB,
b) Why connected data has more value than just data,
c)How to "have fun" with OrientDB combining documents with graphs via SQL,
d) A use case on how OrientDB has helped to raise standards in Irish Public Office.
On OrientDB: NOSQL document databases provide an elegant way to deal with data in different shapes enabling developers to create better and faster products quickly. The main goal of these systems is to find the most efficient solution to manage data itself. With the Big Data Explosion we need to deal with a myriad of highly interconnected information. The challenge now is not only on how to store data but on how to manage, analyse, traverse and use your data within the context of relationships. Graph databases shine at maintaining highly connected data and is the fastest growing category in database management systems: 2014 registered an increase of 250% in terms of adoption and Forrester Research predicts that more than a quarter of enterprises will be using graphs by 2017. OrientDB combines more than one NOSQL model offering the unique flexibility of modelling data in the form of either documents, or graphs, while incorporating object oriented programming as a way of encapsulating relationships.
Polyglot Persistence vs Multi-Model DatabasesLuca Garulli
Many complex applications scale up by using several different databases, i.e. selecting the best DBMS for each use case. This tends to complicate modern architecture with many products by different vendors, no standards, and a lot of ETL which ultimately causes unpredictable results and a lot of headaches. Multi-Model DBMSs were created to make your life easier, giving you the option of using one NoSQL product with powerful multi-purpose engines capable of handling complex domains. Could one DBMS handle all your needs including speed and scalability in the times of Big Data? Luca will walk you through the benefits and trade-offs of multi-model DBMSs and will show you how easy it is to setup one open source database to handle many different use cases, saving you time and money.
Presented at Data Day Texas - Austin (TX) - USA
Towards a rebirth of data science (by Data Fellas)Andy Petrella
Nowadays, Data Science is buzzing all over the place.
But what is a, so-called, Data Scientist?
Some will argue that a Data Scientist is a person able to report and present insights in a data set. Others will say that a Data Scientist can handle a high throughput of values and expose them in services. Yet another definition includes the capacity to create meaningful visualizations on the data.
However, we enter an age where velocity is a key. Not only the velocity of your data is high, but the time to market is shortened. Hence, the time separating the moment you receive a set of data and the time you’ll be able to deliver added value is crucial.
In this talk, we’ll review the legacy Data Science methodologies, what it meant in terms of delivered work and results.
Afterwards, we’ll slightly move towards different concepts, techniques and tools that Data Scientists will have to learn and appropriate in order to accomplish their tasks in the age of Big Data.
The dissertation is closed by exposing the Data Fellas view on a solution to the challenges, specially thanks to the Spark Notebook and the Shar3 product we develop.
Demi Ben Ari - Apache Spark 101 - First Steps into distributed computing:
The world has changed, having one huge server won’t do the job, the ability to Scale Out would be your savior. Apache Spark is a fast and general engine for big data processing, with streaming, SQL, machine learning and graph processing. Showing the basics of Apache Spark and distributed computing.
Demi is a Software engineer, Entrepreneur and an International Tech Speaker.
Demi has over 10 years of experience in building various systems both from the field of near real time applications and Big Data distributed systems.
Co-Founder of the “Big Things” Big Data community and Google Developer Group Cloud.
Big Data Expert, but interested in all kinds of technologies, from front-end to backend, whatever moves data around.
Here is my seminar presentation on No-SQL Databases. it includes all the types of nosql databases, merits & demerits of nosql databases, examples of nosql databases etc.
For seminar report of NoSQL Databases please contact me: ndc@live.in
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisTrieu Nguyen
Growth of big datasets
Introduction to Apache Hadoop and Spark for developing applications
Components of Hadoop, HDFS, MapReduce and HBase
Capabilities of Spark and the differences from a typical MapReduce solution
Some Spark use cases for data analysis
The Information Technology have led us into an era where the production, sharing and use of information are now part of everyday life and of which we are often unaware actors almost: it is now almost inevitable not leave a digital trail of many of the actions we do every day; for example, by digital content such as photos, videos, blog posts and everything that revolves around the social networks (Facebook and Twitter in particular). Added to this is that with the "internet of things", we see an increase in devices such as watches, bracelets, thermostats and many other items that are able to connect to the network and therefore generate large data streams. This explosion of data justifies the birth, in the world of the term Big Data: it indicates the data produced in large quantities, with remarkable speed and in different formats, which requires processing technologies and resources that go far beyond the conventional systems management and storage of data. It is immediately clear that, 1) models of data storage based on the relational model, and 2) processing systems based on stored procedures and computations on grids are not applicable in these contexts. As regards the point 1, the RDBMS, widely used for a great variety of applications, have some problems when the amount of data grows beyond certain limits. The scalability and cost of implementation are only a part of the disadvantages: very often, in fact, when there is opposite to the management of big data, also the variability, or the lack of a fixed structure, represents a significant problem. This has given a boost to the development of the NoSQL database. The website NoSQL Databases defines NoSQL databases such as "Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open source and horizontally scalable." These databases are: distributed, open source, scalable horizontally, without a predetermined pattern (key-value, column-oriented, document-based and graph-based), easily replicable, devoid of the ACID and can handle large amounts of data. These databases are integrated or integrated with processing tools based on the MapReduce paradigm proposed by Google in 2009. MapReduce with the open source Hadoop framework represent the new model for distributed processing of large amounts of data that goes to supplant techniques based on stored procedures and computational grids (step 2). The relational model taught courses in basic database design, has many limitations compared to the demands posed by new applications based on Big Data and NoSQL databases that use to store data and MapReduce to process large amounts of data.
Course Website http://pbdmng.datatoknowledge.it/
Contact me for other informations and to download the slides
In this lecture we analyze document oriented databases. In particular we consider why there are the first approach to nosql and what are the main features. Then, we analyze as example MongoDB. We consider the data model, CRUD operations, write concerns, scaling (replication and sharding).
Finally we presents other document oriented database and when to use or not document oriented databases.
An Approach for RDF-based Semantic Access to NoSQL Repositories, presented as partial requiremnt for the discipline "Metodologia da Pesquisa em Ciência da Computação" at UFSC/2015
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseEdureka!
NoSQL includes a wide range of different database technologies and were developed as a result of surging volume of data stored. Relational databases are not capable of coping with this huge volume and faces agility challenges. This is where NoSQL databases have come in to play and are popular because of their features. The session covers the following topics to help you choose the right NoSQL databases:
Traditional databases
Challenges with traditional databases
CAP Theorem
NoSQL to the rescue
A BASE system
Choose the right NoSQL database
Agenda
- What is NOSQL?
- Motivations for NOSQL?
- Brewer’s CAP Theorem
- Taxonomy of NOSQL databases
- Apache Cassandra
- Features
- Data Model
- Consistency
- Operations
- Cluster Membership
- What Does NOSQL means for RDBMS?
A practical introduction to Oracle NoSQL Database - OOW2014Anuj Sahni
Not familiar with Oracle NoSQL Database yet? This great product introduction session discusses the primary functionality included with the product as well as integration with other Oracle products. It includes a live demo that illustrates installation and configuration as well as data modeling and sample NoSQL application development.
NoSQL Databases for Implementing Data Services – Should I Care?Guido Schmutz
Traditionally the data services in a service-oriented solution have been/are implemented using relational data technologies. For lot of scenarios, this might be the best choice. On the other hand there are other use cases, where an alternative storage mechanism , such as a NoSQL database, might help to solve the problem more easily or in a more scalable way, i.e. using a different storage model.
An Intro to NoSQL Databases -- NoSQL databases will not become the new dominators. Relational will still be popular, and used in the majority of situations. They, however, will no longer be the automatic choice. (source : http://martinfowler.com/)
An unprecedented amount of data is being created and is accessible. This presentation will instruct on using the new NoSQL technologies to make sense of all this data.
What is NoSQL? How does it come to the picture? What are the types of NoSQL? Some basics of different NoSQL types? Differences between RDBMS and NoSQL. Pros and Cons of NoSQL.
What is MongoDB? What are the features of MongoDB? Nexus architecture of MongoDB. Data model and query model of MongoDB? Various MongoDB data management techniques. Indexing in MongoDB. A working example using MongoDB Java driver on Mac OSX.
Cloud Databases in Research and PracticeFelix Gessert
The combination of database systems and cloud computing is extremely attractive: unlimited storage capacities, elastic scalability and as-a-Service models seem to be within reach. This talk will give an in-depth survey of existing solutions for cloud databases that evolved in the last years and provide classification and comparison. This includes real-world systems (e.g. Azure Tables, DynamoDB and Parse) as well as research approaches (e.g. RelationalCloud and ElasTras). In practice however, there are some unsolved problems. Network latency, scalable transactions, SLAs, multi-tenancy, abstract data modelling, elastic scalability and polyglot persistence pose daunting tasks for many scenarios. Therefore, we conclude with „Orestes“ a research approach based on well-known techniques such as web caching, Bloom filters and optimistic concurrency control that demonstrates how existing cloud databases can be enhanced to suit specific applications.
Better Together: The New Data Management OrchestraCloudera, Inc.
To ingest, store, process and leverage big data for maximum business impact requires integrating systems, processing frameworks, and analytic deployment options. Learn how Cloudera’s enterprise data hub framework, MongoDB, and Teradata Data Warehouse working in concert can enable companies to explore data in new ways and solve problems that not long ago might have seemed impossible.
Gone are the days of NoSQL and SQL competing for center stage. Visionary companies are driving data subsystems to operate in harmony. So what’s changed?
In this webinar, you will hear from executives at Cloudera, Teradata and MongoDB about the following:
How to deploy the right mix of tools and technology to become a data-driven organization
Examples of three major data management systems working together
Real world examples of how business and IT are benefiting from the sum of the parts
Join industry leaders Charles Zedlewski, Chris Twogood and Kelly Stirman for this unique panel discussion, moderated by BI Research analyst, Colin White.
This is a presentation by Peter Coppola, VP of Product and Marketing at Basho Technologies and Matthew Aslett, Research Director at 451 Research. Join them as they discuss whether multi-model databases and polyglot persistence have increased operational complexity. They'll discuss the benefits and importance of NoSQL databases and how the Basho Data Platform helps enterprises leverage Big Data applications.
When Databases Meet Big data and Hadoop - Uni of Tromso Online LectureIrfan Elahi
Slides of my online lecture that I delivered to the grad students of University of Tromsø (Norway) about
"When Databases Meet Big Data - Expectations, Challenges and Opportunities"
on 13/09/2018.
The lecture provided an overview of what databases have been used for traditionally and with the rise of big data paradigms, what expectations do enterprises and organizations have now from them. With the shift from vertical scaling to horizontal scaling, what challenges germinate in the context of functional capabilities of databases and how does it all align with the expectations from big data platforms which are increasingly being considered for use-cases like ETL offloading and scalable data warehousing. Lastly, what opportunities lie in this niche and what lies beyond.
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseAnita Luthra
Overview of SQL vs NoSQL. When to use NoSQL vs structured databases. Shows roadmap and considerations for defining success of implementation of Big Data in the enterprise. This presentation also provides a quick overview of the different types of Big-Data databases
My Presentation about EMC Academic Alliance Program at Mansoura University. In this presentation, I tried to present an introduction on the most four famous EMC courses, and an overlook on the most EMC famous products.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Knowledge engineering: from people to machines and back
NoSQL Databases, Not just a Buzzword
1. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
NoSQL Databases
Not just a Buzzword
Haitham A. El-Ghareeb
Faculty of Computers and Information Sciences
Mansoura University
Egypt
helghareeb@mans.edu.eg
December 8, 2014
Haitham A. El-Ghareeb NoSQL Databases
2. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Contacts
Twitter: @helghareeb
Linkedin: http://eg.linkedin.com/in/helghareeb
Youtube: http://video.helghareeb.me
Web site: http://www.helghareeb.me
Blog: http://blog.helghareeb.me
email: helghareeb@mans.edu.eg
email: helghareeb@gmail.com
Haitham A. El-Ghareeb NoSQL Databases
3. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
1 Let's Agree on..
Some de
4. nitions
Open Source
Scalability
Theory vs. Product
2 Data Models
Flat File
Hierarchical
Network
Relational
Object
3 NoSQL Matters
Key Value Stores
Haitham A. El-Ghareeb NoSQL Databases
5. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Column Oriented
Document-based
Graph
4 Closer Look on Famous Products
Redis
Cassandra
MongoDB
Neo4j
5 Research in NoSQL
Disadvantages of NoSQL
Which DB When
6 Finally
7 References
Haitham A. El-Ghareeb NoSQL Databases
6. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Objectives
NoSQL is a hot topic
Meet some famous Data Models
Compare NoSQL and SQL
Review common features of NoSQL databases
Identify some SQL Challenges
Address CAP Theorem
Meet some well known NoSQL Products
Haitham A. El-Ghareeb NoSQL Databases
7. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Some de
8. nitions
Open Source
Scalability
Theory vs. Product
1 Let's Agree on..
Some de
9. nitions
Open Source
Scalability
Theory vs. Product
2 Data Models
Flat File
Hierarchical
Network
Relational
Object
3 NoSQL Matters
Key Value Stores
Haitham A. El-Ghareeb NoSQL Databases
10. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Some de
11. nitions
Open Source
Scalability
Theory vs. Product
Column Oriented
Document-based
Graph
4 Closer Look on Famous Products
Redis
Cassandra
MongoDB
Neo4j
5 Research in NoSQL
Disadvantages of NoSQL
Which DB When
6 Finally
7 References
Haitham A. El-Ghareeb NoSQL Databases
12. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Some de
13. nitions
Open Source
Scalability
Theory vs. Product
Database
Collection of persistent data that is used by the application
systems of some given enterprise [3].
Haitham A. El-Ghareeb NoSQL Databases
14. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Some de
15. nitions
Open Source
Scalability
Theory vs. Product
Database System
Basically just a computerized record-keeping system. The
database itself can be regarded as a kind of electronic
16. ling
cabinet; that is, it is a repository or container for a collection
of computerized data
22. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Some de
23. nitions
Open Source
Scalability
Theory vs. Product
Open Source
FOSS: Free and Open Source Software
Flat Model of Management
Everyone can contribute
Variety
Haitham A. El-Ghareeb NoSQL Databases
24. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Some de
25. nitions
Open Source
Scalability
Theory vs. Product
Scalability
It is the ability of a system, network, or process to handle
a growing amount of work in a capable manner or its
ability to be enlarged to accommodate that growth [2].
For example, it can refer to the capability of a system to
increase its total output under an increased load when
resources (typically hardware) are added.
Haitham A. El-Ghareeb NoSQL Databases
26. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Some de
27. nitions
Open Source
Scalability
Theory vs. Product
Scalability Options
Traditional Scaling (Vertical Scaling) - Cloud databases
scaling up by adding new expensive big servers.
is not always possible
is not reliable in many cases
Haitham A. El-Ghareeb NoSQL Databases
28. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Some de
29. nitions
Open Source
Scalability
Theory vs. Product
Scalability Options
Traditional Scaling (Vertical Scaling) - Cloud databases
scaling up by adding new expensive big servers.
is not always possible
is not reliable in many cases
Architectural principle - scaling out (Horizontal Scaling)
based on data partitioning, i.e. dividing the database
across many (inexpensive) machines.
Haitham A. El-Ghareeb NoSQL Databases
30. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Some de
31. nitions
Open Source
Scalability
Theory vs. Product
Sharding
Technique: data sharding, i.e. horizontal partitioning of
data
Haitham A. El-Ghareeb NoSQL Databases
32. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Some de
33. nitions
Open Source
Scalability
Theory vs. Product
Sharding
Technique: data sharding, i.e. horizontal partitioning of
data
Consequences:
manage parallel access in the application
scales well for both reads and writes
not transparent, application needs to be partition-aware
Haitham A. El-Ghareeb NoSQL Databases
34. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Some de
35. nitions
Open Source
Scalability
Theory vs. Product
Theory vs. Product
Who leads: Academia vs. Market?
Who makes the standards?
Who sticks with standards?
Haitham A. El-Ghareeb NoSQL Databases
36.
37. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
1 Let's Agree on..
Some de
38. nitions
Open Source
Scalability
Theory vs. Product
2 Data Models
Flat File
Hierarchical
Network
Relational
Object
3 NoSQL Matters
Key Value Stores
Haitham A. El-Ghareeb NoSQL Databases
39. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Column Oriented
Document-based
Graph
4 Closer Look on Famous Products
Redis
Cassandra
MongoDB
Neo4j
5 Research in NoSQL
Disadvantages of NoSQL
Which DB When
6 Finally
7 References
Haitham A. El-Ghareeb NoSQL Databases
40. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Flat File
Widely Used
Example - Con
43. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Con
45. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Hierarchical Data Model
Represents data as hierarchical tree structures.
Each hierarchy represents a number of related records.
There is no standard language for the hierarchical model.
A popular hierarchical DML is DL/1 of the IMS system.
It dominated the DBMS market for over 20 years between
1965 and 1985 and is still a widely used DBMS worldwide.
Haitham A. El-Ghareeb NoSQL Databases
46. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Hierarchical Data Model Example
Haitham A. El-Ghareeb NoSQL Databases
47. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Hierarchical Data Model Products
Examples of Products [4]:
Microsoft Windows Registry
IBM - IMS
SAS - System 2K
TDMS
Haitham A. El-Ghareeb NoSQL Databases
48. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Network Data Model [4]
Represents data as record types and also represents a
limited type of 1:N relationship, called a set type.
A 1:N, or one-to-many, relationship relates one instance
of a record to many record instances using some pointer
linking mechanism in these models.
Also known as the CODASYL DBTG model 1.
DML was proposed in the 1971 by Database Task Group
(DBTG).
1CODASYL DBTG stands for Conference on Data Systems Languages
Database Task Group.
Haitham A. El-Ghareeb NoSQL Databases
49. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Network Model Example [4]
Haitham A. El-Ghareeb NoSQL Databases
50. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Network Data Model Products
Example of Products [4]
Computer Associates - IDMS
Unisys - DMS 1100
HP - IMAGE
HP - VAX DBMS
Cincom - SUPRA
Haitham A. El-Ghareeb NoSQL Databases
51. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Haitham A. El-Ghareeb NoSQL Databases
52. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Relational Model
In Relational Model [3]:
Data is represented by means of rows in tables, and such
rows can be directly interpreted as true propositions.
Operators are provided for operating on rows in tables,
and those operators directly support the process of
inferring additional true propositions from the given ones.
As a simple example, the relational project operator.
First relational products began to appear in the late
1970s and early 1980s.
Haitham A. El-Ghareeb NoSQL Databases
53. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
ACID Properties
Haitham A. El-Ghareeb NoSQL Databases
54. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
ACID Properties
Atomic { All of the work in a transaction completes
(commit) or none of it completes.
Haitham A. El-Ghareeb NoSQL Databases
55. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
ACID Properties
Atomic { All of the work in a transaction completes
(commit) or none of it completes.
Consistent { A transaction transforms the database from
one consistent state to another consistent state.
Consistency is de
56. ned in terms of constraints.
Haitham A. El-Ghareeb NoSQL Databases
57. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
ACID Properties
Atomic { All of the work in a transaction completes
(commit) or none of it completes.
Consistent { A transaction transforms the database from
one consistent state to another consistent state.
Consistency is de
58. ned in terms of constraints.
Isolated { The results of any changes made during a
transaction are not visible until the transaction has
committed.
Haitham A. El-Ghareeb NoSQL Databases
59. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
ACID Properties
Atomic { All of the work in a transaction completes
(commit) or none of it completes.
Consistent { A transaction transforms the database from
one consistent state to another consistent state.
Consistency is de
60. ned in terms of constraints.
Isolated { The results of any changes made during a
transaction are not visible until the transaction has
committed.
Durable { The results of a committed transaction survive
failures.
Haitham A. El-Ghareeb NoSQL Databases
61. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Relational Model is Great I
World Economy is Relational
SQL = Rich, declarative query language
Database enforces referential integrity
ACID semantics
Well understood by developers
Well supported by frameworks and tools, e.g. Spring
JDBC, Hibernate, JPA
Haitham A. El-Ghareeb NoSQL Databases
62. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Relational Model is Great II
Well understood by operations
Con
63. guration
Care and feeding
Backups
Tuning
Failure and recovery
Performance characteristics
But. . . .
Haitham A. El-Ghareeb NoSQL Databases
64. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Relational Model Challenges
Haitham A. El-Ghareeb NoSQL Databases
65. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Relational Model Challenges
Object/relational impedance mismatch
Complicated to map rich domain model to relational
schema
Haitham A. El-Ghareeb NoSQL Databases
66. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Relational Model Challenges
Object/relational impedance mismatch
Complicated to map rich domain model to relational
schema
Relational schema is rigid
Dicult to handle semi-structured data, e.g. varying
attributes
Schema changes = downtime or money loss
Haitham A. El-Ghareeb NoSQL Databases
67. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Relational Model Challenges
Object/relational impedance mismatch
Complicated to map rich domain model to relational
schema
Relational schema is rigid
Dicult to handle semi-structured data, e.g. varying
attributes
Schema changes = downtime or money loss
Extremely dicult/impossible to scale writes:
Vertical scaling is limited/requires money
Horizontal scaling is limited or requires money
Haitham A. El-Ghareeb NoSQL Databases
68. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Relational Model Challenges
Object/relational impedance mismatch
Complicated to map rich domain model to relational
schema
Relational schema is rigid
Dicult to handle semi-structured data, e.g. varying
attributes
Schema changes = downtime or money loss
Extremely dicult/impossible to scale writes:
Vertical scaling is limited/requires money
Horizontal scaling is limited or requires money
Performance can be suboptimal for some use cases
Haitham A. El-Ghareeb NoSQL Databases
69. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Brewer's CAP Theorem
A distributed system can support only two of the following
characteristics:
Consistency
Availability
Partition tolerance
Haitham A. El-Ghareeb NoSQL Databases
70. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Consistency
all nodes see the same data at the same time
client perceives that a set of operations has occurred all
at once
More like Atomic in ACID transaction properties
Haitham A. El-Ghareeb NoSQL Databases
71. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Availability
Node(s) failures do not prevent survivors from continuing
to operate
Every operation must terminate in an intended response
Haitham A. El-Ghareeb NoSQL Databases
72. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Partition Tolerance
Operations will complete, even if individual components
are unavailable
System continues to operate despite arbitrary message
loss
Haitham A. El-Ghareeb NoSQL Databases
73. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
CAP Theorem Challenge
Brewer's CAP Theorem: for any system sharing data, it
is impossible to guarantee simultaneously all of these
three properties.
Very large systems will partition at some point:
It is necessary to decide between C and A
Traditional DBMS prefer C over A and P
Most Web applications choose A (except in speci
74. c
applications such as order processing)
Haitham A. El-Ghareeb NoSQL Databases
75. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
CAP Theorem and ACID
Drop A or C of ACID
Relaxing C makes replication easy, facilitates fault
tolerance,
Relaxing A reduces (or eliminates) need for distributed
concurrency control.
Haitham A. El-Ghareeb NoSQL Databases
76. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Object Data Model [4] I
The object data model de
77. nes a database in terms of
objects, their properties, and their operations. Objects
with the same structure and behavior belong to a class,
and classes are organized into hierarchies (or acyclic
graphs).
The operations of each class are speci
80. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Flat File
Hierarchical
Network
Relational
Object
Object Data Model [4] II
Relational DBMSs have been extending their models to
incorporate object database concepts and other
capabilities; these systems are referred to as
object-relational or extended relational systems.
Haitham A. El-Ghareeb NoSQL Databases
81. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Haitham A. El-Ghareeb NoSQL Databases
82. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
1 Let's Agree on..
Some de
83. nitions
Open Source
Scalability
Theory vs. Product
2 Data Models
Flat File
Hierarchical
Network
Relational
Object
3 NoSQL Matters
Key Value Stores
Haitham A. El-Ghareeb NoSQL Databases
84. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Column Oriented
Document-based
Graph
4 Closer Look on Famous Products
Redis
Cassandra
MongoDB
Neo4j
5 Research in NoSQL
Disadvantages of NoSQL
Which DB When
6 Finally
7 References
Haitham A. El-Ghareeb NoSQL Databases
85. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Google Trends - 2009
Haitham A. El-Ghareeb NoSQL Databases
86. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Job Trends - 2006 - 2011
Haitham A. El-Ghareeb NoSQL Databases
87. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Job Trends - 2006 - 2011
Haitham A. El-Ghareeb NoSQL Databases
88. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
De
89. nition
From www.nosql-database.org
Next Generation Databases mostly addressing some of the
points: being non-relational, distributed, open-source and
horizontal scalable. The original intention has been modern
web-scale databases. The movement began early 2009 and is
growing rapidly. Often more characteristics apply as:
schema-free, easy replication support, simple API,
eventually consistent / BASE (not ACID), a huge data
amount, and more.
Haitham A. El-Ghareeb NoSQL Databases
90. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
NoSQL Distinguishing Characteristics
Large data volumes
Google's big data
Scalable replication and distribution
Potentially thousands of machines
Potentially distributed around the world
Queries need to return answers quickly
Mostly query, few updates
Asynchronous Inserts and Updates
Schema-less
ACID transaction properties are not needed { BASE
Open source development
Haitham A. El-Ghareeb NoSQL Databases
91. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
List of NoSQL Databases I
From www.nosql-database.org: Currently [ 150 ]
Grouped into:
Wide Column Store / Column Families
Document Store
Key Value / Tuple Store
Graph Databases
Multimodel Databases
Object Databases
Grid and Cloud Database Solutions
Haitham A. El-Ghareeb NoSQL Databases
92. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
List of NoSQL Databases II
XML Databases
Multidimensional Databases
Multivalue Databases
Event Sourcing
Network Model
Other NoSQL related databases
Unresolved and uncategorized
Haitham A. El-Ghareeb NoSQL Databases
93. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
BASE Transactions
Acronym contrived to be the opposite of ACID
Basically Available - possibilities of faults but not a fault
of the whole system
Soft state - copies of a data item may be inconsistent
Eventually Consistent - copies become consistent at
some later time if there are no more updates to that
data item
Haitham A. El-Ghareeb NoSQL Databases
94. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
BASE Transactions
Acronym contrived to be the opposite of ACID
Basically Available - possibilities of faults but not a fault
of the whole system
Soft state - copies of a data item may be inconsistent
Eventually Consistent - copies become consistent at
some later time if there are no more updates to that
data item
Characteristics
Weak consistency
Availability
95. rst
Best eort
Approximate answers OK
Simpler and faster
Haitham A. El-Ghareeb NoSQL Databases
96.
97. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Today's focus Inchallah
Key-Value Store - Hash table of keys
Column Oriented - Each storage block contains data from
only one column
Document Store - stores documents made up of tagged
elements
Graph Databases - focus on relations not only entities
Haitham A. El-Ghareeb NoSQL Databases
98.
99. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Key-Value Store I
A two-column table consisting of a key and a value
associated with the key.
The key acts as the index, and the value can be
referenced as a look up.
Example - Project-Voldemort
http://www.project-voldemort.com/
Linkedin
eventual consistent key value stores, auto scaling.
Haitham A. El-Ghareeb NoSQL Databases
100. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Key-Value Store II
Example - MemCached
http://memcachedb.org/
Backend storage is Berkeley-DB
Membase { Memcached with persistence and improved
consistent hashing.
AppFabric Cache { Multi region Cache.
Redis { Data structure server.
Riak { Based on Amazon's Dynamo.
Haitham A. El-Ghareeb NoSQL Databases
101. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Key-Value Store
Haitham A. El-Ghareeb NoSQL Databases
102. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Consistent Hashing
Solves Partitioning Problem.
Consistent Hashing.
servers = [s1, s2, s3, s4]
serverToSendData = servers[hash(data) %
servers.length]
Haitham A. El-Ghareeb NoSQL Databases
103. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Optimistic Concurrency
Haitham A. El-Ghareeb NoSQL Databases
104.
105. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Column Oriented
Haitham A. El-Ghareeb NoSQL Databases
106. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Column Oriented
Store data in column order
Allow key-value pairs to be stored (and retrieved on key)
in a massively parallel system
Data model: families of attributes de
107. ned in a schema,
new attributes can be added
Storing principle: big hashed distributed tables
Properties: partitioning (horizontally and/or vertically),
high availability etc.
Completely transparent to application
Enables compression over column
Haitham A. El-Ghareeb NoSQL Databases
108.
109. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Document-based
Collections of Documents
Schema-less
Based on JSON format: a data model which supports
lists, maps, dates, Boolean with nesting
Indexed semi-structured documents
Haitham A. El-Ghareeb NoSQL Databases
110.
111. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Graph Data Model
Haitham A. El-Ghareeb NoSQL Databases
112. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Graph Data Model
Haitham A. El-Ghareeb NoSQL Databases
113. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Graph Data Model
Haitham A. El-Ghareeb NoSQL Databases
114. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Graph Data Model
Haitham A. El-Ghareeb NoSQL Databases
115. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Haitham A. El-Ghareeb NoSQL Databases
116. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
NoSQL Concept Tree [1]
Haitham A. El-Ghareeb NoSQL Databases
117. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Key Value Stores
Column Oriented
Document-based
Graph
Complexity
Haitham A. El-Ghareeb NoSQL Databases
118. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
1 Let's Agree on..
Some de
119. nitions
Open Source
Scalability
Theory vs. Product
2 Data Models
Flat File
Hierarchical
Network
Relational
Object
3 NoSQL Matters
Key Value Stores
Haitham A. El-Ghareeb NoSQL Databases
120. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Column Oriented
Document-based
Graph
4 Closer Look on Famous Products
Redis
Cassandra
MongoDB
Neo4j
5 Research in NoSQL
Disadvantages of NoSQL
Which DB When
6 Finally
7 References
Haitham A. El-Ghareeb NoSQL Databases
121. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Haitham A. El-Ghareeb NoSQL Databases
122. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Redis I
Advanced key-value store
Values can be binary strings, Lists, Sets, Ordered Sets,
Hash maps, ..
Operations for each data type, e.g. appending to a list,
adding to a set, retrieving a slice of a list, . . .
Provides pub/sub-based messaging
Very fast:
In-memory operations
100K operations/second on entry-level hardware
Haitham A. El-Ghareeb NoSQL Databases
123. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Redis II
Persistent
Periodic snapshots of memory OR append commands to
log
124. le
Limits are size of keys retained in memory.
Has transactions
Commands can be batched and executed atomically
Haitham A. El-Ghareeb NoSQL Databases
125. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Scaling Redis
Master/slave replication
Tree of Redis servers
Non-persistent master can replicate to a persistent slave
Use slaves for read-only queries
Run multiple servers per physical host
Server is single threaded = Leverage multiple CPUs
Optional virtual memory
Ideally data should
126. t in RAM
Values (not keys) written to disc
Haitham A. El-Ghareeb NoSQL Databases
127. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Redis use cases
Use in conjunction with another database
Drop-in replacement for Memcached
Session state
Cache of data retrieved from System Of Records (SOR)
Denormalized datastore for high-performance queries
Randomly selecting an item { SRANDMEMBER
Queuing { Lists with LPOP, RPUSH, . . . .
High score tables { Sorted sets
Notable users: github, guardian.co.uk, . . . .
Haitham A. El-Ghareeb NoSQL Databases
128. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Haitham A. El-Ghareeb NoSQL Databases
129. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Cassandra Data Model I
Columns - The column is the smallest increment of data
in Cassandra. It is a tuple containing a name, a value and
a timestamp.
A column must have a name, and the name can be a
static label (such as name or email) or it can be
dynamically set when the column is created by
application.
It is not required for a column to have a value.
Haitham A. El-Ghareeb NoSQL Databases
130. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Cassandra Data Model II
Haitham A. El-Ghareeb NoSQL Databases
131. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Column Family
Column Family is similar to a table in that it is a
container for columns and rows.
In Cassandra, we de
133. ne metadata about the columns, but
the actual columns that make up a row are determined by
the client application.
Each row can have a dierent set of columns.
Column Family is not entirely schema-less. Each column
family should be designed to contain a single type of data.
Haitham A. El-Ghareeb NoSQL Databases
134. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Static vs. Dynamic Column Family
Haitham A. El-Ghareeb NoSQL Databases
135. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Super Column
Column Family can contain either regular columns or
super columns.
Super Column which adds another level of nesting to the
regular column family structure.
Super columns are comprised of a (super) column name
and an ordered map of sub-columns.
A super column can specify a comparator on both the
super column name as well as on the sub-column names.
Haitham A. El-Ghareeb NoSQL Databases
136. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Super Column
Haitham A. El-Ghareeb NoSQL Databases
137. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Cassandra Advantages
Tunable consistency.
Decentralized.
Writes are faster than reads.
No Single point of failure.
Incremental scalability.
Uses consistent hashing (logical partitioning) when
clustered.
Peer to peer routing(ring).
Haitham A. El-Ghareeb NoSQL Databases
138. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Haitham A. El-Ghareeb NoSQL Databases
139. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
BigTable Representation
Haitham A. El-Ghareeb NoSQL Databases
140. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Haitham A. El-Ghareeb NoSQL Databases
141. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
MongoDB
Document-oriented database
JSON-style documents: Lists, Maps, primitives
Documents organized into collections (table)
Full or partial document updates
Transactional update in place on one document
Atomic Modi
142. ers
Rich query language for dynamic queries
Index support { secondary and compound
GridFS for eciently storing large
144. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
BSON
Haitham A. El-Ghareeb NoSQL Databases
145. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
MongoDB Query By Example
Haitham A. El-Ghareeb NoSQL Databases
146. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
One Document
Sequence of bytes on disk = fast I/O
No joins/seeks
In-place updates when possible = no index updates
Transaction = update of single document
Haitham A. El-Ghareeb NoSQL Databases
147. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
MongoDB use cases
Use cases
Real-time analytics
Content management systems
Single document partial update
Caching
High volume writes
Who is using it?
Shutter
y, Foursquare
Bit.ly Intuit
SourceForge, NY Times
GILT Groupe, Evite,
SugarCRM
Haitham A. El-Ghareeb NoSQL Databases
148. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Haitham A. El-Ghareeb NoSQL Databases
149. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Neo4j I
Graph data model
Collection of graph nodes
Typed relationships between nodes
Nodes and relationships have properties
High performance traversal API from roots
Breadth
152. nd root nodes
Indexes on node/relationship properties
Pluggable - Lucene is the default
Graph algorithms: shortest path, . . .
Haitham A. El-Ghareeb NoSQL Databases
153. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Neo4j II
Transactional (ACID) including 2PC
Deployment modes
Embedded { written in Java
Server with REST API
Haitham A. El-Ghareeb NoSQL Databases
154. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Neo4j Use Cases I
Use Cases
Anything social
Cloud/Network management, i.e. tracking/managing
physical/virtual resources
Any kind of geospatial data
Master data management
Bioinformatics
Fraud detection
Metadata management
Haitham A. El-Ghareeb NoSQL Databases
155. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Neo4j Use Cases II
Who is using it?
StudiVZ (the largest social network in Europe)
Fanbox
The Swedish military
And big organizations in datacom, intelligence, and
156. nance that wish to remain anonymous
Haitham A. El-Ghareeb NoSQL Databases
157. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Redis
Cassandra
MongoDB
Neo4j
Cypher
Haitham A. El-Ghareeb NoSQL Databases
158. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Disadvantages of NoSQL
Which DB When
1 Let's Agree on..
Some de
159. nitions
Open Source
Scalability
Theory vs. Product
2 Data Models
Flat File
Hierarchical
Network
Relational
Object
3 NoSQL Matters
Key Value Stores
Haitham A. El-Ghareeb NoSQL Databases
160. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Disadvantages of NoSQL
Which DB When
Column Oriented
Document-based
Graph
4 Closer Look on Famous Products
Redis
Cassandra
MongoDB
Neo4j
5 Research in NoSQL
Disadvantages of NoSQL
Which DB When
6 Finally
7 References
Haitham A. El-Ghareeb NoSQL Databases
161. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Disadvantages of NoSQL
Which DB When
Disadvantages of NoSQL
New and sometimes buggy
Data is generally duplicated, potential for inconsistency
No standardized schema
No standard format for queries
No standard language
Dicult to impose complicated structures
Depend on the application layer to enforce data integrity
No guarantee of support
Too many options, which one, or ones to pick
Haitham A. El-Ghareeb NoSQL Databases
162. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Disadvantages of NoSQL
Which DB When
Which DB Engine When
Haitham A. El-Ghareeb NoSQL Databases
163. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Disadvantages of NoSQL
Which DB When
Another Attempt
Haitham A. El-Ghareeb NoSQL Databases
164. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
1 Let's Agree on..
Some de
165. nitions
Open Source
Scalability
Theory vs. Product
2 Data Models
Flat File
Hierarchical
Network
Relational
Object
3 NoSQL Matters
Key Value Stores
Haitham A. El-Ghareeb NoSQL Databases
166. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Column Oriented
Document-based
Graph
4 Closer Look on Famous Products
Redis
Cassandra
MongoDB
Neo4j
5 Research in NoSQL
Disadvantages of NoSQL
Which DB When
6 Finally
7 References
Haitham A. El-Ghareeb NoSQL Databases
167. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
NoSQL Summary
NoSQL databases reject:
Overhead of ACID transactions
Complexity of SQL
Burden of up-front schema design
Declarative query expression
Yesterday's technology
Programmer responsible for
Step-by-step procedural language
Navigating access path
Haitham A. El-Ghareeb NoSQL Databases
168. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
SQL vs. NoSQL
SQL Databases
Prede
173. nition and interface language
Getting an answer quickly is more important than
getting a correct answer
Haitham A. El-Ghareeb NoSQL Databases
174. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
How to Succeed?
Know your application
Don't forget the past lessons
Consider a hybrid approach
Fight the desire to Roll-Your-Own-DB
Start small but signi
176. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Haitham A. El-Ghareeb NoSQL Databases
177. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
Contacts
Twitter: @helghareeb
Linkedin: http://eg.linkedin.com/in/helghareeb
Youtube: http://video.helghareeb.me
Web site: http://www.helghareeb.me
Blog: http://blog.helghareeb.me
email: helghareeb@mans.edu.eg
email: helghareeb@gmail.com
Haitham A. El-Ghareeb NoSQL Databases
178.
179. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
References I
Dataversity and the NoSQLNow! conference present the
CIO's guide to NoSQL: A free whitepaper by dan
McCreary and william McKnight.
Andre B. Bondi.
Characteristics of scalability and their impact on
performance.
In Proceedings of the 2Nd International Workshop on
Software and Performance, WOSP '00, pages 195{203,
New York, NY, USA, 2000. ACM.
Haitham A. El-Ghareeb NoSQL Databases
180. Let's Agree on..
Data Models
NoSQL Matters
Closer Look on Famous Products
Research in NoSQL
Finally
References
References II
C. J. Date.
An Introduction to Database Systems.
Addison-Wesley, 8th edition, 2003.
Ramez Elmasri and Shamkant B. Navathe.
Fundamentals of Database Systems.
Addison-Wesley, 6th edition, 2010.
Haitham A. El-Ghareeb NoSQL Databases