Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Business Intelligence & NoSQL Databases
1. BI and NoSQL Databases
National Engineering School of Tunis
Prepared by:
Radhouene ROUACHED Zied ENNACEUR
1
18/11/2016
Master Information System Techniques
University of Tunis El Manar
2. Agenda
2
01010101 What is NoSQLWhat is NoSQL
03030303 Business Intelligence: Hadoop vs NoSQLBusiness Intelligence: Hadoop vs NoSQL
NoSQl CategoriesNoSQl Categories
04040404
02020202 Business Intelligence needs NoSQLBusiness Intelligence needs NoSQL
05050505 Implementation: Mongo Db the most popular NoSQL solutionImplementation: Mongo Db the most popular NoSQL solution
06060606 FAQFAQ
3. Introduction
Problematic
3
• Historically the data that drives business intelligence has been
stored in structured formats in a data warehouse, such as
customer information on how much is spent. However, this
approach misses out on the value of semi-unstructured and
unstructured data, like the details from a customer call or a
customer tweet.
4. Introduction
Solution
4
• With such information missing, a complete view of the customer
or business can be limited. The consequence is that an
inability to gain knowledge and measure customer information
means businesses can fall behind, especially in a competitive
market.
5. What is NoSQL ?
5
• NoSQL is referred to Not Only SQL, it is the Next Generation
Databases mostly addressing some of the points: being non-
relational, distributed and horizontally scalable.
o Not using the relational model,
o Running well on clusters,
o Mostly open-source,
o Built for the 21st century web estates,
o Schema-less.
6. Business Intelligence needs NoSQL(1/2)
6
• With NoSQL, BI and data warehousing can become quicker and
much more efficient. It allows organizations to react to events
more quickly, increase customer attention, streamline the supply
chain, predict customer behavior at the point it matters and
predict future service calls. At the rise of big, unstructured data,
NoSQL presents enormous opportunity for the future of business
intelligence.
• Having access to all types of relevant customer information –
structured, semi-structured and unstructured – is an essential
requirement for business intelligence (BI) to help enterprises get
ahead of the competition.
7. Business Intelligence needs NoSQL(2/2)
7
1970 1980 1990 2000 2010
0
0,5
1
1,5
2
2,5
3
0,5
0,75
1
1,25
1,5
0 0
1
2
2,5
OLTP Data Web App Data
Year
Zettabyte
8. Hadoop vs NoSQL(1/2)
8
• Apache Hadoop is an open-source software framework that supports
data-intensive distributed applications, licensed under the Apache
v2 license. It enables applications to work with thousands of
computational independent computers and petabytes of data.
HADOOP
MapReduceHDFS
10. NoSQL Categories(1/14)
10
Key-Value Databases,
Document Databases,
Column family stores,
Graph Databases.
• NoSQL databases can broadly be categorized in four types:
11. NoSQL Categories(2/14)
11
Key-value Databases
Example DBs :
• One of the simplest types, it is a sort of distributed hash map designed to
save data without defining schema,
• All data in the form of Key/Value,
• The data is indexed using keys, therefore, it is not possible to access data
value without having keys,
• While files are stored as blob, removed the need of indexing data and
which allows more performance,
• Communications provided using the CRUD operations GET, POST ,PUT and
DELETE.
Ø Characteristics
12. NoSQL Categories(3/14)
12
Key-value Databases
• Tracking transient attributes in a Web application, such as a shopping cart,
• Caching data from relational databases to improve performance,
• Storing configuration and user data information for mobile applications,
• Storing large objects, such as images and audio files.
•
Ø Uses cases and implementations
14. NoSQL Categories(5/14)
14
Example DBs :
• Extend key / value paradigm with "documents" more complex instead of
simple data and a unique key for each of them,
• JSON or XML document type,
• Each document is an object contains one or more fields,
• Each field contains a typed value (string, date, or binary array).
Document Databases
Ø Characteristics
15. NoSQL Categories(6/14)
15
• Back-end support for websites with high volumes of reads and writes,
• Managing data types with variable attributes, such as products,
•
• Tracking variable types of metadata,
•
• Applications that use JSON data structures,
•
• Applications benefiting from denormalization by embedding structures
within structures.
Document Databases
Ø Uses cases and implementations
17. NoSQL Categories(8/14)
17
Example DBs :
• Looks like RDBMS, but with a dynamic number of columns, different from
one record to another (no columns with nulls),
• Offer very high performance and highly scalable architecture.
Column family stores
Ø Characteristics
18. NoSQL Categories(9/14)
18
• Applications that require the ability to always write to the database,
•
• Applications that are geographically distributed over multiple data
centers,
•
• Applications with dynamic fields,
•
• Applications with the potential for truly large volumes of data, such as
hundreds of terabytes,
•
• Applications that can tolerate some short-term inconsistency in replicas.
Column family stores
Ø Uses cases and implementations
20. NoSQL Categories(11/14)
20
Example DBs :
• Based on graph theory,
• Relies on the nodes of concepts, relationships and properties attached to
them,
• Designed for data whose relations are represented as graphs, and having
interconnected elements, with an unknown number of relationships
between them,
• Suitable for treatment of social networks data.
•
•
Graph Databases
Ø Characteristics
21. NoSQL Categories(12/14)
21
• Recommending products and services,
• Business process management,
• Network and IT infrastructure management,
•
• Identity and access management,
• Social networking.
•
•
Graph Databases
Ø Uses cases and implementations
23. NoSQL Categories(14/14)
23
● Model ● Performanc
es
● Scalability ● Flexibility ● Complexity
● Key store
value
● Column
store
● Document
DB
● Graph DB
Good
Not Bad
Bad
25. FAQ
25
Ø NOSQL Databases
• Performance on large data volumes,
• Performances on unstructured data,
• Scalability very important, even at low volumes.
Ø However
• Fairly young technology Lack of tools supporting it,
• Still evolving, no standard,
• No common query language like SQL, but :
o We must do more work at defining queries,
o Special requests to the base (Cassandra Query Language),
o API based on Map Reduce, or object graphs.
26. Scientific research news(1/2)
26
[Jie Xu et al.,2016] ZQL: A Unified Middleware Bridging Both
Relational and NoSQL Databases,
[María Teresa González-Aparicio et al.,2016]A New Model for
Testing CRUD Operations in a NoSQL Database,
[Francesca Bugiotti et al.,2015]How I Learned to Stop
Worrying and Love NoSQL Databases.
27. Scientific research news(2/2)
27
[Saumaya Goyal et al.,2016]An overview of hybrid databases,
[Suna Yin et al.,2016] STNoSQL: Creating NoSQL database on
the Sensible Things platform.