1) The document discusses the differences between SQL and NoSQL databases in terms of scalability, data modeling, and indexing. SQL databases are less scalable but ensure consistency and transactions, while NoSQL databases are more scalable through replication and sharding.
2) Complex applications may require a hybrid approach using both SQL and NoSQL databases. For example, storing product data in a NoSQL database and customer relationship management data in a SQL database.
3) There is no single best approach - the optimal solution depends on the specific business needs and data usage patterns. Both SQL and NoSQL databases each have their own advantages, and either can be suitable depending on the context.
The rising interest in NoSQL technology over the last few years resulted in an increasing number of evaluations and comparisons among competing NoSQL technologies From survey we create a concise and up-to-date comparison of NoSQL engines, identifying their most beneficial use from the software engineer point of view.
The rising interest in NoSQL technology over the last few years resulted in an increasing number of evaluations and comparisons among competing NoSQL technologies From survey we create a concise and up-to-date comparison of NoSQL engines, identifying their most beneficial use from the software engineer point of view.
This Presentation is about NoSQL which means Not Only SQL. This presentation covers the aspects of using NoSQL for Big Data and the differences from RDBMS.
The NoSQL movement has introduced four new database architectural patterns that complement, but not replace, traditional relational and analytical databases. This presentation will introduce these four patterns and discuss their relative strengths and weaknesses for solving a variety of business problems. These problems include Big Data (scalability), search, high availability and agility. For each type of problem we look at how NoSQL databases take different approaches to solving these problems and how you can use this knowledge to find the right database architecture for your business challenges.
NoSQL databases get a lot of press coverage, but there seems to be a lot of confusion surrounding them, as in which situations they work better than a Relational Database, and how to choose one over another. This talk will give an overview of the NoSQL landscape and a classification for the different architectural categories, clarifying the base concepts and the terminology, and will provide a comparison of the features, the strengths and the drawbacks of the most popular projects (CouchDB, MongoDB, Riak, Redis, Membase, Neo4j, Cassandra, HBase, Hypertable).
The presentation begins with an overview of the growth of non-structured data and the benefits NoSQL products provide. It then provides an evaluation of the more popular NoSQL products on the market including MongoDB, Cassandra, Neo4J, and Redis. With NoSQL architectures becoming an increasingly appealing database management option for many organizations, this presentation will help you effectively evaluate the most popular NoSQL offerings and determine which one best meets your business needs.
This presentation is all about for the difference in between the Sql and NoSQL database because this question generally comes in the mind of every people that on what parameters and
how we can differentiate both these databases.
So, after viewing this presentation all your doubts and misconfusion between Sql and NoSQL got clear.
Is emergence of NoSQL killed RDBMS and SQL? This slide discusses what is NoSQL and it's history. This also discusses briefly about polyglot persistence.
What is NoSQL? How does it come to the picture? What are the types of NoSQL? Some basics of different NoSQL types? Differences between RDBMS and NoSQL. Pros and Cons of NoSQL.
What is MongoDB? What are the features of MongoDB? Nexus architecture of MongoDB. Data model and query model of MongoDB? Various MongoDB data management techniques. Indexing in MongoDB. A working example using MongoDB Java driver on Mac OSX.
This Presentation is about NoSQL which means Not Only SQL. This presentation covers the aspects of using NoSQL for Big Data and the differences from RDBMS.
The NoSQL movement has introduced four new database architectural patterns that complement, but not replace, traditional relational and analytical databases. This presentation will introduce these four patterns and discuss their relative strengths and weaknesses for solving a variety of business problems. These problems include Big Data (scalability), search, high availability and agility. For each type of problem we look at how NoSQL databases take different approaches to solving these problems and how you can use this knowledge to find the right database architecture for your business challenges.
NoSQL databases get a lot of press coverage, but there seems to be a lot of confusion surrounding them, as in which situations they work better than a Relational Database, and how to choose one over another. This talk will give an overview of the NoSQL landscape and a classification for the different architectural categories, clarifying the base concepts and the terminology, and will provide a comparison of the features, the strengths and the drawbacks of the most popular projects (CouchDB, MongoDB, Riak, Redis, Membase, Neo4j, Cassandra, HBase, Hypertable).
The presentation begins with an overview of the growth of non-structured data and the benefits NoSQL products provide. It then provides an evaluation of the more popular NoSQL products on the market including MongoDB, Cassandra, Neo4J, and Redis. With NoSQL architectures becoming an increasingly appealing database management option for many organizations, this presentation will help you effectively evaluate the most popular NoSQL offerings and determine which one best meets your business needs.
This presentation is all about for the difference in between the Sql and NoSQL database because this question generally comes in the mind of every people that on what parameters and
how we can differentiate both these databases.
So, after viewing this presentation all your doubts and misconfusion between Sql and NoSQL got clear.
Is emergence of NoSQL killed RDBMS and SQL? This slide discusses what is NoSQL and it's history. This also discusses briefly about polyglot persistence.
What is NoSQL? How does it come to the picture? What are the types of NoSQL? Some basics of different NoSQL types? Differences between RDBMS and NoSQL. Pros and Cons of NoSQL.
What is MongoDB? What are the features of MongoDB? Nexus architecture of MongoDB. Data model and query model of MongoDB? Various MongoDB data management techniques. Indexing in MongoDB. A working example using MongoDB Java driver on Mac OSX.
If NoSQL is your answer, you are probably asking the wrong question.Lukas Smith
This session is not about bad mouthing MongoDB, CoachDB, big data, map reduce or any of the other more recent additions to the database buzzword bingo. Instead it is about looking at how NoSQL is a confusing term and a more realistic assessment how old and new approaches in databases impact todays architectures...
MongoDB® is a matured NoSQL database product with an ever growing adoption. Many big enterprise and internet companies such as Cisco, EBay, Disney etc. are now running large scale mongoDB production deployments. With its increased adoption, mongoDB® has enabled developers to build new types of applications for cloud, mobile and social technologies. This makes mongoDB® developers an invaluable resource for companies looking to innovate in each of these areas.
Enterprise NoSQL: Silver Bullet or Poison PillBilly Newport
This is a slightly revised version of the keynote I gave for the first time at StrangeLoop 2010. It tries to shows the pros and cons of NoSQL versus SQL and highlight whats easy and not so easy to do so people have a better understanding of typical NoSQL type products.
Couchbase Server is a high-performance NoSQL distributed database with a flexible data model. It scales on commodity hardware to support large data sets with a high number of concurrent reads and writes while maintaining low latency and strong consistency.
Presentation on NoSQL Database related RDBMSabdurrobsoyon
This Presentation is about NoSQL which means Not Only SQL. This presentation covers the aspects of using NoSQL for Big Data and the differences from RDBMS.
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
1. SQL VS NOSQL: DEEP DIVE
As we have walked through earlier in the last topic, now I suppose you have built
your team and you have the SQL/NoSQL versatile guy. Then he can make value
out of this one!
Caution: This article is not intended for beginners/ Amateurs.
Warm Up:
I will start with a warm up example before we dive in this comparison,
The following figure is a simple demonstration to compare between an RDBMS table
and a simple graph structured database for friendship relations which was introduced
by Healey. So, simply on the left it is schemaless structured and on the right this shows
how it can be extended to a normal structured schema.
Also before diving deep, we can just revise the simple instructions, by a simple example of both
representations.
Schemaless simply means that two documents, (a NoSQL data structure we will
discuss) can have different fields, or common fields that store different types of data as
this example:
var cars = [
{ Model: "BMW", Color: "Red", Manufactured: 2016 },
{ Model: "Mercedes", Type: "Coupe", Color: "Black", Manufactured: “1-1-2017” }
];
The basic difference in representation between SQL and NoSQL common technologies
, is the support of JOIN in SQL, it is as simple as joining two tables like this one.
SELECT Orders.OrderID, Customers.Name, Orders.Date
FROM Orders
INNER JOIN Customers
ON Orders.CustID = Customers.CustID
One major difference between NoSQL and SQL is that NoSQL databases are more
scalable than SQL databases. MongoDB for example has built in support for replication
and sharding (horizontal partitioning of data) to support this scalability. On the other
hand,SQL databases have less scalability but they ensure very high consistency and
they support fault tolerance, built in journaling and transaction management.
2. Sounds simple till now, but I assume that you have refreshed your mind skillset and we
can speak more technically. Let’s take this from the point of view of the compaction, the
NoSQL recently proposed compation algorithm by Ghosh and Gupta shows that it is
very challenging to handle the continuous generation of sstables (sorted string tables) ;
this is the file of key value string paired sorted by keys. The continuous generation of
sstables at a server overtime causes the read operation to contact multiple sstables
creating a disk I/O bottleneck for reads, so reads are slower than writes in NoSQL
databases, and for that, the NoSQL systems run the compaction protocols in the
background. The compaction algorithm to merge multiple sstables into a single sstable
by the merge sorting keys is NP-hard !
So, apparently, SQL doesn’t have this complexity deep inside it has simpler
implementation, but that of course at the cost of scalability. SQL databases are of
course less scalable.
Scalability:
Now take a deep breath! We are still in the shallow surface but at least we can now see
the corals, and we have an overview of the scalability for both. To further discuss this I
will go to Cattell’s proposed classification for the different data stores.
Data store
type
Use case Example Hints/Recommendations
Key-value
Store
For simple
application with
only one kind of
object,
and you only need
to look up objects
based on one
attribute.
Facebook’s
user home page’s
live updates.
Developers familiarity of
memcached is
recommended
Move to document store if
you intend to make key
value store lookups based
on multiple attributes.
Document
Store
for multiple
different kinds of
objects
Department of
Motor Vehicles
application,
with vehicles and
drivers), where
you need to look
up
objects based on
multiple fields
(say, a driver’s
name,
license number,
owned vehicle, or
birth date)
Use it when you accept to
tolerate
an “eventually consistent”
model with limited
atomicity and isolation.
“quorum-read”mechanism is
recommended for up to date
atomically consistent data.
3. Extensible
Record
Store
Higher throughput
and stronger
concurrency at the
cost of slightly
more complexity
than document
store
EBay style
applications:
Partitioning data
both vertically and
horizontally for
storing customer
information on an
HBase or Hypertable is used
for this partitioning
and making it easily
extensible.
Scalable
RDBMS
The usage of
ACID semantics to
free developers
from dealing with
locks, out-of-date
data, update
collisions, and
consistency;
Applications
which
do not demand
updates or joins
that span many
Nodes
Use MySQL clusters
(VoltDB and Clustrix) as
they were benchmarked for
improved scalability.
Now it is important to know the benchmarking KPIs for this scalability assessment,
The benchmarking is pivoted on three main axes: the concurrency, Data storage and
replication. The concurrency is pinned on the mechanism of locking, mutli version
concurrency control and the ACID. The data storage is to check either it is in memory
(Ram based) or disk based. The replication is split to two types either it is synchronous
or asynchronous.
By this kind of benchmarking and testing, the conclusion was that the cluster SQL
databases have shown promising performance per node as well as the capacity of
performing at scale. And hence SQL scalable RDBMS still has some competitive
advantage over the NoSQL data stores because of the convenience of the higher-level
SQL and ACID properties but you are about to lose this advantage if you are spanning
nodes.
NoSQL Characteristics:
NoSQL databases have been distinguished over SQL databases with three main
characteristics that stick to the CAP theorem developed by computer scientist Eric
Brewer which states that “it is impossible for a distributed computer system to
simultaneously provide more than two out of three of the following guarantees:
Consistency. Availability. Partition tolerance.”
4. Indexing:
“I'm trying to create indexes on a table with 308 million rows. It took ~20 minutes to load the
table but 10 days to build indexes on it.” ‣ MySQL bug #9544 • “Select queries were slow until I
added an index onto the timestamp field... Adding the index really helped our reporting, BUT
now the inserts are taking forever.” ‣ This was a comment on mysqlperformanceblog.com.
The SQL databases always follow the B-Tree indexing structure and it is the well-known for
almost all the DBMS. On the other hand, the NoSQL databases use the key/value pair index
structures or the T-trees. To better have a grasp of the T-tree, the next diagram indicates the
structure of the T-Tree with pointers in each T-Node.
5. In some cases, like in MongoDB it uses the B-Tree indexing with Memory-mapped files
indexing pointer. More contributions were done by Otoo, Nimako and Kwofie for developing
more advanced indexing algorithms like the O2-Tree for the In-Memory Database.
In general, the column store indexes are not the same like the traditional indices, they are more
like pre-aggregated statistics. This column store indexing structure was introduced in SQL
server 2012 and it requires you to specify what fields you want to index.
However, it is a fact to say that on absolute NoSQL systems needs indexing as they are too
fragmented in structure not like the SQL databases. Nevertheless, Cassandra has also introduced
secondary indexing over the single clustering indexing. In the next diagram, Victoria Malaya,
has summarized this in correspondence between MongoDB and SQL server in the following
diagram to illustrate the difference between SQL server DB and MongoDB indexing, the SQL
server database index on the column level and MongoDB indexes on the collection level and
6. and supports indexing on any field or subfield of the document in the MongoDB collection,
Business state of the art (Hybrid):
The current state of the art and the business need may require a hybrid structure that includes
NoSQL alongside SQL. A nice representation for that was introduced by Moniruzzaman.
However, there is big trend for moving to NoSQL more, due to the lack of flexibility and the
rigid schemas of SQL, with the high latency alongside the low performance compared to
NoSQL in addition to the inability to scale out data with the same power.
The authors of SQLite and CouchDB have proposed UnQL2 as an attempt to create a hybrid
SQL-NoSQL query language. UnQL is based on SQL with an extension to query NoSQL data
7. Complex use cases of SQL vs NoSQL:
Now I will cover couple of examples to show the difference in implementation for both,
1- The SQL (relational database) vs the NoSQL (document and graph database)
The below designs are for an app designed for testing purposes and it acts as a social networking
portal, it provides the common features for social media (friends grouping, private messaging,
microblogging, rate and write comments, add tags to topics...).. Just to have the sense of how this
can be, here is a highlevel snapshot of the three proposed designs.
1- Relational database 2- Document Database 3- Graph database.
(SQL PostgreSQL) (NoSQL MongoDB) (NoSQL Neo4j)
Later there were lots of performance test cases applied, using the execution time as
benchmarking. But this is out of our scope this is only for demonstrating the idea of the
design.
2-Image processing
In NoSQL databases like MongoDB you can do image processing by using NodeJs modules like
sharp , Jimp and many more and this is the mainstream nowadays. What is may be more
interesting is using SQL in image processing; you can use something like PixQL which is an
SQL inspired command-line image processing.
You can also use pure SQL in image processing but this is more complicated and you will need
some advanced design like this one:
These diagrams represent the OLTP transactional layer and the ROLAP cube to store
information about objects.
An example query to calculate image histogram info from the cube.
SELECT
FK_OBJECTS_ID,
1 AS CHANNEL,
8. RED AS BRIGHTNESS, 1 AS IMG_AREA,
COUNT(*) AS VALUE
FROM
FACT_IMAGES
GROUP BY
FK_OBJECTS_ID, RED
UNION ALL
SELECT
FK_OBJECTS_ID, 2 AS CHANNEL,
GREEN AS BRIGHTNESS,
1 AS IMG_AREA,
COUNT(*) AS VALUE
FROM
FACT_IMAGES
GROUP BY
FK_OBJECTS_ID,
GREEN
UNION ALL
SELECT
FK_OBJECTS_ID,
3 AS CHANNEL,
BLUE AS BRIGHTNESS,
1 AS IMG_AREA,
COUNT(*) AS VALUE
FROM
FACT_IMAGES
GROUP BY
FK_OBJECTS_ID,
BLUE
UNION ALL
SELECT FK_
OBJECTS_ID,
4 AS CHANNEL,
ALPHA AS BRIGHTNESS,
1 AS IMG_AREA,
COUNT(*) AS VALUE
FROM FACT_IMAGES
GROUP BY
FK_OBJECTS_ID, ALPHA
CRM Application:
Another application of using both is a company which stores the CRM on both database types,
the product info and related data on a NoSQL database and the CRM data on an SQL database.
This topic opens lots of discussions as there is big debate on this data bridging, so that dividing
the same application data over SQL and NoSQL generates a gap in terms of data access.
Conclusion:
After we have just explored the corals! you should have clearly noticed that favoring
SQL over NoSQL or using both alongside each other, is dependent on your business
context and the utilization of the resources you have. Every application and every
business demand has its own feasibility study and criteria. So, there is no battle between
SQL and NoSQL, it is a harmony!