A Beginners Guide to noSQL

Mike Crabb
Mike CrabbLecturer at University of Dundee
THE BEGINNERS GUIDE TO
noSQL
THE
WHY WE ARE STORING MORE DATA
NOW THAN WE EVER HAVE
BEFORE
THE
WHY WE ARE STORING MORE DATA
NOW THAN WE EVER HAVE
BEFORE
CONNECTIONS BETWEEN OUR
DATA ARE GROWING ALL THE
TIME
THE
WHY WE ARE STORING MORE DATA
NOW THAN WE EVER HAVE
BEFORE
CONNECTIONS BETWEEN OUR
DATA ARE GROWING ALL THE
TIME
WE DON’T MAKE THINGS
KNOWING THE STRUCTURE
FROM DAY 1
THE
WHY WE ARE STORING MORE DATA
NOW THAN WE EVER HAVE
BEFORE
CONNECTIONS BETWEEN OUR
DATA ARE GROWING ALL THE
TIME
WE DON’T MAKE THINGS
KNOWING THE STRUCTURE
FROM DAY 1
SERVER ARCHITECTURE IS NOW
AT A STAGE WHERE WE CAN
TAKE ADVANTAGE OF IT
salary lists
most web applications
social networks
semantic trading
SiZE
Complexity
relational databases
NOSQL
USE CASES
LARGE DATA VOLUMES
MASSIVELY DISTRIBUTED ARCHITECTURE
REQUIRED TO STORE THE DATA
GOOGLE, AMAZON, FACEBOOK, 100K SERVERS
NOSQL
USE CASES
LARGE DATA VOLUMES
MASSIVELY DISTRIBUTED ARCHITECTURE
REQUIRED TO STORE THE DATA
GOOGLE, AMAZON, FACEBOOK, 100K SERVERS
EXTREME QUERY WORKLOAD
IMPOSSIBLE TO EFFICIENTLY DO JOINS AT THAT
SCALE WITH AN RDBMS
NOSQL
USE CASES
LARGE DATA VOLUMES
MASSIVELY DISTRIBUTED ARCHITECTURE
REQUIRED TO STORE THE DATA
GOOGLE, AMAZON, FACEBOOK, 100K SERVERS
EXTREME QUERY WORKLOAD
IMPOSSIBLE TO EFFICIENTLY DO JOINS AT THAT
SCALE WITH AN RDBMS
SCHEMA EVOLUTION
SCEMA FLEXIBILITY IS NOT TRIVIAL AT A LARGE
SCALE BUT IT CAN BE WITH NO SQL
NOSQL
PROS AND CONS
PROS
MASSIVE SCALABILITY
HIGH AVAILABILITY
LOWER COST
SCHEMA FLEXIBILITY
SPARCE AND SEMI STRUCTURED DATA
NOSQL
PROS AND CONS
PROS
MASSIVE SCALABILITY
HIGH AVAILABILITY
LOWER COST
SCHEMA FLEXIBILITY
SPARCE AND SEMI STRUCTURED DATA
CONS
LIMITED QUERY CAPABILITIES
NOT STANDARDISED (PORTABILITY MAY BE AN ISSUE)
STILL A DEVELOPING TECHNOLOGY
OSQL NOSQL NOSQL NOSQL
QL BIGTABLE NOSQL NOSQL
QL NOSQL NOSQL NOSQL N
OSQL NOSQL KEY VALUE NO
SQL NOSQL NOSQL NOSQL N
NOSQL NOSQL NOSQL NOS
NOSQL NOSQL NOSQL NOSQ
QL NOSQL NOSQL NOSQL NO
GRAPHDB NOSQL NOSQL N
NOSQL NOSQL NOSQL NOS
OSQL NOSQL NOSQL NOSQL
SQL NOSQL DOCUMENT NOS
FOUREMERGING TRENDS IN
NOSQL DATABASES
BUT FIRST…
IMAGINE A LIBRARY
LOTS OF DIFFERENT FLOORS
DIFFERENT SECTIONS ON EACH FLOOR
DIFFERENT BOOKSHELVES IN EACH SECTION
LOTS OF BOOKS ON EACH SHELF
LOTS OF PAGES IN EACH BOOK
LOTS OF WORDS ON EACH PAGE
EVERYTHING IS WELL ORGANISED
AND EVERYTHING HAS A SPACE
BUT FIRST…
IMAGINE A LIBRARY
WHAT HAPPENS IF WE
BUY TOO MANY BOOKS!?
(THE WORLD EXPLODES AND THE KITTENS WIN)
BUT FIRST…
IMAGINE A LIBRARY
WHAT HAPPENS IF WE WANT TO
STORE CDS ALL OF A SUDDEN!?
(THE WORLD EXPLODES AND THE KITTENS WIN)
BUT FIRST…
IMAGINE A LIBRARY
WHAT HAPPENS IF WE WANT
TO GET RID OF ALL BOOKS
THAT MENTION KITTENS
(KITTENS STILL WIN)
BIG
BEHAVES LIKE A STANDARD RELATIONAL
DATABASE BUT WITH A SLIGHT CHANGE
http://research.google.com/archive/bigtable.html
http://research.google.com/archive/spanner.html
DESIGNED TO WORK WITH A LOT OF
DATA…A REALLY BIG CRAP TON
CREATED BY GOOGLE AND NOW USED
BY LOTS OF OTHERS
TABLE
THIS IS A STANDARD
RELATIONAL
DATABASE
BIG
TABLE
THIS IS A BIG
TABLE DATABASE
(AND NOW THE NAME MAKES SENCE!)
BIG
TABLE
“A Bigtable is a sparse, distributed, persistent
multidimensional sorted map. The map is indexed by a
row key, column key, and a timestamp; each value in
the map is an uninterpreted array of bytes.”
BIG
TABLE
“A Bigtable is a sparse, distributed, persistent
multidimensional sorted map. The map is indexed by a
row key, column key, and a timestamp; each value in
the map is an uninterpreted array of bytes.”
BIG
TABLE
“A Bigtable is a sparse, distributed, persistent
multidimensional sorted map. The map is indexed by a
row key, column key, and a timestamp; each value in
the map is an uninterpreted array of bytes.”
KEY
VALUE
AGAIN, DESIGNED TO WORK WITH A LOT
OF DATA
EACH BIT OF DATA IS STORED IN A
SINGLE COLLECTION
EACH COLLECTION CAN HAVE DIFFERENT
TYPES OF DATA
KEY
VALUE
A CB D E
KEY
VALUE
A C D E
OUR VALUES ARE HIDDEN INSIDE THE KEYS
TO FIND OUT WHAT THEY ARE WE NEED TO
QUERY THEM
What is in Key B?
The Triangle
B
KEY
VALUE
(VOLDERMORT)
DOCUMENT
STORE
DESIGNED TO WORK WITH A LOT OF
DATA (BEGINNING TO NOTICE A THEME?)
VERY SIMILAR TO A KEY VALUE DATABASE
MAIN DIFFERENCE IS THAT YOU CAN
ACTUALLY SEE THE VALUES
DOCUMENT
STORE
A CB D E
DOCUMENT
STORE
A CB D E
Bring me the triangles
Yes m’lord.
SIDENOTE
REMEMBER HOW SQL
DATABASES ARE LIBRARIES?
NO SQL IS MORE LIKE A BAG
OF CATS!
SIDENOTE
colour: tabby
name: Gunther
colour: ginger
name: Mylo
colour: grey
name: Ruffus
age: kitten
colour: ginger(ish)
name: Fred
age: kitten
colour: ginger(ish)
name: Quentin
legs: 3
WE CAN ADD IN
FIELDS AS AND
WHEN WE
NEED THEM
DOCUMENT
STORE
A CB D E
Bring me the KITTENS!
Of course m’lord.
DOCUMENT
STORE
GRAPH
DATABASE
FOCUS HERE IS ON MODELLING THE
STRUCTURE OF THE DATA
INSPIRED BY GRAPH THEORY (GO MATHS!)
SCALES REALLY WELL TO THE
STRUCTURE OF THE DATA
GRAPH
DATABASE
GRAPH
DATABASE
GRAPH
DATABASE
WORKS_WITH
WORKS_WITH
OWNS
OWNS
CARSHARES IN
GRAPH
DATABASE
name: “Michael”
twitter: “@mrmike
name: “John”
twitter:”@mrjohn”
brand: “Toyota”
currentState: “Broken”
brand: “Vauxhall”
currentState: “Working”
WORKS_WITH
WORKS_WITH
OWNS
OWNS
CARSHARES IN
GRAPH
DATABASE
name: “Michael”
twitter: “@mrmike
name: “John”
twitter:”@mrjohn”
brand: “Toyota”
currentState: “Broken”
brand: “Vauxhall”
currentState: “Working”
WORKS_WITH
WORKS_WITH
OWNS
propertyType: “car”
OWNS
propertyType: “car”
CARSHARES IN
GRAPH
DATABASE
key/value store
bigtable clone
document database
graph database
SiZE
Complexity
key/value store
bigtable clone
document database
graph database
SiZE
Complexity
>90% of use cases
WHEN TO USE
NOSQL
AND WHEN TO USE
SQL
THE BASICS
High availability and disaster recovery are a must
Understand the pros and cons of each design model
Don’t pick something just because it is new
Do you remember the zune?
Don’t pick something based JUST on performance
SQL
High performance for transactions. Think ACID
Highly structured, very portable
Small amounts of data
SMALL IS LESS THAN 500GB
Supports many tables with different types of data
Can fetch ordered data
Compatible with lots of tools
THE GOOD
ATOMICITY
CONSISTENCY
ISOLATION
DURABILITY
SQL
SQL
High performance for transactions. Think ACID
Highly structured, very portable
Small amounts of data
SMALL IS LESS THAN 500GB
Supports many tables with different types of data
Can fetch ordered data
Compatible with lots of tools
THE GOOD
SQL
Complex queries take a long time
The relational model takes a long time to learn
Not really scalable
Not suited for rapid development
THE BAD
noSQL
Fits well for volatile data
High read and write throughput
Scales really well
Rapid development is possible
In general it’s faster than SQL
THE GOOD
BASICALLY
AVAILABLE
SOFT STATE
EVENTUALLY CONSISTENT
noSQL
noSQL
Fits well for volatile data
High read and write throughput
Scales really well
Rapid development is possible
In general it’s faster than SQL
THE GOOD
noSQL
Key/Value pairs need to be packed/unpacked all the time
Still working on getting security for these working as well as SQL
Lack of relations from one key to another
THE GOOD
tl;dr
so use both, but think about when you want to use them!
works great, can’t scale for large data
works great, doesn't fit all situations
SQL
noSQL
A lot of this content is loving ripped from
lots of other (more impressive)
presentations that are already on
SlideShare - you should check them out!
FINALLY
1 of 53

Recommended

3D: DBT using Databricks and Delta by
3D: DBT using Databricks and Delta3D: DBT using Databricks and Delta
3D: DBT using Databricks and DeltaDatabricks
2.1K views39 slides
Change Data Feed in Delta by
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in DeltaDatabricks
1.6K views21 slides
Building a Streaming Microservice Architecture: with Apache Spark Structured ... by
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Databricks
1.6K views28 slides
Ray: Enterprise-Grade, Distributed Python by
Ray: Enterprise-Grade, Distributed PythonRay: Enterprise-Grade, Distributed Python
Ray: Enterprise-Grade, Distributed PythonDatabricks
797 views26 slides
Apache doris (incubating) introduction by
Apache doris (incubating) introductionApache doris (incubating) introduction
Apache doris (incubating) introductionleanderlee2
831 views33 slides
Scalability, Availability & Stability Patterns by
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
515.7K views196 slides

More Related Content

What's hot

Understanding Query Plans and Spark UIs by
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsDatabricks
4.7K views50 slides
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... by
 Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...Databricks
35.3K views54 slides
Data Lineage with Apache Airflow using Marquez by
Data Lineage with Apache Airflow using Marquez Data Lineage with Apache Airflow using Marquez
Data Lineage with Apache Airflow using Marquez Willy Lulciuc
3.1K views58 slides
Building Data-Centric Businesses by
Building Data-Centric BusinessesBuilding Data-Centric Businesses
Building Data-Centric BusinessesThoughtworks
33.8K views109 slides
SQL Analytics Powering Telemetry Analysis at Comcast by
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastDatabricks
341 views35 slides
Best Practices for Building and Deploying Data Pipelines in Apache Spark by
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkDatabricks
1.9K views27 slides

What's hot(20)

Understanding Query Plans and Spark UIs by Databricks
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIs
Databricks4.7K views
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... by Databricks
 Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Databricks35.3K views
Data Lineage with Apache Airflow using Marquez by Willy Lulciuc
Data Lineage with Apache Airflow using Marquez Data Lineage with Apache Airflow using Marquez
Data Lineage with Apache Airflow using Marquez
Willy Lulciuc3.1K views
Building Data-Centric Businesses by Thoughtworks
Building Data-Centric BusinessesBuilding Data-Centric Businesses
Building Data-Centric Businesses
Thoughtworks33.8K views
SQL Analytics Powering Telemetry Analysis at Comcast by Databricks
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at Comcast
Databricks341 views
Best Practices for Building and Deploying Data Pipelines in Apache Spark by Databricks
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Databricks1.9K views
DevOps for Databricks by Databricks
DevOps for DatabricksDevOps for Databricks
DevOps for Databricks
Databricks1.1K views
Relational to Big Graph by Neo4j
Relational to Big GraphRelational to Big Graph
Relational to Big Graph
Neo4j85.4K views
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P... by Databricks
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Databricks1K views
개발자를 위한 (블로그) 글쓰기 intro by Seongyun Byeon
개발자를 위한 (블로그) 글쓰기 intro개발자를 위한 (블로그) 글쓰기 intro
개발자를 위한 (블로그) 글쓰기 intro
Seongyun Byeon15.6K views
Building flexible ETL pipelines with Apache Camel on Quarkus by Ivelin Yanev
Building flexible ETL pipelines with Apache Camel on QuarkusBuilding flexible ETL pipelines with Apache Camel on Quarkus
Building flexible ETL pipelines with Apache Camel on Quarkus
Ivelin Yanev395 views
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa... by DataStax
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
DataStax6.6K views
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud by Noritaka Sekiyama
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama33.3K views
[KAIST 채용설명회] 데이터 엔지니어는 무슨 일을 하나요? by Juhong Park
[KAIST 채용설명회] 데이터 엔지니어는 무슨 일을 하나요?[KAIST 채용설명회] 데이터 엔지니어는 무슨 일을 하나요?
[KAIST 채용설명회] 데이터 엔지니어는 무슨 일을 하나요?
Juhong Park2.6K views
Massive Data Processing in Adobe Using Delta Lake by Databricks
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks719 views
Operating and Supporting Delta Lake in Production by Databricks
Operating and Supporting Delta Lake in ProductionOperating and Supporting Delta Lake in Production
Operating and Supporting Delta Lake in Production
Databricks333 views
Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn... by Simplilearn
Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...
Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...
Simplilearn1.1K views
MLflow Model Serving by Databricks
MLflow Model ServingMLflow Model Serving
MLflow Model Serving
Databricks1.4K views

Viewers also liked

Cloudera Impala: A modern SQL Query Engine for Hadoop by
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera, Inc.
6.6K views17 slides
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup by
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetupiwrigley
3.9K views59 slides
Big Data Standards - Workshop, ExpBio, Boston, 2015 by
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Susanna-Assunta Sansone
2.3K views36 slides
mini MAXI art exhibition by
mini MAXI art exhibitionmini MAXI art exhibition
mini MAXI art exhibitionAnna Casey
2.1K views67 slides
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017 by
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Stefan Lipp
1.7K views17 slides
Introduction to NoSQL Databases by
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
47.7K views36 slides

Viewers also liked(7)

Cloudera Impala: A modern SQL Query Engine for Hadoop by Cloudera, Inc.
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera, Inc.6.6K views
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup by iwrigley
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
iwrigley3.9K views
mini MAXI art exhibition by Anna Casey
mini MAXI art exhibitionmini MAXI art exhibition
mini MAXI art exhibition
Anna Casey2.1K views
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017 by Stefan Lipp
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Stefan Lipp1.7K views
Introduction to NoSQL Databases by Derek Stainer
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
Derek Stainer47.7K views
Enabling the Industry 4.0 vision: Hype? Real Opportunity! by Boris Otto
Enabling the Industry 4.0 vision: Hype? Real Opportunity!Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Boris Otto2.5K views

Similar to A Beginners Guide to noSQL

PostgreSQL, your NoSQL database by
PostgreSQL, your NoSQL databasePostgreSQL, your NoSQL database
PostgreSQL, your NoSQL databaseReuven Lerner
1.4K views47 slides
Oracle's Take On NoSQL by
Oracle's Take On NoSQLOracle's Take On NoSQL
Oracle's Take On NoSQLAlexander Shopov
3.4K views130 slides
Introduction to Linked Data 1/5 by
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Juan Sequeda
1.9K views79 slides
Super resize me by
Super resize meSuper resize me
Super resize meAxel Valdez
521 views39 slides
Wither OWL by
Wither OWLWither OWL
Wither OWLJames Hendler
7.3K views46 slides
Following Google: Don’t Follow the Followers, Follow the Leaders by
Following Google: Don’t Follow the Followers, Follow the LeadersFollowing Google: Don’t Follow the Followers, Follow the Leaders
Following Google: Don’t Follow the Followers, Follow the LeadersC4Media
825 views97 slides

Similar to A Beginners Guide to noSQL(20)

PostgreSQL, your NoSQL database by Reuven Lerner
PostgreSQL, your NoSQL databasePostgreSQL, your NoSQL database
PostgreSQL, your NoSQL database
Reuven Lerner1.4K views
Introduction to Linked Data 1/5 by Juan Sequeda
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5
Juan Sequeda1.9K views
Following Google: Don’t Follow the Followers, Follow the Leaders by C4Media
Following Google: Don’t Follow the Followers, Follow the LeadersFollowing Google: Don’t Follow the Followers, Follow the Leaders
Following Google: Don’t Follow the Followers, Follow the Leaders
C4Media825 views
Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015 by NoSQLmatters
Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015
Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015
NoSQLmatters1K views
Linked Data: The Real Web 2.0 (from 2008) by Uche Ogbuji
Linked Data: The Real Web 2.0 (from 2008)Linked Data: The Real Web 2.0 (from 2008)
Linked Data: The Real Web 2.0 (from 2008)
Uche Ogbuji789 views
WTF is the Semantic Web by Juan Sequeda
WTF is the Semantic WebWTF is the Semantic Web
WTF is the Semantic Web
Juan Sequeda902 views
No sql distilled-distilled by rICh morrow
No sql distilled-distilledNo sql distilled-distilled
No sql distilled-distilled
rICh morrow1.5K views
To SQL or NoSQL, that is the question by Krishnakumar S
To SQL or NoSQL, that is the questionTo SQL or NoSQL, that is the question
To SQL or NoSQL, that is the question
Krishnakumar S1.5K views
Introducción a NoSQL by MongoDB
Introducción a NoSQLIntroducción a NoSQL
Introducción a NoSQL
MongoDB7.1K views
Enterprise NoSQL: Silver Bullet or Poison Pill by Billy Newport
Enterprise NoSQL: Silver Bullet or Poison PillEnterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison Pill
Billy Newport936 views
NoSQL and MapReduce by J Singh
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
J Singh12.5K views
Lessons learnt coverting from SQL to NoSQL by Enda Farrell
Lessons learnt coverting from SQL to NoSQLLessons learnt coverting from SQL to NoSQL
Lessons learnt coverting from SQL to NoSQL
Enda Farrell4.5K views
NoSQL, SQL, NewSQL - methods of structuring data. by Tony Rogerson
NoSQL, SQL, NewSQL - methods of structuring data.NoSQL, SQL, NewSQL - methods of structuring data.
NoSQL, SQL, NewSQL - methods of structuring data.
Tony Rogerson2.5K views

More from Mike Crabb

Hard to Reach Users in Easy to Reach Places by
Hard to Reach Users in Easy to Reach PlacesHard to Reach Users in Easy to Reach Places
Hard to Reach Users in Easy to Reach PlacesMike Crabb
1.1K views47 slides
Accessible and Assistive Interfaces by
Accessible and Assistive InterfacesAccessible and Assistive Interfaces
Accessible and Assistive InterfacesMike Crabb
692 views18 slides
Accessible Everyone by
Accessible EveryoneAccessible Everyone
Accessible EveryoneMike Crabb
1.4K views89 slides
The Peer Review Process by
The Peer Review ProcessThe Peer Review Process
The Peer Review ProcessMike Crabb
1.8K views36 slides
Managing Quality In Qualitative Research by
Managing Quality In Qualitative ResearchManaging Quality In Qualitative Research
Managing Quality In Qualitative ResearchMike Crabb
1.6K views62 slides
Analysing Qualitative Data by
Analysing Qualitative DataAnalysing Qualitative Data
Analysing Qualitative DataMike Crabb
817 views59 slides

More from Mike Crabb(20)

Hard to Reach Users in Easy to Reach Places by Mike Crabb
Hard to Reach Users in Easy to Reach PlacesHard to Reach Users in Easy to Reach Places
Hard to Reach Users in Easy to Reach Places
Mike Crabb1.1K views
Accessible and Assistive Interfaces by Mike Crabb
Accessible and Assistive InterfacesAccessible and Assistive Interfaces
Accessible and Assistive Interfaces
Mike Crabb692 views
Accessible Everyone by Mike Crabb
Accessible EveryoneAccessible Everyone
Accessible Everyone
Mike Crabb1.4K views
The Peer Review Process by Mike Crabb
The Peer Review ProcessThe Peer Review Process
The Peer Review Process
Mike Crabb1.8K views
Managing Quality In Qualitative Research by Mike Crabb
Managing Quality In Qualitative ResearchManaging Quality In Qualitative Research
Managing Quality In Qualitative Research
Mike Crabb1.6K views
Analysing Qualitative Data by Mike Crabb
Analysing Qualitative DataAnalysing Qualitative Data
Analysing Qualitative Data
Mike Crabb817 views
Conversation Discourse and Document Analysis by Mike Crabb
Conversation Discourse and Document AnalysisConversation Discourse and Document Analysis
Conversation Discourse and Document Analysis
Mike Crabb479 views
Ethnographic and Observational Research by Mike Crabb
Ethnographic and Observational ResearchEthnographic and Observational Research
Ethnographic and Observational Research
Mike Crabb621 views
Doing Focus Groups by Mike Crabb
Doing Focus GroupsDoing Focus Groups
Doing Focus Groups
Mike Crabb344 views
Doing Interviews by Mike Crabb
Doing InterviewsDoing Interviews
Doing Interviews
Mike Crabb280 views
Designing Qualitative Research by Mike Crabb
Designing Qualitative ResearchDesigning Qualitative Research
Designing Qualitative Research
Mike Crabb418 views
Introduction to Accessible Design by Mike Crabb
Introduction to Accessible DesignIntroduction to Accessible Design
Introduction to Accessible Design
Mike Crabb177 views
Accessible Everyone by Mike Crabb
Accessible EveryoneAccessible Everyone
Accessible Everyone
Mike Crabb1.9K views
Texture and Glyph Design by Mike Crabb
Texture and Glyph DesignTexture and Glyph Design
Texture and Glyph Design
Mike Crabb5.4K views
Pattern Perception and Map Design by Mike Crabb
Pattern Perception and Map DesignPattern Perception and Map Design
Pattern Perception and Map Design
Mike Crabb404 views
Dealing with Enterprise Level Data by Mike Crabb
Dealing with Enterprise Level DataDealing with Enterprise Level Data
Dealing with Enterprise Level Data
Mike Crabb1.1K views
Using Cloud in an Enterprise Environment by Mike Crabb
Using Cloud in an Enterprise EnvironmentUsing Cloud in an Enterprise Environment
Using Cloud in an Enterprise Environment
Mike Crabb2.3K views
Teaching Cloud to the Programmers of Tomorrow by Mike Crabb
Teaching Cloud to the Programmers of TomorrowTeaching Cloud to the Programmers of Tomorrow
Teaching Cloud to the Programmers of Tomorrow
Mike Crabb5.6K views
Sql Injection and XSS by Mike Crabb
Sql Injection and XSSSql Injection and XSS
Sql Injection and XSS
Mike Crabb3.3K views
Forms and Databases in PHP by Mike Crabb
Forms and Databases in PHPForms and Databases in PHP
Forms and Databases in PHP
Mike Crabb1.8K views

Recently uploaded

Roadmap y Novedades de producto by
Roadmap y Novedades de productoRoadmap y Novedades de producto
Roadmap y Novedades de productoNeo4j
43 views33 slides
HarshithAkkapelli_Presentation.pdf by
HarshithAkkapelli_Presentation.pdfHarshithAkkapelli_Presentation.pdf
HarshithAkkapelli_Presentation.pdfharshithakkapelli
11 views16 slides
DSD-INT 2023 - Delft3D User Days - Welcome - Day 3 - Afternoon by
DSD-INT 2023 - Delft3D User Days - Welcome - Day 3 - AfternoonDSD-INT 2023 - Delft3D User Days - Welcome - Day 3 - Afternoon
DSD-INT 2023 - Delft3D User Days - Welcome - Day 3 - AfternoonDeltares
11 views43 slides
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023 by
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023Icinga
36 views17 slides
ict act 1.pptx by
ict act 1.pptxict act 1.pptx
ict act 1.pptxsanjaniarun08
12 views17 slides
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ... by
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...marksimpsongw
74 views34 slides

Recently uploaded(20)

Roadmap y Novedades de producto by Neo4j
Roadmap y Novedades de productoRoadmap y Novedades de producto
Roadmap y Novedades de producto
Neo4j43 views
DSD-INT 2023 - Delft3D User Days - Welcome - Day 3 - Afternoon by Deltares
DSD-INT 2023 - Delft3D User Days - Welcome - Day 3 - AfternoonDSD-INT 2023 - Delft3D User Days - Welcome - Day 3 - Afternoon
DSD-INT 2023 - Delft3D User Days - Welcome - Day 3 - Afternoon
Deltares11 views
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023 by Icinga
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
Icinga36 views
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ... by marksimpsongw
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...
marksimpsongw74 views
What Can Employee Monitoring Software Do?​ by wAnywhere
What Can Employee Monitoring Software Do?​What Can Employee Monitoring Software Do?​
What Can Employee Monitoring Software Do?​
wAnywhere18 views
Les nouveautés produit Neo4j by Neo4j
 Les nouveautés produit Neo4j Les nouveautés produit Neo4j
Les nouveautés produit Neo4j
Neo4j27 views
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko... by Deltares
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
Deltares10 views
Applying Platform Engineering Thinking to Observability.pdf by Natan Yellin
Applying Platform Engineering Thinking to Observability.pdfApplying Platform Engineering Thinking to Observability.pdf
Applying Platform Engineering Thinking to Observability.pdf
Natan Yellin12 views
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ... by Donato Onofri
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Donato Onofri643 views
Software evolution understanding: Automatic extraction of software identifier... by Ra'Fat Al-Msie'deen
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge... by Deltares
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...
Deltares16 views
Citi TechTalk Session 2: Kafka Deep Dive by confluent
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
confluent17 views
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -... by Deltares
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...
Deltares6 views
Tridens DevOps by Tridens
Tridens DevOpsTridens DevOps
Tridens DevOps
Tridens9 views
Advanced API Mocking Techniques by Dimpy Adhikary
Advanced API Mocking TechniquesAdvanced API Mocking Techniques
Advanced API Mocking Techniques
Dimpy Adhikary18 views

A Beginners Guide to noSQL

  • 2. THE WHY WE ARE STORING MORE DATA NOW THAN WE EVER HAVE BEFORE
  • 3. THE WHY WE ARE STORING MORE DATA NOW THAN WE EVER HAVE BEFORE CONNECTIONS BETWEEN OUR DATA ARE GROWING ALL THE TIME
  • 4. THE WHY WE ARE STORING MORE DATA NOW THAN WE EVER HAVE BEFORE CONNECTIONS BETWEEN OUR DATA ARE GROWING ALL THE TIME WE DON’T MAKE THINGS KNOWING THE STRUCTURE FROM DAY 1
  • 5. THE WHY WE ARE STORING MORE DATA NOW THAN WE EVER HAVE BEFORE CONNECTIONS BETWEEN OUR DATA ARE GROWING ALL THE TIME WE DON’T MAKE THINGS KNOWING THE STRUCTURE FROM DAY 1 SERVER ARCHITECTURE IS NOW AT A STAGE WHERE WE CAN TAKE ADVANTAGE OF IT
  • 6. salary lists most web applications social networks semantic trading SiZE Complexity relational databases
  • 7. NOSQL USE CASES LARGE DATA VOLUMES MASSIVELY DISTRIBUTED ARCHITECTURE REQUIRED TO STORE THE DATA GOOGLE, AMAZON, FACEBOOK, 100K SERVERS
  • 8. NOSQL USE CASES LARGE DATA VOLUMES MASSIVELY DISTRIBUTED ARCHITECTURE REQUIRED TO STORE THE DATA GOOGLE, AMAZON, FACEBOOK, 100K SERVERS EXTREME QUERY WORKLOAD IMPOSSIBLE TO EFFICIENTLY DO JOINS AT THAT SCALE WITH AN RDBMS
  • 9. NOSQL USE CASES LARGE DATA VOLUMES MASSIVELY DISTRIBUTED ARCHITECTURE REQUIRED TO STORE THE DATA GOOGLE, AMAZON, FACEBOOK, 100K SERVERS EXTREME QUERY WORKLOAD IMPOSSIBLE TO EFFICIENTLY DO JOINS AT THAT SCALE WITH AN RDBMS SCHEMA EVOLUTION SCEMA FLEXIBILITY IS NOT TRIVIAL AT A LARGE SCALE BUT IT CAN BE WITH NO SQL
  • 10. NOSQL PROS AND CONS PROS MASSIVE SCALABILITY HIGH AVAILABILITY LOWER COST SCHEMA FLEXIBILITY SPARCE AND SEMI STRUCTURED DATA
  • 11. NOSQL PROS AND CONS PROS MASSIVE SCALABILITY HIGH AVAILABILITY LOWER COST SCHEMA FLEXIBILITY SPARCE AND SEMI STRUCTURED DATA CONS LIMITED QUERY CAPABILITIES NOT STANDARDISED (PORTABILITY MAY BE AN ISSUE) STILL A DEVELOPING TECHNOLOGY
  • 12. OSQL NOSQL NOSQL NOSQL QL BIGTABLE NOSQL NOSQL QL NOSQL NOSQL NOSQL N OSQL NOSQL KEY VALUE NO SQL NOSQL NOSQL NOSQL N NOSQL NOSQL NOSQL NOS NOSQL NOSQL NOSQL NOSQ QL NOSQL NOSQL NOSQL NO GRAPHDB NOSQL NOSQL N NOSQL NOSQL NOSQL NOS OSQL NOSQL NOSQL NOSQL SQL NOSQL DOCUMENT NOS FOUREMERGING TRENDS IN NOSQL DATABASES
  • 13. BUT FIRST… IMAGINE A LIBRARY LOTS OF DIFFERENT FLOORS DIFFERENT SECTIONS ON EACH FLOOR DIFFERENT BOOKSHELVES IN EACH SECTION LOTS OF BOOKS ON EACH SHELF LOTS OF PAGES IN EACH BOOK LOTS OF WORDS ON EACH PAGE EVERYTHING IS WELL ORGANISED AND EVERYTHING HAS A SPACE
  • 14. BUT FIRST… IMAGINE A LIBRARY WHAT HAPPENS IF WE BUY TOO MANY BOOKS!? (THE WORLD EXPLODES AND THE KITTENS WIN)
  • 15. BUT FIRST… IMAGINE A LIBRARY WHAT HAPPENS IF WE WANT TO STORE CDS ALL OF A SUDDEN!? (THE WORLD EXPLODES AND THE KITTENS WIN)
  • 16. BUT FIRST… IMAGINE A LIBRARY WHAT HAPPENS IF WE WANT TO GET RID OF ALL BOOKS THAT MENTION KITTENS (KITTENS STILL WIN)
  • 17. BIG BEHAVES LIKE A STANDARD RELATIONAL DATABASE BUT WITH A SLIGHT CHANGE http://research.google.com/archive/bigtable.html http://research.google.com/archive/spanner.html DESIGNED TO WORK WITH A LOT OF DATA…A REALLY BIG CRAP TON CREATED BY GOOGLE AND NOW USED BY LOTS OF OTHERS TABLE
  • 18. THIS IS A STANDARD RELATIONAL DATABASE BIG TABLE THIS IS A BIG TABLE DATABASE (AND NOW THE NAME MAKES SENCE!)
  • 19. BIG TABLE “A Bigtable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by a row key, column key, and a timestamp; each value in the map is an uninterpreted array of bytes.”
  • 20. BIG TABLE “A Bigtable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by a row key, column key, and a timestamp; each value in the map is an uninterpreted array of bytes.”
  • 21. BIG TABLE “A Bigtable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by a row key, column key, and a timestamp; each value in the map is an uninterpreted array of bytes.”
  • 22. KEY VALUE AGAIN, DESIGNED TO WORK WITH A LOT OF DATA EACH BIT OF DATA IS STORED IN A SINGLE COLLECTION EACH COLLECTION CAN HAVE DIFFERENT TYPES OF DATA
  • 24. KEY VALUE A C D E OUR VALUES ARE HIDDEN INSIDE THE KEYS TO FIND OUT WHAT THEY ARE WE NEED TO QUERY THEM What is in Key B? The Triangle B
  • 26. DOCUMENT STORE DESIGNED TO WORK WITH A LOT OF DATA (BEGINNING TO NOTICE A THEME?) VERY SIMILAR TO A KEY VALUE DATABASE MAIN DIFFERENCE IS THAT YOU CAN ACTUALLY SEE THE VALUES
  • 28. DOCUMENT STORE A CB D E Bring me the triangles Yes m’lord.
  • 29. SIDENOTE REMEMBER HOW SQL DATABASES ARE LIBRARIES? NO SQL IS MORE LIKE A BAG OF CATS!
  • 30. SIDENOTE colour: tabby name: Gunther colour: ginger name: Mylo colour: grey name: Ruffus age: kitten colour: ginger(ish) name: Fred age: kitten colour: ginger(ish) name: Quentin legs: 3 WE CAN ADD IN FIELDS AS AND WHEN WE NEED THEM
  • 31. DOCUMENT STORE A CB D E Bring me the KITTENS! Of course m’lord.
  • 33. GRAPH DATABASE FOCUS HERE IS ON MODELLING THE STRUCTURE OF THE DATA INSPIRED BY GRAPH THEORY (GO MATHS!) SCALES REALLY WELL TO THE STRUCTURE OF THE DATA
  • 37. GRAPH DATABASE name: “Michael” twitter: “@mrmike name: “John” twitter:”@mrjohn” brand: “Toyota” currentState: “Broken” brand: “Vauxhall” currentState: “Working” WORKS_WITH WORKS_WITH OWNS OWNS CARSHARES IN
  • 38. GRAPH DATABASE name: “Michael” twitter: “@mrmike name: “John” twitter:”@mrjohn” brand: “Toyota” currentState: “Broken” brand: “Vauxhall” currentState: “Working” WORKS_WITH WORKS_WITH OWNS propertyType: “car” OWNS propertyType: “car” CARSHARES IN
  • 40. key/value store bigtable clone document database graph database SiZE Complexity
  • 41. key/value store bigtable clone document database graph database SiZE Complexity >90% of use cases
  • 42. WHEN TO USE NOSQL AND WHEN TO USE SQL
  • 43. THE BASICS High availability and disaster recovery are a must Understand the pros and cons of each design model Don’t pick something just because it is new Do you remember the zune? Don’t pick something based JUST on performance
  • 44. SQL High performance for transactions. Think ACID Highly structured, very portable Small amounts of data SMALL IS LESS THAN 500GB Supports many tables with different types of data Can fetch ordered data Compatible with lots of tools THE GOOD
  • 46. SQL High performance for transactions. Think ACID Highly structured, very portable Small amounts of data SMALL IS LESS THAN 500GB Supports many tables with different types of data Can fetch ordered data Compatible with lots of tools THE GOOD
  • 47. SQL Complex queries take a long time The relational model takes a long time to learn Not really scalable Not suited for rapid development THE BAD
  • 48. noSQL Fits well for volatile data High read and write throughput Scales really well Rapid development is possible In general it’s faster than SQL THE GOOD
  • 50. noSQL Fits well for volatile data High read and write throughput Scales really well Rapid development is possible In general it’s faster than SQL THE GOOD
  • 51. noSQL Key/Value pairs need to be packed/unpacked all the time Still working on getting security for these working as well as SQL Lack of relations from one key to another THE GOOD
  • 52. tl;dr so use both, but think about when you want to use them! works great, can’t scale for large data works great, doesn't fit all situations SQL noSQL
  • 53. A lot of this content is loving ripped from lots of other (more impressive) presentations that are already on SlideShare - you should check them out! FINALLY