SlideShare a Scribd company logo
November, 2017
Andy Ellicott, Crate.io
Is the Future of NoSQL…
SQL?
Logistics…
• Submit questions at any time via the questions panel

• Slides & recording will be shared via email after the event
Agenda
–
• The NoSQL Era, the good, the bad, the ugly

• Post-NoSQL…Distributed SQL renaissance

• “Things Data” is coming

• Other DBMS futures predictions

• Additional reading

• Questions & answers
I like databases
25 years in DBMS & software development companies

IMHO…the coolest ways software is changing what’s
possible in life and business…is usually due to some
database changing what’s possible with software.
Good, Bad, & Ugly
The NoSQL Era
DBMS Timeline
–
2005 2010
SQL - One Size
Fits All era
Distributed
SQL 1.0 era NoSQL era
• Oracle
• DB2
• SQL Server
• MySQL
• PostgreSQL
• Vertica
• Greenplum
• Netezza
• Paraccel
• VoltDB
• …
• MongoDB
• Hadoop
• DynamoDB
• Cassandra
• Redis
• …
NoSQL Family Tree
–
NoSQL, The Good…
-
• Many, many choices for most any use case

- JSON document stores

- Key-value stores

- Cacheing

- Time series

- Text search

• Easy, Economical, Developer friendly:

- Scalability

- Fault-tolerance 

- Dynamic, flexible schemas (JSON)

- Open source
Knowing the CAP Theorem Helped…
-
• A partitioned database…

- where data is duplicated across multiple machines

- Access to that data can be EITHER

• Highly Consistent (e.g., MongoDB)

• or Highly available (e.g., DynamoDB)

• We learned the sky doesn’t fall if you forfeit ACID
- “Eventual Consistency”
NoSQL, The Bad…
-
• No standards (i.e., literally, no SQL)

- Harder to learn

- Hard to integrate

• Too many choices, hard to differentiate

- MongoDB vs. Rethink?

- CouchDB vs. Couchbase?

• DBA expertise

- Resizing & rebalancing database clusters

• Brute force query optimization, via code

• Polyglot persistence gone wild

- Use multiple specialized databases in a single system

- Over time, duplicate data storage and sync costs can grow out of control
NoSQL, The Ugly…
-
• Market “consolidation”

- RethinkDB 

- Riak

- FoundationDB
Ten Years Ago …
-
Was NoSQL a step backwards
in DBMS technology…or a step
forwards?
• Greatly expanded researchers & contributors

• Debunked assumptions about requirements

- SQL access

- ACID / Eventual consistency

• Created open source code and thought leadership on
which next generation of SQL is being built
IMHO, NoSQL has been a step forwards
-
Distributed SQL II
The Newest Generation of SQL
–
SQL NOSQL
Crate Components (  ​   ​Crate   ​   ​ Elasticsearch ,   ​   ​other Open Source) 
The CrateDB Open Source Stack
–
1 file to download & install
CrateDB - the key inventions

–
Distributed SQL with search, time
series, geospatial, aggregations
Cloud-native architecture
easy scaling via Containers
NoSQL storage & clustering for
horizontal scaling & dynamic schema
Columnar Caches for real-time, in-
memory SQL query performance
shared-nothing architecture
Using CrateDB
–
Simple install

Zero-configuration, auto-join
Compatible

ANSI SQL vis Postgres-wire
protocol, JDBC, REST
Real-time performance

Distributed SQL query engine
Dynamic schema

all data (structured + JSON), time
series, geospatial
Distributed SQL query versatility

Aggregations, time series, search,
geospatial…
Simpler scalability

Shared nothing, horizontal scale out

Always on

High availability, replication, self-
healing
Flexible

No lock-in, runs any cloud and on-
premise
The next wave of big data
will come from machines
“Things Data”
The Next Wave of Big Data
–
“IoT is creating unparalleled information
management and analytics challenges.”
- Jim Hare, Gartner
Every
Step
Every
Lightbulb
Every
Message
Every
Bottle
•Firehose of data
•Complex data
•Real-time
•Edge + Cloud
Millions of data points per second
Instantly actionable - current & large historic data sets
Run anywhere. Cloud. On-premises Containers. Small
footprint or large clusters with 100+ nodes.
Joins, Time Series, Geospatial, JSON, Text search, AI, Blobs
• Plastic bottles, caps, lids etc

- e.g. produces the plastic Coca Cola Bottles in USA

• CrateDB enables “real-time factory”

- 1500+ production lines, 900 sensor types

- Improve overall equipment efficiency (OEE)

- Reduce labor costs / Reduce waste

• Millions of data points per minute

• Dozens of realtime charts

- Chart refresh (query speed) reduced from 5 minutes to 0.3
seconds with CrateDB
Bottles/day Factories Worldwide
160M 180
Customer: ALPA USA
$4B global plastic packaging manufacturer 

–
Customer: Skyhigh
Cloud security, Campbell 

–
• Cloud Access Security Broker (CASB)

• Billions of events/day into CrateDB - internet traffic for
40% of Fortune 500

• Realtime dashboards - flag suspicious, risky internet
usage

• CrateDB replaced Elasticsearch & MySQL

- queries were 20x faster with only 25% of hardware
Sekhar Sarukkai

Co-founder & SVP Engineering
“CrateDB’s real-time SQL
performance, simple scaling, and
high availability make it a key
element of our stack”
Billions of
events / day
40% Fortune 500
30M+ Users
Interactive

real-time dashboards
“It’s a lot of data, which you need to ingest
very fast and auto-query very fast. That’s
why we brought in CrateDB.”
Mark Sutheran

CEO
Customer:
Automotive, Singapore

–
Sensor Readings / per second / per car
2,000
• IoT-Enabled Vehicle Tracking

• Predictive maintenance 

• Data in Crate allows full 3D reconstruction
of accidents
CrateDB Traditional SQL Distributed SQL I NoSQL
SQL
(ease of adoption &
integration)
✅ ✅ ✅ ❌
Complex,
dynamic data ✅ ❌ ❌ ✅
Firehose &
Real-Time Queries
✅ ❌ ✅ ✴
Scale out architecture ✅ ❌ ✅ ✅
Open source / Economical ✅ ✴ ❌ ✅
New DBMS Required for “Things Data” Era?

–
What’s Next?
If You’re Doing Distributed…
–
Gateway
Devices
Servers, Sensors, 

Actuators, Machines,

Wearables, Cars etc.
Applications

& PlatformsGateway & DB
Edge Public/Hybrid/Private
shared-nothing architecture
CrateDB enables use-cases at the “edge” and in the cloud, with SQL, horizontal scaling, high availability, and multi-model data
structures. With CrateDB, customers can extract value from realtime data, enabling applications & services not possible before.
Will Scale-out Databases Start Eliminating the Need for Middleware?
–
• Message queues were invented to compensate for
DBMS weaknesses

- Downtime

- Slow ingestion

• New databases like CrateDB don’t have those
pitfalls

• Embedding MQTT broker in CrateDB 

- Define “Ingestion rules” in CrateDB

• MQTT topic —> Target table for storage

- Stores messages in tables

- Eliminates the need for extra middleware

• Lowers hosting costs, complexity, development time fast ingest. always-on architecture
Embedded MQTT Broker
Message Queue
Devices
MQTT messages MQTT messages
versus
DBMS
slow ingest &
DB downtime
Devices
MQTT Broker
MQTT Consumer/Writer
You might be done with NoSQL when…
–
• You find yourself saying “I wish I could just use a join”

• The competition for hiring new experienced users is
slowing your team growth

• You’re using too many different DBs together…”too
specialized”
Thank You!
-
• CrateDB

- https://crate.io

• Slides & recording of this will be sent to you shortly, via email

• Ping me any time

- Andy Ellicott

- andy@crate.io
Andy Ellicott

andy@crate.io
Thank you

More Related Content

What's hot

Webinar how to build a highly available time series solution with kairos-db (1)
Webinar  how to build a highly available time series solution with kairos-db (1)Webinar  how to build a highly available time series solution with kairos-db (1)
Webinar how to build a highly available time series solution with kairos-db (1)
Julia Angell
 
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible APIIntroducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
ScyllaDB
 
Shift: Real World Migration from MongoDB to Cassandra
Shift: Real World Migration from MongoDB to CassandraShift: Real World Migration from MongoDB to Cassandra
Shift: Real World Migration from MongoDB to Cassandra
DataStax
 
Intro to databricks delta lake
 Intro to databricks delta lake Intro to databricks delta lake
Intro to databricks delta lake
Mykola Zerniuk
 
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra RockstarWebinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
DataStax
 
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsCassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
DataStax
 
Cassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary DifferencesCassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary Differences
ScyllaDB
 
Workshop - How to benchmark your database
Workshop - How to benchmark your databaseWorkshop - How to benchmark your database
Workshop - How to benchmark your database
ScyllaDB
 
Scylla Summit 2022: ScyllaDB Cloud: Simplifying Deployment to the Public Cloud
Scylla Summit 2022: ScyllaDB Cloud: Simplifying Deployment to the Public CloudScylla Summit 2022: ScyllaDB Cloud: Simplifying Deployment to the Public Cloud
Scylla Summit 2022: ScyllaDB Cloud: Simplifying Deployment to the Public Cloud
ScyllaDB
 
Disney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand Users
Disney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand UsersDisney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand Users
Disney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand Users
ScyllaDB
 
Meetup Google BigQuery powered by ai
Meetup Google BigQuery powered by aiMeetup Google BigQuery powered by ai
Meetup Google BigQuery powered by ai
Ido Volff
 
Overcoming Barriers of Scaling Your Database
Overcoming Barriers of Scaling Your DatabaseOvercoming Barriers of Scaling Your Database
Overcoming Barriers of Scaling Your Database
ScyllaDB
 
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
DataStax Academy
 
A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0
Krishna Sankar
 
How jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxHow jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStax
DataStax
 
Scylla Summit 2022: Migrating SQL Schemas for ScyllaDB: Data Modeling Best Pr...
Scylla Summit 2022: Migrating SQL Schemas for ScyllaDB: Data Modeling Best Pr...Scylla Summit 2022: Migrating SQL Schemas for ScyllaDB: Data Modeling Best Pr...
Scylla Summit 2022: Migrating SQL Schemas for ScyllaDB: Data Modeling Best Pr...
ScyllaDB
 
Why NoSQL and MongoDB for Big Data
Why NoSQL and MongoDB for Big DataWhy NoSQL and MongoDB for Big Data
Why NoSQL and MongoDB for Big Data
William LaForest
 
Real-time Fraud Detection for Southeast Asia’s Leading Mobile Platform
Real-time Fraud Detection for Southeast Asia’s Leading Mobile PlatformReal-time Fraud Detection for Southeast Asia’s Leading Mobile Platform
Real-time Fraud Detection for Southeast Asia’s Leading Mobile Platform
ScyllaDB
 
Webinar: Eventual Consistency != Hopeful Consistency
Webinar: Eventual Consistency != Hopeful ConsistencyWebinar: Eventual Consistency != Hopeful Consistency
Webinar: Eventual Consistency != Hopeful Consistency
DataStax
 
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDBScylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
ScyllaDB
 

What's hot (20)

Webinar how to build a highly available time series solution with kairos-db (1)
Webinar  how to build a highly available time series solution with kairos-db (1)Webinar  how to build a highly available time series solution with kairos-db (1)
Webinar how to build a highly available time series solution with kairos-db (1)
 
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible APIIntroducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
 
Shift: Real World Migration from MongoDB to Cassandra
Shift: Real World Migration from MongoDB to CassandraShift: Real World Migration from MongoDB to Cassandra
Shift: Real World Migration from MongoDB to Cassandra
 
Intro to databricks delta lake
 Intro to databricks delta lake Intro to databricks delta lake
Intro to databricks delta lake
 
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra RockstarWebinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
 
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsCassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
 
Cassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary DifferencesCassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary Differences
 
Workshop - How to benchmark your database
Workshop - How to benchmark your databaseWorkshop - How to benchmark your database
Workshop - How to benchmark your database
 
Scylla Summit 2022: ScyllaDB Cloud: Simplifying Deployment to the Public Cloud
Scylla Summit 2022: ScyllaDB Cloud: Simplifying Deployment to the Public CloudScylla Summit 2022: ScyllaDB Cloud: Simplifying Deployment to the Public Cloud
Scylla Summit 2022: ScyllaDB Cloud: Simplifying Deployment to the Public Cloud
 
Disney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand Users
Disney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand UsersDisney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand Users
Disney+ Hotstar: Scaling NoSQL for Millions of Video On-Demand Users
 
Meetup Google BigQuery powered by ai
Meetup Google BigQuery powered by aiMeetup Google BigQuery powered by ai
Meetup Google BigQuery powered by ai
 
Overcoming Barriers of Scaling Your Database
Overcoming Barriers of Scaling Your DatabaseOvercoming Barriers of Scaling Your Database
Overcoming Barriers of Scaling Your Database
 
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
 
A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0
 
How jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxHow jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStax
 
Scylla Summit 2022: Migrating SQL Schemas for ScyllaDB: Data Modeling Best Pr...
Scylla Summit 2022: Migrating SQL Schemas for ScyllaDB: Data Modeling Best Pr...Scylla Summit 2022: Migrating SQL Schemas for ScyllaDB: Data Modeling Best Pr...
Scylla Summit 2022: Migrating SQL Schemas for ScyllaDB: Data Modeling Best Pr...
 
Why NoSQL and MongoDB for Big Data
Why NoSQL and MongoDB for Big DataWhy NoSQL and MongoDB for Big Data
Why NoSQL and MongoDB for Big Data
 
Real-time Fraud Detection for Southeast Asia’s Leading Mobile Platform
Real-time Fraud Detection for Southeast Asia’s Leading Mobile PlatformReal-time Fraud Detection for Southeast Asia’s Leading Mobile Platform
Real-time Fraud Detection for Southeast Asia’s Leading Mobile Platform
 
Webinar: Eventual Consistency != Hopeful Consistency
Webinar: Eventual Consistency != Hopeful ConsistencyWebinar: Eventual Consistency != Hopeful Consistency
Webinar: Eventual Consistency != Hopeful Consistency
 
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDBScylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
Scylla Summit 2018: Scalable Stream Processing with KSQL, Kafka and ScyllaDB
 

Similar to Webinar: The Future of SQL

OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
NETWAYS
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?
Crate.io
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandra
Brian Enochson
 
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
DataScienceConferenc1
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
Venu Anuganti
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Adrian Cockcroft
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Qubole
 
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,..."Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
lisapaglia
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
PolarSeven Pty Ltd
 
Changing the game with cloud dw
Changing the game with cloud dwChanging the game with cloud dw
Changing the game with cloud dw
elephantscale
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
BigDataCloud
 
Cloud introduction2.ppt
Cloud introduction2.pptCloud introduction2.ppt
Cloud introduction2.ppt
Bala Anand
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
SpringPeople
 
Data Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseData Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax Enterprise
DataStax
 
20160331 sa introduction to big data pipelining berlin meetup 0.3
20160331 sa introduction to big data pipelining berlin meetup   0.320160331 sa introduction to big data pipelining berlin meetup   0.3
20160331 sa introduction to big data pipelining berlin meetup 0.3
Simon Ambridge
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
IBM Cloud Data Services
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
Bethmi Gunasekara
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...
Simon Ambridge
 
Create cloud service on AWS
Create cloud service on AWSCreate cloud service on AWS
Create cloud service on AWS
Amazon Web Services
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
Kent Graziano
 

Similar to Webinar: The Future of SQL (20)

OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
OSDC 2018 | The operational brain: how new Paradigms like Machine Learning ar...
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandra
 
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
 
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,..."Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Changing the game with cloud dw
Changing the game with cloud dwChanging the game with cloud dw
Changing the game with cloud dw
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
 
Cloud introduction2.ppt
Cloud introduction2.pptCloud introduction2.ppt
Cloud introduction2.ppt
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 
Data Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseData Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax Enterprise
 
20160331 sa introduction to big data pipelining berlin meetup 0.3
20160331 sa introduction to big data pipelining berlin meetup   0.320160331 sa introduction to big data pipelining berlin meetup   0.3
20160331 sa introduction to big data pipelining berlin meetup 0.3
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...
 
Create cloud service on AWS
Create cloud service on AWSCreate cloud service on AWS
Create cloud service on AWS
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 

Recently uploaded

Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 

Webinar: The Future of SQL

  • 1. November, 2017 Andy Ellicott, Crate.io Is the Future of NoSQL… SQL?
  • 2. Logistics… • Submit questions at any time via the questions panel • Slides & recording will be shared via email after the event
  • 3. Agenda – • The NoSQL Era, the good, the bad, the ugly • Post-NoSQL…Distributed SQL renaissance • “Things Data” is coming • Other DBMS futures predictions • Additional reading • Questions & answers
  • 4. I like databases 25 years in DBMS & software development companies IMHO…the coolest ways software is changing what’s possible in life and business…is usually due to some database changing what’s possible with software.
  • 5. Good, Bad, & Ugly The NoSQL Era
  • 6. DBMS Timeline – 2005 2010 SQL - One Size Fits All era Distributed SQL 1.0 era NoSQL era • Oracle • DB2 • SQL Server • MySQL • PostgreSQL • Vertica • Greenplum • Netezza • Paraccel • VoltDB • … • MongoDB • Hadoop • DynamoDB • Cassandra • Redis • …
  • 8. NoSQL, The Good… - • Many, many choices for most any use case - JSON document stores - Key-value stores - Cacheing - Time series - Text search • Easy, Economical, Developer friendly: - Scalability - Fault-tolerance - Dynamic, flexible schemas (JSON) - Open source
  • 9. Knowing the CAP Theorem Helped… - • A partitioned database… - where data is duplicated across multiple machines - Access to that data can be EITHER • Highly Consistent (e.g., MongoDB) • or Highly available (e.g., DynamoDB) • We learned the sky doesn’t fall if you forfeit ACID - “Eventual Consistency”
  • 10. NoSQL, The Bad… - • No standards (i.e., literally, no SQL) - Harder to learn - Hard to integrate • Too many choices, hard to differentiate - MongoDB vs. Rethink? - CouchDB vs. Couchbase? • DBA expertise - Resizing & rebalancing database clusters • Brute force query optimization, via code • Polyglot persistence gone wild - Use multiple specialized databases in a single system - Over time, duplicate data storage and sync costs can grow out of control
  • 11. NoSQL, The Ugly… - • Market “consolidation” - RethinkDB - Riak - FoundationDB
  • 12. Ten Years Ago … - Was NoSQL a step backwards in DBMS technology…or a step forwards?
  • 13. • Greatly expanded researchers & contributors • Debunked assumptions about requirements - SQL access - ACID / Eventual consistency • Created open source code and thought leadership on which next generation of SQL is being built IMHO, NoSQL has been a step forwards -
  • 15. The Newest Generation of SQL – SQL NOSQL
  • 17. CrateDB - the key inventions
 – Distributed SQL with search, time series, geospatial, aggregations Cloud-native architecture easy scaling via Containers NoSQL storage & clustering for horizontal scaling & dynamic schema Columnar Caches for real-time, in- memory SQL query performance shared-nothing architecture
  • 18. Using CrateDB – Simple install
 Zero-configuration, auto-join Compatible
 ANSI SQL vis Postgres-wire protocol, JDBC, REST Real-time performance
 Distributed SQL query engine Dynamic schema
 all data (structured + JSON), time series, geospatial Distributed SQL query versatility
 Aggregations, time series, search, geospatial… Simpler scalability
 Shared nothing, horizontal scale out Always on
 High availability, replication, self- healing Flexible
 No lock-in, runs any cloud and on- premise
  • 19. The next wave of big data will come from machines “Things Data”
  • 20. The Next Wave of Big Data – “IoT is creating unparalleled information management and analytics challenges.” - Jim Hare, Gartner Every Step Every Lightbulb Every Message Every Bottle •Firehose of data •Complex data •Real-time •Edge + Cloud Millions of data points per second Instantly actionable - current & large historic data sets Run anywhere. Cloud. On-premises Containers. Small footprint or large clusters with 100+ nodes. Joins, Time Series, Geospatial, JSON, Text search, AI, Blobs
  • 21. • Plastic bottles, caps, lids etc - e.g. produces the plastic Coca Cola Bottles in USA
 • CrateDB enables “real-time factory” - 1500+ production lines, 900 sensor types - Improve overall equipment efficiency (OEE) - Reduce labor costs / Reduce waste • Millions of data points per minute • Dozens of realtime charts - Chart refresh (query speed) reduced from 5 minutes to 0.3 seconds with CrateDB Bottles/day Factories Worldwide 160M 180 Customer: ALPA USA $4B global plastic packaging manufacturer –
  • 22. Customer: Skyhigh Cloud security, Campbell – • Cloud Access Security Broker (CASB) • Billions of events/day into CrateDB - internet traffic for 40% of Fortune 500 • Realtime dashboards - flag suspicious, risky internet usage • CrateDB replaced Elasticsearch & MySQL - queries were 20x faster with only 25% of hardware Sekhar Sarukkai Co-founder & SVP Engineering “CrateDB’s real-time SQL performance, simple scaling, and high availability make it a key element of our stack” Billions of events / day 40% Fortune 500 30M+ Users Interactive
 real-time dashboards
  • 23. “It’s a lot of data, which you need to ingest very fast and auto-query very fast. That’s why we brought in CrateDB.” Mark Sutheran CEO Customer: Automotive, Singapore – Sensor Readings / per second / per car 2,000 • IoT-Enabled Vehicle Tracking • Predictive maintenance • Data in Crate allows full 3D reconstruction of accidents
  • 24. CrateDB Traditional SQL Distributed SQL I NoSQL SQL (ease of adoption & integration) ✅ ✅ ✅ ❌ Complex, dynamic data ✅ ❌ ❌ ✅ Firehose & Real-Time Queries ✅ ❌ ✅ ✴ Scale out architecture ✅ ❌ ✅ ✅ Open source / Economical ✅ ✴ ❌ ✅ New DBMS Required for “Things Data” Era? –
  • 26. If You’re Doing Distributed… – Gateway Devices Servers, Sensors, 
 Actuators, Machines,
 Wearables, Cars etc. Applications & PlatformsGateway & DB Edge Public/Hybrid/Private shared-nothing architecture CrateDB enables use-cases at the “edge” and in the cloud, with SQL, horizontal scaling, high availability, and multi-model data structures. With CrateDB, customers can extract value from realtime data, enabling applications & services not possible before.
  • 27. Will Scale-out Databases Start Eliminating the Need for Middleware? – • Message queues were invented to compensate for DBMS weaknesses - Downtime - Slow ingestion • New databases like CrateDB don’t have those pitfalls • Embedding MQTT broker in CrateDB - Define “Ingestion rules” in CrateDB • MQTT topic —> Target table for storage - Stores messages in tables - Eliminates the need for extra middleware • Lowers hosting costs, complexity, development time fast ingest. always-on architecture Embedded MQTT Broker Message Queue Devices MQTT messages MQTT messages versus DBMS slow ingest & DB downtime Devices MQTT Broker MQTT Consumer/Writer
  • 28. You might be done with NoSQL when… – • You find yourself saying “I wish I could just use a join” • The competition for hiring new experienced users is slowing your team growth • You’re using too many different DBs together…”too specialized”
  • 29. Thank You! - • CrateDB - https://crate.io • Slides & recording of this will be sent to you shortly, via email • Ping me any time - Andy Ellicott - andy@crate.io