SlideShare a Scribd company logo
RELEVANCE OF TIMESERIES DATABASES for
REAL-TIME SOLUTIONS
BY :
MUNIRAJU
VENKATESHA REDDY
 MUNIRAJU_V@HOTMAIL.COM
1
2
3
4
5
6
About Me
Now()
Challenges & Opportunities
Solution
Time-Series Databases
Druid.io
Quick introduction about me & my work
Real-Time scenarios and importance of NOW()
Shifting business focus, Opportunities and Use cases
Technology challenges and alternatives
Quick view on various time-series databases
Why druid.io is most fit for real-time analytics ,
my experience, demo design patterns &
reference architecture
Muniraju Venkatesha Reddy (Muni)
is a lead solutions architect and having overall 18 years experience in
following areas.
Big Data Analytics & AI | DW& BI | Solution Strategy and advisory|
Cloud Data lake & Insights Solutions (IaaS/PaaS/SaaS) | Enterprise
Architecture | Consulting | Product & Processes Engineering | Delivery
planning & execution | Customer Relations | Grow the business |
People Management.
In my current role with HCL America, working with customers across
industry verticals on designing next generation solutions by leveraging
Big data, Analytics & AI.
Reach me @ : Muniraju_v@hotmail.com
NOW ( )
Your flight is at 9:00 AM for
important customer
meeting at later part of
the day in near by city
There is a traffic incident at 7:30
hrs and major disruptions to
airport roadway.
You want to know now()?
You want to know Later?
NOW ( )
You went out for a Family
& Friends Dinner Event at
6 PM
There is a burglary incident in
your house
NOW ( )
You want to know now()?
You want to know Later?
You want to know now()?
You want to know Later?
You are a warehouse
manager for large supply
chain business for
retailers. You are
expecting large stock of
goods to be delivered
tonight for next week
festival season
There will be sever weather
condition declared. Due to
transportation disruptions
arrival of goods will be delayed
for more than 4 days
NOW ( )
Important things are happening NOW ( )
NOW( ) is the only time life really happens
whether individual or business
Shifting business focus,
Opportunities and Use cases
“ The collection and storage
of data, for processing at a
scheduled time when a
sufficient amount of data
has been accumulated..”
“ The immediate processing of data
after the transaction occurs, with
the database being updated at the
time of the event..”
2.5K
Week 1 Week 2 Week 3 Week 4 Month 3 Qtr. 3
2.6K 2.4K2.4K 1.1M 2.9M
Batch() vs. NOW()…
Shift in Business Focus…
Cutting
Preventable
Losses
Routine
Operations
Missing
Opportunities
New
Opportunities
1
2
3
4
• Closed Loop control systems (Manufacturing)
• Systems & Network Monitoring (IT)
• Field Asset Monitoring (Multiple)
o Vending Machines, Oil rigs, Fleet
management, Telecom
• Transaction Processing (Finance)
o Fraud, Validations, Authentications
• Complex ICU Analytics (Healthcare)
• Disaster Warning Systems (Environment)
• Fraudulent Trades (Finance)
• Preventive Maintenance (Manufacturing)
• Customer Churn (Marketing)
• Brand Reputation (Marketing)
• Social Media (Sentiments)
• Missed Opportunities – Revenue
o Customer Services – Social, Products,
Advertisements,
Up Sell / Cross Cell
• Missed Opportunities – Efficiency
o Supply chain, Quality of Service, Insurance,
Capacity Management
• Autonomous Diagnostics and Connected
automotives
• Tractors are becoming soil sensors
• Community WIFI (Global Projections – 94M (2016) to
541M (2021))
• Cyber Security, Smarter Surveillance,….
1
2
3
4
Business Opportunities & Use Cases
Solution
Solution…
• Primarily Updates
• Performance
• Memory Swapping
• Scaling
• I/O Intensive
• Distributed Keys
• Memory Intensive
• Complex Index Structure
• Poor Secondary Index
Support
• Primarily Inserts
• Recent Time Interval
• Key association with timestamp
• Time/Space Chunking
o Fixed Duration Intervals
o Fixed Size chunks
o Adaptive Intervals
RDBMS
NOSQL
TIMESEARIES
Variants
GraphiteDB
Time Series Databases at a Glance…
Druid.io
Evolving Architecture Standards… Kappa (K)
Evolving Architecture Standards… Lambda (λ)
IMMUTABLE DATA (HDFS) PRECOMPUTE VIEWS
STREAM
PROCESS
STREAM
INCREMENT
VIEWS
BATCH
RECOMPUTE
ENTERPRISE
DATA
VIEW 1 VIEW 2 VIEW …n
VIEW
REAL-TIME
INCREMENT
BATCH VIEWS
REAL-TIME VIEWS
MERGED
VIEWS
RESEARCH
SCENARIOS
(<= 30to90 Sec)
OFFLINE
SCENARIOS
(<= 5 Secs)
ONLINE
SCENARIOS
(<= 5 Secs)
BATCH LAYER
SPEEDY LAYER
SERVING LAYER
SPLIT –> TRANSFORM -> SINK
SPLIT –> TRANSFORM -> SINK
Enrich -> Split
Enrich -> Split
Steams
Streams
Streams
Streams
MEDIATION
LAYER
Druid.io (supports both Kappa & Lambda)
Druid is an open-source data store designed for sub-second
queries on real-time and historical data. It is primarily used for
business intelligence (OLAP) queries on event data. Druid
provides low latency (real-time) data ingestion, flexible data
exploration, and fast data aggregation.
More Details… Druid.io
Sub-second OLAP Queries: Druid’s unique architecture
enables rapid multi-dimensional filtering, ad-hoc attribute
groupings, and extremely fast aggregations.
Real-time Streaming Ingestion : Druid employs lock-
free ingestion to allow for simultaneous ingestion and
querying of high dimensional, high volume data sets.
Explore events immediately after they occur.
Power Analytic Applications: Druid has numerous
features built for multi-tenancy. Power user-facing analytic
applications designed to be used by thousands of
concurrent users.
Cost Effective: Druid is extremely cost effective at scale
and has numerous features built in for cost reduction. Trade
off cost and performance with simple configuration knobs.
Highly Available: Druid is used to back SaaS
implementations that need to be up all the time. Druid
supports rolling updates so your data is still available and
queryable during software updates. Scale up or down
without data loss.
Scalable: Existing Druid deployments handle trillions of
events, petabytes of data, and thousands of queries every
second.
Key Features
My Experience
When I listening the customer challenge, I mentioned about the druid (I had
only theoretical knowledge) and will best fit for the scenario and this turned
out to be a request for showcasing short demo.
It took me 5 days to build out a simple end to end demo by using simulator
which generates real-time events, capture and ingesting real-time events via
kafka, creating ingestion spec (dimensions and measures) for druid and
developing a dashboard using superset without a line of programming code
on Hortonworks Sandbox.
Demo - Cell towers Capacity Monitoring in real time on Geo Map
Design Pattern Used for Demo
Real-time
Stream
Topic Ingestion
Spec
Real-time Exploration / BI
Offline Analysis & Analytics
Real-time
Stream
Topic Ingestion
Spec
Real-time Exploration / BI
Offline Analysis & Analytics
Data
Enrichment
Batch
Data
Tranquillity
Batch
Data
Real-time
Stream
Topic Ingestion
Spec
Real-time Exploration / BI
Offline Analysis & Analytics
Data
Enrichment
Batch
Data
Real-time
Stream
Topic Ingestion
Spec
Real-time Exploration / BI
Offline Analysis & Analytics
Data
Enrichment
Batch
Data
Pipeline
Development
1
2 4
3
Druid.io – Design Patterns
Druid.io Data Flow and Sample Reference Architecture
Druid.io Data Flow
Reference Architecture
Reference Architecture (Logical)
Relevance of time series databases &amp; druid.io

More Related Content

What's hot

Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...
Big Data Spain
 
Data analysis trend 2015 2016 v071
Data analysis trend 2015 2016 v071Data analysis trend 2015 2016 v071
Data analysis trend 2015 2016 v071
Chun Myung Kyu
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dataconomy Media
 
Oracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingOracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream Processing
Guido Schmutz
 
Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
MapR Technologies
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
DataWorks Summit
 
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
Infochimps, a CSC Big Data Business
 
Threat Detection and Response at Scale with Dominique Brezinski
Threat Detection and Response at Scale with Dominique BrezinskiThreat Detection and Response at Scale with Dominique Brezinski
Threat Detection and Response at Scale with Dominique Brezinski
Databricks
 
Dsdt meetup-january2018
Dsdt meetup-january2018Dsdt meetup-january2018
Dsdt meetup-january2018
JDA Labs MTL
 
My other computer is a datacentre - 2012 edition
My other computer is a datacentre - 2012 editionMy other computer is a datacentre - 2012 edition
My other computer is a datacentre - 2012 edition
Steve Loughran
 
DNA - Einstein - Data science ja bigdata
DNA - Einstein - Data science ja bigdataDNA - Einstein - Data science ja bigdata
DNA - Einstein - Data science ja bigdata
Rolf Koski
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
MapR Technologies
 
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for CybersecurityEmpower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Databricks
 
The Power of Now! Azure Stream Analytics - Microsoft ITPro AirLift
The Power of Now! Azure Stream Analytics - Microsoft ITPro AirLiftThe Power of Now! Azure Stream Analytics - Microsoft ITPro AirLift
The Power of Now! Azure Stream Analytics - Microsoft ITPro AirLift
Rui Quintino
 
Estimating the Total Costs of Your Cloud Analytics Platform 
Estimating the Total Costs of Your Cloud Analytics Platform Estimating the Total Costs of Your Cloud Analytics Platform 
Estimating the Total Costs of Your Cloud Analytics Platform 
DATAVERSITY
 
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",..."From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
Dataconomy Media
 
Ibm big data
Ibm big dataIbm big data
Ibm big data
Peter Tutty
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
MapR Technologies
 
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Edwin Poot
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time Systems
Kamalika Dutta
 

What's hot (20)

Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...
 
Data analysis trend 2015 2016 v071
Data analysis trend 2015 2016 v071Data analysis trend 2015 2016 v071
Data analysis trend 2015 2016 v071
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Oracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingOracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream Processing
 
Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
 
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
 
Threat Detection and Response at Scale with Dominique Brezinski
Threat Detection and Response at Scale with Dominique BrezinskiThreat Detection and Response at Scale with Dominique Brezinski
Threat Detection and Response at Scale with Dominique Brezinski
 
Dsdt meetup-january2018
Dsdt meetup-january2018Dsdt meetup-january2018
Dsdt meetup-january2018
 
My other computer is a datacentre - 2012 edition
My other computer is a datacentre - 2012 editionMy other computer is a datacentre - 2012 edition
My other computer is a datacentre - 2012 edition
 
DNA - Einstein - Data science ja bigdata
DNA - Einstein - Data science ja bigdataDNA - Einstein - Data science ja bigdata
DNA - Einstein - Data science ja bigdata
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for CybersecurityEmpower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
 
The Power of Now! Azure Stream Analytics - Microsoft ITPro AirLift
The Power of Now! Azure Stream Analytics - Microsoft ITPro AirLiftThe Power of Now! Azure Stream Analytics - Microsoft ITPro AirLift
The Power of Now! Azure Stream Analytics - Microsoft ITPro AirLift
 
Estimating the Total Costs of Your Cloud Analytics Platform 
Estimating the Total Costs of Your Cloud Analytics Platform Estimating the Total Costs of Your Cloud Analytics Platform 
Estimating the Total Costs of Your Cloud Analytics Platform 
 
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",..."From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
 
Ibm big data
Ibm big dataIbm big data
Ibm big data
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
 
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time Systems
 

Similar to Relevance of time series databases &amp; druid.io

Data Virtualization: From Zero to Hero
Data Virtualization: From Zero to HeroData Virtualization: From Zero to Hero
Data Virtualization: From Zero to Hero
Denodo
 
Neo4j: What's Under the Hood & How Knowing This Can Help You
Neo4j: What's Under the Hood & How Knowing This Can Help You Neo4j: What's Under the Hood & How Knowing This Can Help You
Neo4j: What's Under the Hood & How Knowing This Can Help You
Neo4j
 
Productionizing Hadoop: 7 Architectural Best Practices
Productionizing Hadoop: 7 Architectural Best PracticesProductionizing Hadoop: 7 Architectural Best Practices
Productionizing Hadoop: 7 Architectural Best Practices
MapR Technologies
 
Getting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesGetting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming Architectures
SingleStore
 
Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)
Denodo
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
James Serra
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
Denny Lee
 
Data Culture Series - Keynote & Panel - 19h May - London
Data Culture Series  - Keynote & Panel - 19h May - LondonData Culture Series  - Keynote & Panel - 19h May - London
Data Culture Series - Keynote & Panel - 19h May - London
Jonathan Woodward
 
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and moreBig Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Amazon Web Services
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
Crimson 3 - Final case presentation
Crimson 3 - Final case presentationCrimson 3 - Final case presentation
Crimson 3 - Final case presentation
Pragnya Balamurukesan
 
Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create
PyData
 
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
Amazon Web Services
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
Dylan Tong
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
Cloudera, Inc.
 
Accelerate Migration to the Cloud using Data Virtualization (APAC)
Accelerate Migration to the Cloud using Data Virtualization (APAC)Accelerate Migration to the Cloud using Data Virtualization (APAC)
Accelerate Migration to the Cloud using Data Virtualization (APAC)
Denodo
 
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
confluent
 
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Denodo
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An Introduction
Denodo
 
Lean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataLean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big Data
Stylight
 

Similar to Relevance of time series databases &amp; druid.io (20)

Data Virtualization: From Zero to Hero
Data Virtualization: From Zero to HeroData Virtualization: From Zero to Hero
Data Virtualization: From Zero to Hero
 
Neo4j: What's Under the Hood & How Knowing This Can Help You
Neo4j: What's Under the Hood & How Knowing This Can Help You Neo4j: What's Under the Hood & How Knowing This Can Help You
Neo4j: What's Under the Hood & How Knowing This Can Help You
 
Productionizing Hadoop: 7 Architectural Best Practices
Productionizing Hadoop: 7 Architectural Best PracticesProductionizing Hadoop: 7 Architectural Best Practices
Productionizing Hadoop: 7 Architectural Best Practices
 
Getting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesGetting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming Architectures
 
Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
 
Data Culture Series - Keynote & Panel - 19h May - London
Data Culture Series  - Keynote & Panel - 19h May - LondonData Culture Series  - Keynote & Panel - 19h May - London
Data Culture Series - Keynote & Panel - 19h May - London
 
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and moreBig Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
Crimson 3 - Final case presentation
Crimson 3 - Final case presentationCrimson 3 - Final case presentation
Crimson 3 - Final case presentation
 
Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create
 
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Accelerate Migration to the Cloud using Data Virtualization (APAC)
Accelerate Migration to the Cloud using Data Virtualization (APAC)Accelerate Migration to the Cloud using Data Virtualization (APAC)
Accelerate Migration to the Cloud using Data Virtualization (APAC)
 
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
 
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An Introduction
 
Lean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataLean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big Data
 

Recently uploaded

Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 

Recently uploaded (20)

Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 

Relevance of time series databases &amp; druid.io

  • 1. RELEVANCE OF TIMESERIES DATABASES for REAL-TIME SOLUTIONS BY : MUNIRAJU VENKATESHA REDDY  MUNIRAJU_V@HOTMAIL.COM
  • 2. 1 2 3 4 5 6 About Me Now() Challenges & Opportunities Solution Time-Series Databases Druid.io Quick introduction about me & my work Real-Time scenarios and importance of NOW() Shifting business focus, Opportunities and Use cases Technology challenges and alternatives Quick view on various time-series databases Why druid.io is most fit for real-time analytics , my experience, demo design patterns & reference architecture
  • 3. Muniraju Venkatesha Reddy (Muni) is a lead solutions architect and having overall 18 years experience in following areas. Big Data Analytics & AI | DW& BI | Solution Strategy and advisory| Cloud Data lake & Insights Solutions (IaaS/PaaS/SaaS) | Enterprise Architecture | Consulting | Product & Processes Engineering | Delivery planning & execution | Customer Relations | Grow the business | People Management. In my current role with HCL America, working with customers across industry verticals on designing next generation solutions by leveraging Big data, Analytics & AI. Reach me @ : Muniraju_v@hotmail.com
  • 5. Your flight is at 9:00 AM for important customer meeting at later part of the day in near by city There is a traffic incident at 7:30 hrs and major disruptions to airport roadway. You want to know now()? You want to know Later? NOW ( )
  • 6. You went out for a Family & Friends Dinner Event at 6 PM There is a burglary incident in your house NOW ( ) You want to know now()? You want to know Later?
  • 7. You want to know now()? You want to know Later? You are a warehouse manager for large supply chain business for retailers. You are expecting large stock of goods to be delivered tonight for next week festival season There will be sever weather condition declared. Due to transportation disruptions arrival of goods will be delayed for more than 4 days NOW ( )
  • 8. Important things are happening NOW ( ) NOW( ) is the only time life really happens whether individual or business
  • 10. “ The collection and storage of data, for processing at a scheduled time when a sufficient amount of data has been accumulated..” “ The immediate processing of data after the transaction occurs, with the database being updated at the time of the event..” 2.5K Week 1 Week 2 Week 3 Week 4 Month 3 Qtr. 3 2.6K 2.4K2.4K 1.1M 2.9M Batch() vs. NOW()…
  • 11. Shift in Business Focus…
  • 12. Cutting Preventable Losses Routine Operations Missing Opportunities New Opportunities 1 2 3 4 • Closed Loop control systems (Manufacturing) • Systems & Network Monitoring (IT) • Field Asset Monitoring (Multiple) o Vending Machines, Oil rigs, Fleet management, Telecom • Transaction Processing (Finance) o Fraud, Validations, Authentications • Complex ICU Analytics (Healthcare) • Disaster Warning Systems (Environment) • Fraudulent Trades (Finance) • Preventive Maintenance (Manufacturing) • Customer Churn (Marketing) • Brand Reputation (Marketing) • Social Media (Sentiments) • Missed Opportunities – Revenue o Customer Services – Social, Products, Advertisements, Up Sell / Cross Cell • Missed Opportunities – Efficiency o Supply chain, Quality of Service, Insurance, Capacity Management • Autonomous Diagnostics and Connected automotives • Tractors are becoming soil sensors • Community WIFI (Global Projections – 94M (2016) to 541M (2021)) • Cyber Security, Smarter Surveillance,…. 1 2 3 4 Business Opportunities & Use Cases
  • 14. Solution… • Primarily Updates • Performance • Memory Swapping • Scaling • I/O Intensive • Distributed Keys • Memory Intensive • Complex Index Structure • Poor Secondary Index Support • Primarily Inserts • Recent Time Interval • Key association with timestamp • Time/Space Chunking o Fixed Duration Intervals o Fixed Size chunks o Adaptive Intervals RDBMS NOSQL TIMESEARIES
  • 19. Evolving Architecture Standards… Lambda (λ) IMMUTABLE DATA (HDFS) PRECOMPUTE VIEWS STREAM PROCESS STREAM INCREMENT VIEWS BATCH RECOMPUTE ENTERPRISE DATA VIEW 1 VIEW 2 VIEW …n VIEW REAL-TIME INCREMENT BATCH VIEWS REAL-TIME VIEWS MERGED VIEWS RESEARCH SCENARIOS (<= 30to90 Sec) OFFLINE SCENARIOS (<= 5 Secs) ONLINE SCENARIOS (<= 5 Secs) BATCH LAYER SPEEDY LAYER SERVING LAYER SPLIT –> TRANSFORM -> SINK SPLIT –> TRANSFORM -> SINK Enrich -> Split Enrich -> Split Steams Streams Streams Streams MEDIATION LAYER
  • 20. Druid.io (supports both Kappa & Lambda) Druid is an open-source data store designed for sub-second queries on real-time and historical data. It is primarily used for business intelligence (OLAP) queries on event data. Druid provides low latency (real-time) data ingestion, flexible data exploration, and fast data aggregation. More Details… Druid.io Sub-second OLAP Queries: Druid’s unique architecture enables rapid multi-dimensional filtering, ad-hoc attribute groupings, and extremely fast aggregations. Real-time Streaming Ingestion : Druid employs lock- free ingestion to allow for simultaneous ingestion and querying of high dimensional, high volume data sets. Explore events immediately after they occur. Power Analytic Applications: Druid has numerous features built for multi-tenancy. Power user-facing analytic applications designed to be used by thousands of concurrent users. Cost Effective: Druid is extremely cost effective at scale and has numerous features built in for cost reduction. Trade off cost and performance with simple configuration knobs. Highly Available: Druid is used to back SaaS implementations that need to be up all the time. Druid supports rolling updates so your data is still available and queryable during software updates. Scale up or down without data loss. Scalable: Existing Druid deployments handle trillions of events, petabytes of data, and thousands of queries every second. Key Features My Experience When I listening the customer challenge, I mentioned about the druid (I had only theoretical knowledge) and will best fit for the scenario and this turned out to be a request for showcasing short demo. It took me 5 days to build out a simple end to end demo by using simulator which generates real-time events, capture and ingesting real-time events via kafka, creating ingestion spec (dimensions and measures) for druid and developing a dashboard using superset without a line of programming code on Hortonworks Sandbox.
  • 21. Demo - Cell towers Capacity Monitoring in real time on Geo Map Design Pattern Used for Demo
  • 22. Real-time Stream Topic Ingestion Spec Real-time Exploration / BI Offline Analysis & Analytics Real-time Stream Topic Ingestion Spec Real-time Exploration / BI Offline Analysis & Analytics Data Enrichment Batch Data Tranquillity Batch Data Real-time Stream Topic Ingestion Spec Real-time Exploration / BI Offline Analysis & Analytics Data Enrichment Batch Data Real-time Stream Topic Ingestion Spec Real-time Exploration / BI Offline Analysis & Analytics Data Enrichment Batch Data Pipeline Development 1 2 4 3 Druid.io – Design Patterns
  • 23. Druid.io Data Flow and Sample Reference Architecture Druid.io Data Flow Reference Architecture Reference Architecture (Logical)