SlideShare a Scribd company logo
BIG DATA ANALYTICS FOR
REAL TIME SYSTEMS
Kamalika Dutta Manasi Jayapal
Overview
 Introduction
 Big Data Analytics
 Real Time Systems
 Challenges of Real Time Analytics
 Technologies
 Tools
 Use Cases
 Future Work and Conclusion
2 Big Data Analytics for Real Time Systems
Overview
3 Big Data Analytics for Real Time Systems
 Introduction
 Big Data Analytics
 Real Time Systems
 Challenges of Real Time Analytics
 Technologies
 Tools
 Use Cases
 Future Work and Conclusion
Where does Big Data come from?
4 Big Data Analytics for Real Time Systems
Courtesy: http://goo.gl/JWswfj
What makes it Big Data?
5 Big Data Analytics for Real Time Systems
Courtesy: Oracle
VARIABILITY
Evolution of Big Data
6 Big Data Analytics for Real Time Systems
1960s 1967
Automatic Data
Compression
1997
Information Explosion
Our Literature Survey!
Overview
7 Big Data Analytics for Real Time Systems
 Introduction
 Big Data Analytics
 Real Time Systems
 Challenges of Real Time Analytics
 Technologies
 Tools
 Use Cases
 Future Work and Conclusion
Big Data Analytics
“Big data analytics is the process of examining large data sets to
uncover hidden patterns, unknown correlations, market trends,
customer preferences and other useful business information.“
8 Big Data Analytics for Real Time Systems
 Predictive Analysis
 Text Analysis
 Data Mining
 Statistical Analysis
Courtesy: smartdatacollective.com
Sample Systems
9 Big Data Analytics for Real Time Systems
Analytics & 3 V‘s
10 Big Data Analytics for Real Time Systems
Courtesy: watalon.com
Overview
 Introduction
 Big Data Analytics
 Real Time Systems
 Challenges of Real Time Analytics
 Technologies
 Tools
 Use Cases
 Future Work and Conclusion
11 Big Data Analytics for Real Time Systems
Real Time Systems
“A real-time system is one that processes information and produces a
response within a specified time, else risk severe consequences,
sometimes including failure.“
12 Big Data Analytics for Real Time Systems
 Telecommunication
Systems
 Anti-Lock Brakes in a Car
 Air Traffic Control System
 Weather Forecasting
System
Courtesy: yourdon.com
Real-Time Analytics of Big Data
13 Big Data Analytics for Real Time Systems
What is Happening?
Kilobytes/
Sec
Megabytes/
Sec
Gigabytes 
Terabytes
Petabytes 
Exabytes
Seconds Milliseconds Minutes
Minutes 
Hours
Big Data
Real Time
Courtesy: infochimps.com
Overview
 Introduction
 Big Data Analytics
 Real Time Systems
 Challenges of Real Time Analytics
 Technologies
 Tools
 Use Cases
 Future Work and Conclusion
14 Big Data Analytics for Real Time Systems
Challenges of Real Time Analytics
15 Big Data Analytics for Real Time Systems
Expensive
Complex Architecture, Batch Processing
Semi and Unstructured Data: New Sources are unpredictable; Relational
databases are not capable, leaving us hamstrung
Market too Dynamic to Predict: Subscribers preferences change; competition
adds acceleration to it
Scalability: Requires sub-second response times; more than a single server can
handle
Thinking Beyond Hadoop!
16 Big Data Analytics for Real Time Systems
Manage & store huge
volume of any data
Hadoop File System
MapReduce
Manage streaming data Stream Computing
Analyze unstructured data Text Analytics Engine
Data WarehousingStructure and control data
Integrate and govern all
data sources
Integration, Data Quality, Security,
Lifecycle Management, MDM
Understand and navigate
federated big data sources
Federated Discovery and Navigation
Courtesy: IBM
Our Solution
 Do the impossible: Incorporate any kind
of data
 Scale Big: Scale without any complexity
 Not Time Consuming: Seconds to
Minutes
 Real Time: Try to analyze data without
expensive data warehouse loads
17 Big Data Analytics for Real Time Systems
Powerful Analytics, In Place, In Real Time.
Courtesy: slideshare.com
Overview
 Introduction
 Big Data Analytics
 Real Time Systems
 Challenges of Real Time Analytics
 Technologies
 Tools
 Use Cases
 Future Work and Conclusion
18 Big Data Analytics for Real Time Systems
In-Memory Computing
In-memory computing primarily relies on keeping data in a server's RAM as a
means of processing at faster speeds. It uses a type of middleware software that
allows one to store data in RAM, across a cluster of computers, and process it in
parallel.
19 Big Data Analytics for Real Time Systems
Courtesy: Stratecast
Stream Processing
20 Big Data Analytics for Real Time Systems
Courtesy: EMC
 Stream-processing systems operate on continuous data streams e.g., click
streams on web pages, user request/query streams, monitoring events,
notifications, etc.
 Stream processing delivers real-time analytic processing on constantly changing
data in motion.
 Analyse first store later!
Complex Event Processing
Complex Event Processing (CEP) processes multiple event streams generated
within the enterprise to construct data abstraction and identify meaningful
patterns among those streams.
21 Big Data Analytics for Real Time Systems
 Analytics across both real-time and historical data.
 Real-time event capture, filtering, pattern detection, matching, and
aggregation.
Overview
 Introduction
 Big Data Analytics
 Real Time Systems
 Challenges of Real Time Analytics
 Technologies
 Tools
 Use Cases
 Future Work and Conclusion
22 Big Data Analytics for Real Time Systems
Tools for Real Time Analytics
Big Data is NOT new, the Tools ARE!
23 Big Data Analytics for Real Time Systems
IBM InfoSphere Streams
Kafka
 A high performance distributed publish-subscribe messaging system.
 Designed for processing of real time activity stream data.
 Initially developed at LinkedIn, now part of Apache.
 Kafka works in combination with Apache Storm, Apache HBase and Apache
Spark for real-time analysis and rendering of streaming data.
24 Big Data Analytics for Real Time Systems
 Fast
 Scalable
 Durable
 Fault-tolerant
Storm
 A highly distributed real-time computation system.
 Acquired by Twitter.
 Twitter claims, “Over a million tuples processed per second per node.”
 Fast, Scalable, Reliable and Fault-tolerant.
25 Big Data Analytics for Real Time Systems
 Stream: Unbounded
sequence of tuples
 Primitives
 Spouts: Pull messages
 Bolts: Perform core
functions of stream
computing
Stream
Spark Streaming
 Was developed in the AMPLab at
UC Berkeley.
 In-memory computing
capabilities deliver speed.
 Low latency
 High throughput
 Fault tolerant
 New programing model:
 Discretized streams (Dstreams)
 Resilient Distributed Datasets
26 Big Data Analytics for Real Time Systems
Spark Streaming uses micro-batching to support continuous stream processing. It is
an extension of Spark which is a batch-processing system.
Courtesy: Apache Spark
Spring XD (XD=eXtreme Data)
 Spring XD is a unified, distributed, and extensible system for data ingestion, real
time analytics, batch processing, and data export.
 Spring XD framework supports streams for the ingestion of event driven data
from a source to a sink that passes through any number of processors.
27 Big Data Analytics for Real Time Systems
Courtesy: Infoq
Comparison of Tools (1)
Spark Streaming Apache Storm Spring XD
Definition
A fast and general purpose
cluster computing system.
A distributed real-time
computation system.
A unified, distributed, and
extensible system for data
ingestion, real time analytics,
batch processing, and data
export.
Implemented in Scala Clojure Java
Programming API Scala, Java, Python
Java API and usable with any
programing language.
Java
Development A full top level Apache project. Undergoing Apache project. Spring project by Pivotal.
Processing Model
Batch processing framework
that also does micro-batching.
Stream Processing Framework
that processes and dispatches
messages as soon as they
arrive.
Unified platform for stream
processing.
Fault Tolerance
Recovery of lost work and
restart of workers via the
resource manager.
Restart of Workers,
Supervisors like nothing ever
happened.
Reassignment of work to
container working.
28 Big Data Analytics for Real Time Systems
Comparison of Tools (2)
Spark Streaming Apache Storm Spring XD
Data processing
Messages are not lost and
delivered once. (Small-scale
batching)
Keeps track of each and every
record.
Unacknowledged messages
are retried until the
container comes back.
Use Cases
• Combines batch and
stream processing
(Lambda Architecture).
• Machine Learning:
Improve performance of
iterative algorithms
• Power Real-time
Dashboards.
Prevention of:
• securities fraud
• compliance violations
• security breaches
• network outage
• Stream tweets to
Hadoop for sentiment
analysis.
• High throughput
distributed data
ingestion into HDFS from
a variety of input
sources.
• Real-time analytics at
ingestion time, e.g.
gathering metrics and
counting values.
29 Big Data Analytics for Real Time Systems
Which tools are right for you?
30 Big Data Analytics for Real Time Systems
Lambda Architecture
31 Big Data Analytics for Real Time Systems
 In 2013, Nathan Marz and James Warren proposed the Lambda Architecture
that attempts to provide a methodology to build a Big Data system.
 Such a system would balance latency, throughput, and fault-tolerance by
using batch processing to provide comprehensive and accurate pre-computed
views, while simultaneously using real-time stream processing to provide
dynamic views.
Marz, Nathan, and James Warren. Big Data: Principles and best practices of scalable real-time data systems. O'Reilly Media, 2013.
Courtesy: Trivadis
Lambda Architecture Example
32 Big Data Analytics for Real Time Systems
Marz, Nathan, and James Warren. Big Data: Principles and best practices of scalable real-time data systems. O'Reilly Media, 2013.
Courtesy: Trivadis
Overview
 Introduction
 Big Data Analytics
 Real Time Systems
 Challenges of Real Time Analytics
 Technologies
 Tools
 Use Cases
 Future Work and Conclusion
33 Big Data Analytics for Real Time Systems
Use Cases
34 Big Data Analytics for Real Time Systems
 Healthcare
 Capture and analyze real-time data from medical monitors,
alerting hospital staff to potential health problems before patients
manifest clinical signs of infection or other issues.
 Analyze privacy-protected streams of medical device data to
detect early signs of disease, identify correlations among multiple
patients.
 Finance
 Analyze ticks, tweets, satellite imagery, weather trends, and any
other type of data to inform trading algorithms in real time.
 Apply fraud insights to take action in real time. Use analytics on
streaming data to confidently differentiate legitimate actions,
while preventing or interrupting suspicious actions and respond
immediately to criminal patterns and activities.
Use Cases
35 Big Data Analytics for Real Time Systems
 Government
 Identify social program fraud within seconds based on program
history, citizen profile, and geospatial data.
 Identify items or patterns for deeper investigation in Cyber-
security.
 Transport
 Traffic managers can now respond quickly and accurately to
relevant insights from real-time analytics drawn from data feeds
and reports.
 Telematics can provide data-in-motion such as vehicle speed, data
relating to the transmission control system, braking, air bags, tire
pressure and wiper speed as well as geospatial and current
environmental conditions data. Hence, automotive companies can
strengthen customer relationships
Use Cases
36 Big Data Analytics for Real Time Systems
 Telecommunication
 Improve customer profitability analysis, end-to-end visibility for
new product rollouts and real-time analysis for better the network
customers.
 Perform capacity planning for mobile networks as new high-
bandwidth services are introduced. Improve customer experience.
 Retail
 See a product recurring in abandoned shopping carts. Run a
promotion to close more sales of that product.
 Evaluate sales performance in real time. Take measures now to
achieve sales quotas.
 An electric coupon delivery service sends e-mails to customers
with recommendations matched to their interests derived from
their location information, membership information, and
information on nearby stores.
37 Big Data Analytics for Real Time Systems
Courtesy: SAP
Overview
 Introduction
 Big Data Analytics
 Real Time Systems
 Challenges of Real Time Analytics
 Technologies
 Tools
 Use Cases
 Future Work and Conclusion
38 Big Data Analytics for Real Time Systems
Future Work
 Increased Level of Merging
 Application of Social and Digital Media
 New Technologies
 Further Development of Telemetric Data
 Self Learning Systems
 Complex Statistical Methods
39 Big Data Analytics for Real Time Systems
Conclusion
40 Big Data Analytics for Real Time Systems
Resources
Privacy Security
TimeCost
“Consumer Data will be the biggest differentiator in the next two to three years.
Whoever unlocks the reams of data and uses it strategically, will win”
-Angela Ahrendts, CEO, Burberry
?
41 Big Data Analytics for Real Time Systems

More Related Content

What's hot

Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
Utkarsh Sharma
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
Dr. C.V. Suresh Babu
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 
Mining Data Streams
Mining Data StreamsMining Data Streams
Mining Data Streams
SujaAldrin
 
Big data ecosystem
Big data ecosystemBig data ecosystem
Big data ecosystemmagda3695
 
Hadoop data management
Hadoop data managementHadoop data management
Hadoop data management
Subhas Kumar Ghosh
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streams
hktripathy
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
DataWorks Summit
 
Data Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data AnalysisData Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data Analysis
Eva Durall
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
Ravi Nayak
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Dr. C.V. Suresh Babu
 
Building a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - WebinarBuilding a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - Webinar
Amazon Web Services
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
James Serra
 
Lecture #01
Lecture #01Lecture #01
Lecture #01
Konpal Darakshan
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with Python
Davis David
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data mining
MITS Gwalior
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
Prashant Gupta
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
Dr. Abdul Ahad Abro
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
ShivanandaVSeeri
 

What's hot (20)

Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Mining Data Streams
Mining Data StreamsMining Data Streams
Mining Data Streams
 
Web content mining
Web content miningWeb content mining
Web content mining
 
Big data ecosystem
Big data ecosystemBig data ecosystem
Big data ecosystem
 
Hadoop data management
Hadoop data managementHadoop data management
Hadoop data management
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streams
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
Data Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data AnalysisData Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data Analysis
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Building a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - WebinarBuilding a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - Webinar
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Lecture #01
Lecture #01Lecture #01
Lecture #01
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with Python
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data mining
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 

Viewers also liked

2016 IBM Interconnect - medical devices transformation
2016 IBM Interconnect  - medical devices transformation2016 IBM Interconnect  - medical devices transformation
2016 IBM Interconnect - medical devices transformation
Elizabeth Koumpan
 
Augmented Reality for E-Learning
Augmented Reality for E-LearningAugmented Reality for E-Learning
Augmented Reality for E-Learning
Kamalika Dutta
 
Business Procedure Modelling and Digitization Toolbox - Master Thesis - Kamal...
Business Procedure Modelling and Digitization Toolbox - Master Thesis - Kamal...Business Procedure Modelling and Digitization Toolbox - Master Thesis - Kamal...
Business Procedure Modelling and Digitization Toolbox - Master Thesis - Kamal...
Kamalika Dutta
 
Augmented Reality in Education
Augmented Reality in Education Augmented Reality in Education
Augmented Reality in Education
K3 Hamilton
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
Bernard Marr
 
Liberating medical device data for clinical research
Liberating medical device data for clinical researchLiberating medical device data for clinical research
Liberating medical device data for clinical researchJohn Zaleski
 
Web, gaming, mobile : quel développeur serez-vous demain ?
Web, gaming, mobile : quel développeur serez-vous demain ?Web, gaming, mobile : quel développeur serez-vous demain ?
Web, gaming, mobile : quel développeur serez-vous demain ?
Microsoft
 
Rd big data & analytics v1.0
Rd big data & analytics v1.0Rd big data & analytics v1.0
Rd big data & analytics v1.0
Yadu Balehosur
 
Ast 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analyticsAst 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analyticsAccenture
 
Big Data Analytics: Architectural Perspective
Big Data Analytics: Architectural PerspectiveBig Data Analytics: Architectural Perspective
Big Data Analytics: Architectural Perspective
Sumit Kalra
 
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Enrico Palumbo
 
Use of Chemical Characterization to Assess the Equivalency of Medical Devices...
Use of Chemical Characterization to Assess the Equivalency of Medical Devices...Use of Chemical Characterization to Assess the Equivalency of Medical Devices...
Use of Chemical Characterization to Assess the Equivalency of Medical Devices...
NAMSA
 
Meta Analysis of Medical Device Data Applications for Designing Studies and R...
Meta Analysis of Medical Device Data Applications for Designing Studies and R...Meta Analysis of Medical Device Data Applications for Designing Studies and R...
Meta Analysis of Medical Device Data Applications for Designing Studies and R...
NAMSA
 
The concept of Datalake with Hadoop
The concept of Datalake with HadoopThe concept of Datalake with Hadoop
The concept of Datalake with Hadoop
Avkash Chauhan
 
A big-data architecture for real-time analytics
A big-data architecture for real-time analyticsA big-data architecture for real-time analytics
A big-data architecture for real-time analytics
ramikaurraminder
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Hortonworks
 
Studies in application of Augmented Reality in E Learning - Design Project 3
Studies in application of Augmented Reality in E Learning - Design Project 3Studies in application of Augmented Reality in E Learning - Design Project 3
Studies in application of Augmented Reality in E Learning - Design Project 3Mannu Amrit
 
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service Stefan Schwarz
 
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache HadoopIntroduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
Avkash Chauhan
 

Viewers also liked (20)

2016 IBM Interconnect - medical devices transformation
2016 IBM Interconnect  - medical devices transformation2016 IBM Interconnect  - medical devices transformation
2016 IBM Interconnect - medical devices transformation
 
Augmented Reality for E-Learning
Augmented Reality for E-LearningAugmented Reality for E-Learning
Augmented Reality for E-Learning
 
Business Procedure Modelling and Digitization Toolbox - Master Thesis - Kamal...
Business Procedure Modelling and Digitization Toolbox - Master Thesis - Kamal...Business Procedure Modelling and Digitization Toolbox - Master Thesis - Kamal...
Business Procedure Modelling and Digitization Toolbox - Master Thesis - Kamal...
 
Augmented Reality in Education
Augmented Reality in Education Augmented Reality in Education
Augmented Reality in Education
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Curved Arrows Flat
Curved Arrows FlatCurved Arrows Flat
Curved Arrows Flat
 
Liberating medical device data for clinical research
Liberating medical device data for clinical researchLiberating medical device data for clinical research
Liberating medical device data for clinical research
 
Web, gaming, mobile : quel développeur serez-vous demain ?
Web, gaming, mobile : quel développeur serez-vous demain ?Web, gaming, mobile : quel développeur serez-vous demain ?
Web, gaming, mobile : quel développeur serez-vous demain ?
 
Rd big data & analytics v1.0
Rd big data & analytics v1.0Rd big data & analytics v1.0
Rd big data & analytics v1.0
 
Ast 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analyticsAst 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analytics
 
Big Data Analytics: Architectural Perspective
Big Data Analytics: Architectural PerspectiveBig Data Analytics: Architectural Perspective
Big Data Analytics: Architectural Perspective
 
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
 
Use of Chemical Characterization to Assess the Equivalency of Medical Devices...
Use of Chemical Characterization to Assess the Equivalency of Medical Devices...Use of Chemical Characterization to Assess the Equivalency of Medical Devices...
Use of Chemical Characterization to Assess the Equivalency of Medical Devices...
 
Meta Analysis of Medical Device Data Applications for Designing Studies and R...
Meta Analysis of Medical Device Data Applications for Designing Studies and R...Meta Analysis of Medical Device Data Applications for Designing Studies and R...
Meta Analysis of Medical Device Data Applications for Designing Studies and R...
 
The concept of Datalake with Hadoop
The concept of Datalake with HadoopThe concept of Datalake with Hadoop
The concept of Datalake with Hadoop
 
A big-data architecture for real-time analytics
A big-data architecture for real-time analyticsA big-data architecture for real-time analytics
A big-data architecture for real-time analytics
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
 
Studies in application of Augmented Reality in E Learning - Design Project 3
Studies in application of Augmented Reality in E Learning - Design Project 3Studies in application of Augmented Reality in E Learning - Design Project 3
Studies in application of Augmented Reality in E Learning - Design Project 3
 
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
 
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache HadoopIntroduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
 

Similar to Big Data Analytics for Real Time Systems

Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016
Stavros Kontopoulos
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
Dr. Shikha Mehta
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
confluent
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
ElsonPaul2
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analytics
Amazon Web Services
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
trendwiseanalytics1
 
Big Data : Risks and Opportunities
Big Data : Risks and OpportunitiesBig Data : Risks and Opportunities
Big Data : Risks and Opportunities
Kenny Huang Ph.D.
 
Innovating With Data and Analytics
Innovating With Data and AnalyticsInnovating With Data and Analytics
Innovating With Data and Analytics
VMware Tanzu
 
Trivento summercamp fast data 9/9/2016
Trivento summercamp fast data 9/9/2016Trivento summercamp fast data 9/9/2016
Trivento summercamp fast data 9/9/2016
Stavros Kontopoulos
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQL
SingleStore
 
Stream Meets Batch for Smarter Analytics- Impetus White Paper
Stream Meets Batch for Smarter Analytics- Impetus White PaperStream Meets Batch for Smarter Analytics- Impetus White Paper
Stream Meets Batch for Smarter Analytics- Impetus White Paper
Impetus Technologies
 
Streaming analytics
Streaming analyticsStreaming analytics
Streaming analytics
Gerard McNamee
 
Analytics&IoT
Analytics&IoTAnalytics&IoT
Analytics&IoT
Selvaraj Kesavan
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Amazon Web Services
 
A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in Action
Amazon Web Services
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Sreedhar Chowdam
 
SQL Server 2008 R2 StreamInsight
SQL Server 2008 R2 StreamInsightSQL Server 2008 R2 StreamInsight
SQL Server 2008 R2 StreamInsight
Eduardo Castro
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
Amazon Web Services
 
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreBig Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
HPCC Systems
 
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
Open Blueprint for Real-Time  Analytics in Retail: Strata Hadoop World 2017 S...Open Blueprint for Real-Time  Analytics in Retail: Strata Hadoop World 2017 S...
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
Grid Dynamics
 

Similar to Big Data Analytics for Real Time Systems (20)

Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analytics
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big Data : Risks and Opportunities
Big Data : Risks and OpportunitiesBig Data : Risks and Opportunities
Big Data : Risks and Opportunities
 
Innovating With Data and Analytics
Innovating With Data and AnalyticsInnovating With Data and Analytics
Innovating With Data and Analytics
 
Trivento summercamp fast data 9/9/2016
Trivento summercamp fast data 9/9/2016Trivento summercamp fast data 9/9/2016
Trivento summercamp fast data 9/9/2016
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQL
 
Stream Meets Batch for Smarter Analytics- Impetus White Paper
Stream Meets Batch for Smarter Analytics- Impetus White PaperStream Meets Batch for Smarter Analytics- Impetus White Paper
Stream Meets Batch for Smarter Analytics- Impetus White Paper
 
Streaming analytics
Streaming analyticsStreaming analytics
Streaming analytics
 
Analytics&IoT
Analytics&IoTAnalytics&IoT
Analytics&IoT
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
 
A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in Action
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
SQL Server 2008 R2 StreamInsight
SQL Server 2008 R2 StreamInsightSQL Server 2008 R2 StreamInsight
SQL Server 2008 R2 StreamInsight
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
 
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreBig Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
 
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
Open Blueprint for Real-Time  Analytics in Retail: Strata Hadoop World 2017 S...Open Blueprint for Real-Time  Analytics in Retail: Strata Hadoop World 2017 S...
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
 

Recently uploaded

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 

Recently uploaded (20)

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 

Big Data Analytics for Real Time Systems

  • 1. BIG DATA ANALYTICS FOR REAL TIME SYSTEMS Kamalika Dutta Manasi Jayapal
  • 2. Overview  Introduction  Big Data Analytics  Real Time Systems  Challenges of Real Time Analytics  Technologies  Tools  Use Cases  Future Work and Conclusion 2 Big Data Analytics for Real Time Systems
  • 3. Overview 3 Big Data Analytics for Real Time Systems  Introduction  Big Data Analytics  Real Time Systems  Challenges of Real Time Analytics  Technologies  Tools  Use Cases  Future Work and Conclusion
  • 4. Where does Big Data come from? 4 Big Data Analytics for Real Time Systems Courtesy: http://goo.gl/JWswfj
  • 5. What makes it Big Data? 5 Big Data Analytics for Real Time Systems Courtesy: Oracle VARIABILITY
  • 6. Evolution of Big Data 6 Big Data Analytics for Real Time Systems 1960s 1967 Automatic Data Compression 1997 Information Explosion Our Literature Survey!
  • 7. Overview 7 Big Data Analytics for Real Time Systems  Introduction  Big Data Analytics  Real Time Systems  Challenges of Real Time Analytics  Technologies  Tools  Use Cases  Future Work and Conclusion
  • 8. Big Data Analytics “Big data analytics is the process of examining large data sets to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information.“ 8 Big Data Analytics for Real Time Systems  Predictive Analysis  Text Analysis  Data Mining  Statistical Analysis Courtesy: smartdatacollective.com
  • 9. Sample Systems 9 Big Data Analytics for Real Time Systems
  • 10. Analytics & 3 V‘s 10 Big Data Analytics for Real Time Systems Courtesy: watalon.com
  • 11. Overview  Introduction  Big Data Analytics  Real Time Systems  Challenges of Real Time Analytics  Technologies  Tools  Use Cases  Future Work and Conclusion 11 Big Data Analytics for Real Time Systems
  • 12. Real Time Systems “A real-time system is one that processes information and produces a response within a specified time, else risk severe consequences, sometimes including failure.“ 12 Big Data Analytics for Real Time Systems  Telecommunication Systems  Anti-Lock Brakes in a Car  Air Traffic Control System  Weather Forecasting System Courtesy: yourdon.com
  • 13. Real-Time Analytics of Big Data 13 Big Data Analytics for Real Time Systems What is Happening? Kilobytes/ Sec Megabytes/ Sec Gigabytes  Terabytes Petabytes  Exabytes Seconds Milliseconds Minutes Minutes  Hours Big Data Real Time Courtesy: infochimps.com
  • 14. Overview  Introduction  Big Data Analytics  Real Time Systems  Challenges of Real Time Analytics  Technologies  Tools  Use Cases  Future Work and Conclusion 14 Big Data Analytics for Real Time Systems
  • 15. Challenges of Real Time Analytics 15 Big Data Analytics for Real Time Systems Expensive Complex Architecture, Batch Processing Semi and Unstructured Data: New Sources are unpredictable; Relational databases are not capable, leaving us hamstrung Market too Dynamic to Predict: Subscribers preferences change; competition adds acceleration to it Scalability: Requires sub-second response times; more than a single server can handle
  • 16. Thinking Beyond Hadoop! 16 Big Data Analytics for Real Time Systems Manage & store huge volume of any data Hadoop File System MapReduce Manage streaming data Stream Computing Analyze unstructured data Text Analytics Engine Data WarehousingStructure and control data Integrate and govern all data sources Integration, Data Quality, Security, Lifecycle Management, MDM Understand and navigate federated big data sources Federated Discovery and Navigation Courtesy: IBM
  • 17. Our Solution  Do the impossible: Incorporate any kind of data  Scale Big: Scale without any complexity  Not Time Consuming: Seconds to Minutes  Real Time: Try to analyze data without expensive data warehouse loads 17 Big Data Analytics for Real Time Systems Powerful Analytics, In Place, In Real Time. Courtesy: slideshare.com
  • 18. Overview  Introduction  Big Data Analytics  Real Time Systems  Challenges of Real Time Analytics  Technologies  Tools  Use Cases  Future Work and Conclusion 18 Big Data Analytics for Real Time Systems
  • 19. In-Memory Computing In-memory computing primarily relies on keeping data in a server's RAM as a means of processing at faster speeds. It uses a type of middleware software that allows one to store data in RAM, across a cluster of computers, and process it in parallel. 19 Big Data Analytics for Real Time Systems Courtesy: Stratecast
  • 20. Stream Processing 20 Big Data Analytics for Real Time Systems Courtesy: EMC  Stream-processing systems operate on continuous data streams e.g., click streams on web pages, user request/query streams, monitoring events, notifications, etc.  Stream processing delivers real-time analytic processing on constantly changing data in motion.  Analyse first store later!
  • 21. Complex Event Processing Complex Event Processing (CEP) processes multiple event streams generated within the enterprise to construct data abstraction and identify meaningful patterns among those streams. 21 Big Data Analytics for Real Time Systems  Analytics across both real-time and historical data.  Real-time event capture, filtering, pattern detection, matching, and aggregation.
  • 22. Overview  Introduction  Big Data Analytics  Real Time Systems  Challenges of Real Time Analytics  Technologies  Tools  Use Cases  Future Work and Conclusion 22 Big Data Analytics for Real Time Systems
  • 23. Tools for Real Time Analytics Big Data is NOT new, the Tools ARE! 23 Big Data Analytics for Real Time Systems IBM InfoSphere Streams
  • 24. Kafka  A high performance distributed publish-subscribe messaging system.  Designed for processing of real time activity stream data.  Initially developed at LinkedIn, now part of Apache.  Kafka works in combination with Apache Storm, Apache HBase and Apache Spark for real-time analysis and rendering of streaming data. 24 Big Data Analytics for Real Time Systems  Fast  Scalable  Durable  Fault-tolerant
  • 25. Storm  A highly distributed real-time computation system.  Acquired by Twitter.  Twitter claims, “Over a million tuples processed per second per node.”  Fast, Scalable, Reliable and Fault-tolerant. 25 Big Data Analytics for Real Time Systems  Stream: Unbounded sequence of tuples  Primitives  Spouts: Pull messages  Bolts: Perform core functions of stream computing Stream
  • 26. Spark Streaming  Was developed in the AMPLab at UC Berkeley.  In-memory computing capabilities deliver speed.  Low latency  High throughput  Fault tolerant  New programing model:  Discretized streams (Dstreams)  Resilient Distributed Datasets 26 Big Data Analytics for Real Time Systems Spark Streaming uses micro-batching to support continuous stream processing. It is an extension of Spark which is a batch-processing system. Courtesy: Apache Spark
  • 27. Spring XD (XD=eXtreme Data)  Spring XD is a unified, distributed, and extensible system for data ingestion, real time analytics, batch processing, and data export.  Spring XD framework supports streams for the ingestion of event driven data from a source to a sink that passes through any number of processors. 27 Big Data Analytics for Real Time Systems Courtesy: Infoq
  • 28. Comparison of Tools (1) Spark Streaming Apache Storm Spring XD Definition A fast and general purpose cluster computing system. A distributed real-time computation system. A unified, distributed, and extensible system for data ingestion, real time analytics, batch processing, and data export. Implemented in Scala Clojure Java Programming API Scala, Java, Python Java API and usable with any programing language. Java Development A full top level Apache project. Undergoing Apache project. Spring project by Pivotal. Processing Model Batch processing framework that also does micro-batching. Stream Processing Framework that processes and dispatches messages as soon as they arrive. Unified platform for stream processing. Fault Tolerance Recovery of lost work and restart of workers via the resource manager. Restart of Workers, Supervisors like nothing ever happened. Reassignment of work to container working. 28 Big Data Analytics for Real Time Systems
  • 29. Comparison of Tools (2) Spark Streaming Apache Storm Spring XD Data processing Messages are not lost and delivered once. (Small-scale batching) Keeps track of each and every record. Unacknowledged messages are retried until the container comes back. Use Cases • Combines batch and stream processing (Lambda Architecture). • Machine Learning: Improve performance of iterative algorithms • Power Real-time Dashboards. Prevention of: • securities fraud • compliance violations • security breaches • network outage • Stream tweets to Hadoop for sentiment analysis. • High throughput distributed data ingestion into HDFS from a variety of input sources. • Real-time analytics at ingestion time, e.g. gathering metrics and counting values. 29 Big Data Analytics for Real Time Systems
  • 30. Which tools are right for you? 30 Big Data Analytics for Real Time Systems
  • 31. Lambda Architecture 31 Big Data Analytics for Real Time Systems  In 2013, Nathan Marz and James Warren proposed the Lambda Architecture that attempts to provide a methodology to build a Big Data system.  Such a system would balance latency, throughput, and fault-tolerance by using batch processing to provide comprehensive and accurate pre-computed views, while simultaneously using real-time stream processing to provide dynamic views. Marz, Nathan, and James Warren. Big Data: Principles and best practices of scalable real-time data systems. O'Reilly Media, 2013. Courtesy: Trivadis
  • 32. Lambda Architecture Example 32 Big Data Analytics for Real Time Systems Marz, Nathan, and James Warren. Big Data: Principles and best practices of scalable real-time data systems. O'Reilly Media, 2013. Courtesy: Trivadis
  • 33. Overview  Introduction  Big Data Analytics  Real Time Systems  Challenges of Real Time Analytics  Technologies  Tools  Use Cases  Future Work and Conclusion 33 Big Data Analytics for Real Time Systems
  • 34. Use Cases 34 Big Data Analytics for Real Time Systems  Healthcare  Capture and analyze real-time data from medical monitors, alerting hospital staff to potential health problems before patients manifest clinical signs of infection or other issues.  Analyze privacy-protected streams of medical device data to detect early signs of disease, identify correlations among multiple patients.  Finance  Analyze ticks, tweets, satellite imagery, weather trends, and any other type of data to inform trading algorithms in real time.  Apply fraud insights to take action in real time. Use analytics on streaming data to confidently differentiate legitimate actions, while preventing or interrupting suspicious actions and respond immediately to criminal patterns and activities.
  • 35. Use Cases 35 Big Data Analytics for Real Time Systems  Government  Identify social program fraud within seconds based on program history, citizen profile, and geospatial data.  Identify items or patterns for deeper investigation in Cyber- security.  Transport  Traffic managers can now respond quickly and accurately to relevant insights from real-time analytics drawn from data feeds and reports.  Telematics can provide data-in-motion such as vehicle speed, data relating to the transmission control system, braking, air bags, tire pressure and wiper speed as well as geospatial and current environmental conditions data. Hence, automotive companies can strengthen customer relationships
  • 36. Use Cases 36 Big Data Analytics for Real Time Systems  Telecommunication  Improve customer profitability analysis, end-to-end visibility for new product rollouts and real-time analysis for better the network customers.  Perform capacity planning for mobile networks as new high- bandwidth services are introduced. Improve customer experience.  Retail  See a product recurring in abandoned shopping carts. Run a promotion to close more sales of that product.  Evaluate sales performance in real time. Take measures now to achieve sales quotas.  An electric coupon delivery service sends e-mails to customers with recommendations matched to their interests derived from their location information, membership information, and information on nearby stores.
  • 37. 37 Big Data Analytics for Real Time Systems Courtesy: SAP
  • 38. Overview  Introduction  Big Data Analytics  Real Time Systems  Challenges of Real Time Analytics  Technologies  Tools  Use Cases  Future Work and Conclusion 38 Big Data Analytics for Real Time Systems
  • 39. Future Work  Increased Level of Merging  Application of Social and Digital Media  New Technologies  Further Development of Telemetric Data  Self Learning Systems  Complex Statistical Methods 39 Big Data Analytics for Real Time Systems
  • 40. Conclusion 40 Big Data Analytics for Real Time Systems Resources Privacy Security TimeCost “Consumer Data will be the biggest differentiator in the next two to three years. Whoever unlocks the reams of data and uses it strategically, will win” -Angela Ahrendts, CEO, Burberry ?
  • 41. 41 Big Data Analytics for Real Time Systems