3. SOURCE DATA
MS Queue’s
Events
XML Files
Databases
Sensor data
Social
Enterprise
Repositories
RDBMS
EDW
NoSQL
Feed m
Feed 2
Feed 1
Load
(Optional)
Staging Area
Traditional Analytics – Data at Rest
Business Analytics
Business Intelligence
Visualization Tools
Visualize
Analyze
Extract Transform
Feed n
Feed 2
Feed 1
Visualize
4. Next Generation – Data in Motion
• Organizations need to react to changing business conditions in real time
• Faster decision making across all industries
• Few companies outside of financial markets, telecom & utilities have experience with
streaming
• Newer data sources – like sensors, social media feeds
• Higher Volume and Greater Velocity
• More unstructured and semi-structured data
• Democratization of technologies
• Open Source Projects
• Large Scale Compute & Storage – Hadoop, NoSQL
• Streaming Technologies – Apex, Spark, Storm etc.
• Real-time dashboards and alert notification systems
• Beyond niche use cases
• Broad applicability but needs more adoption
6. Stream Processing
•Continuous processing on data as it flows through a
system
•Allows users to act on events instantaneously via
alerts
•Processing related to time (event time vs. processing
time)
•Real-Time – diff between event time and processing
time is negligible
Enables your Data In Motion Architecture
7. Big Data Application Types
IoT
Fraud
CDR
CDC
Reporting
SQL
Operations
Data Discovery
SQL on
Streams
Streaming
Disovery
8. Sample Streaming Analytics Patterns
Preprocessing
• Filtering events
• Transforming
attributes
Alerts & Thresholds
• Based on complex
conditions
Computing within
Windows
• Aggregations
Combining Event
Streams
• Correlation
• Error detection
Enrichment
• Looking up database,
reference data
Temporal Events
• Detecting events
within time windows
Tracking
• Tracking events over
space & time
Trend Detection
• Rise, Fall
• Outliers
Source: https://iwringer.wordpress.com/2015/08/03/patterns-for-streaming-realtime-analytics/
10. Financial Services
• Detect fraudulent activity in real-time
• Risk Analysis
• Deliver personalized products and
offerings
• Make decisions in real-time for trading
and transactional platforms
12. Telecom
• Real-time network monitoring and
protection
• Quality of service and Customer
Satisfaction
• Take action based on users’ location
• Automatic resource allocation and load
balancing
13. Online Advertising
• Dynamic bidding
• Real-time targeting & personalization
• Maximize click-through and
conversion rates.
• Reporting that can be updated
continuously
15. Internet of Things
• Environment monitoring
• Infrastructure management
• Manufacturing
• Energy management
• Public Building & Home automation
• Transportation
16. IoT secure ingestion and predictive analysis
High performance, multi-
customer secure, data ingestion.
Complex event processing with
historical data for predictive
maintenance
Sensor 2
Sensor 1
Sensor N
Application n
Application 1
Persistent
Data
Governance
Complex
Event Process
Predictive
maintenance
17. Stream Processing:
Conclusion
•Lots of untapped potential!
•Gives your business a competitive edge!
•Open Source and Big Data
technologies
•Built to address the scale and latency
demands
•Broad use cases
•Across industries and verticals
18.
19.
20. Hadoop Ingestion Made Easy
https://www.brighttalk.com/webcast/13685/194937/hadoop-ingestion-made-easy-with-
datatorrent-dtingest