Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Make Streaming Analytics
Work for you: The Devil is in
the Details
K...
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
• Building a Stream Analytics Platform:
Requirements
• Buildi...
Building a Stream
Analytics Platform
Requirements
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
5 Attributes of a Streaming Platform
 Ingest
 Process
 Analyze
 ...
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Data Flow versus Stream Analytics
 Data Flow
–Ingest and route tera...
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
6 key focus areas for building a Streaming Platform
 Common Abstrac...
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Focus Areas for Streaming Platform
Common Abstraction Layer
–Select...
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Focus Areas for Streaming Platform
Latency
–500 millisecond or less...
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Focus Areas for Streaming Platform
Lambda Architecture
–Integrate s...
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Focus Areas for Streaming Platform
Scale Out
–Linear Scale out or ...
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Focus Areas for Streaming Platform
Rapid Application Development
–...
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Focus Areas for Streaming Platform
Data Visualization
–Time Series...
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Real Time Stream Analytics versus Streaming Engine
Abstracts under...
Building a Stream
Analytics Platform
Capabilities
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stream Analytics Platform - Capabilities
• Continuous, Real Time Pr...
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Streaming Platform Capabilities - Analytics
Detect Trends
Alert on ...
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stream Analytics Platform - Capabilities
• Filtering
–Event Selecti...
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stream Analytics Platform - Capabilities
• Correlations
–Correlatio...
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stream Analytics Platform - Capabilities
• Serialization/Deserializ...
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stream Analytics - Capabilities Summary
# Capability
1. Filtering
2...
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Use Cases
CyberSecurity - Threat Protection
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
CyberSecurity - Threat Protection Use Case
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
CyberSecurity - Threat Protection - Steps
Applying Streaming Analyt...
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
CyberSecurity - Threat Protection - Steps
Applying Streaming Analyt...
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
CyberSecurity - Threat Protection - Steps
Applying Streaming Analyt...
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
CyberSecurity - Threat Protection - Steps
Applying Streaming Analyt...
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Make Streaming Analytics
Work for you: The Devil is in
the Details
...
Copyright © 2015 Neustar, Inc. All Rights Reserved 28
AREAS OF FOCUS AND INNOVATION FOR NEUSTAR
 Security
 User
 Device
 Anomaly Detection
 Multi-Factor Authentication (PK...
DEVICE AND IOT PLATFORM
Devices
Data Processing Services
Local & Offline Functionality
Device and User Authentication
Devi...
CORE DATA PLATFORM REQUIREMENTS
 Provide Ingestion for all Machine Data Exhaust from all system components. (ex:
Anomaly ...
Copyright © 2015 Neustar, Inc. All Rights Reserved 32
Copyright © 2015 Neustar, Inc. All Rights Reserved 33
Copyright © 2015 Neustar, Inc. All Rights Reserved 34
Outgoing Messages
Copyright © 2015 Neustar, Inc. All Rights Reserved 35
Turn a Light on Manually
Inbound Message Flow
Copyright © 2015 Neustar, Inc. All Rights Reserved 36
Incoming Messages
INTERESTING PROBLEMS
 We need to work in offline and temporary offline mode so data replication is the hardest
problem to...
38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Q & A
Upcoming SlideShare
Loading in …5
×

Make Streaming Analytics work for you: The Devil is in the Details

1,870 views

Published on

Make Streaming Analytics work for you: The Devil is in the Details

Published in: Technology

Make Streaming Analytics work for you: The Devil is in the Details

  1. 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Make Streaming Analytics Work for you: The Devil is in the Details Kanishk Mahajan June 2016 Director Product Management, Hortonworks Ryan Medlin Director Software Engineering, Neustar
  2. 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda • Building a Stream Analytics Platform: Requirements • Building a Stream Analytics Platform: Capabilities • Use Case - • CyberSecurity - Threat Protection • Q&A
  3. 3. Building a Stream Analytics Platform Requirements
  4. 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 5 Attributes of a Streaming Platform  Ingest  Process  Analyze  Visualize  Respond
  5. 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Data Flow versus Stream Analytics  Data Flow –Ingest and route terabytes of data into a ”unified firehose” –Actively performance manage the latency and quality of these data flows - –across high variability of data formats, size of data and speed of data –Produces Streams - ”unbounded sequence of events” Real Time Stream Analytics –Sub second event processing with linear scalability to billions of events –Predictive Analytics at Scale – Real time data aggregation across edge nodes while processing 10s of millions of events and 100s of gigabytes per second –Guaranteed no data loss and events processed in order
  6. 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 6 key focus areas for building a Streaming Platform  Common Abstraction Layer  Latency  Lambda Architecture –“Orchestrate” over static and real time data  Scale-out  Rapid Application Development  Data Visualization
  7. 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Focus Areas for Streaming Platform Common Abstraction Layer –Select one or more streaming engines –Select one or more cloud providers –Select one or more event sources –…Future Proof
  8. 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Focus Areas for Streaming Platform Latency –500 millisecond or less- Dashboards, Security Incidents, Asset Performance –20 milliseconds or less - Ad Networks, Preventive Maintenance –…
  9. 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Focus Areas for Streaming Platform Lambda Architecture –Integrate static and real time data –Enrichment –Orchestration of Batch workflows –Predictive Classification and Scoring
  10. 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Focus Areas for Streaming Platform Scale Out –Linear Scale out or scale down –Resource Management –Handle Transient workloads
  11. 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Focus Areas for Streaming Platform Rapid Application Development –Aggregations –Filters –Multi Stream Correlations –Splits –Joins –Normalizations –Business Rules Editor
  12. 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Focus Areas for Streaming Platform Data Visualization –Time Series Visualizations –Metrics Dashboarding –Trends –Comparisons –Thresholds –Custom UI extensions
  13. 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Real Time Stream Analytics versus Streaming Engine Abstracts underlying Streaming Engine- Storm, Spark Streaming, Flink.. OOTB Support for multiple (cloud) event sources - Kafka, AWS Kinesis, Azure Event Hub Built-in Operators for Complex Event Processing Built-in Real Time Dashboarding- Metrics and Events Pluggable Workflow Management Business Rules Editor and Rapid Application Development Framework Cloud Deployment Scalable Architecture  Handles different latency requirements Hortonworks Inc Confidential 2016
  14. 14. Building a Stream Analytics Platform Capabilities
  15. 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Stream Analytics Platform - Capabilities • Continuous, Real Time Processing –Continuous Query and near real time processing of event data –Distributed, scalable platform for processing continuous unbounded streams of data –Abstraction across multiple open source stream engine technologies –Large Scale Event Processing –No coding approach for event processing logic • Event Source –Multiple event source support –Multiple data format support –Drag and Drop, no coding approach • Event Storage –Multiple event storage support –Event Retention time period support –Drag and Drop, no coding approach Page 15
  16. 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Streaming Platform Capabilities - Analytics Detect Trends Alert on probable Failures Descriptive Proactive Predictive Prescriptive • Rules Calculations on predefined settings • High Throughput, Low Latency Event Processing Framework • Train Models Offline • Provide Query interface and storage of Models • Integrate Real Time Data with Batch Data • Connect with Enterprise Applications Reporting and Dashboarding Visual Pattern Recognition Monitor System Alert before Failure Take Action on Data Avoid Failure and Drive Operational Efficiency
  17. 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Stream Analytics Platform - Capabilities • Filtering –Event Selection based on pre-defined criteria • Transformations –Aggregations –Event Prioritization –Event Deduplication –Event Time Stamping –Custom User Defined Functions • Enrichment –Pull in reference data for additional context Page 17
  18. 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Stream Analytics Platform - Capabilities • Correlations –Correlation of in-flight event data from multiple event streams • Temporal Calculations –Time Based Calculations on in-flight data in an event stream for e.g. a windowing computation with a time based window • Pattern Detection –Calculating trends on in-flight data in an event stream –Outlier detection : Pattern Matching with historical data Page 18
  19. 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Stream Analytics Platform - Capabilities • Serialization/Deserialization –Marshalling/Unmarshalling event data into a series of bytes –Drag and Drop, no coding support • Rule Engine –Support for business rules (business directives that impact a decision) –Support for invoking external business rule engine • Visualization –Reporting and Dashboarding –Visual Pattern Recognition Page 19
  20. 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Stream Analytics - Capabilities Summary # Capability 1. Filtering 2. Transformations 3. Enrichment 4. Correlations 5. Temporal Calculations 6. Pattern Detection # Capability 1. Continuous Real Time Processing 2. Event Sources 3. Event Storage 4. Temporal Calculations 5. Rules Engine 6. Visualization # Capability 7. Filtering 8. Transformation 9. Event Storage 10. Enrichment 11. Correlations 12. Pattern Detection
  21. 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Use Cases CyberSecurity - Threat Protection
  22. 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved CyberSecurity - Threat Protection Use Case
  23. 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved CyberSecurity - Threat Protection - Steps Applying Streaming Analytics • Expose • Provide Visibility into your enterprise • Passively watch network traffic and construct events to describe the activity it sees. • “Ingest” • Transmit • Compress, Encrypt, De-identify event data • Send to the cloud for centralized log retention, real-time threat analysis and incident investigation
  24. 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved CyberSecurity - Threat Protection - Steps Applying Streaming Analytics • Prioritize • Aggregate intelligence across all control points to identify and prioritize those systems that remain compromised and require immediate remediation. • Allows analysts to “zero in” on just those events of most importance. • Significantly reduce the number of incidents that security analysts need to investigate • Combine global telemetry from cyber intelligence networks with local customer context across endpoints, networks and email, to uncover attacks that would otherwise evade detection. • “Correlate” and “Enrich”
  25. 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved CyberSecurity - Threat Protection - Steps Applying Streaming Analytics • Analyze • Create comprehensive detection rules, behavioral analytics and guided investigations to detect the latest threats. • Enable quick and nimble data exploration across billions of events to proactively hunt for hidden indicators of compromise (IOCs). • Provide agile investigation tooling/UI to pivot from one indicator to the next, reconstruct the attack storyline and plan a forceful response to disrupt the attack. • Allow Security analysts to visualize all related attack components, e.g., all files used in the attack, all email addresses, all malicious IP addresses, etc. • “Proactive”, “Proactive”, “Prescriptive” and “Descriptive” Analytics • “Real Time Analytics across large volumes of data at scale”
  26. 26. 26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved CyberSecurity - Threat Protection - Steps Applying Streaming Analytics • Remediate • Generate alerts to expedite validating and scoping the incident. • Enrich alerts with supporting data. This data could include threat intelligence, point-in-time context regarding users impacted, actions taken and hosts involved help you validate and scope the incident. • Allow a Security Investigator to hunt for Indicators-of-Compromise across all your endpoints from a single console. • Leverage your existing installations of Endpoint Protection and Email Security for enforcement without having to install any new endpoint agents. • Export rich security intelligence into third-party security incident and event management systems (SIEMs). • “Notifications”, “Event Storage”, “Workflow”
  27. 27. 27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Make Streaming Analytics Work for you: The Devil is in the Details Kanishk Mahajan June 2016 Director Product Management, Hortonworks Ryan Medlin Director Software Engineering, Neustar
  28. 28. Copyright © 2015 Neustar, Inc. All Rights Reserved 28
  29. 29. AREAS OF FOCUS AND INNOVATION FOR NEUSTAR  Security  User  Device  Anomaly Detection  Multi-Factor Authentication (PKI, etc)  Policy Creation/Decisioning/Enforcement  Local / Edge / Distributed functionality  Openness and Standardization  (OCF/OneM2M/MQTT/CoAP/GitHub)  Partnerships Copyright © 2015 Neustar, Inc. All Rights Reserved 29
  30. 30. DEVICE AND IOT PLATFORM Devices Data Processing Services Local & Offline Functionality Device and User Authentication Device Directory & Identity User and Device Policy Device Certification & Public / Private Key Distribution/MFA OIC/OCF and IoTivity Standards Based Platform for Interoperability & device definition Absolute, Group/Collection, Local & cloud Verification / Fraud / Anomaly detection Location Services Future Peer-to-Peer communication & Discovery. decentralization Secure Discovery & Communication OCF based security & communication protocols Data pipeline, enrichment and analytics
  31. 31. CORE DATA PLATFORM REQUIREMENTS  Provide Ingestion for all Machine Data Exhaust from all system components. (ex: Anomaly Detection, Monitoring/Alerting)  Deserializing avro messages using given schema(s) and by default store a message in hdfs.  Given a message attribute (for example device type and customer Id or device Id) execute a rule to determine next course of action which should comprise of one or more of these actions: – Notify a user via email or sms or 3rd party api – Notify or pass message to registry service/API – Store data in hdfs – Store data in a relational data store (hbase phoenix or other?) Copyright © 2015 Neustar, Inc. All Rights Reserved 31
  32. 32. Copyright © 2015 Neustar, Inc. All Rights Reserved 32
  33. 33. Copyright © 2015 Neustar, Inc. All Rights Reserved 33
  34. 34. Copyright © 2015 Neustar, Inc. All Rights Reserved 34 Outgoing Messages
  35. 35. Copyright © 2015 Neustar, Inc. All Rights Reserved 35 Turn a Light on Manually Inbound Message Flow
  36. 36. Copyright © 2015 Neustar, Inc. All Rights Reserved 36 Incoming Messages
  37. 37. INTERESTING PROBLEMS  We need to work in offline and temporary offline mode so data replication is the hardest problem to solve and is based on unique use cases.  Also need to run a subset of the Cloud Inbound real time service rules and processing. How to reuse this logic both in the cloud and locally on a Local Gateway is a design challenge with tradeoffs. Copyright © 2015 Neustar, Inc. All Rights Reserved 37
  38. 38. 38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Q & A

×