• Save
Real Time Event Processing and In-­memory analysis of Big Data - StampedeCon 2013
Upcoming SlideShare
Loading in...5
×
 

Real Time Event Processing and In-­memory analysis of Big Data - StampedeCon 2013

on

  • 1,726 views

At the StampedeCon 2013 Big Data conference in St. Louis, Vinod Vydier, Middleware Specialist at Oracle, discussed Real Time Event Processing and In-­memory analysis of Big Data. There are multiple ...

At the StampedeCon 2013 Big Data conference in St. Louis, Vinod Vydier, Middleware Specialist at Oracle, discussed Real Time Event Processing and In-­memory analysis of Big Data. There are multiple projects (for example Cloudera’s Impala) that do real­time or near real time analysis of Big Data. However, if there are events that need to be looked at and responded to in real time (for example credit card fraud or a vehicle metrics to alert a driver) this can have a significant impact on data collection and analysis using the traditional Big Data techniques. In this session, I will introduce strategies on how you can use an Event Processing Engine to respond to events in real time, and then filter and categorize data in memory before storing data in HDFS. This will make it easier to run Hadoop jobs on the data collected, and also have end clients respond to the critical events in real time.

Statistics

Views

Total Views
1,726
Slideshare-icon Views on SlideShare
1,593
Embed Views
133

Actions

Likes
1
Downloads
0
Comments
0

2 Embeds 133

http://eventifier.co 84
http://eventifier.com 49

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Real Time Event Processing and In-­memory analysis of Big Data - StampedeCon 2013 Real Time Event Processing and In-­memory analysis of Big Data - StampedeCon 2013 Presentation Transcript

    • Event Processing for better (Big) Data Vinod Vydier Middleware Specialist @ Oracle
    • Agenda §  Why use event processing? §  Event Processing Applications §  Technical Architecture §  Use of In-Memory data-grid §  Use cases
    • Challenges Working with Big Data • Storing Data has becoming cheap, however the storage is not infinite and has to be managed to make use of the data effectively. • Hadoop has inherent latency for responding to real time events (which can produce high volume data at high velocity) and typically involves real responses. • Event Processing helps in getting clean data with context and less redundancy into HDFS, so the Hadoop jobs can be more effective. • Event Processing helps in responding back in real time, and storing the data in HDFS for better historical analysis.
    • Why use Event Processing Infrastructure Application has any one or more of the following conditions: §  Requires high throughput and low latency processing. §  Has continuously streaming data. §  Real-time correlation between multiple incoming data sources. §  Time-sensitive alerts, aggregations and calculations. §  Needs to look for patterns in the data stream. §  Data does not need to be stored, if there is nothing of interest in it. §  Problem is more easily solved by analyzing before storing in HDFS.
    • Filtering, Real-time Intelligence for Big Data VOLUME VELOCITY VARIETY VALUE SOCIAL BLOG SMART METER 101100101001 001001101010 101011100101 010100100101 FAST DATA Event Processing Intelligence GREATER
    • Stay ahead of Big Data Filter out, correlate Move time-critical analysis to front of process • Filter out noise (example: data ticks with no changes), add context (by correlating multiple sources), increase relevance. • Identify certain critical conditions as you insert data into the warehouse.
    • Getting ahead of the curve: Fast Data Big Data minutesms Fast Data Historicaldepth:deep Historicaldepth:shallow Example: analysis of traffic patterns and congestion times for urban planning Example: monitoring of traffic cameras to ensure given license plates are not in use on multiple vehicles Add “depth” to your fast data by merging output of MapReduce to stream processing
    • Adapter Adapter Processor Adapter HDFS Data Source Queries <<Source>> <<Source>> <<Sink>> Service1 Service2 Export Import Event Processing Network (EPN) Event Processing Application Queries Channel Channel Channel Channel What is an Event Processing application Data Source
    • Event Processing inputs Ø  Streams Ø  Continuous input, often in high- volume Ø  Time ordered Ø  Does not end Ø  Impossible to process / analyze in real-time with traditional relational database systems Example: Raw Sensor Event streams, GPS, Market Data Feeds BA BOEING D 77.575 800 20080305 10:03:02:78 DO DUPOD NT D 41.575 3000 20080305 10:03:04:12 AA ALCOA INC D 20.125 1000 20080305 10:03:01:55 AXP AMER EXPRESS CO D 45.875 500 20080305 10:03:02:10 BA BOEING D 77.575 800 20080305 10:03:02:78 C CITIGROUP D 34.125 2000 20080305 10:03:03:05 CAT CATERPILLAR D 22.5 600 20080305 10:03:03:46 DO DUPONT D 41.575 3000 20080305 10:03:04:12 AA ALCOA INC D 20.125 1000 20080305 10:03:01:55 AXP AMER EXPRESS CO D 45.875 500 20080305 10:03:02:10 BA BOEING D 77.575 800 20080305 10:03:02:78 C CITIGROUP D 34.125 2000 20080305 10:03:03:05 CAT CATERPILLAR D 22.5 600 20080305 10:03:03:46 DO DUPONT D 41.575 3000 20080305 10:03:04:12 AA ALCOA INC D 20.125 1000 20080305 10:03:01:55 AXP AMER EXPRESS CO D 45.875 500 20080305 10:03:02:10 BA BOEING D 77.575 800 20080305 10:03:02:78 C CITIGROUP D 34.125 2000 20080305 10:03:03:05 CAT CATERPILLAR D 22.5 600 20080305 10:03:03:46 DO DUPONT D 41.575 3000 20080305 10:03:04:12 AA ALCOA INC D 20.125 1000 20080305 10:03:01:55 AXP AMER EXPRESS CO D 45.875 500 20080305 10:03:02:10 BA BOEING D 77.575 800 20080305 10:03:02:78 Event Processing provides a new data management infrastructure to support and analyze Streams in real-time BA BOEING D 77.575 41.575 800 20080305 10:03:02:78 DO DUPONT D 41.575 3000 20080305 10:03:04:12 BA BOEING D 77.575 800 20080305 10:03:02:78 C CITIGROUP D 34.125 2000 20080305 10:03:03:05 BA BOEING D 77.575 800 20080305 10:03:02:78
    • Filtering Ø  New stream filtered for specific criteria, e.g. stock price > $22 Ø  Correlation & Aggregation Ø  Scrolling, time-based window metrics, e.g. average # of stock trades in the last hour Ø  Pattern Matching Ø  Notification of detected event patterns, e.g. price changes A, B and C occurred within 15 minute window CAT CATERPILLAR D 22.5 600 20080305 10:03:03:46 DO DUPONT D 41.575 3000 20080305 10:03:04:12 AA ALCOA INC D 20.125 1000 20080305 10:03:01:55 AXP AMER EXPRESS CO D 45.875 500 20080305 10:03:02:10 BA BOEING D 77.575 800 20080305 10:03:02:78 …… • Event Processing done in-Memory (not in Database) • Logic is defined through Continuous Queries on the data CAT CATERPILLAR D 22.5 600 20080305 10:03:03:46 DO DUPONT D 41.575 3000 20080305 10:03:04:12 AA ALCOA INC D 20.125 1000 20080305 10:03:01:55 AXP AMER EXPRESS CO D 45.875 500 20080305 10:03:02:10 BA BOEING D 77.575 800 20080305 10:03:02:78 CAT CATERPILLAR D 22.5 600 20080305 10:03:03:46 DO DUPONT D 41.575 3000 20080305 10:03:04:12 AA ALCOA INC D 20.125 1000 20080305 10:03:01:55 AXP AMER EXPRESS CO D 45.875 500 20080305 10:03:02:10 BA BOEING D 77.575 800 20080305 10:03:02:78 BA BOEING D 77.575 41.575 800 20080305 10:03:02:78 DO DUPONT D 41.575 3000 20080305 10:03:04:12 COMPLEX QUERIES Event Processing outputs
    • Data crunching for Event Processing done in a in-memory data grid •  High throughput for storing data •  Aggregation and event querying •  Pattern implementation flexibility combining complementary technologies •  Handle and correlate events in real time, including support for multiple patterns: •  Pre processing (buffer inputs) •  In Event Processing (to cache reference data) •  Post Processing (to expose processed events to consuming apps) Data Grid Event Processing Consolidat ed & in- context Data Filtered/ Aggregat ed Data HDFS and traditional storage
    • In-memory events on the data stream n  Threshold Management n  Detecting threshold conditions across multiple event streams n  Using cache to: n  Allow dynamic configuration of thresholds n  Add (via join) contextual data to support aggregation n  Using pattern matching to find sustained conditions n  Alert Generation n  Using relations to represent state and state transitions n  Using “missing event” patterns to monitor expected response(s) n  Alarm Management n  Using pattern matching to remove extraneous alarm events n  e.g. power off alarm preceded by tamper alarm within (n) minutes X
    • Alarm Filtering Scenario Discard Power Off Alarm if there was a Tamper Alarm for the same meter within the previous 5 seconds
    • Visualizing events on the data stream JMS Resource Locations Matches and Alerts SQL Event Processing Application JMS Geo-Fencing Definitions SQL MapViewer Manager
    • JMS Protocol Integration n Common integration touch point with Service Bus n Business Activity Monitoring integration HTTP Publish/Subscribe n Support pub/sub events between Event server and web clients. n Clients don’t need to poll for updates (unlike traditional HTTP). n Clients subscribe to and publish to event channels. n Bayeux protocol n Light weight and the payload is JSON Visual/SOA integration with Event Processing
    • Event Processing High Level Architecture JSON Adapter CacheProcessor POJO EPN (Event Processing Network) Elements HTTP Pub/S
    • Query Plan and Real Time Monitoring
    • Event Driven SOA: Simplify Business Complexity •  Real-time business insight •  Preempt and react instantaneously to Enterprise, Environmental and Global Business conditions •  Gain business insight using previously untapped, raw event sources •  Hot-pluggable integration •  Transparent SOA infrastructure interoperability •  Distributed, deployment ready, pre-integrated, in-memory Data Grid, and Java low latency determinism. •  Lightweight high performance Java Event Server platform •  Real-time business friendly analyst oriented visualization layers •  Powerful, extensible Event Processing Analysis abstraction •  Business user dashboards •  Business user domain specific natural language layers •  Real-time predictive analytics
    • Event Processing use cases in different industries 1.  Customer Experience 2.  Transportation, Logistics & Fleet Management 3.  Utilities: Demand & Response, Smart Meter 4.  Public Sector: Emergency Response, Intelligence 5.  Telcos: Real Time billing & WiFi offloading, Mobile billboard
    • Customer Experience n  Industry focus on new buzzword: Customer Experience n  Desire to harness potential of social networks for better targeted marketing Event Processing can help with: n  Monitoring in real-time customer activity (social networks, location (e.g. proximity to stores, etc) and identifying opportunities in real-time n  Correlating with existing information (customer/ shopping profiles, etc.) n  Generating real-time alerts
    • Transportation, Logistics and Fleet Management n  Constant industry pressure for greater efficiency n  Need to differentiate through premium services and greater reliability and visibility n  Availability of cheap wireless sensors (temperature, GPS, etc.) that can be included in packages/containers/trucks Event Processing can help with: n  Real-time monitoring of inflow of data from sensors n  Trends detection / prediction (to rise, etc.) n  Leveraging spatial/geo-location capabilities.
    • Utilities n  Adoption of Smart Meters: concerns about bandwidth/ processing power required to handle the information they generate, desire to offer value-add services n  Ever increasing electricity demand n  Demand for real-time billing & analytics n  Greater customer expectations re: outage & response times n  Regulations Event Processing can help with: n  Alerting of consumption trends in real-time, enabling “Demand/ Response” n  Real-time detection of problems (abnormal spikes in consumption indicative of leaks, etc.) n  Filtering out redundant or nested (ex: tree fell on the line) outage errors and problems n  Tracking of resources and personnel
    • Telco n  Overloaded data networks and new strategies to offload traffic: real-time billing vs. unlimited, offloading to WiFi, degradation of service from 4G to 3G, etc. n  GPS-enabled phones offer new location-based marketing opportunities: “mobile billboards” How can Event Processing help: n  Event Processing infrastructure can handle massive amounts of data generated by mobile devices, filter out, correlate and aggregate in real-time to only retain valuable information n  Event Processing can plug into all types of feeds, from devices to social networks n  Event Processing can be integrated with spatial and geo- location technology to send location specific data to the user.
    • Public Sector n  Heightened security requirements n  Ever increasing population in urban areas drives optimization requirements n  Increasing number of real-time data: video feeds, GPS data, traffic data, etc. n  Applications: Security Intelligence, geo-fencing, “Smart Cities”, traffic control, gateless tolls How Event Processing can help: n  Event Processing can be integrated with spatial and geo- location technology to track location specific data with a user. n  Event Processing can plug in any data feed such as video / face recognition n  Event Processing meets performance & availability requirements in this space
    • Thanks for attending!!