View, Act, and React: Shaping Business Activity with Analytics, BigData Queries, and Complex Event Processing


Published on

Sun Tzu said “if you know your enemies and know yourself, you can win a hundred battles without a single loss.” Those words have never been truer than in our time. We are faced with an avalanche of data. Many believe the ability to process and gain insights from a vast array of available data will be the primary competitive advantage for organizations in the years to come.

To make sense of data, you will have to face many challenges: how to collect, how to store, how to process, and how to react fast. Although you can build these systems from bottom up, it is a significant problem. There are many technologies, both open source and proprietary, that you can put together to build your analytics solution, which will likely save you effort and provide a better solution.

In this session, Srinath will discuss WSO2’s middleware offering in BigData and explain how you can put them together to build a solution that will make sense of your data. The session will cover technologies like thrift for collecting data, Cassandra for storing data, Hadoop for analyzing data in batch mode, and Complex event processing for analyzing data real time.

Published in: Technology, Economy & Finance
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

View, Act, and React: Shaping Business Activity with Analytics, BigData Queries, and Complex Event Processing

  1. 1. View, Act, and React: Shaping Business Activity with Analytics, BigData Queries, and Complex Event Processing Srinath Perera Director, Research WSO2
  2. 2. Start Image cedit, CC licence, • 1942, Asimov wrote a book called Foundation, in which the character Hari Seldon use mathematical models to predict the future of civilization and then to save it. • Paul Krugman,( the Nobel Laureate in Economics), said his interest in economic begin with foundation. • We are entering that Era of our history where Mr. Asimov might have a point.
  3. 3. Consider a Day in your Life • What is the best road to take? • Would there be any bad weather? • What is the best way to invest the money? • Should I take that loan? • Is there a way to do this faster? • What others did in similar cases? • Which product should I buy?
  4. 4. Bigdata Landscape
  5. 5. Big Data Architecture
  6. 6. Why it is hard? • System build of many computers (1000 nodes to store 1PB with 1TB each) • That handles lots of data (10Gb network => 83 days to copy 1PB) • Running complex logic (models can be complex as the system) • This pushes us to the frontier of Distributed Systems and Databases, Licensed CC
  7. 7. Big Data Architecture with WSO2
  8. 8. Each stream has a name Event Streams { • 'name':'PlayStream', 'version':'1.0.0', 'payloadData':[ 'name':'sid', 'ts':'BIGINT', 'x':'DOUBLE', • ... ] } Each event has attributes, that has types We view the world as event streams Event stream is series of events over time We use SQL like languages (Hive/ CEP) to process event streams and create new event streams Select from PlayStream[x>2500 and .. ] İnsert into NearGoalStream
  9. 9. Demo Usecase (DEBS 2013) • Football game, players and ball has sensors (DESB Challenge 2013) • sid, ts, x,y,z, v,a • Use cases: Running analysis, Ball Possession and Shots on Goal, Heatmap of Activity • Siddhi did 100K+ on each usecase • For this talk, we will look at user activity by region of the field.
  10. 10. Demo High-level Architecture
  11. 11. Data Collection Agent agent = new Agent(agentConfiguration); publisher = new AsyncDataPublisher( "tcp://localhost:7612", .. ); StreamDefinition definition = new StreamDefinition(STREAM_NAME, VERSION); definition.addPayloadData("sid", STRING); ... publisher.addStreamDefinition(definitio n); ... Event event = new Event(); event.setPayloadData(eventData); publisher.publish(STREAM_NAME, VERSION, event); • Can receive events via SOAP, HTTP, JMS, .. • WSO2 Events is highly optimized version (400K events TPS) • Default Agents and you can write custom agents.
  12. 12. Business Activity Monitor
  13. 13. BAM Hive Query Find how much time spent in each cell. CREATE EXTERNAL TABLE IF NOT EXISTS PlayStream … select sid, ceiling((y+33000)*7/10000 + x/10000) as cell, count(sid) from PlayStream GROUP BY sid, ceiling((y+33000)*7/10000 + x/10000);
  14. 14. Complex Event Processor
  15. 15. Calculate the mean location of each player every second define partition sidPrt by PlayStream.sid, CEP Query LocBySecStream.sid from PlayStream#window.timeBatch(1sec) select sid, avg(x) as xMean, avg(y) as yMean, avg(z) as zMean insert into LocBySecStream partition by sidPrt from every e1 = LocBySecStream -> e2 = LocBySecStream [e1.yMean + 10000 > yMean or yMean + 10000 > e1.yMean] within 2sec select e1.sid insert into LongAdvStream partition by sidPrt ; Detect more than 10m run
  16. 16. Run Demo
  17. 17. Visualization
  18. 18. Conclusion
  19. 19. Thank You