Associate Director / Solutions Architect
22nd June 2016 | Sydney, Australia.
dassanaATwso2.com
Dassana Wijesekara
WSO2 Workshop 2016 | Sydney . Australia.
WSO2 Analytics Platform
Analytics is Growing Up
▪ It is no longer about doing
your first analytics usecase.
▪ It is about
▪ How to do it everyday,
efficiently?
▪ How to recover?
▪ How to make
decisions?
▪ How to do other forms
like real-time ,
Interactive, and
predicative analytics
Analytics 2.0 Platform
▪ One platform for all
four forms of analytics
▪ Single consistent
programming model
▪ One analytics archive
format)
▪ Support for the lifecycle
of analytics Apps
Integrate well with rest of the
enterprise!!
Collect Data
▪ One Sensor API to
publish events
- REST, Thrift, JMS, Kafka
- Java clients, java script
clients*
▪ First you define streams
(think it as a infinite table
in SQL DB)
▪ Then send events via
Sensor API
Can send to batch pipeline, Realtime pipeline or both via
configuration!
Collecting Data: Example
 Java example: create and send events
 Events send asynchronously
 See client given in http://goo.gl/vIJzqc for more info
Agent agent = new Agent(agentConfiguration);
publisher = new AsyncDataPublisher("tcp://hostname:7612", .. );
StreamDefinition definition = new
StreamDefinition(STREAM_NAME,VERSION);
definition.addPayloadData("sid", STRING);
...
publisher.addStreamDefinition(definition);
...
Event event = new Event();
event.setPayloadData(eventData);
publisher.publish(STREAM_NAME, VERSION, event); Send event
Define Stream
Initialize Agent
Analysis: Batch Analytics
Complex Event Processing
Analytics logic with SQL like
Queries
▪ Both BAM and CEP provides a
SQL like data processing language
▪ Since many understands SQL,
above languages made large scale
data processing Big Data
accessible to many
▪ Expressive, short, and sweet.
▪ Define core operations that covers
90% of problems
▪ Lets experts dig in when they like!
(via User Defined functions)
Scaling CEP Queries on top of
Storm
▪Accepts CEP queries with hints about how to partition streams
▪Partition streams, build a Apache Storm topology running CEP
nodes as Storm Sprouts, and run it. (see http://goo.gl/pP3kdX )
Predictive Analytics
▪ Predictive Analytics learns a
decision function (a model)
using examples
▪ Is this fraud?
▪ How to drive?
▪ Handwritten text
▪ Build models and use them
with WSO2 CEP, BAM and
ESB using WSO2 Machine
Learner Product ( 2015 Q3)
▪ Build model using R, export
them as PMML, and use
within WSO2 CEP
WSO2 Machine Learner
▪ A wizard to sample,
explore, and understand
data through
visualizations
▪ A wizard to configure,
train machine learning
models, and select the
best model
▪ Find and use those
models with WSO2 CEP,
BAM and ESB
▪ Powered by Apache
Spark MLLib
Communicate: Dashboards
▪ Idea is to give a “Overall idea” in a glance (e.g. car dashboard)
▪ Support for personalization, you can build your own dashboard.
▪ Also the entry point for Drill down
▪ How to build?
- Dashboard via Google Gadget and content via HTML5 + java scripts
- Use charting libraries like Vega or D3
Communicate: Alerts
▪ Detecting conditions can
be done via CEP Queries
▪ Key is the “Last Mile”
- Email
- SMS
- Push notifications to a UI
- Pager
- Trigger physical Alarm
▪ How?
- Select Email sender “Output Adaptor” from CEP, or send from
CEP to ESB, and ESB has lot of connectors
Communicate: APIs
▪ With mobile Apps, most data
are exposed and shared as
APIs (REST/Json ) to end
users.
▪ Need to expose analytics
results as API
▪ Following are some challenges
- Security and Permissions
- API Discovery
- Billing, throttling, quotas &
SLA
▪ How?
- Write data to a database from CEP event tables
- Build Services via WSO2 Data Service
- Expose them as APIs via API Manager
Event Stream Store
▪ One stop place for all
event stream definitions
▪ Let users
▪ Publish and consume
though Multiple protocols
like REST, JMS, Thrift,
Web Sockets etc.
▪ Discover event streams
▪ Enforce security and
authorization
▪ Per-pay subscriptions
▪ Effectively a Event Stream
Market Place!!
▪ This will automate APIs
creation as discussed in the
slide before.
What is it good for?
▪ Batch Analytics
▪ Realtime Streaming analytics
▪ Realtime Interactive analytics
▪ Lambda Architecture
▪ Train and use a ML model
▪ Selective Detailed Analysis
http://tinybuddha.com/blog/a-simple-technique-to-
solve-problems-before-they-get-bigger/
Selective Detailed Analysis
• Too expensive to do
detailed analysis on all the
data
• Instead detect the condition,
and dig into related data
• Fraud toolbox
• Other usecases
– Dynamic offers at Retail
Site
– Weather
Lambda Architecture
• Same code in both batch and realtime layers
• Idea is to fill the time between two batch runs
• Batch layer writes the data to a DB
• Realtime layer merge with batch data via Event Tables
Real Life Use Cases
▪ Health, Smart Parking solutions
▪ Financial Monitoring
▪ Smart City project, Vehicle
tracking, Building monitoring
▪ Railway monitoring
▪ Throttling and Anomaly
Detection
▪ API Analytics (13+ customers)
▪ Connected Car
Case Study: DEBS Grand Challenges
▪ DEBS ((Distributed Event Based Systems) Grand
Challenge is a yearly event processing challenge.
▪ 2014 Challenge:
▪ Smart Home electricity data: 2000 sensors, 40
houses, 4 Billion events. We posted (400K
events/sec) and close to one million
distributed throughput with 4 nodes.
▪ one of the four finalists
▪ 2015 Challenge:
▪ Based on taxi activities collected from New
York City over the year 2013. 14,144 taxis 173
million taxi trip records. We posted 300K/sec
on a single node and one of the finalists.
https://www.flickr.com/photos/shedboy/3681317392/
Case Study: Realtime Soccer
Analysis
Watch at:
https://www.youtube.com/watch?v=nRI6buQ0NOM
Case Study: TFL Traffic Analysis
Built using TFL (
Transport for
London) open data
feeds.
http://goo.gl/04tX
6k
http://goo.gl/9xNi
Cm
Select the Product
Product Features
WSO2 Data
Analytics Server
(DAS)
Everything : Batch,
Realtime, Interactive,
and Predictive
Analytics
WSO2 Complex
Event Processor
(CEP)
Realtime Analytics
only
WSO2 Machine
Learner
Predictive Analytics
only
Questions?
CONTACT US
!

WSO2 Workshop Sydney 2016 - Analytics

  • 1.
    Associate Director /Solutions Architect 22nd June 2016 | Sydney, Australia. dassanaATwso2.com Dassana Wijesekara WSO2 Workshop 2016 | Sydney . Australia. WSO2 Analytics Platform
  • 2.
    Analytics is GrowingUp ▪ It is no longer about doing your first analytics usecase. ▪ It is about ▪ How to do it everyday, efficiently? ▪ How to recover? ▪ How to make decisions? ▪ How to do other forms like real-time , Interactive, and predicative analytics
  • 3.
    Analytics 2.0 Platform ▪One platform for all four forms of analytics ▪ Single consistent programming model ▪ One analytics archive format) ▪ Support for the lifecycle of analytics Apps Integrate well with rest of the enterprise!!
  • 5.
    Collect Data ▪ OneSensor API to publish events - REST, Thrift, JMS, Kafka - Java clients, java script clients* ▪ First you define streams (think it as a infinite table in SQL DB) ▪ Then send events via Sensor API Can send to batch pipeline, Realtime pipeline or both via configuration!
  • 6.
    Collecting Data: Example Java example: create and send events  Events send asynchronously  See client given in http://goo.gl/vIJzqc for more info Agent agent = new Agent(agentConfiguration); publisher = new AsyncDataPublisher("tcp://hostname:7612", .. ); StreamDefinition definition = new StreamDefinition(STREAM_NAME,VERSION); definition.addPayloadData("sid", STRING); ... publisher.addStreamDefinition(definition); ... Event event = new Event(); event.setPayloadData(eventData); publisher.publish(STREAM_NAME, VERSION, event); Send event Define Stream Initialize Agent
  • 7.
  • 8.
  • 9.
    Analytics logic withSQL like Queries ▪ Both BAM and CEP provides a SQL like data processing language ▪ Since many understands SQL, above languages made large scale data processing Big Data accessible to many ▪ Expressive, short, and sweet. ▪ Define core operations that covers 90% of problems ▪ Lets experts dig in when they like! (via User Defined functions)
  • 10.
    Scaling CEP Querieson top of Storm ▪Accepts CEP queries with hints about how to partition streams ▪Partition streams, build a Apache Storm topology running CEP nodes as Storm Sprouts, and run it. (see http://goo.gl/pP3kdX )
  • 11.
    Predictive Analytics ▪ PredictiveAnalytics learns a decision function (a model) using examples ▪ Is this fraud? ▪ How to drive? ▪ Handwritten text ▪ Build models and use them with WSO2 CEP, BAM and ESB using WSO2 Machine Learner Product ( 2015 Q3) ▪ Build model using R, export them as PMML, and use within WSO2 CEP
  • 12.
    WSO2 Machine Learner ▪A wizard to sample, explore, and understand data through visualizations ▪ A wizard to configure, train machine learning models, and select the best model ▪ Find and use those models with WSO2 CEP, BAM and ESB ▪ Powered by Apache Spark MLLib
  • 13.
    Communicate: Dashboards ▪ Ideais to give a “Overall idea” in a glance (e.g. car dashboard) ▪ Support for personalization, you can build your own dashboard. ▪ Also the entry point for Drill down ▪ How to build? - Dashboard via Google Gadget and content via HTML5 + java scripts - Use charting libraries like Vega or D3
  • 14.
    Communicate: Alerts ▪ Detectingconditions can be done via CEP Queries ▪ Key is the “Last Mile” - Email - SMS - Push notifications to a UI - Pager - Trigger physical Alarm ▪ How? - Select Email sender “Output Adaptor” from CEP, or send from CEP to ESB, and ESB has lot of connectors
  • 15.
    Communicate: APIs ▪ Withmobile Apps, most data are exposed and shared as APIs (REST/Json ) to end users. ▪ Need to expose analytics results as API ▪ Following are some challenges - Security and Permissions - API Discovery - Billing, throttling, quotas & SLA ▪ How? - Write data to a database from CEP event tables - Build Services via WSO2 Data Service - Expose them as APIs via API Manager
  • 16.
    Event Stream Store ▪One stop place for all event stream definitions ▪ Let users ▪ Publish and consume though Multiple protocols like REST, JMS, Thrift, Web Sockets etc. ▪ Discover event streams ▪ Enforce security and authorization ▪ Per-pay subscriptions ▪ Effectively a Event Stream Market Place!! ▪ This will automate APIs creation as discussed in the slide before.
  • 17.
    What is itgood for? ▪ Batch Analytics ▪ Realtime Streaming analytics ▪ Realtime Interactive analytics ▪ Lambda Architecture ▪ Train and use a ML model ▪ Selective Detailed Analysis http://tinybuddha.com/blog/a-simple-technique-to- solve-problems-before-they-get-bigger/
  • 18.
    Selective Detailed Analysis •Too expensive to do detailed analysis on all the data • Instead detect the condition, and dig into related data • Fraud toolbox • Other usecases – Dynamic offers at Retail Site – Weather
  • 19.
    Lambda Architecture • Samecode in both batch and realtime layers • Idea is to fill the time between two batch runs • Batch layer writes the data to a DB • Realtime layer merge with batch data via Event Tables
  • 20.
    Real Life UseCases ▪ Health, Smart Parking solutions ▪ Financial Monitoring ▪ Smart City project, Vehicle tracking, Building monitoring ▪ Railway monitoring ▪ Throttling and Anomaly Detection ▪ API Analytics (13+ customers) ▪ Connected Car
  • 21.
    Case Study: DEBSGrand Challenges ▪ DEBS ((Distributed Event Based Systems) Grand Challenge is a yearly event processing challenge. ▪ 2014 Challenge: ▪ Smart Home electricity data: 2000 sensors, 40 houses, 4 Billion events. We posted (400K events/sec) and close to one million distributed throughput with 4 nodes. ▪ one of the four finalists ▪ 2015 Challenge: ▪ Based on taxi activities collected from New York City over the year 2013. 14,144 taxis 173 million taxi trip records. We posted 300K/sec on a single node and one of the finalists. https://www.flickr.com/photos/shedboy/3681317392/
  • 22.
    Case Study: RealtimeSoccer Analysis Watch at: https://www.youtube.com/watch?v=nRI6buQ0NOM
  • 23.
    Case Study: TFLTraffic Analysis Built using TFL ( Transport for London) open data feeds. http://goo.gl/04tX 6k http://goo.gl/9xNi Cm
  • 24.
    Select the Product ProductFeatures WSO2 Data Analytics Server (DAS) Everything : Batch, Realtime, Interactive, and Predictive Analytics WSO2 Complex Event Processor (CEP) Realtime Analytics only WSO2 Machine Learner Predictive Analytics only
  • 25.
  • 26.