Making the Most of
Customer Data
Srinath Perera
Director, Research, WSO2 Inc.
Visiting Faculty, University of Moratuwa
Member, Apache Software Foundation
Research Scientist, Lanka Software Foundation
2
About WSO2
§  Global	
  enterprise,	
  founded	
  in	
  2005
by acknowledged	
  leaders in XML,
web services technologies,
standards and open source
§  Provides only open source
platform-as-a-service for private,
public and hybrid cloud
deployments
§  All WSO2 products are 100%
open source and released under
the Apache License Version 2.0.
§  Is an Active Member of OASIS,
Cloud Security Alliance, OSGi
Alliance, AMQP Working Group,
OpenID Foundation and W3C.
๏  Driven	
  by	
  Innova=on	
  
๏  Launched	
  first	
  open	
  source	
  API	
  
Management	
  solu=on	
  in	
  2012	
  
๏  Launched	
  App	
  Factory	
  in	
  2Q	
  2013	
  
๏  Launched	
  Enterprise	
  Store	
  and	
  
first	
  open	
  source	
  Mobile	
  solu=on	
  
in	
  4Q	
  2013	
  
3
What WSO2 delivers
4
Business Model
Outline
§  Connected Business and Big data analytics
§  Why use Analytics?
§  Big Data Technologies from WSO2
§  BAM – Batch analytics
§  CEP – Real time analytics
§  Lambda Architecture to combine
§  From your business to insights
§  Understand the Customers
§  Targeted Marketing
§  Understand Competition and Market
§  Optimize Operations
§  Predict Outcomes
Adaptive Connected
Business
Connected Business
Be Adaptive
§  Capture business activity
(identified by messages,
transaction execution, and data
state changes) and store data
points for future analytics
§  Deliver automated notifications to
stakeholders and systems based
on business activity, stakeholder
accountability, and authority.
§  Automatically adapt business
process execution based on
events and current conditions
Big Picture
Why Analytics?
§  Because there is
room to improvement,
and you do not know
where and how!
§  Few Areas
o Understand customers
o Understand the Market and competition
o Efficient Marketing
o Optimize your operations
o Predict outcome
Understand the Customers
§ Not all customers are
equal (80/20%)
o Bring different amount of
revenue
o Needs different things
o Lives in different areas
o Use your service at
different times
o Responds to different
things
Marketing
§  Old broadcast model
of marketing
o People are getting
better at ignoring
o People hate when you
knocking on the door
o Most eyeballs are at
internet
§  Market to people who
are interested? Key is
finding who is
interested
Understand the Market and Competition
§  What if we can?
o Know how what market thinks
(follow social feeds)?
o Know what customers like and
dislike?
o Know who are unhappy? (e.g.
find and react to churn)?
o What subset of customers like
our products?
World is inefficient
§  About 50% of cooked food wasted
§  About 30% vegetables and fruits wasted
§  5% revenue on average lost to fraud, and
22% of cases are > 1M
§  Most energy (e.g. lighting, mechanical) is
wasted
§  So much time lost waiting for things,
cleaning up messes, finding things
Big Data Technologies
Collecting Data
§  Data collected at sensors and sent to big
data system via events or flat files
§  Event Streams: we name the events by its
content/ originator
•  Get data through
– Point to Point
– Event Bus
•  E.g. Data bridge
– a thrift based transport we
did that do about 400k
events/ sec
Making Sense of Data
§  Basic Analytics
o To know (what happened?)
o Statics (min, max, average,
histogram … ) + visualizations
o Interactive drill down
§  Advanced Analytics
o To explain (why) - Data mining,
classifications, building models,
clustering
o To forecast – Regression,
Neural networks, decision
models
Dashboards and last Mile
§  Presenting information
o  To end user
o  To decision takers
o  To scientist
§  Interactive exploration
§  Sending alerts
http://www.flickr.com/photos/
stevefaeembra/3604686097/
Big Data Architecture
Data Collection
•  Can receive
events via SOAP,
HTTP, JMS, ..
•  WSO2 Events is
highly optimized
version (400K
events TPS)
•  Default Agents
and you can write
custom agents.
Agent agent = new
Agent(agentConfiguration);
publisher = new AsyncDataPublisher(
"tcp://localhost:7612", .. );
StreamDefinition definition =
new StreamDefinition(STREAM_NAME,
VERSION);
definition.addPayloadData("sid",
STRING);
...
publisher.addStreamDefinition(definition
);
...
Event event = new Event();
event.setPayloadData(eventData);
publisher.publish(STREAM_NAME, VERSION,
event);
Business Activity Monitor
BAM Hive Query
Find how much time spent in each cell.
CREATE EXTERNAL TABLE IF NOT EXISTS PlayStream …
select sid,
ceiling((y+33000)*7/10000 + x/10000) as cell,
count(sid)
from PlayStream
GROUP BY sid, ceiling((y+33000)*7/10000 + x/10000);
Complex Event Processor
CEP Query
define partition sidPrt by PlayStream.sid,
LocBySecStream.sid
from PlayStream#window.timeBatch(1sec)
select sid, avg(x) as xMean, avg(y) as yMean, avg(z) as
zMean
insert into LocBySecStream partition by sidPrt
from every e1 = LocBySecStream ->
e2 = LocBySecStream [e1.yMean + 10000 > yMean
or yMean + 10000 > e1.yMean]
within 2sec select e1.sid
insert into LongAdvStream partition by sidPrt ;
Calculate the mean
location of each player
every second
Detect more
than 10m run
Lambda Architecture
Applying Big Data
Technologies
Understand the Customers
§  Process transactions logs using Hive
o Building a profile for customers
o Identify key 20% that brings in most revenue
o Identify what features and feature
combinations they like most
o Find how they reached you
How? Can be done via basic
analytics (Hive and Basic Stats)
Build a Profile for Customers
§  Get them to register (gets you basic
demographics)
§  Track what they like, what they view? What
they buy?
§  Track how often they buy? Where he live
(from client IP)?
§  Follow their social feeds, gauge the
sentiments, find what they like
How? > 50% via basic analytics, rest
need some NLP, finding similar
items, classification etc.
Targeted Marketing
1.  Know your stats: know Leads => Sales
conversion rate, and details about the pipeline.
2.  Analyze user profiles and target your activities
(e.g. based on location, interests etc.)
3.  Tag campaigns and track the effect (Google
Adv, workshops, events, email campaigns,
even TV or paper adv)
4.  Find how activities affects Leads => Sales.
5.  Use the data for predictive modeling
How? 1-4 with basic analytics +
activity monitoring. #5 with
advanced analytics
Understand the Market and Competition
§  Know your current customers and opportunities
are? Find the risk (e.g. predict Churn)
§  Find which leads are most effective at
conversion?
§  What common sequences users do often? May
be package it as a new product?
§  Track social feeds for what users are saying.
Track sentiments. Convert complains to praises
by acting fast.
How? 20% basic analytics and rest
advanced analytics
Optimize Operations
§  Instrument your operations pipeline. Know
what happens, where resources spent?
o Manufacturing pipeline
o Sales pipeline
o Marketing pipeline
§  Do predictive maintenance
§  Optimize your IT infrastructure
§  Lookout for fraud! (often cost > 30%)
How? 40% basic analytics and rest
advanced analytics
Operation Dashboard
§  Real time view of your business
§  Visualizations that shows the bottom line
at a glance.
§  KPIs, thresholds and alerts
§  Drilldown when there are problems (see
Webinar “Gaining Operational Intelligence
‪with WSO2 BAM”)
§  Different views for different roles
Predict Outcomes
§  Plan the operations, look for risks.
§  Use old data to predict outcomes. Fine
tune and improve models.
§  Do what if analysis, use that to drive your
decisions
§  Try to find predictions on key external
factors (e.g. Oil and manufacturing
companies invest on weather forecasts. )
Conclusion
§  Analytics are important to you Business
o Because there is lot of room to
improvements, but you do not know where.
§  The Big Data platform
§  Applying Big Data technologies
§  Understand the Customers
§  Targeted Marketing
§  Understand Competition and Market
§  Optimize Operations
§  Predict Outcomes
Questions?

Making the Most of Customer Data

  • 1.
    Making the Mostof Customer Data Srinath Perera Director, Research, WSO2 Inc. Visiting Faculty, University of Moratuwa Member, Apache Software Foundation Research Scientist, Lanka Software Foundation
  • 2.
    2 About WSO2 §  Global  enterprise,  founded  in  2005 by acknowledged  leaders in XML, web services technologies, standards and open source §  Provides only open source platform-as-a-service for private, public and hybrid cloud deployments §  All WSO2 products are 100% open source and released under the Apache License Version 2.0. §  Is an Active Member of OASIS, Cloud Security Alliance, OSGi Alliance, AMQP Working Group, OpenID Foundation and W3C. ๏  Driven  by  Innova=on   ๏  Launched  first  open  source  API   Management  solu=on  in  2012   ๏  Launched  App  Factory  in  2Q  2013   ๏  Launched  Enterprise  Store  and   first  open  source  Mobile  solu=on   in  4Q  2013  
  • 3.
  • 4.
  • 5.
    Outline §  Connected Businessand Big data analytics §  Why use Analytics? §  Big Data Technologies from WSO2 §  BAM – Batch analytics §  CEP – Real time analytics §  Lambda Architecture to combine §  From your business to insights §  Understand the Customers §  Targeted Marketing §  Understand Competition and Market §  Optimize Operations §  Predict Outcomes
  • 6.
  • 7.
  • 8.
    Be Adaptive §  Capturebusiness activity (identified by messages, transaction execution, and data state changes) and store data points for future analytics §  Deliver automated notifications to stakeholders and systems based on business activity, stakeholder accountability, and authority. §  Automatically adapt business process execution based on events and current conditions
  • 9.
  • 10.
    Why Analytics? §  Becausethere is room to improvement, and you do not know where and how! §  Few Areas o Understand customers o Understand the Market and competition o Efficient Marketing o Optimize your operations o Predict outcome
  • 11.
    Understand the Customers § Notall customers are equal (80/20%) o Bring different amount of revenue o Needs different things o Lives in different areas o Use your service at different times o Responds to different things
  • 12.
    Marketing §  Old broadcastmodel of marketing o People are getting better at ignoring o People hate when you knocking on the door o Most eyeballs are at internet §  Market to people who are interested? Key is finding who is interested
  • 13.
    Understand the Marketand Competition §  What if we can? o Know how what market thinks (follow social feeds)? o Know what customers like and dislike? o Know who are unhappy? (e.g. find and react to churn)? o What subset of customers like our products?
  • 14.
    World is inefficient § About 50% of cooked food wasted §  About 30% vegetables and fruits wasted §  5% revenue on average lost to fraud, and 22% of cases are > 1M §  Most energy (e.g. lighting, mechanical) is wasted §  So much time lost waiting for things, cleaning up messes, finding things
  • 15.
  • 16.
    Collecting Data §  Datacollected at sensors and sent to big data system via events or flat files §  Event Streams: we name the events by its content/ originator •  Get data through – Point to Point – Event Bus •  E.g. Data bridge – a thrift based transport we did that do about 400k events/ sec
  • 17.
    Making Sense ofData §  Basic Analytics o To know (what happened?) o Statics (min, max, average, histogram … ) + visualizations o Interactive drill down §  Advanced Analytics o To explain (why) - Data mining, classifications, building models, clustering o To forecast – Regression, Neural networks, decision models
  • 18.
    Dashboards and lastMile §  Presenting information o  To end user o  To decision takers o  To scientist §  Interactive exploration §  Sending alerts http://www.flickr.com/photos/ stevefaeembra/3604686097/
  • 19.
  • 20.
    Data Collection •  Canreceive events via SOAP, HTTP, JMS, .. •  WSO2 Events is highly optimized version (400K events TPS) •  Default Agents and you can write custom agents. Agent agent = new Agent(agentConfiguration); publisher = new AsyncDataPublisher( "tcp://localhost:7612", .. ); StreamDefinition definition = new StreamDefinition(STREAM_NAME, VERSION); definition.addPayloadData("sid", STRING); ... publisher.addStreamDefinition(definition ); ... Event event = new Event(); event.setPayloadData(eventData); publisher.publish(STREAM_NAME, VERSION, event);
  • 21.
  • 22.
    BAM Hive Query Findhow much time spent in each cell. CREATE EXTERNAL TABLE IF NOT EXISTS PlayStream … select sid, ceiling((y+33000)*7/10000 + x/10000) as cell, count(sid) from PlayStream GROUP BY sid, ceiling((y+33000)*7/10000 + x/10000);
  • 23.
  • 24.
    CEP Query define partitionsidPrt by PlayStream.sid, LocBySecStream.sid from PlayStream#window.timeBatch(1sec) select sid, avg(x) as xMean, avg(y) as yMean, avg(z) as zMean insert into LocBySecStream partition by sidPrt from every e1 = LocBySecStream -> e2 = LocBySecStream [e1.yMean + 10000 > yMean or yMean + 10000 > e1.yMean] within 2sec select e1.sid insert into LongAdvStream partition by sidPrt ; Calculate the mean location of each player every second Detect more than 10m run
  • 25.
  • 27.
  • 28.
    Understand the Customers § Process transactions logs using Hive o Building a profile for customers o Identify key 20% that brings in most revenue o Identify what features and feature combinations they like most o Find how they reached you How? Can be done via basic analytics (Hive and Basic Stats)
  • 29.
    Build a Profilefor Customers §  Get them to register (gets you basic demographics) §  Track what they like, what they view? What they buy? §  Track how often they buy? Where he live (from client IP)? §  Follow their social feeds, gauge the sentiments, find what they like How? > 50% via basic analytics, rest need some NLP, finding similar items, classification etc.
  • 30.
    Targeted Marketing 1.  Knowyour stats: know Leads => Sales conversion rate, and details about the pipeline. 2.  Analyze user profiles and target your activities (e.g. based on location, interests etc.) 3.  Tag campaigns and track the effect (Google Adv, workshops, events, email campaigns, even TV or paper adv) 4.  Find how activities affects Leads => Sales. 5.  Use the data for predictive modeling How? 1-4 with basic analytics + activity monitoring. #5 with advanced analytics
  • 31.
    Understand the Marketand Competition §  Know your current customers and opportunities are? Find the risk (e.g. predict Churn) §  Find which leads are most effective at conversion? §  What common sequences users do often? May be package it as a new product? §  Track social feeds for what users are saying. Track sentiments. Convert complains to praises by acting fast. How? 20% basic analytics and rest advanced analytics
  • 32.
    Optimize Operations §  Instrumentyour operations pipeline. Know what happens, where resources spent? o Manufacturing pipeline o Sales pipeline o Marketing pipeline §  Do predictive maintenance §  Optimize your IT infrastructure §  Lookout for fraud! (often cost > 30%) How? 40% basic analytics and rest advanced analytics
  • 33.
    Operation Dashboard §  Realtime view of your business §  Visualizations that shows the bottom line at a glance. §  KPIs, thresholds and alerts §  Drilldown when there are problems (see Webinar “Gaining Operational Intelligence ‪with WSO2 BAM”) §  Different views for different roles
  • 34.
    Predict Outcomes §  Planthe operations, look for risks. §  Use old data to predict outcomes. Fine tune and improve models. §  Do what if analysis, use that to drive your decisions §  Try to find predictions on key external factors (e.g. Oil and manufacturing companies invest on weather forecasts. )
  • 35.
    Conclusion §  Analytics areimportant to you Business o Because there is lot of room to improvements, but you do not know where. §  The Big Data platform §  Applying Big Data technologies §  Understand the Customers §  Targeted Marketing §  Understand Competition and Market §  Optimize Operations §  Predict Outcomes
  • 36.