SQL Database Design For Developers at php[tek] 2024
Making the Most of Customer Data
1. Making the Most of
Customer Data
Srinath Perera
Director, Research, WSO2 Inc.
Visiting Faculty, University of Moratuwa
Member, Apache Software Foundation
Research Scientist, Lanka Software Foundation
2. 2
About WSO2
§ Global
enterprise,
founded
in
2005
by acknowledged
leaders in XML,
web services technologies,
standards and open source
§ Provides only open source
platform-as-a-service for private,
public and hybrid cloud
deployments
§ All WSO2 products are 100%
open source and released under
the Apache License Version 2.0.
§ Is an Active Member of OASIS,
Cloud Security Alliance, OSGi
Alliance, AMQP Working Group,
OpenID Foundation and W3C.
๏ Driven
by
Innova=on
๏ Launched
first
open
source
API
Management
solu=on
in
2012
๏ Launched
App
Factory
in
2Q
2013
๏ Launched
Enterprise
Store
and
first
open
source
Mobile
solu=on
in
4Q
2013
5. Outline
§ Connected Business and Big data analytics
§ Why use Analytics?
§ Big Data Technologies from WSO2
§ BAM – Batch analytics
§ CEP – Real time analytics
§ Lambda Architecture to combine
§ From your business to insights
§ Understand the Customers
§ Targeted Marketing
§ Understand Competition and Market
§ Optimize Operations
§ Predict Outcomes
8. Be Adaptive
§ Capture business activity
(identified by messages,
transaction execution, and data
state changes) and store data
points for future analytics
§ Deliver automated notifications to
stakeholders and systems based
on business activity, stakeholder
accountability, and authority.
§ Automatically adapt business
process execution based on
events and current conditions
10. Why Analytics?
§ Because there is
room to improvement,
and you do not know
where and how!
§ Few Areas
o Understand customers
o Understand the Market and competition
o Efficient Marketing
o Optimize your operations
o Predict outcome
11. Understand the Customers
§ Not all customers are
equal (80/20%)
o Bring different amount of
revenue
o Needs different things
o Lives in different areas
o Use your service at
different times
o Responds to different
things
12. Marketing
§ Old broadcast model
of marketing
o People are getting
better at ignoring
o People hate when you
knocking on the door
o Most eyeballs are at
internet
§ Market to people who
are interested? Key is
finding who is
interested
13. Understand the Market and Competition
§ What if we can?
o Know how what market thinks
(follow social feeds)?
o Know what customers like and
dislike?
o Know who are unhappy? (e.g.
find and react to churn)?
o What subset of customers like
our products?
14. World is inefficient
§ About 50% of cooked food wasted
§ About 30% vegetables and fruits wasted
§ 5% revenue on average lost to fraud, and
22% of cases are > 1M
§ Most energy (e.g. lighting, mechanical) is
wasted
§ So much time lost waiting for things,
cleaning up messes, finding things
16. Collecting Data
§ Data collected at sensors and sent to big
data system via events or flat files
§ Event Streams: we name the events by its
content/ originator
• Get data through
– Point to Point
– Event Bus
• E.g. Data bridge
– a thrift based transport we
did that do about 400k
events/ sec
17. Making Sense of Data
§ Basic Analytics
o To know (what happened?)
o Statics (min, max, average,
histogram … ) + visualizations
o Interactive drill down
§ Advanced Analytics
o To explain (why) - Data mining,
classifications, building models,
clustering
o To forecast – Regression,
Neural networks, decision
models
18. Dashboards and last Mile
§ Presenting information
o To end user
o To decision takers
o To scientist
§ Interactive exploration
§ Sending alerts
http://www.flickr.com/photos/
stevefaeembra/3604686097/
22. BAM Hive Query
Find how much time spent in each cell.
CREATE EXTERNAL TABLE IF NOT EXISTS PlayStream …
select sid,
ceiling((y+33000)*7/10000 + x/10000) as cell,
count(sid)
from PlayStream
GROUP BY sid, ceiling((y+33000)*7/10000 + x/10000);
24. CEP Query
define partition sidPrt by PlayStream.sid,
LocBySecStream.sid
from PlayStream#window.timeBatch(1sec)
select sid, avg(x) as xMean, avg(y) as yMean, avg(z) as
zMean
insert into LocBySecStream partition by sidPrt
from every e1 = LocBySecStream ->
e2 = LocBySecStream [e1.yMean + 10000 > yMean
or yMean + 10000 > e1.yMean]
within 2sec select e1.sid
insert into LongAdvStream partition by sidPrt ;
Calculate the mean
location of each player
every second
Detect more
than 10m run
28. Understand the Customers
§ Process transactions logs using Hive
o Building a profile for customers
o Identify key 20% that brings in most revenue
o Identify what features and feature
combinations they like most
o Find how they reached you
How? Can be done via basic
analytics (Hive and Basic Stats)
29. Build a Profile for Customers
§ Get them to register (gets you basic
demographics)
§ Track what they like, what they view? What
they buy?
§ Track how often they buy? Where he live
(from client IP)?
§ Follow their social feeds, gauge the
sentiments, find what they like
How? > 50% via basic analytics, rest
need some NLP, finding similar
items, classification etc.
30. Targeted Marketing
1. Know your stats: know Leads => Sales
conversion rate, and details about the pipeline.
2. Analyze user profiles and target your activities
(e.g. based on location, interests etc.)
3. Tag campaigns and track the effect (Google
Adv, workshops, events, email campaigns,
even TV or paper adv)
4. Find how activities affects Leads => Sales.
5. Use the data for predictive modeling
How? 1-4 with basic analytics +
activity monitoring. #5 with
advanced analytics
31. Understand the Market and Competition
§ Know your current customers and opportunities
are? Find the risk (e.g. predict Churn)
§ Find which leads are most effective at
conversion?
§ What common sequences users do often? May
be package it as a new product?
§ Track social feeds for what users are saying.
Track sentiments. Convert complains to praises
by acting fast.
How? 20% basic analytics and rest
advanced analytics
32. Optimize Operations
§ Instrument your operations pipeline. Know
what happens, where resources spent?
o Manufacturing pipeline
o Sales pipeline
o Marketing pipeline
§ Do predictive maintenance
§ Optimize your IT infrastructure
§ Lookout for fraud! (often cost > 30%)
How? 40% basic analytics and rest
advanced analytics
33. Operation Dashboard
§ Real time view of your business
§ Visualizations that shows the bottom line
at a glance.
§ KPIs, thresholds and alerts
§ Drilldown when there are problems (see
Webinar “Gaining Operational Intelligence
with WSO2 BAM”)
§ Different views for different roles
34. Predict Outcomes
§ Plan the operations, look for risks.
§ Use old data to predict outcomes. Fine
tune and improve models.
§ Do what if analysis, use that to drive your
decisions
§ Try to find predictions on key external
factors (e.g. Oil and manufacturing
companies invest on weather forecasts. )
35. Conclusion
§ Analytics are important to you Business
o Because there is lot of room to
improvements, but you do not know where.
§ The Big Data platform
§ Applying Big Data technologies
§ Understand the Customers
§ Targeted Marketing
§ Understand Competition and Market
§ Optimize Operations
§ Predict Outcomes