This document discusses using business analytics to ask the right questions and make data-driven decisions. It outlines the analytical process of tracking data, storing data, merging data through ETL processes, and retrieving data for analysis and decision making. Traditional business intelligence involved siloed data and limited questioning, while modern approaches use flexible data transformation, consolidated data storage, and self-service tools to empower anyone to access and analyze data. The document stresses defining success metrics, looking for low-hanging fruit opportunities, and digging one level deeper during analysis to gain deeper insights and make better decisions.
2. B U S I N E S S I N T E L L I G E N C E
Operational Control
How many sales did I do today?
Understand & Improve Experience
Are users engaging? Do they like the new features?
Make business decisions
Should we start delivering in a new city?
3. —ANDREW LEONARD
Salon
“Data indicated that the same subscribers who
loved the original BBC production also gobbled
down movies starring Kevin Spacey or directed
by David Fincher”
4. 1
1 Tracking Data
2 Storing Data
3 Merging Data (ETL)
4 Retrieving Data
5 Analysis & Decision Making
The Analytical
Process
9. Go with
SQL
Store all
states
Keep it
clean
Storing - Transactional Data
NoSQL could be a
burden long-term
Even offline processes Messy schema =
complicated analytics
11. Own it
Use eco-
system
too
Store all
the IDs
Storing - Event Data
Or, at a minimum, be
able to get it
Lots of great SaaS event
platforms
Need to be able to
correlate events to
transactions
13. Other Data?
Transactional
Data
Event Data
Raw Queries Biz-User Tools
You should combine
transaction and event data,
+more
Use an analytical database
Redshift is current leader
Difficult - data is heavy
Application
WITH
user_order_activity
AS
(
SELECT
user_id,
age
FROM
ORDERS
GROUP
BY
user_id)
SELECT
AVG(users.age)
as
average_age_of_purchaser
FROM
user_order_activity
LEFT
JOIN
users
ON
user_order_activity.user_id
=
users.user_id
14. SUMMARY
Traditional Approach
OLAP / Data Summaries
S I LO E D
Restricted Q&A
L I M I T E D
I
G
M
N
L
D
Q
B
A
P
R
S
Q
D I F F I CU LT & CO M B E R S O M E
ETL - Heavy Transformation
END USERBI TEAMETL TEAM EDW TEAM
W A N T TO A S K N E W Q U E ST I O N S ?
A B
?CF
X
EB
A
EVENT
DATA
TRANSACTIONALDATA
15. Modern Approach
3R D PARTY APP
API
ANY DEVICE
Transformation at Query
F L E X I B L E
Anywhere for Anyone
A CC E SS I B L ECO N S O L I D AT E D
Simple Extract & Load
I
G
M N
D
Q
A
P
R
S
Q
T
U
W
X
G Q
U
S
A
Z
Data Modeling Layer
A G I L E
D ATA T E A M E N D U S E R S
Data
Model
- name: first_purchasers
type: single_value
base_view: orders
measures:
[orders.first_purchase_count]
listen:
- name:
orders_by_day_and_category
title: "Orders by Day and
Category"
type: looker_area
base_view: order_items
I N N O VAT I O N
TRANSACTIONAL
DATA
EVENT
DATA
Z
B
QA A Z
M P P | R E D S H I F T | I M PA L A
22. What’s selling? What colors
and sizes is it selling in?
What’s getting returned? Is
there a particular size/color?
Is there a product people
buy first that increases their
likelihood of becoming a
repeat customer?
Questions from a retail
buyer at e-commerce
store:
23. Get them
the tool
Decisions
vs. data
science
Game-
changing
insights
Self-Service is Key
People with questions
are running the
businesses.
“Should we open a new
market in Maine?”
Don’t only come from
analyst group
27. Focus on desired outcome
What do you want users to experience?
Measure Engagement
In most cases this is first-line business
analytics
Measure Retention
Are people coming back?
S U C C E S S M E T R I C S
28. H O W T O T R A C K E N G A G E M E N T ?
Not with page views
Usually not even with time on page
Upworthy’s attention minutes
Lots of indicators (mouse, video, etc)
Looker’s approximate usage
Any event in 2 minute window
29. Deriving Approximate Usage
SELECT
event.created_at
AS
created_date,
event.user_id
as
user_id,
COUNT(*)
AS
count,
COUNT(DISTINCT
CONCAT(
CONCAT(event.user_id,'|',event.user_browser_id),
FLOOR(UNIX_TIMESTAMP(event.created_at)/(60*2))
)
)*2
AS
approximate_usage_in_minutes
FROM
event
GROUP
BY
created_date,
user_id
created_date user_id
count
approximate_usage
1/10
1 123 100 minutes
1/10
2 228 50 minutes
1/10
3 45 80 minutes
30. Derived Tables
SELECT
orders.user_id
as
user_id
COUNT(*)
as
lifetime_orders
MIN(orders.created_at)
as
first_order
MAX(orders.created_at)
as
latest_order
COUNT(DISTINCT
DATE_TRUNC('month’))
as
distinct_months_with_orders
FROM
orders
GROUP
BY
user_id
Transactional
Event
Analytical
Derived Table
Insights
31. Start
simple
Most
useful at
row level
Great for
cohorts and
sessionization
Derived Tables
Subselects until slow,
SQL on cron works
surprisingly well
Don’t roll up data, pre-
compute facts
Tiered derived
dimension vs. some
other metric
32. Derived Table - User Order Facts
SELECT
orders.user_id
as
user_id
COUNT(*)
as
lifetime_orders
MIN(orders.created_at)
as
first_order
MAX(orders.created_at)
as
latest_order
COUNT(DISTINCT
DATE_TRUNC('month’))
as
distinct_months_with_orders
FROM
orders
GROUP
BY
user_id
user_id lifetime_orders
first_order
latest_order
distinct_months_with_orders
1
10 1/10/15 2/14/15 2
35. Churn
Users that will likely
never do X again
Usage
How likely to purchase
if they do X
Time to
transaction
How long till first X
Retention
Are users coming back
???
Invent a metric
Repeat
buyers
What’s different about
them
37. It was clear some
users were
accidentally paying
instead of charging,
but it wasn't clear
how widespread the
problem was and
whether it was worth
prioritizing a fix
Inventing Metrics
40. This is the kind of
very visual, very
data‑driven piece
of analysis that
helps us think, "Is
opening the sale at
noon the right
decision?”
???
Low-hanging Fruit
41. Out of stocks are
huge detractors from
the customer
experience - it sucks
ordering something
and then not getting
it - as well as
revenue we failed to
capture
Low-hanging Fruit
43. While this immediate
insight might have
led us to focus on
small groups, this
didn’t match our
expectations of
people planning an
outing on a Friday
night, prompting us
to look further.
One Level Deeper
2 3 4
Time To Book
2 3 4
Group Size
44. We analyze all the
platform data
available - When
someone attempts
to sign, completes
the signup, pushes
an app, has spend,
etc
One Level Deeper
45. Even though it
looks like we were
having nice
incremental
growth, looking into
the details we see
some things to look
into further
One Level Deeper