How to Remove Document Management Hurdles with X-Docs?
Big Data Meetup by Chad Richeson
1. Big Data Meetup
Chad Richeson
CEO, Society Consulting
www.societyconsulting.com
linkedin.com/in/chadricheson
@chadricheson
Seattle Technical Forum
June 19, 2013
2. Bio:
• 17 Years analyzing Businesses, Products, & Customers
• 7+ Years building large analytical & technical teams
• 7+ Years above Petabyte scale
• 12 Years at MSFT
My approach:
• Business decisions drive the technical decisions
• Focus on outcomes first, processes second
• Blend technical, analytical, & business skills
About Me
June 19, 2013 www.societyconsulting.com 2
3. Big Data’s Impact on Industry
Nearly every industry is trying to figure out how to apply Big Data concepts to their
business, to uncover new opportunities, improve efficiencies, and minimize risk.
June 19, 2013 www.societyconsulting.com 3
Digital Media & E-Commerce Real-time ad targeting, Web analytics & trends
Energy and Utilities Smart meter analytics, Asset management
Financial Services Risk and fraud management, Portfolio management, Customer analytics
Government Threat Management, Law Enforcement (Real-time multimodal surveillance,
Cyber security detection), Macro economic analytics
Healthcare and Life Sciences New drug development, Medical record text analytics, Genomic analytics
Retail CRM, Targeted marketing analysis, Vendor delivery & Supply chain
optimizations, Market basket analysis, Click-stream analysis
Telecommunications CRM, Call detail record analysis, Least cost routing, Fraud management
Transportation Logistics optimization, Traffic congestion
4. But can we harness it all?
It’s no longer a matter of whether we can collect large amounts of data, it’s whether we can
harness its power.
June 19, 2013 www.societyconsulting.com 4
• Every day approx 2.5 quintillion (2.5×10^18) bytes of
data is created.
• Mobile devices, web tags, smart energy meters, remote
sensing, wireless sensors, software machine
logs, cameras, rfid readers, etc. are creating massive
amounts of data
• The economic potential of big data is becoming a “C-
level” conversation.
5. Smartphone Data
One driver of data explosion is the smartphone. Today’s smartphone has 14 sensors and
growing. And each phone = one person, which has profound implications on the amount of
meaning that can be derived from the data.
June 19, 2013 www.societyconsulting.com 5
Accelerometer
Gyroscope
Magnetometer
Barometer
Proximity
Light Sensor
Touch Screen
GPS
WiFi
Bluetooth
GSM/CDMA Cell
NFC: Near Field
Camera (front)
Camera (back)
6. Web Data
Another driver of data explosion is the amount of data that can be collected from a web
page. Data growth from the web shows no signs of slowing down.
June 19, 2013 www.societyconsulting.com 6
• Consider 10 million page views a day on a popular
web site:
• Capture User ID for every page view and store them as
integer
• 10 million x 4 bytes = 40 MB of storage/day
• 40MB x 30 days = 1.17 GB/month, just for User ID
• Data quickly grows and so does challenges around
storage, processing and analytics.
7. eCommerce Experimentation
New techniques such as Experimentation are creating dramatically more data to analyze.
Consider a typical eCommerce site (AT&T’s wireless site was chosen here as an example.)
June 19, 2013 www.societyconsulting.com 7
8. eCommerce Experimentation
First, the number of elements that can be varied and the number of variants per element is
nearly limitless.
June 19, 2013 www.societyconsulting.com 8
1- Vary
photos
5-Vary
offer
2-Vary
photos
7-Vary label
4-Vary text
9-Vary copy
6-Vary
copy
10-Vary label
8-Vary
text
3-Vary
photos
9. Experimentation Data
Then, to properly analyze the impact of an experiment, an analyst must add contextual and
confounding variables to the analysis. Even the simplest version of this analysis contains 500
variants that would need to be analyzed. Derived variables would add to this tally.
June 19, 2013 www.societyconsulting.com 9
Contextual Variables:
- Customer Profile
- Time of Year
- Time of Data
- Customer Location
- Customer Device
- Referring URL
Confounding Variables:
- Page load time
- Competitive Offers
- Other Experiments
- Multiple Tabs
- User’s Other Distractions
The Simplest Example:
- one treatment group
- one control group
10 Experiments * 2 = 20 Experiment
variables.
20 Experiment variables * 5
contextual variables = 100
controllable variables.
100 controllable variables * 5
confounding variables = 500
variants to be analyzed.
10. Creating Successful Big Data Projects
Successful Big Data projects blend Analytics Skills, Business Skills, and Technical Skills in the
right proportions.
June 19, 2013 www.societyconsulting.com 10
Business
Strategy
Technical
Skill
Analytics
Skill
11. The Key Steps
When moving a Big Data project from the Lab to the Mainstream, the following steps can
greatly increase the chances of success. Don’t neglect steps 1 and 2!
June 19, 2013 www.societyconsulting.com 11
1. Pick Focus Areas based on Business Strategy
2. Gain Agreement From Target Users
3. Build The Solution (Start Simple)
4. Analyze & Iterate
5. Expand, Repeat
12. A Single View of the Customer
Don’t bite off too much too soon. Creating a single view of the customer is an admirable
goal, but starting with fewer touchpoints and building out the remainder over time is
usually a better choice. Use goals such as these to set overall vision & direction.
June 19, 2013 www.societyconsulting.com 12
Touchpoints
Data “Fabric”
SEM
Web Site
Email
Display
Ads
Mobile
Customer
Partners
Contact Points
Not Exhaustive
• Customer At the
Center
• Data as the
Common Language
Between Systems
• Speed of
Communication
Between Systems
Matters
• Online & Offline
Analytics
13. Predictive Analytics
Predictive Analytics is another tantalizing concept that takes significant skill and capability
to achieve. Build up to an advanced concept like Predictive Analytics, don’t start with it.
June 19, 2013 www.societyconsulting.com 13
“Every Customer A Segment”
3 - Prediction
• Gather & monitor customer
context
• Evaluate explicit customer
signals in the correct context
• Generate predictions of what
the customer is most likely to
need or do next
• Rapidly test and iterate
• Apply learning from each
customer to next customer in
near-realtime
LEAP
“Cast Different Nets”
2 - SegmentationSTEP
“Spray & Pray”
1 - Mass Market
• Analyze historical customer data
to determine segments
• Create a strategy for each
segment, and goals to move
customers between segments
• Generate different messaging
for each segment
• Review performance and re-
craft messages, or re-segment
customer base
• Perform market research from
sample of customers to
determine needs
• Create marketing messaging
that addresses the most
common needs
• Review performance and adjust
messaging, in context of newest
research
Many digital marketing organizations are seeking to move
beyond segmentation to develop a more personalized, predictive
relationship with each customer. Getting to this advanced stage
of digital marketing represents a leap forward in terms of
competitiveness, but also in terms of capabilities needed.
“1x ROI”
“3x ROI”
“10x ROI”
14. Picking Focus Areas – Key Questions
Pick your early focus areas based on what is achievable. Early wins will generate
excitement, momentum, and more funding to sustain and grow your efforts.
June 19, 2013 www.societyconsulting.com 14
Strategy Questions:
• Which solutions would have the most impact on the business?
• Which solutions are quickest & easiest to implement?
• What role(s) does the business need to play?
Technical Questions:
• From which systems will I need data?
• Is the data clean, accessible, and timely?
• How much will it cost (HW, SW, people, 3rd party data) to build the solution?
• Do we have the skills to build it?
• How long will it take to build?
Analytics Questions:
• What are the criteria to evaluate the success of the solution?
• Do we have the tools to manage & analyze the data?
• Do we have the skills to analyze the data correctly?
15. A Focused Business Question
Let’s take an example of trying to connect two customer touchpoints: an eCommerce web
site, and a Customer Support center. The business goal is to improve the customer
experience and save costs.
June 19, 2013 www.societyconsulting.com 15
Solution Goal: Determine Which Changes to the
eCommerce Experience impact Customer Support Costs
Key Questions:
• Which changes to the purchase funnel increase the proportion of sales made
online vs. via phone?
• Do changes to the purchase funnel reduce the amount of phone time required to
make a purchase?
• Do changes to eCommerce help content reduce the number of phone calls?
16. Capabilities Needed
To create this solution, a number of important capabilities need to be in place.
June 19, 2013 www.societyconsulting.com 16
• Ability to identify a customer across both channels
• Ability to collect, store, and manage the data.
• Ability to connect key data points to the Customer ID (i.e. lots
of processing.)
• Ability to analyze the data effectively (tools such as Adobe
Insight, SAS, and Comscore Digital Analytix.)
• Ability to gauge customer impact over time and in context (this
is hard, often requiring advanced statistical skill.)
17. Key Steps
Once the capabilities are in place, perform the analysis and operationalize the changes.
This is a repeatable process that can be run at any scale.
June 19, 2013 www.societyconsulting.com 17
• Bring the data together (pour some Hadoop on it!)
• Find the success clusters, then the success factors.
• Be your own worst critic. Analyze competing explanations.
• Get another analyst to serve as a second set of eyes. Great
analysis survives all scrutiny.
• EXPERIMENT before rolling changes out to all customers.
Business must be bought into this, or the analysis will likely sit
on the shelf.
• With the momentum you gain, repeat for other business
questions.
18. In Summary
• Treat Big Data projects as a combination of
Business, Technical, & Analytics skills.
• Gain up-front agreement from the users of what you
are building.
• Keep focused. Build up wins, then expand.
• Don’t be afraid to fail early, but set expectations
appropriately.
• Network with your peers – keep learning. Big Data is
an exercise in learning.
June 19, 2013 www.societyconsulting.com 18