Cc hass b school talk 2105

calculation | consulting
data science leadership
(TM)
c|c
(TM)
charles@calculationconsulting.com

calculation|consulting
Data Science Leadership
(TM)
charles@caclulationconsulting.com

calculation | consulting data science leadership
Who Are We?
c|c
(TM)
Dr. Charles H. Martin, PhD
University of Chicago, Chemical Physics
NSF Fellow in Theoretical Chemistry
Over 10 years experience in applied Machine Learning
Developed ML algos for Demand Media; the ﬁrst $1B IPO since Google
Lean Start Ups: Aardvark (acquired by Google), eHow, Mode
Wall Street: BlackRock, GLG
Fortune 500: Big Pharma, Telecom, eBay
www.calculationconsulting.com
(TM)
3

BackStory: in 2011, Search Changed. Forever.
• ﬁrst $1B IPO since Google
• Machine Learning based SEO algorithms
• Measure the demand for search, and fulﬁll it
data science algorithms created a billion $ company
c|c
(TM)
(TM)
Demand Media
calculation | consulting data science leadership(TM)
4
eHow.com

BackStory: in 2011, Search Changed. Forever.
• Google adapted (Panda)
• Lack of diversiﬁcation
• Lack of adaptation
• Stock price never recovered
algorithmic accountability: DMD or Google?
c|c
(TM)
IPO
Panda
stock price 2011-2012
(TM)
DMD
(TM)
5

• ﬁrst $1B collapse due to Panda ?
• CPC revenues down
• premium online publishers died
collapse
?
stock price 2011-2012
c|c
(TM)
$1B in ad revenue was repriced and reallocated
Problem: Cornering the market on
search induced a market crash
6

c|c
(TM)
Panda-Induced ‘Market Crash’
Google CPC dropped just after Panda
7

Data Science is Different
c|c
(TM)
Davenport
Generating sustainable revenue requires
Data Science Leadership and Execution
(TM)
8
“Companies need a Spock in the boardroom”

c|c
(TM)
Davenport
Generating sustainable revenue requires
Data Science Leadership and Execution
(TM)
9
http://www.theonion.com/articles/national-science-foundation-science-hard,1405/

Problem: Data Scientists are Different
c|c
(TM)
Davenport
10
not all techies are the same

c|c
(TM)
Davenport
theoretical physics
machine learning specialist
(TM)
11
experimental physics
data scientist
engineer
software, browser tech, dev ops, …

c|c
(TM)
Davenport
12

Managing: Data Science Process
• Acquire Domain Knowledge
• Formulate Hypothesis
• Generate Model(s) from the Data
• Predict Revenue Gains
• Backtest Predictions on your Data
• A/B Test in Production
• Attribute Gains to Model(s)
c|c
(TM)
(TM)
acting
solving
framing
13

Managing: Data Science Process
c|c
(TM)
(TM)
14

c|c
(TM)
• Systems Thinking: leveraging the inter-relationships
between data, marketing, and the customer
• Knowledge Transfer: mentoring — not training — to
develop both personal mastery and team learning
• Mental Models: create a base of small-scale models for
thinking about how to use your data
• Knowledge Sharing: foster collaboration between
research, engineering, and product to drive revenue
Managing: Learning from Data
15

c|c
(TM)
• Cross-functional engineering, product, marketing, ﬁnance
• Autonomous: separate from the traditional engineering
product lifecycle. self-organizing and self-managing
• Experimental: form hypothesis, analyze data, make
predictions, run backtests, A/B testing
• Self-sustaining: not a cost center; generates revenue
(TM)
16

Solution: Collecting and Organizing Data
(TM)
c|c
(TM)
• Most companies are struggling organizing their data
• Data needs to be examined
• Don’t assume data is correct or useful
• More is More: simple algos work
• More is Less: noise is noise
Data not examined is not collected
17

Solutions: Hadoop and Big Data
(TM)
c|c
(TM)
• Hadoop is an internal data ecosystem
• Hadoop appears to have won the adoption wars ?
• Hadoop : 90% deployments internal
• Hadoop is a cost center
• ROI needs cut across business divisions
Algorithms, not data, generate revenue
18

Solutions: Cloud
(TM)
c|c
(TM)
• Startups don’t need infrastructure
• long term Data Storage is virtually free
• Amazon Redshift
• Google Big Query
• Cloud is the future ?
19

Solutions: Spark
(TM)
c|c
(TM)
• Next Gen Platform for Machine Learning
• Sits on Hadoop or the Cloud
• Still very high touch
• Limited algos
20

Problem: Measurements
(TM)
c|c
(TM)
good experiments are amazing
21
“If you can’t measure it, you can’t ﬁx it.”
DJ Patil,White House Chief Data Scientist

Data Science’s Measurement Problem
(TM)
c|c
(TM)
good experiments are hard to design
22
http://www.forbes.com/sites/lizryan/2014/02/10/if-you-cant-measure-it-you-cant-manage-it-is-bs/

(TM)
c|c
(TM)
23
“Data science has a measurement problem.
Simple metrics may not address complex situations.
But complex metrics present myriad problems.”
“As we strive for better algorithms,
we often fail to think critically about what it means
for predictions to be ‘good’”
http://www.kdnuggets.com/2015/03/data-science-measurement-problem-accuracy-auroc-f1.html

(TM)
c|c
(TM)
24
“Buffett found it 'extraordinary' that academics studied such things.
They studied what was measurable, rather than what was meaningful.‘
… to a man with a hammer,
everything looks like a nail.”
― Roger Lowenstein, Buffett:
The Making of an American Capitalist

c|c
(TM)
(TM)
Problem: The Cult of the Algorithm
25
what can algos actually do ?
“We have a new machine learning algo that anticipate
your needs over time and behave accordingly”

c|c
(TM)
(TM)
Problem: What can Machine Learning Do?
26

Demand Algos: Gas Station Analogy
Problem: where to open a gas station ?
Need: good traffic, weak competition
c|c
(TM)
less competitors
no traffic
sweet spot
great traffic
too many competitors
all businesses balance supply and demand
(TM)
27

SAAS Machine Learning Algos
c|c
(TM)
(TM)
28
$100,000 • 167 teams
Diabetic Retinopathy Detection
$15,000 • 341 teams
March Machine Learning Mania 2015
machine learning contests

SAAS Machine Learning Algos
c|c
(TM)
(TM)
29
machine learning apis

c|c
(TM)
(TM)
Problem: What can Deep Learning Do?
30

c|c
(TM)
(TM)
Problem: Externalities
31
external factors can change

c|c
(TM)
(TM)
Problem: Externalities
32
“Zynga is our best company ever!” (2010)
John Doerr, Google Investor, LegendaryVC
http://venturebeat.com/2010/11/16/google-investor-john-doerr-zynga-is-our-best-company-ever/
one marketplace | big risks

c|c
(TM)
(TM)
Solution: Algorithmic Accountability
An asset is an economic resource.
Anything tangible or intangible that is capable of
being owned or controlled to produce value and
that is held to have positive economic value is
considered an asset.
algorithms can be valuable assets
33

c|c
(TM)
(TM)
Algorithmic Accountability
34
does revenue depends on hidden algos ?
• WebMD Google SEO
• Amazon Product Listing Algo
• Pinterest Relevance Algo
• Twitter Spam ﬁlter
• Apple App Store Rankings

c|c
(TM)
(TM)
35
do decisions depend on hidden factors ?
A 'Crisis' in Online Ads: One-Third of Trafﬁc Is Bogus
http://www.wsj.com/articles/SB10001424052702304026304579453253860786362
Now Algorithms Are DecidingWhomTo Hire…
http://www.npr.org/blogs/alltechconsidered/2015/03/23/394827451/now-algorithms-are-deciding-whom-to-hire-based-on-voice
What you don’t know about Internet algorithms is hurting you…
http://www.washingtonpost.com/news/the-intersect/wp/2015/03/23/what-you-dont-know-about-internet-algorithms-is-hurting-you-and-you-probably-dont-know-very-much/

c|c
(TM)
(TM)
Solution: Algorithmic Transparency
36
can you be transparent and not be gamed ?
http://fortune.com/2015/03/18/how-do-you-govern-a-hidden-ﬂuid-and-amoral-algorithm/
83% of the participants in the study changed their behavior
once they knew about the algorithm
How do you govern a (hidden, ﬂuid and amoral) algorithm?
participants mistakenly believed that their friends intentionally
chose not to show them stories

c|c
(TM)
(TM)
Do you depend on some else’s marketplace?
How does your revenue depend on algos?
Do you need an internal algo ?
Who will manage it? build it? maintain it?
algorithms have unforeseen liabilities
37

(TM)
c|c
(TM)
c | c

Cc hass b school talk 2105

Recommended

Recommended

More Related Content

Similar to Cc hass b school talk 2105

Similar to Cc hass b school talk 2105 (20)

More from Charles Martin

More from Charles Martin (18)

Recently uploaded

Recently uploaded (20)

Cc hass b school talk 2105