Ben Biddle, former Data Scientist at Edmunds, turned the volume down on all the data hype in this talk and focused instead on what's most relevant and immediately actionable for a product manager.
21. Top 3 Things to
Remember
1. Get to know your data
flows
2. Think in terms of
hypotheses, not
metrics
3. Use information value
to pick the right data for
the job
24. Recommended Reading
● How to Measure Anything by Douglas Hubbard
● Getting to Plan B by Mullins and Komisar
● Data Driven by Thomas Redman
● Why by Samantha Kleinberg
● Data Science for Business by Provost and Fawcett
26. Part-time Product Management Courses in
Silicon Valley, New York, Los Angeles, and
Orange County
www.productschool.com
Editor's Notes
When you checked in tonight, you got an email inviting you to join our slack community
In that community, we have 12k product people who have come through different companies like google, facebook, uber
Sharing information about events, job offers from our partner companies, and valuable online content
Please check your email and join - it’s free
In our PM Course, we teach how to build products and how to get a job as a software product manager
All our classes are 2 months, part time, and compatible with full time jobs. We have two options, Tues/Thurs in the evening and Saturdays in the morning
Instructors- are senior level product managers from companies like Google, FB, Uber, etc
In addition to our PM class, we offer our Coding for Managers class
Also two months and part time tailored for professionals who don’t come from a traditional engineering background
The goal of this course is not to make you a software engineer, but to give you enough technical background to build a fully functional website and pass the technical interview
Similar to our coding course, we also offer our Data Analytics for Managers
Tailored for people who don’t have a technical background but to give them enough knowledge of analytics to become product managers
Also two months, compatible with full time jobs
The goal of the course is not to make you a data scientist, but to make you technical enough to understand web analytics, learn SQL, and machine learning concepts
We are also live streaming our event to our online audience
If you want to share, please tweet @productschool and #prodmgmt for a free ticket to our next event
Etugo Nwokah
There are already enough books and courses out there to tell you in theory how to get the most value out of data as a product manager
My focus today will be on what it really looks like in practice, specifically the ways in which it can go wrong
I’ve learned a few hard lessons over my 15 year career and hopefully some of you too will have hard-won lessons we can all learn from
Information is anything that reduces uncertainty and can be used to describe or make an inference or prediction about t
When defined simply as the digital equivalent to information, it’s easy to see why data is such an important strategic asset in the information age and a knowledge economy such as that of the US. As AI, robotics and automation continue to advance, non-routine knowledge work will be all that’s left. Every job function will depend on data & analytics; specifically, how well you are able to extract, refine and deploy data.
Extract - find and access valuable new data sources and novel combinations of data sets
Refine - cleanse, transform and enrich data sets
Deploy - make actionable data available to individuals and services when and where it’s needed
Formulating hypotheses - identifying market opportunities or jobs-to-be-done
Testing hypotheses - A/B testing as well as focus groups, customer interviews, user testing, other observational data
Delivering your solution - marketing, sales, support and service as feedback loops for the product manager to take advantage of; data exhaust as a source of ideas for both improvements and new products; data network effects as a competitive moat
In the rush to get something to market, product managers sometimes seem to forget what the V stands for in MVP
Make sure what you’re putting in market is actually viable; don’t let a good idea end up on the waste heap because what you shipped was actually half baked
Don’t limit yourself to just A/B testing either; anything that reduces uncertainty about a hypothesis should be considered a valid form of testing
Beware of confounding factors; make sure you’ve explored for as many alternative explanation as possible for an observed phenomenon but always favor the simplest one (occam’s razor)
Manage hypotheses, not metrics; start with the hypothesis then figure out the best measurements to use for testing that hypothesis
We often inherit metrics from the top down; management decides how to measure success and your job is to move the needle
When that’s the case, make sure you can establish a causal link or causal chain between the metric you need to impact and the hypothesis that you are testing
In psychology this tendency is referred to as confirmation bias, the tendency to seek out information that’s consistent with a pre-held belief and to discount or ignore information that might refute it
Being aware of confirmation bias doesn’t make you immune to it; that’s the definition of a bias in fact
Often times business stakeholders seem to have too much reverence for data and analytics; data is as fallible as the people collecting, managing, and interpreting it
You can never prove a hypothesis; you can only fail to disprove it; everything should be subject to falsifiability
Models, whether they’re written in Python and stored in a Jupyter notebook or derived from personal experience and encoded in the wet-ware of a brain, are simplified descriptions of more complex systems
They necessarily exclude information; the important question is whether any of it is actually useful
This reinforces the importance of always maintaining a healthy amount of skepticism toward whatever the data is telling you; it also relates to my next point
More data delivered faster doesn’t alway lead to better business outcomes
Data comes at a cost; there are costs incurred to find, access, and manage the data, and there are the potential costs (i.e. risks) of something going wrong, whether it’s a security breach or noise that’s mistaken for a signal
Data’s value varies along 3 dimensions - quality, accessibility and interpretability (or meaningfulness)
One measure of data quality is the error rate, which tends to go down over time through reconciliation and error checking; the precision and accuracy of measurements also influence data quality; depending on the business case, you might have a higher or lower error tolerance and require differing levels of precision in your measurements, e.g. marketing mix models vs. multi-touch digital attribution models
Accessibility means getting information to the people who need, when they need it, with as little friction as possible; it also means keeping out people who should not be accessing the data. Often times there’s an inverse relationship between the two, and both can increase your cost basis.
Interpretability or meaningfulness is determined by how easy it is to turn data into action. It starts with a well defined data model, documented in a clear and well maintained data dictionary containing important metadata. Data can be transformed, enriched, and visualized to make it easier to interpret, but if you don’t understand what the numbers mean and have confidence in their reliability, at best you will get stuck in analysis paralysis. At worst, you’ll make a decision based on a spurious conclusion. In cases of information overload, people tend to fall back on more faulty heuristics.
What about implicit vs. explicit data? Is one better than the other?
Build your business case based on information value
(Potential gain assuming a true hypothesis)*(Probability the hypothesis is true) – (Potential loss assuming a false hypothesis)*(Probability the hypothesis is false)
The value with and without information depends on the reduction in uncertainty. A given data source may reduce uncertainty more but at what cost?
Where does data come from, how is it managed, how is it use? What are the feedback loops available to you as a product manager? What kind of data exhaust does your business and/or product generate? How can you benefit from data network effects?
Can you come up with a reasonable causal explanation to what’s being observed and link it back in some way to your top level business objectives? How do the metrics you are using align with your customer’s job-to-be-done? Are you looking for your keys under the lampost because that’s where the light is best or because that’s where you think you actually dropped them?
How much do you really need to reduce your uncertainty to confidently make a decision? What’s your tolerance for errors? How quickly and often do you need to refresh the information available to you? How precise do your measurements need to be? Does the information come with security risks?