2. COMPANY CONFIDENTIAL2
How business gets involved in the modeling
process (challenges involved in)
• CPG (consumer packaged goods)
• One of the first things I learned in the dS biz is that the biz problem is not far from the ds biz
wants to be invovled at all stages
– They want to pose problem
– Give perspective on solutions
– Review what DS is finding,
– Refine, the process and make suggestions
– Understand and critique the results
– Porous layer between biz and ds teams
• Can be a very positive thing: ideas on what should be included, validate if the results are
meaningful, biz context needed to build good models
• Downside: biz will often lead you down paths that are not productive or defensible + anecdotes!
• Having biz involved forces you to have models that are explanatory and not just predictive this
means they are meaningful
• If you just focus on prediction this will lead to overfit,
3. COMPANY CONFIDENTIAL3
It’s all about the data!
• Morgan Stanley we sell AA but many ppl do basic stuff with data
• Means that you don’t’ spend that much time doing algo stuff, mostly about
feature generation and data prep
• In SV w/ internet companies the data science is throw all the data at an
algorithm
• If you can be more intelligent with feature gen, you will get better
performance
• nevertheless, the more data you can get, the better
• So is acquisition of data very important and part of the process (overlooked)
• Traditional world: what data to use, which transforms VERSUS throwing
data in an algorithm and hoping for the best
– This is overlooked
4. COMPANY CONFIDENTIAL4
It’s not about the algorithm!
• Evicore example
• In a very short period of time, just using the straightforward approach, we
found a way to save 10s of millions of dollars
• By contrast, company like Vmware they are obsessed with applying
advanced algorithms on small amounts of data, not rich data, and not
making impact on the biz
• What is more important than the algo, is finding an important biz problem
and getting to a solution in a meaningful time period
• Also what is more important is operationalizing analytics result
• You can have a perfect model, not in production is just an insight can die
on the vine
• Simple model that can give you lift in customer acquisition and impact on
fraud that’s immediate
5. COMPANY CONFIDENTIAL5
How to become a data scientist!
• Personal experience and what you see during hiring
• Recruiting stuff
• Plug for alpine!
• Internships are the most important! Than courses and
stuffz
• All about connections
• Meetups