SlideShare a Scribd company logo
Data Science vs. Jungle Cats
A Paradigm For Data Science in Fundamental Investing
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
vs.y = m1x + ...mnx + b
By Ashlee Bennett
Data Science is a combination of:
- Computer Science/programming
- Math & Statistics
- Domain expertise
Data Science requires domain expertise (or so a google image search suggests)
This is the hardest to find
and often the most
important !
2
Domain expertise is crucial in nearly every step of the data science process
What data
would answer
what
question?
What transformations
or interpolations are
contextually
appropriate?
What performance
metric is aligned
with business
objectives?
What model is
optimal or practical
for the business
framework?
What assumptions
can be made given
fundamental
knowledge ?
Qs:
Ex: Sales v. Units
B&M v. Online
Doors v. users
Quarter vs. Monthly
Outliers: drop or keep
Nulls: drop or fill with #
Precision v. Recall
Correlation v. Contrast
Ranking v. Grouping
Black v. Clear Box
Speed v. Accuracy
Descriptive v. Predictive
Customer base?
Management claims?
Business initiatives?
3
Data
Collection
Cleaning &
Transformation
Performance
Metric Selection
Model
Evaluation
Analytic
Interpretation
Data Scientists can answer some of the questions that arise during the pipeline with
common sense or research, but often the process and ultimate outcome is more
timely and better served when the expertise of the business end user is
incorporated from the getgo and/or directly used to refine the process.
Business end users can be a key source of domain expertise
4
Without domain expertise, irrelevant data could be misleadingly transformed,
deceptively interpolated, evaluated via an irrelevant performance metric with
inappropriate models, only to reach an meaningless conclusion
5
. .
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Based upon the log-transformed, tree-rat
interpolated mean-square error rate of
sea slug population decline predicted in
this naive random support vector forest
gradient boost ensemble regressor, the
price of tea in China should rise two cents
over the next decade...
. . . .
--
z
z
z
? wtf?
So what does this have to do with jungle cats?
6
A jungle cat, or a jungle cat attack, is a metonymy for a rare,
yet critical event. We want to quickly identify and avoid jungle
cats, just like we want to identify and avoid disastrous
investment choices, especially when these choices are few
and carry a big impact.
While fundamental & PE investors have to worry about "jungle cats",
quants just worry about mosquitos. Mosquito bites suck & happen
frequently, but they don't kill you because they're small compared to
your total surface area of skin...Unless you experience many over a
short time period, or if they act as vectors spreading a disease
7
vs.
You can handle a few mosquito bites, but probably not a few jungle cat
attacks.
Quants placing many [smaller] bets can afford to gloss-over, or not incorporate domain
expertise in lieu of speed and diversity because a slightly less accurate or "underfit" model with
a few hiccups won't tank their portfolio; they're just metaphorical mosquito bites
Fundamental and Private Equity investors place fewer [larger] bets, so they have more to gain,
but more to lose. Incorporation of domain expertise can provide the mission-critical edge that
both identifies a good investment and avoids a disastrous one; a metaphorical jungle cat
. .
So where am I going with this?
Imagine a primitive villager who walks out of their
hut one day only to see a fierce jungle-cat....
?
. .
Even if they've only been attacked by a jungle cat
once, or maybe only ever heard about one, they'll
probably instantly know to turn around and GTFO
. .
But would an algorithm know to turn and run to
safety??
?
. .
`
`
vs.
This is no mosquito bite, the stakes are high...
So would an algorithm know to turn and run to safety??
Eventually...
An algorithm would eventually know to turn and run to safety...
But our villager might have to be mauled to death 2000 times first.
That's a lot of dead villagers
vs.
Algorithms and models are only as good as the data that you feed them. Too little
data or poor quality data will produce a suboptimal or even incorrect prediction
An intrinsic dearth of data, especially that pertaining to rare events (ie. jungle cat attacks, Mergers
& Acquisitions activity) can disguise the potential of data science techniques, and even make them
seem inferior to intuition alone
The problem : An algorithm sees and weights features in ways we don't
The advantage: An algorithm sees and weights features in ways we don't
For Instance:
What if the jungle cat came in different colors? Our algorithm might
need to see an instance of each to identify it as a vicious jungle cat, in
addition to being a similar size and build.
What if the jungle cats could have stripes??
The problem : An algorithm sees and weights features in ways we don't
The advantage: An algorithm sees and weights features in ways we don't
What if environmental settings can play a role in triggering an attack?
The problem : An algorithm sees and weights features in ways we don't
The advantage: An algorithm sees and weights features in ways we don't
What if all these things matter in combination??
The problem : An algorithm sees and weights features in ways we don't
The advantage: An algorithm sees and weights features in ways we don't
"Al" the Algorithm
Turn and RUN you fool !!!
Our algorithm might need to be fed
multiple instances of every jungle cat
type and environment combo to
correctly call when it's time to GTFO
with great accuracy
The problem : An algorithm sees and weights features in ways we don't
The advantage: An algorithm sees and weights features in ways we don't
. .
So then why try to use data science at all?
OMG, another jungle cat - RUN
FOR YOUR LIVES!!!
WTF?
Hmm...wait a second....
Please grant me a quick
death...
I just wanted
a belly rub...
Yo! Calm down. I don't
think this is a
jungle-cat...
-
What do you mean?!? It's large,
yellow and has four legs & a tail. My
experience & instincts are telling me a
violent death is nigh if I don't high-10
it outta here. Stat!...
Who? Me?
Yeah, but it also has a
waggly tail, boopable
snoot, floppy ears and
an adorably dumb-look
on its face...
. .
See what I
mean...
. .
. .
Based on all the jungle cat data points
I've seen, it's highly improbable that it
has this combination of differentiating
features & is still a jungle cat. I could be
wrong, but I'm here to bring these subtle
quantitative differences to your attention
You're right. At first glance I thought it was a
jungle cat based on my instinct and life
experience, but at closer inspection there are
quantitative differences between this beast and
any typical jungle cat I've seen or heard of...
You were just weighting the
size and color more than
other features like ear shape,
tail and stupidity of
expression, due to experience
or rumor-based bias
. .
Algorithms like me should
be used to augment
decision making by raising
flags when intuition-based
decisions don't align with all
the data available
. .
Happily Ever After??
OH, yasss...
. .
OMG, another jungle cat - RUN
FOR YOUR LIVES!!!
Hmm...wait a second....
. .
It's big, it has four legs and it's a
color jungle cats come in
Yo! Calm down. I don't
think this is a jungle cat...
Looks ...Tasty....
Based on the jungle-cat data points I've
ingested, it's highly improbable that it
has this combination of differentiating
features and is still a jungle cat. I could
be wrong, but I'm here to bring these
subtle quantitative differences to your
attention
. .
. .
Hmm...Come to think of it, that
doesn't look like a jungle cat
after all.
Yeah. Why don't you take a closer
look?
. .
?
. .
??
. .
???
. .
????
. .
?????
. .
??????
. .
???????
. .
!
. .
What happens After a [Metaphorical] Bear Attack??
Algorithms are only as good as the data they're trained on & they are
scoped to answer a specific question
"Not a Jungle cat" "Won't rip your arm off and eat it"
1) An invaluable "training" data point is gained & used to inform future predictions
2) The limitations or "scope" of the algorithm is revealed, emphasized, or re-considered
What happens After a [Metaphorical] Bear Attack??
1) An invaluable "training" data point is gained & used to inform future predictions
Turn and RUN you fool !!!
The algorithm is now trained to avoid
bears, or animals with the
characteristics of bears, as well.
What happens After a [Metaphorical] Bear Attack??
1) An invaluable "training" datapoint is gained & used to inform future predictions
Turn and RUN you fool !!!
Or we can even "boot-strap" our bear
data point to avoid bears under all
environmental scenarios
What happens After a [Metaphorical] Bear Attack??
1) An invaluable "training" datapoint is gained & used to inform future predictions
Over time, the result is an algorithm that is more
accurate, comprehensive and attuned to the
investor's personal experience and expertise
What happens After a [Metaphorical] Bear Attack??
1) An invaluable "training" datapoint is gained & used to inform future predictions
Over time, the result is an algorithm that is more accurate, comprehensive and attuned to the
investors personal experience and expertise, and whose utility is inheritable for new
investors whose lack of experience makes them especially prone to naivety and
chronological bias
. .
. .
This is akin to how knowledge and experience might be passed
down from a villager to his child, but without any bias, loss of
memory, or reliance on untested and mutable heuristics
What happens After a [Metaphorical] Bear Attack??
2) The limitations or "scope" of the algorithm is revealed, emphasized or re-considered
"Not a Jungle cat" "Won't rip your arm off and eat it"
Al was right, the bear was not a jungle cat. But it was a fucking bear, so our villager still should
have run. Al was not intentionally trying to be a smartass, he was just doing the only classification
task he was trained to do
Oh, Shit.
What happens After a [Metaphorical] Bear Attack??
2) The limitations or "scope" of the algorithm is revealed, emphasized or re-considered
"Not a Jungle cat" "Won't rip your arm off and eat it"
A solution to this dilemma might be to train Al as a multi-class classifier, or create and train new
algorithms who specialize in making different predictions
Run
Don't Run
Pet
Don't Pet
Jungle Cat
Not a Jungle Cat
Bear
Not a Bear
Dog
Not a Dog
What happens After a [Metaphorical] Bear Attack??
2) The limitations or "scope" of the algorithm is revealed, emphasized or re-considered
These different algorithms can even be used to "sanity-check" each others output and find
inconsistencies in the data, or algorithmic failures when their predictions are incongruent
Run
Don't Run
Pet
Don't Pet
Jungle Cat
Not a Jungle Cat
Bear
Not a Bear
Dog
Not a Dog
Collectively Reads As: "Don't Run, Pet, Jungle Cat"
What happens After a [Metaphorical] Bear Attack??
These different algorithms can even be used to "sanity-check" each others output and find
inconsistencies in the data, or algorithmic failures when their predictions are incongruent
. .
Petting a jungle cat? Even our villager knows that's crazy-talk. This discrepancy is less than ideal, but it
allows our villager to weight his confidence in the pooled algorithmic suggestion vs. his own instincts,
and based on the actual outcome, decide which algorithms to trust more than others in the future
WTF?
So what's the moral of the story??
● Algorithms are only as good as the data they're
trained on, and at addressing questions within
the scope for which they were designed
● While it can be a powerful tool to guide
decisions, in fundamental & PE investing data
science should never be completely divorced
from fundamental domain expertise, especially
when there is a dearth of relevant data points for
the algorithm to train on
● Also, don't pet bears.
Data Science vs. Jungle Cats
Cast of Characters (in case you didn't get the metaphor)
Villager
A Fundamental long/short or PE
investor
Jungle Cat
A detrimental equity or PE investment
opportunity to be avoided
"Al" the Algorithm
Your theoretical and abstract, yet
friendly data science help-meet
Affable Canine
A promising, yet non-obvious equity
or PE investment opportunity whose
value is realized after algorithmic,
unbiased assessment of its similarity
to historical wins is brought to the
investors attention
Asshole Bear
A potentially promising, yet non-obvious
equity or PE investment opportunity
whose undesirability is realized upon
further investigation, & whose encounter
should be used as an additional future
"training" data point, or used to remind or
re-think the scope of the algorithm
The End
. .

More Related Content

Similar to Data Science versus Jungle Cats

Pissing against the wind
Pissing against the windPissing against the wind
Pissing against the wind
Alberto Brandolini
 
How To Start The First Paragraph Of An Expository Essay
How To Start The First Paragraph Of An Expository EssayHow To Start The First Paragraph Of An Expository Essay
How To Start The First Paragraph Of An Expository Essay
Jessica Summers
 
Agile shortcuts conf
Agile shortcuts confAgile shortcuts conf
Agile shortcuts conf
Ram Ramalingam
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Guido X Jansen
 
Artificial Intelligence vs Artificial Stupidity
Artificial Intelligence vs Artificial StupidityArtificial Intelligence vs Artificial Stupidity
Artificial Intelligence vs Artificial Stupidity
Jim Stroud
 
Two People Talking Dialog
Two People Talking DialogTwo People Talking Dialog
Two People Talking Dialog
Stacy Johnson
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
JamesOrps
 
Ace case study competitions
Ace case study competitionsAce case study competitions
Ace case study competitions
Sravanth Vangara
 
Meet up september19-final
Meet up september19-finalMeet up september19-final
Meet up september19-final
Ido Rozen
 
Artificial intelligence - An Overview
Artificial intelligence - An OverviewArtificial intelligence - An Overview
Artificial intelligence - An Overview
MANISH T I
 
Sample Essay Applying For Scholarship. Online assignment writing service.
Sample Essay Applying For Scholarship. Online assignment writing service.Sample Essay Applying For Scholarship. Online assignment writing service.
Sample Essay Applying For Scholarship. Online assignment writing service.
Jennifer Magee
 
Ace case study competitions
Ace case study competitionsAce case study competitions
Ace case study competitions
Sravanth Vangara
 
How Four Statistical Rules Forecast Who Wins a Competitive Bid
How Four Statistical Rules Forecast Who Wins a Competitive BidHow Four Statistical Rules Forecast Who Wins a Competitive Bid
How Four Statistical Rules Forecast Who Wins a Competitive Bid
IntelCollab.com
 
Engineer Girl Essay Competition
Engineer Girl Essay CompetitionEngineer Girl Essay Competition
Engineer Girl Essay Competition
Emily Owusuansah
 
Help To Write Essay
Help To Write EssayHelp To Write Essay
Help To Write Essay
Theresa Singh
 
The Inevitable
The InevitableThe Inevitable
The Inevitable
Tom Fleerackers
 
AI Manifesto
AI Manifesto AI Manifesto
AI Manifesto
Abhijeet Kelkar
 
How To Write A Great Essay. Online assignment writing service.
How To Write A Great Essay. Online assignment writing service.How To Write A Great Essay. Online assignment writing service.
How To Write A Great Essay. Online assignment writing service.
Jessica Henderson
 
SU Talk - Rotary - script
SU Talk - Rotary - scriptSU Talk - Rotary - script
SU Talk - Rotary - script
Gordon Casey
 
Machine learning para tertulianos, by javier ramirez at teowaki
Machine learning para tertulianos, by javier ramirez at teowakiMachine learning para tertulianos, by javier ramirez at teowaki
Machine learning para tertulianos, by javier ramirez at teowaki
javier ramirez
 

Similar to Data Science versus Jungle Cats (20)

Pissing against the wind
Pissing against the windPissing against the wind
Pissing against the wind
 
How To Start The First Paragraph Of An Expository Essay
How To Start The First Paragraph Of An Expository EssayHow To Start The First Paragraph Of An Expository Essay
How To Start The First Paragraph Of An Expository Essay
 
Agile shortcuts conf
Agile shortcuts confAgile shortcuts conf
Agile shortcuts conf
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
Artificial Intelligence vs Artificial Stupidity
Artificial Intelligence vs Artificial StupidityArtificial Intelligence vs Artificial Stupidity
Artificial Intelligence vs Artificial Stupidity
 
Two People Talking Dialog
Two People Talking DialogTwo People Talking Dialog
Two People Talking Dialog
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 
Ace case study competitions
Ace case study competitionsAce case study competitions
Ace case study competitions
 
Meet up september19-final
Meet up september19-finalMeet up september19-final
Meet up september19-final
 
Artificial intelligence - An Overview
Artificial intelligence - An OverviewArtificial intelligence - An Overview
Artificial intelligence - An Overview
 
Sample Essay Applying For Scholarship. Online assignment writing service.
Sample Essay Applying For Scholarship. Online assignment writing service.Sample Essay Applying For Scholarship. Online assignment writing service.
Sample Essay Applying For Scholarship. Online assignment writing service.
 
Ace case study competitions
Ace case study competitionsAce case study competitions
Ace case study competitions
 
How Four Statistical Rules Forecast Who Wins a Competitive Bid
How Four Statistical Rules Forecast Who Wins a Competitive BidHow Four Statistical Rules Forecast Who Wins a Competitive Bid
How Four Statistical Rules Forecast Who Wins a Competitive Bid
 
Engineer Girl Essay Competition
Engineer Girl Essay CompetitionEngineer Girl Essay Competition
Engineer Girl Essay Competition
 
Help To Write Essay
Help To Write EssayHelp To Write Essay
Help To Write Essay
 
The Inevitable
The InevitableThe Inevitable
The Inevitable
 
AI Manifesto
AI Manifesto AI Manifesto
AI Manifesto
 
How To Write A Great Essay. Online assignment writing service.
How To Write A Great Essay. Online assignment writing service.How To Write A Great Essay. Online assignment writing service.
How To Write A Great Essay. Online assignment writing service.
 
SU Talk - Rotary - script
SU Talk - Rotary - scriptSU Talk - Rotary - script
SU Talk - Rotary - script
 
Machine learning para tertulianos, by javier ramirez at teowaki
Machine learning para tertulianos, by javier ramirez at teowakiMachine learning para tertulianos, by javier ramirez at teowaki
Machine learning para tertulianos, by javier ramirez at teowaki
 

Recently uploaded

一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
exukyp
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
slg6lamcq
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
nhutnguyen355078
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
asyed10
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
tzu5xla
 
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptxREUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
KiriakiENikolaidou
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
Bisnar Chase Personal Injury Attorneys
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
1tyxnjpia
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
actyx
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
eudsoh
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
Alireza Kamrani
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
aguty
 
Digital Marketing Performance Marketing Sample .pdf
Digital Marketing Performance Marketing  Sample .pdfDigital Marketing Performance Marketing  Sample .pdf
Digital Marketing Performance Marketing Sample .pdf
Vineet
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
uevausa
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
vasanthatpuram
 

Recently uploaded (20)

一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
 
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptxREUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
 
Digital Marketing Performance Marketing Sample .pdf
Digital Marketing Performance Marketing  Sample .pdfDigital Marketing Performance Marketing  Sample .pdf
Digital Marketing Performance Marketing Sample .pdf
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
 

Data Science versus Jungle Cats

  • 1. Data Science vs. Jungle Cats A Paradigm For Data Science in Fundamental Investing - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - vs.y = m1x + ...mnx + b By Ashlee Bennett
  • 2. Data Science is a combination of: - Computer Science/programming - Math & Statistics - Domain expertise Data Science requires domain expertise (or so a google image search suggests) This is the hardest to find and often the most important ! 2
  • 3. Domain expertise is crucial in nearly every step of the data science process What data would answer what question? What transformations or interpolations are contextually appropriate? What performance metric is aligned with business objectives? What model is optimal or practical for the business framework? What assumptions can be made given fundamental knowledge ? Qs: Ex: Sales v. Units B&M v. Online Doors v. users Quarter vs. Monthly Outliers: drop or keep Nulls: drop or fill with # Precision v. Recall Correlation v. Contrast Ranking v. Grouping Black v. Clear Box Speed v. Accuracy Descriptive v. Predictive Customer base? Management claims? Business initiatives? 3
  • 4. Data Collection Cleaning & Transformation Performance Metric Selection Model Evaluation Analytic Interpretation Data Scientists can answer some of the questions that arise during the pipeline with common sense or research, but often the process and ultimate outcome is more timely and better served when the expertise of the business end user is incorporated from the getgo and/or directly used to refine the process. Business end users can be a key source of domain expertise 4
  • 5. Without domain expertise, irrelevant data could be misleadingly transformed, deceptively interpolated, evaluated via an irrelevant performance metric with inappropriate models, only to reach an meaningless conclusion 5 . . - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Based upon the log-transformed, tree-rat interpolated mean-square error rate of sea slug population decline predicted in this naive random support vector forest gradient boost ensemble regressor, the price of tea in China should rise two cents over the next decade... . . . . -- z z z ? wtf?
  • 6. So what does this have to do with jungle cats? 6 A jungle cat, or a jungle cat attack, is a metonymy for a rare, yet critical event. We want to quickly identify and avoid jungle cats, just like we want to identify and avoid disastrous investment choices, especially when these choices are few and carry a big impact. While fundamental & PE investors have to worry about "jungle cats", quants just worry about mosquitos. Mosquito bites suck & happen frequently, but they don't kill you because they're small compared to your total surface area of skin...Unless you experience many over a short time period, or if they act as vectors spreading a disease
  • 7. 7 vs. You can handle a few mosquito bites, but probably not a few jungle cat attacks. Quants placing many [smaller] bets can afford to gloss-over, or not incorporate domain expertise in lieu of speed and diversity because a slightly less accurate or "underfit" model with a few hiccups won't tank their portfolio; they're just metaphorical mosquito bites Fundamental and Private Equity investors place fewer [larger] bets, so they have more to gain, but more to lose. Incorporation of domain expertise can provide the mission-critical edge that both identifies a good investment and avoids a disastrous one; a metaphorical jungle cat
  • 8. . . So where am I going with this? Imagine a primitive villager who walks out of their hut one day only to see a fierce jungle-cat.... ?
  • 9. . . Even if they've only been attacked by a jungle cat once, or maybe only ever heard about one, they'll probably instantly know to turn around and GTFO
  • 10. . . But would an algorithm know to turn and run to safety?? ?
  • 11. . . ` ` vs. This is no mosquito bite, the stakes are high...
  • 12. So would an algorithm know to turn and run to safety?? Eventually...
  • 13. An algorithm would eventually know to turn and run to safety... But our villager might have to be mauled to death 2000 times first. That's a lot of dead villagers
  • 14. vs. Algorithms and models are only as good as the data that you feed them. Too little data or poor quality data will produce a suboptimal or even incorrect prediction An intrinsic dearth of data, especially that pertaining to rare events (ie. jungle cat attacks, Mergers & Acquisitions activity) can disguise the potential of data science techniques, and even make them seem inferior to intuition alone
  • 15. The problem : An algorithm sees and weights features in ways we don't The advantage: An algorithm sees and weights features in ways we don't For Instance: What if the jungle cat came in different colors? Our algorithm might need to see an instance of each to identify it as a vicious jungle cat, in addition to being a similar size and build.
  • 16. What if the jungle cats could have stripes?? The problem : An algorithm sees and weights features in ways we don't The advantage: An algorithm sees and weights features in ways we don't
  • 17. What if environmental settings can play a role in triggering an attack? The problem : An algorithm sees and weights features in ways we don't The advantage: An algorithm sees and weights features in ways we don't
  • 18. What if all these things matter in combination?? The problem : An algorithm sees and weights features in ways we don't The advantage: An algorithm sees and weights features in ways we don't
  • 19. "Al" the Algorithm Turn and RUN you fool !!! Our algorithm might need to be fed multiple instances of every jungle cat type and environment combo to correctly call when it's time to GTFO with great accuracy The problem : An algorithm sees and weights features in ways we don't The advantage: An algorithm sees and weights features in ways we don't
  • 20. . . So then why try to use data science at all? OMG, another jungle cat - RUN FOR YOUR LIVES!!! WTF? Hmm...wait a second....
  • 21. Please grant me a quick death... I just wanted a belly rub... Yo! Calm down. I don't think this is a jungle-cat... -
  • 22. What do you mean?!? It's large, yellow and has four legs & a tail. My experience & instincts are telling me a violent death is nigh if I don't high-10 it outta here. Stat!... Who? Me? Yeah, but it also has a waggly tail, boopable snoot, floppy ears and an adorably dumb-look on its face... . .
  • 24. . . Based on all the jungle cat data points I've seen, it's highly improbable that it has this combination of differentiating features & is still a jungle cat. I could be wrong, but I'm here to bring these subtle quantitative differences to your attention
  • 25. You're right. At first glance I thought it was a jungle cat based on my instinct and life experience, but at closer inspection there are quantitative differences between this beast and any typical jungle cat I've seen or heard of... You were just weighting the size and color more than other features like ear shape, tail and stupidity of expression, due to experience or rumor-based bias . .
  • 26. Algorithms like me should be used to augment decision making by raising flags when intuition-based decisions don't align with all the data available . . Happily Ever After?? OH, yasss...
  • 27. . . OMG, another jungle cat - RUN FOR YOUR LIVES!!! Hmm...wait a second....
  • 28. . . It's big, it has four legs and it's a color jungle cats come in Yo! Calm down. I don't think this is a jungle cat... Looks ...Tasty....
  • 29. Based on the jungle-cat data points I've ingested, it's highly improbable that it has this combination of differentiating features and is still a jungle cat. I could be wrong, but I'm here to bring these subtle quantitative differences to your attention . .
  • 30. . . Hmm...Come to think of it, that doesn't look like a jungle cat after all. Yeah. Why don't you take a closer look?
  • 31. . . ?
  • 38. . . !
  • 39. . .
  • 40. What happens After a [Metaphorical] Bear Attack?? Algorithms are only as good as the data they're trained on & they are scoped to answer a specific question "Not a Jungle cat" "Won't rip your arm off and eat it" 1) An invaluable "training" data point is gained & used to inform future predictions 2) The limitations or "scope" of the algorithm is revealed, emphasized, or re-considered
  • 41. What happens After a [Metaphorical] Bear Attack?? 1) An invaluable "training" data point is gained & used to inform future predictions Turn and RUN you fool !!! The algorithm is now trained to avoid bears, or animals with the characteristics of bears, as well.
  • 42. What happens After a [Metaphorical] Bear Attack?? 1) An invaluable "training" datapoint is gained & used to inform future predictions Turn and RUN you fool !!! Or we can even "boot-strap" our bear data point to avoid bears under all environmental scenarios
  • 43. What happens After a [Metaphorical] Bear Attack?? 1) An invaluable "training" datapoint is gained & used to inform future predictions Over time, the result is an algorithm that is more accurate, comprehensive and attuned to the investor's personal experience and expertise
  • 44. What happens After a [Metaphorical] Bear Attack?? 1) An invaluable "training" datapoint is gained & used to inform future predictions Over time, the result is an algorithm that is more accurate, comprehensive and attuned to the investors personal experience and expertise, and whose utility is inheritable for new investors whose lack of experience makes them especially prone to naivety and chronological bias . . . . This is akin to how knowledge and experience might be passed down from a villager to his child, but without any bias, loss of memory, or reliance on untested and mutable heuristics
  • 45. What happens After a [Metaphorical] Bear Attack?? 2) The limitations or "scope" of the algorithm is revealed, emphasized or re-considered "Not a Jungle cat" "Won't rip your arm off and eat it" Al was right, the bear was not a jungle cat. But it was a fucking bear, so our villager still should have run. Al was not intentionally trying to be a smartass, he was just doing the only classification task he was trained to do Oh, Shit.
  • 46. What happens After a [Metaphorical] Bear Attack?? 2) The limitations or "scope" of the algorithm is revealed, emphasized or re-considered "Not a Jungle cat" "Won't rip your arm off and eat it" A solution to this dilemma might be to train Al as a multi-class classifier, or create and train new algorithms who specialize in making different predictions Run Don't Run Pet Don't Pet Jungle Cat Not a Jungle Cat Bear Not a Bear Dog Not a Dog
  • 47. What happens After a [Metaphorical] Bear Attack?? 2) The limitations or "scope" of the algorithm is revealed, emphasized or re-considered These different algorithms can even be used to "sanity-check" each others output and find inconsistencies in the data, or algorithmic failures when their predictions are incongruent Run Don't Run Pet Don't Pet Jungle Cat Not a Jungle Cat Bear Not a Bear Dog Not a Dog Collectively Reads As: "Don't Run, Pet, Jungle Cat"
  • 48. What happens After a [Metaphorical] Bear Attack?? These different algorithms can even be used to "sanity-check" each others output and find inconsistencies in the data, or algorithmic failures when their predictions are incongruent . . Petting a jungle cat? Even our villager knows that's crazy-talk. This discrepancy is less than ideal, but it allows our villager to weight his confidence in the pooled algorithmic suggestion vs. his own instincts, and based on the actual outcome, decide which algorithms to trust more than others in the future WTF?
  • 49. So what's the moral of the story?? ● Algorithms are only as good as the data they're trained on, and at addressing questions within the scope for which they were designed ● While it can be a powerful tool to guide decisions, in fundamental & PE investing data science should never be completely divorced from fundamental domain expertise, especially when there is a dearth of relevant data points for the algorithm to train on ● Also, don't pet bears.
  • 50. Data Science vs. Jungle Cats Cast of Characters (in case you didn't get the metaphor) Villager A Fundamental long/short or PE investor Jungle Cat A detrimental equity or PE investment opportunity to be avoided "Al" the Algorithm Your theoretical and abstract, yet friendly data science help-meet Affable Canine A promising, yet non-obvious equity or PE investment opportunity whose value is realized after algorithmic, unbiased assessment of its similarity to historical wins is brought to the investors attention Asshole Bear A potentially promising, yet non-obvious equity or PE investment opportunity whose undesirability is realized upon further investigation, & whose encounter should be used as an additional future "training" data point, or used to remind or re-think the scope of the algorithm