SlideShare a Scribd company logo
1 of 50
Download to read offline
The Lottery Paradox
A New Use
MIT Computer Science & Artificial Intelligence Lab
Dec 1, 2020
Peter Cotton
Chief Data Scientist
Intech Investments
Hello. I work for Intech — a leading equity quant manager
2
I am asymptotically the world’s most productive data scientist
3
(Returns measured water height … somewhere … from NOAA)
Creates data stream
… or so I tell my boss
At the conclusion of a “ten minute data science project”, a data stream is predicted
by dozens of competing time series algorithms, written by different authors using
different tools, in different languages, with access to different exogenous data.
Outline
4
1. On the lottery paradox:
a. Positive returns
b. Continuous lotteries
c. Indifference to the market distribution
d. Relationship between returns and distance
2. Putting it to work:
a. Real-time distributional prediction
b. Stacking lottery games
c. Implied quantiles and copulas
d. Categories of business applications
3. An existence “proof” for a prediction network (that doesn’t exist)
a. The demise of artisan “data science”
b. Why algorithms will manage the production of prediction
5
1. Lottery Paradoxes
Lottery paradox #1
6
Assume 10% rake. Buyer chooses 1 … 10,000. Most enter randomly.
Mary buys every possible ticket once.
16% return !
Lottery paradox resolution - simpler example
7
Mary benefits from Alice and Bob tripping on each other’s toes
Only two outcomes
Lottery paradox #2
8
Let W denote the average number of people sharing the prize.
Alice is a random ticket buyer.
Alice shares with approximately W other people
Lottery paradox #2 - resolution
9
In the case of two tickets (head and tails), Alice shares with W-½ others
(We can count)
Lottery paradox #2 - resolution
10
Alice’s average is a population average, not outcome average.
When Alice, Bob and Joe share the prize, it counts three times.
This allows the population average to exceed the average over tickets by almost 1
Lottery paradox #2 - better resolution
11
Alice wins with ticket 137 → Mean # of people choosing 137 goes up by almost +1.
(“Approximate Bayes”)
c.f. Mary winning conveys no information at all.
Lottery paradox #2 - even better resolution?
12
Consider Mary’s last ticket …. lucky by +1
But all her tickets are the same
Lottery paradox #3: Indifference
13
Suppose:
● No rake
● Mary’s investment is small
● Mary optimizes long run wealth
● Mary can see everyone else’s ticket choices
Lottery paradox #3: Indifference
14
Suppose:
● No rake
● Mary’s investment is small
● Mary optimizes long run wealth
● Mary can see everyone else’s ticket choices
⇒ Mary still buys one of each ticket
⇒ Mary doesn’t care what anyone else does !
Racetrack Paradox
15
Mary doesn’t look at the odds !
Racetrack paradox - resolution #1
16
Maximize
Constraint
First order Lagrange condition:
Thus must not depend on the horse index i
Racetrack paradox - elementary resolution
17
Transfer a tiny investment from first horse to the second
Follows that so they must be equal
Remark on Entropy and KL-Divergence
18
Entropy … also the term not involving q in Mary’s return
Kullback and Liebler cross-entropy
We can interpret distance of Q from truth P in terms of Mary’s return exploiting it
Now for something completely different?
19
Mayor draws from “Normalish” distribution
Participants write a real number down. All those close share the prize.
Same Game?
20
{1,2,...,10000}
Mary’s Reward for Accuracy - Exponential
21
Market
Mary
Mary’s Reward for Accuracy - Normalish
22
Market error Mary’s return
10% 20 pts ( 0.2 )
1% 20 bps ( 0.002 )
● Use the fourth root transform to relate exponential
returns to market error … measured as a percentage of
standard deviation
23
2. Realtime distributional prediction
Algorithms Play Continuous Lottery Games
24
All day, every day
Algorithms authored by anyone Live data published by anyone
Algorithms submit 225 scenarios
25
Why not point estimates? Ask Roger Federer
Quarantine
26
Data arrives
12:37:52
Cutoff 12:36:42
70 second quarantine
12:36:57
12:37:27
12:37:46
Qualify for rewards
Time
12:34:27
12:35:41
17.56
Wind
speed
17.57
17.55
Reward
window
Implied Percentiles
27
Every incoming data point implies a new data point …
z = F(x)
where F is the “community” distribution function
Cumulative distribution for NY Electricity Production (Wind) 1 hr ahead
Example: Reactions to the presidential debate
Welcome Module 128
See https://www.microprediction.com/blog/tears_of_joy_standardizing_streaming_data
Stacking Lotteries
29
Market implied percentiles are themselves the subject of lottery games (via
normal quantile function)
Approximately N(0,1)
Algorithms predicting small
deviations from standard normal
Combine Percentiles
30
Some seemingly univariate series of games are actually copulas
Pitch and Yaw implied compulas - from MIT SciML helicopula challenge
Optics Analogy
31
Keep “lensing” until you get N(0,1)
Composition of monotone functions, each contributed by one or more algorithms
Pathways in the Collective Probability Brain
32
Scenarios “thrown” up to top level lottery
U( )
V( )
R( )
W( )
S( )
T( )
Collaboration
Q( )
Competition
Competition
Competition
Law of Iterated Expectations
33
Pathways grow and shrink based on the economics
Point estimates are a special case - shift
Exogenous data is a special case - shift arbitrarily (!)
E[Y|X]
E[E[Y|X]|Z]
Y
E[Y|S]
E[E[Y|S]|Z]
E[Y|S,Z,R] = E[E[E[Y|S]|Z]|R]
Scenarios thrown “up” into top level lottery
Management fees charged down from parent to child
Wanna Play?
34
Wanna Predict Something?
35
In any language (api.microprediction.org)
Use category #1: Auxiliary market predictions
36
Markets predict the mean of a stock well
Everything else (pretty much) is poorly predicted, due to lack
of the discipline imposed by competition.
● Volatilities,
● Correlations
● Bid-offer spreads
● Liquidity
● Trading costs
● Holding periods
● Client flow
● Response to inquiry
● Cover price
Use category #2: Prioritizing human work
37
e.g. reference data cleaning
Probability that a record is changed?
Which records will be changed?
Use category #3: Enhancing live data feeds
Welcome Module 138
Tagging.
Converting sporadic live data to continuous.
Discovering existing relationships
Predicting delayed data and partially filled data
Discovering good embeddings
Finding new exogenous data
Discovering good proxies for truth
Use category #4: Live feature discovery
Welcome Module 139
Chumming the water
Predicting quantities correlated with the quantity you truly care about
Determining which feature generation algorithms are suited to the task at hand
Use category #5: Enhancing business intelligence applications
Welcome Module 140
Predicting numbers on dashboards
Highlighting unusual movements
Predicting human reaction to information, or not (false positives)
Enabling humans to track a larger amount of data in real time
Use category #6: Fairness and explanation
Welcome Module 141
Discovering data that reveals hidden bias
Historical example: proxies for race, redlining
Usage category #7: Surrogate models
42
Competing and combining surrogate models for agent based epidemic modeling
https://www.microprediction.org/stream_dashboard.html?stream=pandemic_infected
43
3. An Existence Proof
(for an automated Machine Learning
network replacing artisan data science
in large part)
1 - Motherhood statement
Welcome Module 144
Quantitative business optimization will be a survival requirement for companies
(Machine Learning is set to transform all industries)
2 - Slightly more controversial...
Welcome Module 145
Quantitative business optimization using ML/AI = frequently repeated prediction
Control theory ~ RL ~ microprediction of value functions
3 - Obvious to MIT folks
Welcome Module 146
Strangers can do your ML for you
4 - Orthodox economics (local knowledge)
Welcome Module 147
At approximately zero friction, markets >> central planning by humans
5 - The rest is busywork ...
Welcome Module 148
Humans will not play a blocking role in the production of prediction
Machine Learning will be orchestrated by hierarchies of real-time generalized contests
Thanks for listening !
49
50
• Wrote the front end
• Winning crawlers
• Clients in Java, Julia, Rust
• ZK-MUID proofs
• Monotonic NN’s
Thanks to Key Contributors. Join us !
Interested? Join us Friday’s at noon for informal contributor chat
https://www.microprediction.com/contact-us

More Related Content

Similar to Lottery paradox csail-dec-2020.pptx

44 randomized-algorithms
44 randomized-algorithms44 randomized-algorithms
44 randomized-algorithmsAjitSaraf1
 
Neo4j GraphDay Seattle- Sept19- graphs are ai
Neo4j GraphDay Seattle- Sept19-  graphs are aiNeo4j GraphDay Seattle- Sept19-  graphs are ai
Neo4j GraphDay Seattle- Sept19- graphs are aiNeo4j
 
Data Science An Engineering Implementation Perspective
Data Science An Engineering Implementation PerspectiveData Science An Engineering Implementation Perspective
Data Science An Engineering Implementation PerspectiveLalit Mohan Chandra Bhatt
 
Pseudo-Random Number Generators: A New Approach
Pseudo-Random Number Generators: A New ApproachPseudo-Random Number Generators: A New Approach
Pseudo-Random Number Generators: A New ApproachNithin Prince John
 
Understanding the fundamentals of attacks
Understanding the fundamentals of attacksUnderstanding the fundamentals of attacks
Understanding the fundamentals of attacksCyber Security Alliance
 
Fin415 Week 2 Slides
Fin415 Week 2 SlidesFin415 Week 2 Slides
Fin415 Week 2 Slidessmarkbarnes
 
Introduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive AnalyticsIntroduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive AnalyticsDilum Bandara
 
Game theory for neural networks
Game theory for neural networksGame theory for neural networks
Game theory for neural networksDavid Balduzzi
 
Machine learning for bestt group - 20170714
Machine learning for bestt group - 20170714Machine learning for bestt group - 20170714
Machine learning for bestt group - 20170714IBM Thailand Co Ltd
 
Big Data Analytics: The Math, the Implementation and How it can be Effectivel...
Big Data Analytics: The Math, the Implementation and How it can be Effectivel...Big Data Analytics: The Math, the Implementation and How it can be Effectivel...
Big Data Analytics: The Math, the Implementation and How it can be Effectivel...InfoTrust LLC
 
Software testing
Software testingSoftware testing
Software testingDIPEN SAINI
 
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713Mathieu DESPRIEE
 
Big Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloBig Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloOCTO Technology
 
IOT Use Cases by Derick Jose - Co-founder and Chief Product Officer of M2M pl...
IOT Use Cases by Derick Jose - Co-founder and Chief Product Officer of M2M pl...IOT Use Cases by Derick Jose - Co-founder and Chief Product Officer of M2M pl...
IOT Use Cases by Derick Jose - Co-founder and Chief Product Officer of M2M pl...The Hive
 
Applied Data Science for monetization: pitfalls, common misconceptions, and n...
Applied Data Science for monetization: pitfalls, common misconceptions, and n...Applied Data Science for monetization: pitfalls, common misconceptions, and n...
Applied Data Science for monetization: pitfalls, common misconceptions, and n...DevGAMM Conference
 
Week14_Business Simulation Modeling MSBA.pptx
Week14_Business Simulation Modeling MSBA.pptxWeek14_Business Simulation Modeling MSBA.pptx
Week14_Business Simulation Modeling MSBA.pptxUsamamalik345378
 
Little book of programming challenges
Little book of programming challengesLittle book of programming challenges
Little book of programming challengesysolanki78
 
Estimating default risk in fund structures
Estimating default risk in fund structuresEstimating default risk in fund structures
Estimating default risk in fund structuresIFMR
 

Similar to Lottery paradox csail-dec-2020.pptx (20)

44 randomized-algorithms
44 randomized-algorithms44 randomized-algorithms
44 randomized-algorithms
 
Neo4j GraphDay Seattle- Sept19- graphs are ai
Neo4j GraphDay Seattle- Sept19-  graphs are aiNeo4j GraphDay Seattle- Sept19-  graphs are ai
Neo4j GraphDay Seattle- Sept19- graphs are ai
 
Data Science An Engineering Implementation Perspective
Data Science An Engineering Implementation PerspectiveData Science An Engineering Implementation Perspective
Data Science An Engineering Implementation Perspective
 
Pseudo-Random Number Generators: A New Approach
Pseudo-Random Number Generators: A New ApproachPseudo-Random Number Generators: A New Approach
Pseudo-Random Number Generators: A New Approach
 
Understanding the fundamentals of attacks
Understanding the fundamentals of attacksUnderstanding the fundamentals of attacks
Understanding the fundamentals of attacks
 
Fin415 Week 2 Slides
Fin415 Week 2 SlidesFin415 Week 2 Slides
Fin415 Week 2 Slides
 
Introduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive AnalyticsIntroduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive Analytics
 
Game theory for neural networks
Game theory for neural networksGame theory for neural networks
Game theory for neural networks
 
Machine learning for bestt group - 20170714
Machine learning for bestt group - 20170714Machine learning for bestt group - 20170714
Machine learning for bestt group - 20170714
 
Big Data Analytics: The Math, the Implementation and How it can be Effectivel...
Big Data Analytics: The Math, the Implementation and How it can be Effectivel...Big Data Analytics: The Math, the Implementation and How it can be Effectivel...
Big Data Analytics: The Math, the Implementation and How it can be Effectivel...
 
Software testing
Software testingSoftware testing
Software testing
 
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
 
Big Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloBig Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao Paulo
 
IOT Use Cases by Derick Jose - Co-founder and Chief Product Officer of M2M pl...
IOT Use Cases by Derick Jose - Co-founder and Chief Product Officer of M2M pl...IOT Use Cases by Derick Jose - Co-founder and Chief Product Officer of M2M pl...
IOT Use Cases by Derick Jose - Co-founder and Chief Product Officer of M2M pl...
 
Applied Data Science for monetization: pitfalls, common misconceptions, and n...
Applied Data Science for monetization: pitfalls, common misconceptions, and n...Applied Data Science for monetization: pitfalls, common misconceptions, and n...
Applied Data Science for monetization: pitfalls, common misconceptions, and n...
 
Ml ppt at
Ml ppt atMl ppt at
Ml ppt at
 
Machine Learning for Dummies
Machine Learning for DummiesMachine Learning for Dummies
Machine Learning for Dummies
 
Week14_Business Simulation Modeling MSBA.pptx
Week14_Business Simulation Modeling MSBA.pptxWeek14_Business Simulation Modeling MSBA.pptx
Week14_Business Simulation Modeling MSBA.pptx
 
Little book of programming challenges
Little book of programming challengesLittle book of programming challenges
Little book of programming challenges
 
Estimating default risk in fund structures
Estimating default risk in fund structuresEstimating default risk in fund structures
Estimating default risk in fund structures
 

Recently uploaded

Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp onlinebalibahu1313
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理pyhepag
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理cyebo
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfscitechtalktv
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxStephen266013
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理pyhepag
 
Data analytics courses in Nepal Presentation
Data analytics courses in Nepal PresentationData analytics courses in Nepal Presentation
Data analytics courses in Nepal Presentationanshikakulshreshtha11
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxDilipVasan
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理pyhepag
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group MeetingAlison Pitt
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonPayment Village
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Calllward7
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理cyebo
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictJack Cole
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理pyhepag
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsCEPTES Software Inc
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfEmmanuel Dauda
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfMichaelSenkow
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyRafigAliyev2
 

Recently uploaded (20)

Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
 
Data analytics courses in Nepal Presentation
Data analytics courses in Nepal PresentationData analytics courses in Nepal Presentation
Data analytics courses in Nepal Presentation
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 

Lottery paradox csail-dec-2020.pptx

  • 1. The Lottery Paradox A New Use MIT Computer Science & Artificial Intelligence Lab Dec 1, 2020 Peter Cotton Chief Data Scientist Intech Investments
  • 2. Hello. I work for Intech — a leading equity quant manager 2
  • 3. I am asymptotically the world’s most productive data scientist 3 (Returns measured water height … somewhere … from NOAA) Creates data stream … or so I tell my boss At the conclusion of a “ten minute data science project”, a data stream is predicted by dozens of competing time series algorithms, written by different authors using different tools, in different languages, with access to different exogenous data.
  • 4. Outline 4 1. On the lottery paradox: a. Positive returns b. Continuous lotteries c. Indifference to the market distribution d. Relationship between returns and distance 2. Putting it to work: a. Real-time distributional prediction b. Stacking lottery games c. Implied quantiles and copulas d. Categories of business applications 3. An existence “proof” for a prediction network (that doesn’t exist) a. The demise of artisan “data science” b. Why algorithms will manage the production of prediction
  • 6. Lottery paradox #1 6 Assume 10% rake. Buyer chooses 1 … 10,000. Most enter randomly. Mary buys every possible ticket once. 16% return !
  • 7. Lottery paradox resolution - simpler example 7 Mary benefits from Alice and Bob tripping on each other’s toes Only two outcomes
  • 8. Lottery paradox #2 8 Let W denote the average number of people sharing the prize. Alice is a random ticket buyer. Alice shares with approximately W other people
  • 9. Lottery paradox #2 - resolution 9 In the case of two tickets (head and tails), Alice shares with W-½ others (We can count)
  • 10. Lottery paradox #2 - resolution 10 Alice’s average is a population average, not outcome average. When Alice, Bob and Joe share the prize, it counts three times. This allows the population average to exceed the average over tickets by almost 1
  • 11. Lottery paradox #2 - better resolution 11 Alice wins with ticket 137 → Mean # of people choosing 137 goes up by almost +1. (“Approximate Bayes”) c.f. Mary winning conveys no information at all.
  • 12. Lottery paradox #2 - even better resolution? 12 Consider Mary’s last ticket …. lucky by +1 But all her tickets are the same
  • 13. Lottery paradox #3: Indifference 13 Suppose: ● No rake ● Mary’s investment is small ● Mary optimizes long run wealth ● Mary can see everyone else’s ticket choices
  • 14. Lottery paradox #3: Indifference 14 Suppose: ● No rake ● Mary’s investment is small ● Mary optimizes long run wealth ● Mary can see everyone else’s ticket choices ⇒ Mary still buys one of each ticket ⇒ Mary doesn’t care what anyone else does !
  • 16. Racetrack paradox - resolution #1 16 Maximize Constraint First order Lagrange condition: Thus must not depend on the horse index i
  • 17. Racetrack paradox - elementary resolution 17 Transfer a tiny investment from first horse to the second Follows that so they must be equal
  • 18. Remark on Entropy and KL-Divergence 18 Entropy … also the term not involving q in Mary’s return Kullback and Liebler cross-entropy We can interpret distance of Q from truth P in terms of Mary’s return exploiting it
  • 19. Now for something completely different? 19 Mayor draws from “Normalish” distribution Participants write a real number down. All those close share the prize.
  • 21. Mary’s Reward for Accuracy - Exponential 21 Market Mary
  • 22. Mary’s Reward for Accuracy - Normalish 22 Market error Mary’s return 10% 20 pts ( 0.2 ) 1% 20 bps ( 0.002 ) ● Use the fourth root transform to relate exponential returns to market error … measured as a percentage of standard deviation
  • 24. Algorithms Play Continuous Lottery Games 24 All day, every day Algorithms authored by anyone Live data published by anyone
  • 25. Algorithms submit 225 scenarios 25 Why not point estimates? Ask Roger Federer
  • 26. Quarantine 26 Data arrives 12:37:52 Cutoff 12:36:42 70 second quarantine 12:36:57 12:37:27 12:37:46 Qualify for rewards Time 12:34:27 12:35:41 17.56 Wind speed 17.57 17.55 Reward window
  • 27. Implied Percentiles 27 Every incoming data point implies a new data point … z = F(x) where F is the “community” distribution function Cumulative distribution for NY Electricity Production (Wind) 1 hr ahead
  • 28. Example: Reactions to the presidential debate Welcome Module 128 See https://www.microprediction.com/blog/tears_of_joy_standardizing_streaming_data
  • 29. Stacking Lotteries 29 Market implied percentiles are themselves the subject of lottery games (via normal quantile function) Approximately N(0,1) Algorithms predicting small deviations from standard normal
  • 30. Combine Percentiles 30 Some seemingly univariate series of games are actually copulas Pitch and Yaw implied compulas - from MIT SciML helicopula challenge
  • 31. Optics Analogy 31 Keep “lensing” until you get N(0,1) Composition of monotone functions, each contributed by one or more algorithms
  • 32. Pathways in the Collective Probability Brain 32 Scenarios “thrown” up to top level lottery U( ) V( ) R( ) W( ) S( ) T( ) Collaboration Q( ) Competition Competition Competition
  • 33. Law of Iterated Expectations 33 Pathways grow and shrink based on the economics Point estimates are a special case - shift Exogenous data is a special case - shift arbitrarily (!) E[Y|X] E[E[Y|X]|Z] Y E[Y|S] E[E[Y|S]|Z] E[Y|S,Z,R] = E[E[E[Y|S]|Z]|R] Scenarios thrown “up” into top level lottery Management fees charged down from parent to child
  • 35. Wanna Predict Something? 35 In any language (api.microprediction.org)
  • 36. Use category #1: Auxiliary market predictions 36 Markets predict the mean of a stock well Everything else (pretty much) is poorly predicted, due to lack of the discipline imposed by competition. ● Volatilities, ● Correlations ● Bid-offer spreads ● Liquidity ● Trading costs ● Holding periods ● Client flow ● Response to inquiry ● Cover price
  • 37. Use category #2: Prioritizing human work 37 e.g. reference data cleaning Probability that a record is changed? Which records will be changed?
  • 38. Use category #3: Enhancing live data feeds Welcome Module 138 Tagging. Converting sporadic live data to continuous. Discovering existing relationships Predicting delayed data and partially filled data Discovering good embeddings Finding new exogenous data Discovering good proxies for truth
  • 39. Use category #4: Live feature discovery Welcome Module 139 Chumming the water Predicting quantities correlated with the quantity you truly care about Determining which feature generation algorithms are suited to the task at hand
  • 40. Use category #5: Enhancing business intelligence applications Welcome Module 140 Predicting numbers on dashboards Highlighting unusual movements Predicting human reaction to information, or not (false positives) Enabling humans to track a larger amount of data in real time
  • 41. Use category #6: Fairness and explanation Welcome Module 141 Discovering data that reveals hidden bias Historical example: proxies for race, redlining
  • 42. Usage category #7: Surrogate models 42 Competing and combining surrogate models for agent based epidemic modeling https://www.microprediction.org/stream_dashboard.html?stream=pandemic_infected
  • 43. 43 3. An Existence Proof (for an automated Machine Learning network replacing artisan data science in large part)
  • 44. 1 - Motherhood statement Welcome Module 144 Quantitative business optimization will be a survival requirement for companies (Machine Learning is set to transform all industries)
  • 45. 2 - Slightly more controversial... Welcome Module 145 Quantitative business optimization using ML/AI = frequently repeated prediction Control theory ~ RL ~ microprediction of value functions
  • 46. 3 - Obvious to MIT folks Welcome Module 146 Strangers can do your ML for you
  • 47. 4 - Orthodox economics (local knowledge) Welcome Module 147 At approximately zero friction, markets >> central planning by humans
  • 48. 5 - The rest is busywork ... Welcome Module 148 Humans will not play a blocking role in the production of prediction Machine Learning will be orchestrated by hierarchies of real-time generalized contests
  • 50. 50 • Wrote the front end • Winning crawlers • Clients in Java, Julia, Rust • ZK-MUID proofs • Monotonic NN’s Thanks to Key Contributors. Join us ! Interested? Join us Friday’s at noon for informal contributor chat https://www.microprediction.com/contact-us