SlideShare a Scribd company logo
1 of 50
The Lottery Paradox
A New Use
MIT Computer Science & Artificial Intelligence Lab
Dec 1, 2020
Peter Cotton
Chief Data Scientist
Intech Investments
Hello. I work for Intech — a leading equity quant manager
2
I am asymptotically the world’s most productive data scientist
3
(Returns measured water height … somewhere … from NOAA)
Creates data stream
… or so I tell my boss
At the conclusion of a “ten minute data science project”, a data stream is predicted
by dozens of competing time series algorithms, written by different authors using
different tools, in different languages, with access to different exogenous data.
Outline
4
1. On the lottery paradox:
a. Positive returns
b. Continuous lotteries
c. Indifference to the market distribution
d. Relationship between returns and distance
1. Putting it to work:
a. Real-time distributional prediction
b. Stacking lottery games
c. Implied quantiles and copulas
d. Categories of business applications
1. An existence “proof” for a prediction network (that doesn’t exist)
a. The demise of artisan “data science”
b. Why algorithms will manage the production of prediction
5
1. Lottery Paradoxes
Lottery paradox #1
6
Assume 10% rake. Buyer chooses 1 … 10,000. Most enter randomly.
Mary buys every possible ticket once.
16% return !
Lottery paradox resolution - simpler example
7
Mary benefits from Alice and Bob tripping on each other’s toes
Only two outcomes
Lottery paradox #2
8
Let W denote the average number of people sharing the prize.
Alice is a random ticket buyer.
Alice shares with approximately W other people
Lottery paradox #2 - resolution
9
In the case of two tickets (head and tails), Alice shares with W-½ others
(We can count)
Lottery paradox #2 - resolution
10
Alice’s average is a population average, not outcome average.
When Alice, Bob and Joe share the prize, it counts three times.
This allows the population average to exceed the average over tickets by almost 1
Lottery paradox #2 - better resolution
11
Alice wins with ticket 137 → Mean # of people choosing 137 goes up by almost +1.
(“Approximate Bayes”)
c.f. Mary winning conveys no information at all.
Lottery paradox #2 - even better resolution?
12
Consider Mary’s last ticket …. lucky by +1
But all her tickets are the same
Lottery paradox #3: Indifference
13
Suppose:
● No rake
● Mary’s investment is small
● Mary optimizes long run wealth
● Mary can see everyone else’s ticket choices
Lottery paradox #3: Indifference
14
Suppose:
● No rake
● Mary’s investment is small
● Mary optimizes long run wealth
● Mary can see everyone else’s ticket choices
⇒ Mary still buys one of each ticket
⇒ Mary doesn’t care what anyone else does !
Racetrack Paradox
15
Mary doesn’t look at the odds !
Racetrack paradox - resolution #1
16
Maximize
Constraint
First order Lagrange condition:
Thus must not depend on the horse index i
Racetrack paradox - elementary resolution
17
Transfer a tiny investment from first horse to the second
Follows that so they must be equal
Remark on Entropy and KL-Divergence
18
Entropy … also the term not involving q in Mary’s return
Kullback and Liebler cross-entropy
We can interpret distance of Q from truth P in terms of Mary’s return exploiting it
Now for something completely different?
19
Mayor draws from “Normalish” distribution
Participants write a real number down. All those close share the prize.
Same Game?
20
{1,2,...,10000}
Mary’s Reward for Accuracy - Exponential
21
Market
Mary
Mary’s Reward for Accuracy - Normalish
22
Market error Mary’s return
10% 20 pts ( 0.2 )
1% 20 bps ( 0.002 )
● Use the fourth root transform to relate exponential
returns to market error … measured as a percentage of
standard deviation
23
2. Realtime distributional prediction
Algorithms Play Continuous Lottery Games
24
All day, every day
Algorithms authored by anyone Live data published by anyone
Algorithms submit 225 scenarios
25
Why not point estimates? Ask Roger Federer
Quarantine
26
Data arrives
12:37:52
Cutoff 12:36:42
70 second quarantine
12:36:57
12:37:27
12:37:46
Qualify for rewards
Time
12:34:27
12:35:41
17.56
Wind
speed
17.57
17.55
Reward
window
Implied Percentiles
27
Every incoming data point implies a new data point …
z = F(x)
where F is the “community” distribution function
Cumulative distribution for NY Electricity Production (Wind) 1 hr ahead
Example: Reactions to the presidential debate
Welcome Module 1
28
See https://www.microprediction.com/blog/tears_of_joy_standardizing_streaming_data
Stacking Lotteries
29
Market implied percentiles are themselves the subject of lottery games (via
normal quantile function)
Approximately N(0,1)
Algorithms predicting small
deviations from standard normal
Combine Percentiles
30
Some seemingly univariate series of games are actually copulas
Pitch and Yaw implied compulas - from MIT SciML helicopula challenge
Optics Analogy
31
Keep “lensing” until you get N(0,1)
Composition of monotone functions, each contributed by one or more algorithms
Pathways in the Collective Probability Brain
32
Scenarios “thrown” up to top level lottery
U( )
V( )
R( )
W( )
S( )
T( )
Collaboration
Q( )
Competition
Competition
Competition
Law of Iterated Expectations
33
Pathways grow and shrink based on the economics
Point estimates are a special case - shift
Exogenous data is a special case - shift arbitrarily (!)
E[Y|X]
E[E[Y|X]|Z]
Y
E[Y|S]
E[E[Y|S]|Z]
E[Y|S,Z,R] = E[E[E[Y|S]|Z]|R]
Scenarios thrown “up” into top level lottery
Management fees charged down from parent to child
Wanna Play?
34
Wanna Predict Something?
35
In any language (api.microprediction.org)
Use category #1: Auxiliary market predictions
36
Markets predict the mean of a stock well
Everything else (pretty much) is poorly predicted, due to lack
of the discipline imposed by competition.
● Volatilities,
● Correlations
● Bid-offer spreads
● Liquidity
● Trading costs
● Holding periods
● Client flow
● Response to inquiry
● Cover price
Use category #2: Prioritizing human work
37
e.g. reference data cleaning
Probability that a record is changed?
Which records will be changed?
Use category #3: Enhancing live data feeds
Welcome Module 1
38
Tagging.
Converting sporadic live data to continuous.
Discovering existing relationships
Predicting delayed data and partially filled data
Discovering good embeddings
Finding new exogenous data
Discovering good proxies for truth
Use category #4: Live feature discovery
Welcome Module 1
39
Chumming the water
Predicting quantities correlated with the quantity you truly care about
Determining which feature generation algorithms are suited to the task at hand
Use category #5: Enhancing business intelligence applications
Welcome Module 1
40
Predicting numbers on dashboards
Highlighting unusual movements
Predicting human reaction to information, or not (false positives)
Enabling humans to track a larger amount of data in real time
Use category #6: Fairness and explanation
Welcome Module 1
41
Discovering data that reveals hidden bias
Historical example: proxies for race, redlining
Usage category #7: Surrogate models
42
Competing and combining surrogate models for agent based epidemic modeling
https://www.microprediction.org/stream_dashboard.html?stream=pandemic_infected
43
3. An Existence Proof
(for an automated Machine Learning
network replacing artisan data science
in large part)
1 - Motherhood statement
Welcome Module 1
44
Quantitative business optimization will be a survival requirement for companies
(Machine Learning is set to transform all industries)
2 - Slightly more controversial...
Welcome Module 1
45
Quantitative business optimization using ML/AI = frequently repeated prediction
Control theory ~ RL ~ microprediction of value functions
3 - Obvious to MIT folks
Welcome Module 1
46
Strangers can do your ML for you
4 - Orthodox economics (local knowledge)
Welcome Module 1
47
At approximately zero friction, markets >> central planning by humans
5 - The rest is busywork ...
Welcome Module 1
48
Humans will not play a blocking role in the production of prediction
Machine Learning will be orchestrated by hierarchies of real-time generalized contests
Thanks for listening !
49
50
• Wrote the front end
• Winning crawlers
• Clients in Java, Julia, Rust
• ZK-MUID proofs
• Monotonic NN’s
Thanks to Key Contributors. Join us !
Interested? Join us Friday’s at noon for informal contributor chat
https://www.microprediction.com/contact-us

More Related Content

Similar to Lottery paradox csail-dec-2020

From the Big Bang to Ecommerce, a journey in making sense of Big Data
From the Big Bang to Ecommerce, a journey in making sense of Big DataFrom the Big Bang to Ecommerce, a journey in making sense of Big Data
From the Big Bang to Ecommerce, a journey in making sense of Big Data
Patrick Deglon
 
DevelopingDataScienceProfession
DevelopingDataScienceProfessionDevelopingDataScienceProfession
DevelopingDataScienceProfession
Gary Rector
 

Similar to Lottery paradox csail-dec-2020 (20)

Big Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloBig Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao Paulo
 
Neo4j GraphDay Seattle- Sept19- graphs are ai
Neo4j GraphDay Seattle- Sept19-  graphs are aiNeo4j GraphDay Seattle- Sept19-  graphs are ai
Neo4j GraphDay Seattle- Sept19- graphs are ai
 
Artificial intelligence professor jinsong dong 2 august 2017
Artificial intelligence professor jinsong dong 2 august 2017Artificial intelligence professor jinsong dong 2 august 2017
Artificial intelligence professor jinsong dong 2 august 2017
 
Intro to Quant Trading Strategies (Lecture 1 of 10)
Intro to Quant Trading Strategies (Lecture 1 of 10)Intro to Quant Trading Strategies (Lecture 1 of 10)
Intro to Quant Trading Strategies (Lecture 1 of 10)
 
Data Science An Engineering Implementation Perspective
Data Science An Engineering Implementation PerspectiveData Science An Engineering Implementation Perspective
Data Science An Engineering Implementation Perspective
 
Machine Learning for Dummies
Machine Learning for DummiesMachine Learning for Dummies
Machine Learning for Dummies
 
44 randomized-algorithms
44 randomized-algorithms44 randomized-algorithms
44 randomized-algorithms
 
20181212 ibm aot
20181212 ibm aot20181212 ibm aot
20181212 ibm aot
 
Introduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive AnalyticsIntroduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive Analytics
 
Applied Data Science for monetization: pitfalls, common misconceptions, and n...
Applied Data Science for monetization: pitfalls, common misconceptions, and n...Applied Data Science for monetization: pitfalls, common misconceptions, and n...
Applied Data Science for monetization: pitfalls, common misconceptions, and n...
 
EDHREC @ Data Science MD
EDHREC @ Data Science MDEDHREC @ Data Science MD
EDHREC @ Data Science MD
 
Week14_Business Simulation Modeling MSBA.pptx
Week14_Business Simulation Modeling MSBA.pptxWeek14_Business Simulation Modeling MSBA.pptx
Week14_Business Simulation Modeling MSBA.pptx
 
AI3391 Artificial intelligence Session 15 Min Max Algorithm.pptx
AI3391 Artificial intelligence Session 15  Min Max Algorithm.pptxAI3391 Artificial intelligence Session 15  Min Max Algorithm.pptx
AI3391 Artificial intelligence Session 15 Min Max Algorithm.pptx
 
Everything You Always Wanted to Know About Synthetic Data
Everything You Always Wanted to Know About Synthetic DataEverything You Always Wanted to Know About Synthetic Data
Everything You Always Wanted to Know About Synthetic Data
 
From the Big Bang to Ecommerce, a journey in making sense of Big Data
From the Big Bang to Ecommerce, a journey in making sense of Big DataFrom the Big Bang to Ecommerce, a journey in making sense of Big Data
From the Big Bang to Ecommerce, a journey in making sense of Big Data
 
MAT 540(STR) Effective Communication/tutorialrank.com
 MAT 540(STR) Effective Communication/tutorialrank.com MAT 540(STR) Effective Communication/tutorialrank.com
MAT 540(STR) Effective Communication/tutorialrank.com
 
MAT 540(Str) EXceptional Education/snaptutorial.COM
MAT 540(Str) EXceptional Education/snaptutorial.COMMAT 540(Str) EXceptional Education/snaptutorial.COM
MAT 540(Str) EXceptional Education/snaptutorial.COM
 
DevelopingDataScienceProfession
DevelopingDataScienceProfessionDevelopingDataScienceProfession
DevelopingDataScienceProfession
 
MAT 540 AID str Become Exceptional--mat540aid.com
MAT 540 AID str Become Exceptional--mat540aid.comMAT 540 AID str Become Exceptional--mat540aid.com
MAT 540 AID str Become Exceptional--mat540aid.com
 
MAT 540 AID str Achievement Education--mat540aid.com
MAT 540 AID str Achievement Education--mat540aid.comMAT 540 AID str Achievement Education--mat540aid.com
MAT 540 AID str Achievement Education--mat540aid.com
 

Recently uploaded

Tuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notesTuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notes
jyothisaisri
 
Continuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsContinuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discs
Sérgio Sacani
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
PirithiRaju
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
University of Hertfordshire
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
sreddyrahul
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Sérgio Sacani
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
GOWTHAMIM22
 

Recently uploaded (20)

Tuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notesTuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notes
 
Continuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsContinuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discs
 
In-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxIn-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptx
 
The Scientific names of some important families of Industrial plants .pdf
The Scientific names of some important families of Industrial plants .pdfThe Scientific names of some important families of Industrial plants .pdf
The Scientific names of some important families of Industrial plants .pdf
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
 
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptx
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdf
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
 
NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interaction
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
 
Hemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. MuralinathHemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. Muralinath
 
GBSN - Biochemistry (Unit 4) Chemistry of Carbohydrates
GBSN - Biochemistry (Unit 4) Chemistry of CarbohydratesGBSN - Biochemistry (Unit 4) Chemistry of Carbohydrates
GBSN - Biochemistry (Unit 4) Chemistry of Carbohydrates
 
Errors: types, determination and elimination
Errors: types, determination and eliminationErrors: types, determination and elimination
Errors: types, determination and elimination
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
 

Lottery paradox csail-dec-2020

  • 1. The Lottery Paradox A New Use MIT Computer Science & Artificial Intelligence Lab Dec 1, 2020 Peter Cotton Chief Data Scientist Intech Investments
  • 2. Hello. I work for Intech — a leading equity quant manager 2
  • 3. I am asymptotically the world’s most productive data scientist 3 (Returns measured water height … somewhere … from NOAA) Creates data stream … or so I tell my boss At the conclusion of a “ten minute data science project”, a data stream is predicted by dozens of competing time series algorithms, written by different authors using different tools, in different languages, with access to different exogenous data.
  • 4. Outline 4 1. On the lottery paradox: a. Positive returns b. Continuous lotteries c. Indifference to the market distribution d. Relationship between returns and distance 1. Putting it to work: a. Real-time distributional prediction b. Stacking lottery games c. Implied quantiles and copulas d. Categories of business applications 1. An existence “proof” for a prediction network (that doesn’t exist) a. The demise of artisan “data science” b. Why algorithms will manage the production of prediction
  • 6. Lottery paradox #1 6 Assume 10% rake. Buyer chooses 1 … 10,000. Most enter randomly. Mary buys every possible ticket once. 16% return !
  • 7. Lottery paradox resolution - simpler example 7 Mary benefits from Alice and Bob tripping on each other’s toes Only two outcomes
  • 8. Lottery paradox #2 8 Let W denote the average number of people sharing the prize. Alice is a random ticket buyer. Alice shares with approximately W other people
  • 9. Lottery paradox #2 - resolution 9 In the case of two tickets (head and tails), Alice shares with W-½ others (We can count)
  • 10. Lottery paradox #2 - resolution 10 Alice’s average is a population average, not outcome average. When Alice, Bob and Joe share the prize, it counts three times. This allows the population average to exceed the average over tickets by almost 1
  • 11. Lottery paradox #2 - better resolution 11 Alice wins with ticket 137 → Mean # of people choosing 137 goes up by almost +1. (“Approximate Bayes”) c.f. Mary winning conveys no information at all.
  • 12. Lottery paradox #2 - even better resolution? 12 Consider Mary’s last ticket …. lucky by +1 But all her tickets are the same
  • 13. Lottery paradox #3: Indifference 13 Suppose: ● No rake ● Mary’s investment is small ● Mary optimizes long run wealth ● Mary can see everyone else’s ticket choices
  • 14. Lottery paradox #3: Indifference 14 Suppose: ● No rake ● Mary’s investment is small ● Mary optimizes long run wealth ● Mary can see everyone else’s ticket choices ⇒ Mary still buys one of each ticket ⇒ Mary doesn’t care what anyone else does !
  • 16. Racetrack paradox - resolution #1 16 Maximize Constraint First order Lagrange condition: Thus must not depend on the horse index i
  • 17. Racetrack paradox - elementary resolution 17 Transfer a tiny investment from first horse to the second Follows that so they must be equal
  • 18. Remark on Entropy and KL-Divergence 18 Entropy … also the term not involving q in Mary’s return Kullback and Liebler cross-entropy We can interpret distance of Q from truth P in terms of Mary’s return exploiting it
  • 19. Now for something completely different? 19 Mayor draws from “Normalish” distribution Participants write a real number down. All those close share the prize.
  • 21. Mary’s Reward for Accuracy - Exponential 21 Market Mary
  • 22. Mary’s Reward for Accuracy - Normalish 22 Market error Mary’s return 10% 20 pts ( 0.2 ) 1% 20 bps ( 0.002 ) ● Use the fourth root transform to relate exponential returns to market error … measured as a percentage of standard deviation
  • 24. Algorithms Play Continuous Lottery Games 24 All day, every day Algorithms authored by anyone Live data published by anyone
  • 25. Algorithms submit 225 scenarios 25 Why not point estimates? Ask Roger Federer
  • 26. Quarantine 26 Data arrives 12:37:52 Cutoff 12:36:42 70 second quarantine 12:36:57 12:37:27 12:37:46 Qualify for rewards Time 12:34:27 12:35:41 17.56 Wind speed 17.57 17.55 Reward window
  • 27. Implied Percentiles 27 Every incoming data point implies a new data point … z = F(x) where F is the “community” distribution function Cumulative distribution for NY Electricity Production (Wind) 1 hr ahead
  • 28. Example: Reactions to the presidential debate Welcome Module 1 28 See https://www.microprediction.com/blog/tears_of_joy_standardizing_streaming_data
  • 29. Stacking Lotteries 29 Market implied percentiles are themselves the subject of lottery games (via normal quantile function) Approximately N(0,1) Algorithms predicting small deviations from standard normal
  • 30. Combine Percentiles 30 Some seemingly univariate series of games are actually copulas Pitch and Yaw implied compulas - from MIT SciML helicopula challenge
  • 31. Optics Analogy 31 Keep “lensing” until you get N(0,1) Composition of monotone functions, each contributed by one or more algorithms
  • 32. Pathways in the Collective Probability Brain 32 Scenarios “thrown” up to top level lottery U( ) V( ) R( ) W( ) S( ) T( ) Collaboration Q( ) Competition Competition Competition
  • 33. Law of Iterated Expectations 33 Pathways grow and shrink based on the economics Point estimates are a special case - shift Exogenous data is a special case - shift arbitrarily (!) E[Y|X] E[E[Y|X]|Z] Y E[Y|S] E[E[Y|S]|Z] E[Y|S,Z,R] = E[E[E[Y|S]|Z]|R] Scenarios thrown “up” into top level lottery Management fees charged down from parent to child
  • 35. Wanna Predict Something? 35 In any language (api.microprediction.org)
  • 36. Use category #1: Auxiliary market predictions 36 Markets predict the mean of a stock well Everything else (pretty much) is poorly predicted, due to lack of the discipline imposed by competition. ● Volatilities, ● Correlations ● Bid-offer spreads ● Liquidity ● Trading costs ● Holding periods ● Client flow ● Response to inquiry ● Cover price
  • 37. Use category #2: Prioritizing human work 37 e.g. reference data cleaning Probability that a record is changed? Which records will be changed?
  • 38. Use category #3: Enhancing live data feeds Welcome Module 1 38 Tagging. Converting sporadic live data to continuous. Discovering existing relationships Predicting delayed data and partially filled data Discovering good embeddings Finding new exogenous data Discovering good proxies for truth
  • 39. Use category #4: Live feature discovery Welcome Module 1 39 Chumming the water Predicting quantities correlated with the quantity you truly care about Determining which feature generation algorithms are suited to the task at hand
  • 40. Use category #5: Enhancing business intelligence applications Welcome Module 1 40 Predicting numbers on dashboards Highlighting unusual movements Predicting human reaction to information, or not (false positives) Enabling humans to track a larger amount of data in real time
  • 41. Use category #6: Fairness and explanation Welcome Module 1 41 Discovering data that reveals hidden bias Historical example: proxies for race, redlining
  • 42. Usage category #7: Surrogate models 42 Competing and combining surrogate models for agent based epidemic modeling https://www.microprediction.org/stream_dashboard.html?stream=pandemic_infected
  • 43. 43 3. An Existence Proof (for an automated Machine Learning network replacing artisan data science in large part)
  • 44. 1 - Motherhood statement Welcome Module 1 44 Quantitative business optimization will be a survival requirement for companies (Machine Learning is set to transform all industries)
  • 45. 2 - Slightly more controversial... Welcome Module 1 45 Quantitative business optimization using ML/AI = frequently repeated prediction Control theory ~ RL ~ microprediction of value functions
  • 46. 3 - Obvious to MIT folks Welcome Module 1 46 Strangers can do your ML for you
  • 47. 4 - Orthodox economics (local knowledge) Welcome Module 1 47 At approximately zero friction, markets >> central planning by humans
  • 48. 5 - The rest is busywork ... Welcome Module 1 48 Humans will not play a blocking role in the production of prediction Machine Learning will be orchestrated by hierarchies of real-time generalized contests
  • 50. 50 • Wrote the front end • Winning crawlers • Clients in Java, Julia, Rust • ZK-MUID proofs • Monotonic NN’s Thanks to Key Contributors. Join us ! Interested? Join us Friday’s at noon for informal contributor chat https://www.microprediction.com/contact-us