SlideShare a Scribd company logo
1 of 26
Download to read offline
LARGE SCALE
FORECASTING &
PRICING
Tofigh Naghibi
2
3
4
5
Forecaster: Oracle of all possible futures
Optimizer: Find the ONE optimal future action
FORECASTING & PRICING
Forecaster Optimizer
6
Forecast sales
per article/object
per shop/market
for a given time horizon per time unit.
FORECASTING & PRICING OVERVIEW
Optimal price for a
Given objective function (Revenue /
profit, customer satisfaction)
Given constraints (min profit, max
accepted loss, max over-stock)
Given optimization time horizon
Given compute budget
Forecasting Pricing
7
PRICING STRATEGIES
8
PRICING STRATEGIES
Reward (Object function) value
becomes clear immediately ⇒
● Far future is irrelevant
● Prediction is much easier
● You even can use online optimizer
● Can be quickly tested
Long time horizon Short time horizon
Reward (Objective function) value
is revealed very late ⇒
● Many time steps are dependent
● Prediction is difficult
● You need a proper optimizer
● Needs long term A/B test
9
PRICING STRATEGIES
Long time horizon Short time horizon
● Model-based reinforcement learning
○ Learn market model from
historic data, D
○ Learn action policy: 𝞹(d|S)
○ Deterministic: Pure optimization
○ Stochastic: Exploration
● Multi-arm bandit / Online learning
○ Learn online or from historic
data
○ Select an action/price
○ Observe its effect and update
10
● Many articles
○ Not sensible to have a model per article
● Frequent price updates
○ Forecast + Optimization need to be efficient/fast
● Many shops/markets
○ They might have interactions
FORECASTING AT SCALE: CHALLENGES
11
● Various stakeholders
○ Might need forecast at different aggregation level
○ Training samples are not independent anymore
● Long forecasting horizons
○ Put a lot of pressure both computationally and
accuray-wise on the system
○ Hard to evaluate
FORECASTING AT SCALE: CHALLENGES
12
● ARIMA/ Prophet: One demand forecast model per article
○ Model management is an issue
○ Remember: The learned model SHOULD be a function of price
Demand = f(price)
Bigger problem: In short time series, not many price changes exist.
Thus, demand approximation as a function of price will be very
noisy
TRADITIONAL FORECASTING METHODS
13
● One model shared among all articles
● Shop model can be shared or not but, they SHOULD
communicate
● Able to provide the gradient of your function with respect
to price
MODERN APPROACH
14
DIFFERENTIAL PROGRAMMING
Return
rate
Cancel
lation
Sales
Forecast
Damaged
rate
Stock
forecast
Depre
ciation
15
● Develop models in gradient descent based frameworks
○ pyTorch, Chainer, maxNet, TensorFlow,....
○ Yes it is neural network
○ No not all the models need to be neural network based.
DIFFERENTIAL PROGRAMMING
16
REAL-WORLD EXAMPLE
17
REAL-WORLD EXAMPLE
Sales model Return rate model
18
REAL-WORLD EXAMPLE
Sales model Return rate model
Over stock Err. Min
19
● Models can communicate via gradient
● Requirement: All models should implement gradient
calculation
REAL-WORLD EXAMPLE
20
● Solving optimization is simple: Use gradient ascent/descent.
Gradient of the models are given almost for free.
REAL-WORLD EXAMPLE
21
SALES FORECASTER
22
SALES FORECASTER
Encoder
Decoder
23
● Replacement of LSTM
○ Extremely parallelizable: O(1) in terms of seq
length
● Bunch of feed-forward networks, much easier to go
against overfitting
TRANSFORMER
Attention Is All You Need
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia
Polosukhin
24
● Multi-head attention mechanism
○ Attention: Every time point finds its similarity with every other time
point in the seq
○ First the seq projected to k different sub-spaces and then attention is
applied
● No hidden-state. Encoded output seq has the same length as the input seq
● It encodes a set
TRANSFORMER
25
Infrastructure
● Low-Latency super duper data processing pipeline
● Multi-GPU training
● Sagemaker, Kubernetes, Databricks for large scale analysis
● One-click training data with new features generation
26
?

More Related Content

Similar to Large Scale Forecasting & Pricing Optimization

Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitionsOwen Zhang
 
Tuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep LearningTuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep LearningSigOpt
 
Algo trading with machine learning ppt
Algo trading with machine learning pptAlgo trading with machine learning ppt
Algo trading with machine learning pptDeb prakash ganguly
 
GridMAP: Next generation energy analysis tools.
GridMAP: Next generation energy analysis tools.GridMAP: Next generation energy analysis tools.
GridMAP: Next generation energy analysis tools.Iain Beveridge
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper reviewMazen Aly
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudSigOpt
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsScott Clark
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsSigOpt
 
Building A Trading Desk On Analytics
Building A Trading Desk On AnalyticsBuilding A Trading Desk On Analytics
Building A Trading Desk On AnalyticsRory Winston
 
How to make Headless Commerce Strategy.
How to make Headless Commerce Strategy.How to make Headless Commerce Strategy.
How to make Headless Commerce Strategy.Aureate Labs
 
Statistical Arbitrage
Statistical ArbitrageStatistical Arbitrage
Statistical ArbitrageShubham Patil
 
Advanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise WebinarAdvanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise WebinarSigOpt
 
05 - CarmenHerrero A machine learning approach to optimize prices during cle...
05 -  CarmenHerrero A machine learning approach to optimize prices during cle...05 -  CarmenHerrero A machine learning approach to optimize prices during cle...
05 - CarmenHerrero A machine learning approach to optimize prices during cle...smityajah
 
SigOpt for Hedge Funds
SigOpt for Hedge FundsSigOpt for Hedge Funds
SigOpt for Hedge FundsSigOpt
 
Production ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeProduction ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeIdo Shilon
 
Scaling machinelearning as a service at uber li Erran li - 2016
Scaling machinelearning as a service at uber li Erran li - 2016Scaling machinelearning as a service at uber li Erran li - 2016
Scaling machinelearning as a service at uber li Erran li - 2016Karthik Murugesan
 
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016PAPIs.io
 
Agile and fixed budget projects
Agile and fixed budget projectsAgile and fixed budget projects
Agile and fixed budget projectsGul Mohammad
 
Power Laws: Optimizing Demand-side Strategies. Second Prize Solution
Power Laws: Optimizing Demand-side Strategies. Second Prize SolutionPower Laws: Optimizing Demand-side Strategies. Second Prize Solution
Power Laws: Optimizing Demand-side Strategies. Second Prize SolutionGuillermo Barbadillo Villanueva
 

Similar to Large Scale Forecasting & Pricing Optimization (20)

Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitions
 
Tuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep LearningTuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep Learning
 
Algo trading with machine learning ppt
Algo trading with machine learning pptAlgo trading with machine learning ppt
Algo trading with machine learning ppt
 
GridMAP: Next generation energy analysis tools.
GridMAP: Next generation energy analysis tools.GridMAP: Next generation energy analysis tools.
GridMAP: Next generation energy analysis tools.
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper review
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Trading Analytics
Trading AnalyticsTrading Analytics
Trading Analytics
 
Building A Trading Desk On Analytics
Building A Trading Desk On AnalyticsBuilding A Trading Desk On Analytics
Building A Trading Desk On Analytics
 
How to make Headless Commerce Strategy.
How to make Headless Commerce Strategy.How to make Headless Commerce Strategy.
How to make Headless Commerce Strategy.
 
Statistical Arbitrage
Statistical ArbitrageStatistical Arbitrage
Statistical Arbitrage
 
Advanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise WebinarAdvanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise Webinar
 
05 - CarmenHerrero A machine learning approach to optimize prices during cle...
05 -  CarmenHerrero A machine learning approach to optimize prices during cle...05 -  CarmenHerrero A machine learning approach to optimize prices during cle...
05 - CarmenHerrero A machine learning approach to optimize prices during cle...
 
SigOpt for Hedge Funds
SigOpt for Hedge FundsSigOpt for Hedge Funds
SigOpt for Hedge Funds
 
Production ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeProduction ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ waze
 
Scaling machinelearning as a service at uber li Erran li - 2016
Scaling machinelearning as a service at uber li Erran li - 2016Scaling machinelearning as a service at uber li Erran li - 2016
Scaling machinelearning as a service at uber li Erran li - 2016
 
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
 
Agile and fixed budget projects
Agile and fixed budget projectsAgile and fixed budget projects
Agile and fixed budget projects
 
Power Laws: Optimizing Demand-side Strategies. Second Prize Solution
Power Laws: Optimizing Demand-side Strategies. Second Prize SolutionPower Laws: Optimizing Demand-side Strategies. Second Prize Solution
Power Laws: Optimizing Demand-side Strategies. Second Prize Solution
 

Recently uploaded

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 

Recently uploaded (20)

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

Large Scale Forecasting & Pricing Optimization

  • 2. 2
  • 3. 3
  • 4. 4
  • 5. 5 Forecaster: Oracle of all possible futures Optimizer: Find the ONE optimal future action FORECASTING & PRICING Forecaster Optimizer
  • 6. 6 Forecast sales per article/object per shop/market for a given time horizon per time unit. FORECASTING & PRICING OVERVIEW Optimal price for a Given objective function (Revenue / profit, customer satisfaction) Given constraints (min profit, max accepted loss, max over-stock) Given optimization time horizon Given compute budget Forecasting Pricing
  • 8. 8 PRICING STRATEGIES Reward (Object function) value becomes clear immediately ⇒ ● Far future is irrelevant ● Prediction is much easier ● You even can use online optimizer ● Can be quickly tested Long time horizon Short time horizon Reward (Objective function) value is revealed very late ⇒ ● Many time steps are dependent ● Prediction is difficult ● You need a proper optimizer ● Needs long term A/B test
  • 9. 9 PRICING STRATEGIES Long time horizon Short time horizon ● Model-based reinforcement learning ○ Learn market model from historic data, D ○ Learn action policy: 𝞹(d|S) ○ Deterministic: Pure optimization ○ Stochastic: Exploration ● Multi-arm bandit / Online learning ○ Learn online or from historic data ○ Select an action/price ○ Observe its effect and update
  • 10. 10 ● Many articles ○ Not sensible to have a model per article ● Frequent price updates ○ Forecast + Optimization need to be efficient/fast ● Many shops/markets ○ They might have interactions FORECASTING AT SCALE: CHALLENGES
  • 11. 11 ● Various stakeholders ○ Might need forecast at different aggregation level ○ Training samples are not independent anymore ● Long forecasting horizons ○ Put a lot of pressure both computationally and accuray-wise on the system ○ Hard to evaluate FORECASTING AT SCALE: CHALLENGES
  • 12. 12 ● ARIMA/ Prophet: One demand forecast model per article ○ Model management is an issue ○ Remember: The learned model SHOULD be a function of price Demand = f(price) Bigger problem: In short time series, not many price changes exist. Thus, demand approximation as a function of price will be very noisy TRADITIONAL FORECASTING METHODS
  • 13. 13 ● One model shared among all articles ● Shop model can be shared or not but, they SHOULD communicate ● Able to provide the gradient of your function with respect to price MODERN APPROACH
  • 15. 15 ● Develop models in gradient descent based frameworks ○ pyTorch, Chainer, maxNet, TensorFlow,.... ○ Yes it is neural network ○ No not all the models need to be neural network based. DIFFERENTIAL PROGRAMMING
  • 18. 18 REAL-WORLD EXAMPLE Sales model Return rate model Over stock Err. Min
  • 19. 19 ● Models can communicate via gradient ● Requirement: All models should implement gradient calculation REAL-WORLD EXAMPLE
  • 20. 20 ● Solving optimization is simple: Use gradient ascent/descent. Gradient of the models are given almost for free. REAL-WORLD EXAMPLE
  • 23. 23 ● Replacement of LSTM ○ Extremely parallelizable: O(1) in terms of seq length ● Bunch of feed-forward networks, much easier to go against overfitting TRANSFORMER Attention Is All You Need Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
  • 24. 24 ● Multi-head attention mechanism ○ Attention: Every time point finds its similarity with every other time point in the seq ○ First the seq projected to k different sub-spaces and then attention is applied ● No hidden-state. Encoded output seq has the same length as the input seq ● It encodes a set TRANSFORMER
  • 25. 25 Infrastructure ● Low-Latency super duper data processing pipeline ● Multi-GPU training ● Sagemaker, Kubernetes, Databricks for large scale analysis ● One-click training data with new features generation
  • 26. 26 ?