Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Modeling at scale in systematic trading

121 views

Published on

SigOpt CEO Scott Clark provides insights for modeling at scale in systematic trading. SigOpt works with algorithmic trading firms that collectively represent $300 billion in assets under management (AUM). In this presentation, Scott draws on this experience to provide a few critical insights to how these companies effectively model at scale. Alongside these insights, Scott shares a more specific case study from working with Two Sigma, a leading systematic investment manager.

Published in: Technology
  • Be the first to comment

Modeling at scale in systematic trading

  1. 1. Modeling at Scale in Systematic Trading Scott Clark, CEO, SigOpt
  2. 2. SigOpt. Confidential.2
  3. 3. *Current SigOpt trading customers represent over $300B in assets under management $300B+Assets Under Management*
  4. 4. Lessons 1. Invest in a reproducible process 2. Balance flexibility with standardization 3. Divide labor between humans & machines 4. Maximize resource utilization 5. Prioritize performance (broadly defined)
  5. 5. Invest a reproducible process1
  6. 6. Data Modeling Simulation Optimization Execution
  7. 7. Data Historical stock prices Company data Company news Social data Location data Satellite data
  8. 8. Modeling Simulation Optimization
  9. 9. Modeling Simulation Optimization Backtests must avoid: Overfitting bias Look ahead bias Survivorship bias P-hacking bias Metric bias
  10. 10. Backtest Modeling Simulation Optimization Training Data Model Testing Data New Configurations Objective Metric Better Results EXPERIMENT INSIGHTS OPTIMIZATION ENSEMBLE ENTERPRISE PLATFORM RESTAPI
  11. 11. Execution High frequency trading Market making Statistical arbitrage ___________________ Rebalancing Portfolio Optimization
  12. 12. Balance flexibility with standardization2
  13. 13. Serving Monitoring Hardware: Scalable, Efficient Compute Data Pipelines Features Experimentation Modeling: Notebooks, Libraries, Frameworks Optimization: Tuning, Tracking, Analysis, Resource Mgmt Execution Simulation: Backtests, Metric Iteration, Model Evaluation, Portfolio Optimization
  14. 14. Framework Solutions: Standard or proprietary per firm? Innovation: Incremental or existential for firm? Status: Still evolving or fully established?
  15. 15. Hardware: Scalable, Efficient Compute Solutions: Standard | Innovation: Existential | Status: Evolving | Implication: Buy Data Pipelines Features Solutions: Proprietary __________ Innovation: Incremental __________ Status: Established __________ Implication: Mixed Experimentation Modeling: Notebooks, Libraries, Frameworks Solutions: Standard | Innovation: Existential | Status: Evolvingl | Implication: Mixed Optimization: Tuning, Tracking, Analysis, Resource Mgmt Solutions: Standard | Innovation: Incremental | Status: Evolving | Implication: Buy Serving Monitoring Solutions: Standard __________ Innovation: Existential __________ Status: Established __________ Implication: Mixed Execution Simulation: Backtests, Metric Iteration, Model Evaluation, Portfolio Optimization Solutions: Proprietary | Innovation: Existential | Status: Established | Implication: Build
  16. 16. Simulation & Evaluation Notebook & Model Framework Hardware Environment Data Preparation Experimentation, Training, Evaluation Model Productionalization Validation Serving Deploying Monitoring Managing Inference Online Testing Transformation Labeling Pre-Processing Pipeline Dev. Feature Eng. Feature Stores On-Premise Hybrid Multi-Cloud Experimentation & Model Optimization Insights, Tracking, Collaboration Model Search, Hyperparameter Tuning Resource Scheduler, Management Backtests Metrics Portfolio Opt
  17. 17. Divide labor between humans & machines3
  18. 18. SigOpt. Confidential.20
  19. 19. SigOpt. Confidential. Hyperparameter Optimization Model Tuning Grid Search Random Search Bayesian Optimization Training & Tuning Evolutionary Algorithms Deep Learning Architecture Search Hyperparameter Search
  20. 20. SigOpt. Confidential. Pro Con Manual Search Leverages expertise Not scalable, inconsistent Grid Search Simple to implement Not scalable, often infeasible Random Search Scalable Inefficient Evolutionary Algorithms Effective at architecture search Very resource intensive Bayesian Optimization Efficient, effective Can be tough to parallelize
  21. 21. SigOpt. Confidential. Pro Con Manual Search Leverages expertise Not scalable, inconsistent Grid Search Simple to implement Not scalable, often infeasible Random Search Scalable Inefficient Evolutionary Algorithms Effective at architecture search Very resource intensive Bayesian Optimization Efficient, effective Can be tough to parallelize
  22. 22. SigOpt. Confidential. Pro Con Manual Search Leverages expertise Not scalable, inconsistent Grid Search Simple to implement Not scalable, often infeasible Random Search Scalable Inefficient Evolutionary Algorithms Effective at architecture search Very resource intensive Bayesian Optimization Efficient, effective Can be tough to parallelize
  23. 23. SigOpt. Confidential. Pro Con Manual Search Leverages expertise Not scalable, inconsistent Grid Search Simple to implement Not scalable, often infeasible Random Search Scalable Inefficient Evolutionary Algorithms Effective at architecture search Very resource intensive Bayesian Optimization Efficient, effective Can be tough to parallelize
  24. 24. SigOpt. Confidential. Pro Con Manual Search Leverages expertise Not scalable, inconsistent Grid Search Simple to implement Not scalable, often infeasible Random Search Scalable Inefficient Evolutionary Algorithms Effective at architecture search Very resource intensive Bayesian Optimization Efficient, effective Can be tough to parallelize
  25. 25. Data and models stay private Iterative, automated optimization Built specifically for scalable enterprise use cases Training Data Model BacktestTesting Data New Configurations Objective Metric Better Results EXPERIMENT INSIGHTS Organize and introspect experiments OPTIMIZATION ENSEMBLE Explore and exploit with a variety of techniques ENTERPRISE PLATFORM Built to scale with your models in production RESTAPI
  26. 26. Maximize resource utilization4
  27. 27. Build or Buy It there a proprietary advantage to DIY? Does it benefit from domain expertise? Is the process the same for each model? Is the open source well maintained? Is the open source reliable? Are there more performant alternatives? Is there a low maintenance burden to buy? Can the product scale with our needs? Can the product evolve with our needs?
  28. 28. Asynchronous parallelization Is critical for resource utilization Benefit: Maximize Resource Utilization
  29. 29. 90% fewer training runs to optimize https://devblogs.nvidia.com/sigopt-deep-learning-h yperparameter-optimization/ 400x faster time to optimize https://aws.amazon.com/blogs/machine-learning/fast-c nn-tuning-with-aws-gpu-instances-and-sigopt/ 20x the cost efficiency to optimize https://devblogs.nvidia.com/optimizing-end-to-end- memory-networks-using-sigopt-gpus/ Benefit: Performance Gains
  30. 30. Prioritize performance (broadly defined)5
  31. 31. Performance (table stakes) Better, Faster, Cheaper
  32. 32. Better Results, 8x Faster “We’ve integrated SigOpt’s optimization service and are now able to get better results faster and cheaper than any solution we’ve seen before.” Matt Adereth Managing Director Two Sigma Benefit: Performance Gains
  33. 33. Source: https://arxiv.org/abs/1603.09441
  34. 34. SigOpt. Confidential. Case: Cars Image Classification 38 Stanford Dataset https://ai.stanford.edu/~jkrause/cars/car_dataset.html 16,185 images, 196 classes Labels: Car, Make, Year
  35. 35. SigOpt. Confidential.39
  36. 36. SigOpt. Confidential.40 Cost efficiency SigOpt Bayesian Random Hours per training 4.2 4.2 4.2 Observations 220 646 646 Number of Runs 1 1 20 Total compute hours 924 2,713 54,264 Cost per GPU-hour $0.90 $0.90 $0.90 Total compute cost $832 $2,442 $48,838 Time to optimize SigOpt Bayesian Random Total compute hours 924 2,713 54,264 # of Machines 20 20 20 Wall-clock time (hrs) 46 136 2,713 1.7% the cost of random search to achieve similar performance 58x faster wall-clock time to optimize with multitask than random search
  37. 37. Performance (broadly defined) Entirely new capabilities
  38. 38. Failed observations Constraints on the model Noise in the data Competing metrics Lengthy training cycles Distributed training
  39. 39. Failed observations Constraints on the model Noise in the data Competing metrics Lengthy training cycles Distributed training
  40. 40. Thank you https://sigopt.com/company/careers/ https://sigopt.com/blog/ https://sigopt.com/research/ https://sigopt.com/try-it

×