Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ThinkFast: Scaling Machine Learning to Modern Demands

1,680 views

Published on

by Hristo Spassimirov Paskov
Founder and CEO, ThinkFast Mathematical Intelligence Corporations
Intel Software Innovator for Artificial Intelligence

Machine learning has revolutionized the technological landscape and its success has inspired the collection of vast amounts of data aimed at answering ever deeper questions and solving increasingly harder problems. Continuing this success critically relies on the existence of machine learning paradigms that can perform sophisticated analyses at the data scales required by modern data sets and that reduce development cycle times by improving ease of use. The evolution of machine learning paradigms shows a marked trend toward better addressing these desiderata and a convergence toward paradigms that blend “smooth” modeling techniques classically attributed to statistics with “combinatorial” elements traditionally studied in computer science.

These modern learning paradigms pose a new set of challenges that, when properly addressed, open an unexpected wealth of possibilities. I will discuss how ThinkFast is solving these challenges with fundamental advances in optimization that promote the interpretation of machine learning as a more classical database technology. These advances allow us to scale a variety of techniques to unprecedented data scales using commodity hardware. They also provide surprising insights into how modern techniques learn about data, including a characterization of the limits of what they can learn, and ultimately allow us to devise new, more powerful techniques that do not suffer from these limitations.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

ThinkFast: Scaling Machine Learning to Modern Demands

  1. 1. ThinkFast: Scaling Machine Learning to Modern Demands Hristo Paskov
  2. 2. The Genomic Data Deluge • Precision Medicine Initiative: sequence 1,000,000 genomes – $215 Million in 2015 – Pilot study – Outputs 10-50 GB/person How do we analyze all of this data to drive progress?
  3. 3. Massive Data Sources News eCommerce Bioinformatics 100K Genomes Social Media
  4. 4. The Analysis Refinement Cycle ⨂ Data 1 2 𝑦 − 𝑋𝑤 2 2 + 𝜆 2 𝑤 2 2 Model 𝑥+ = 𝑥 − 𝛼𝑀𝛻𝑓 𝑥 Solver Model captures data nuance? Solver exists, is fast enough? Yes? Proceed ! No? Quit Increase time, money, experience, resources
  5. 5. More Than Just Training Models • Regularization paths • Model risk assessment • Interpretability ModelCoefficient Regularization Parameter
  6. 6. Brief History of Statistical Learning Interpretability & Statistical Guarantees Scalability Ease of Use Simple Models Kernel Methods Trees & Ensembles Structured Regularization
  7. 7. Structured Regularization Losses Regression Classification Ranking Motif Finding Matrix Factorization Feature Embedding Data Imputation … Regularizers Sparsity Spatial/ Temporal / Manifold Structure Group Structure Hierarchical Structure Structured & Unstructured Multitask Learning … min 𝛽∈ℝ 𝑑 𝐿 𝑋𝛽 + 𝜆𝑅 𝛽
  8. 8. The Lasso’s Combinatorial Side min 𝛽∈ℝ 𝑑 𝐿 𝑦 − 𝑋𝛽 + 𝜆 𝛽 1 𝜆 0 3 2 1 4 ModelCoefficient
  9. 9. The Database Perspective min 𝛽∈ℝ 𝑑 𝐿 𝑦 − 𝑋𝛽 + 𝜆 𝛽 1 −𝑋 𝑇 𝜕 𝑦−𝑋𝛽 𝐿 𝑦 − 𝑋𝛽 + 𝜆𝜕 𝛽 𝛽 1
  10. 10. The Database Perspective −𝑋 𝑇 𝜕 𝑦−𝑋𝛽 𝐿 𝑦 − 𝑋𝛽 + 𝜆𝜕 𝛽 𝛽 1 Feature & label storage
  11. 11. The Database Perspective −𝑋 𝑇 𝜕 𝑦−𝑋𝛽 𝐿 𝑦 − 𝑋𝛽 + 𝜆𝜕 𝛽 𝛽 1 Feature & label storage Data access operations 𝑢 = 𝑦 − 𝑋𝛽 𝑣 = 𝜕 𝑢 𝐿 𝑢 𝑤 = 𝑋 𝑇 𝑣
  12. 12. The Database Perspective −𝑋 𝑇 𝜕 𝑦−𝑋𝛽 𝐿 𝑦 − 𝑋𝛽 + 𝜆𝜕 𝛽 𝛽 1 Feature & label storage Data access operations 𝑢 = 𝑦 − 𝑋𝛽 𝑣 = 𝜕 𝑢 𝐿 𝑢 𝑤 = 𝑋 𝑇 𝑣 ML “Query Language” min 𝛽∈ℝ 𝑑 𝐿 𝑦 − 𝑋𝛽 + 𝜆 𝛽 1
  13. 13. The Database Perspective min 𝛽1,𝛽2,𝛽3∈ℝ 𝑑 𝑡=1 3 𝐿 𝑡 𝑦𝑡 − 𝑋𝑡 𝛽𝑡 + 𝜆 𝑡 𝑅𝑡 𝛽𝑡 +𝜔 𝛽1 𝛽2 𝛽3 ∗
  14. 14. The Database Perspective Feature, label and model storage Data access operations 𝑢 = 𝑦 − 𝑋𝛽 𝑣 = 𝜕 𝑢 𝐿 𝑢 𝑤 = 𝑋 𝑇 𝑣 ML “Query Language” min 𝛽∈ℝ 𝑑 𝐿 𝑦 − 𝑋𝛽 + 𝜆 𝛽 1 𝑀1 𝑀2 𝑀1 𝑀2 𝑀3 𝑀1 𝑀2
  15. 15. The Database Perspective 𝑢 = 𝑦 − 𝑋𝛽 𝑣 = 𝜕 𝑢 𝐿 𝑢 𝑤 = 𝑋 𝑇 𝑣 min 𝛽∈ℝ 𝑑 𝐿 𝑦 − 𝑋𝛽 + 𝜆 𝛽 1 𝑀1 𝑀2 𝑀1 𝑀2 𝑀3 𝑀1 𝑀2 Processing Memory Mathematical Structure
  16. 16. Efficient Feature Storage
  17. 17. “Query Language” Optimization • Static analysis 𝑦 − 𝑋𝑤 2 2 + 𝑤 2 2 𝑦 − 𝑋𝑤 2 2 + 𝑤 1 ? 𝑦 − 𝑋𝑤 2 2 + 1 2 𝑤 2 2 + 𝑤 1
  18. 18. “Query Language” Optimization • Static analysis 𝑦 − 𝑋𝑤 2 2 + 𝑤 2 2 𝑦 − 𝑋𝑤 2 2 + 𝑤 1 𝑦 − 𝑋𝑤 2 2 + 1 2 𝑤 2 2 + 𝑤 1 ? 𝜀 𝑦 − 𝑋𝑤 + 1 2 𝑤 2 2 + 𝑤 1
  19. 19. “Query Language” Optimization • Static analysis • Runtime analysis
  20. 20. Some Bioinformatics Applications • Personalized medicine, Memorial Sloan Kettering Cancer Center – 35% accuracy improvement over state-of-the-art • Metagenomic binning and DNA quality assessment, Stanford School of Medicine – Previously unsolved problem • Toxicogenomic analysis, Stanford University – Improved on state-of-the-art results
  21. 21. Upcoming • Massive scale character level sentiment and text analysis on Amazon data – Billions of features, hours to solve a model – Efficient multitask learning • Characterize the global limitations of learning word structure – Devise provably more efficient regularizers for uncovering structure

×