Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15

1,046 views

Published on

Topology as Framework for Data Science: Ayasdi has a unique approach to machine learning and data analysis using topology. This framework represents a revolutionary way to look at and understand data that is orthogonal but complementary to traditional machine learning and statistical tools. In this presentation I will show you what is meant by this statement: How does topology help with data analysis? Why would you use topology? I will illustrate with both synthetic examples and problems we’ve solved for our clients.

Published in: Technology
  • Be the first to comment

Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15

  1. 1. Shape as Organizing Principle for Data MLConf Seattle 2015 Anthony Bak, Principal Data Scientist
  2. 2. The Data Problem: Complexity
  3. 3. Solution: Topological Summaries
  4. 4. Shape as Organizing Principle for Data
  5. 5. Shape as Organizing Principle
  6. 6. Reduce Bias, Discover Models TDA tells you the data you have, not the data you want to have.
  7. 7. Generating Topological Summaries
  8. 8. Generating Topological Summaries
  9. 9. Generating Topological Summaries
  10. 10. Generating Topological Summaries
  11. 11. Generating Topological Summaries
  12. 12. Generating Topological Summaries
  13. 13. Generating Topological Summaries
  14. 14. Generating Topological Summaries
  15. 15. Generating Topological Summaries
  16. 16. Generating Topological Summaries
  17. 17. Generating Topological Summaries
  18. 18. Generating Topological Summaries
  19. 19. Generating Topological Summaries
  20. 20. Generating Topological Summaries
  21. 21. Generating Topological Summaries
  22. 22. Generating Topological Summaries
  23. 23. Generating Topological Summaries
  24. 24. Remember/Forget  Use multiple lenses/metrics to get the complete picture  Different lenses provide different summaries
  25. 25. Generating Topological Summaries
  26. 26. Lenses: where do they come from? Mean/Max/Min Variance n-Moment Density … Statistics PCA/SVD Autoencoders Isomap/MDS/TS NE … Machine Learning Centrality Curvature Harmonic Cycles … Geometry
  27. 27. Why Topology?
  28. 28. Key Properties of TDA Deformation Invariance Compressed Representation Coordinate Freeness
  29. 29. Coordinate Invariance 1. Topology of shape doesn’t depend on the coordinates used to describe the shape 1. Different feature sets can describe the same phenomena 1. While processing data, we frequently alter coordinates: scaling, rotating, whitening You want to study properties of your data that are invariant under coordinate changes
  30. 30. Coordinate Invariance: Gene Expression NKI GSE230
  31. 31. Coordinate Invariance: Disease State
  32. 32. Deformation Invariance • Topological features don’t change when you stretch and distort the data Advantage: Makes problems easier  Noise resistance  Less pre-processing of data  Robust (stable) data
  33. 33. Deformation Invariance
  34. 34. Deformation Invariance
  35. 35. Deformation Invariance
  36. 36. Deformation Invariance
  37. 37. Compressed Representation • Replace the metric space with a combinatorial summary: a simplicial complex. • Data becomes easier to manage, search, and query while maintaining essential features. • Leverages many known algorithms from graph theory, computational topology, computational geometry.
  38. 38. Compressed Representation
  39. 39. Baby Steps: PCA
  40. 40. PCA
  41. 41. PCA
  42. 42. Data Stories
  43. 43. Model Introspection
  44. 44. Model Introspection
  45. 45. Predictive Maintenance
  46. 46. Customer Churn
  47. 47. Customer Churn
  48. 48. Transaction Fraud
  49. 49. Transaction Fraud
  50. 50. Transaction Fraud
  51. 51. We’re Hiring! http://www.ayasdi.com/company/careers/ Data Has Shape And Shape Has Meaning

×