Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

That Conference 2017 - Killing a Fly with a Shotgun: Metacognition and the Art of Problem Solving


Published on

Nexosis Data Scientist Joe Volzer presented this deck at That Conference on August 7th, 2017.

Published in: Technology
  • Be the first to comment

That Conference 2017 - Killing a Fly with a Shotgun: Metacognition and the Art of Problem Solving

  1. 1. 1 Killing a Fly with a Shotgun Metacognition and the Art of Problem Solving
  2. 2. 2 Thank you sponsors! o That Conference sent me an e-mail requesting that I do this. o Also, I’m genuinely thankful. This is an impressive event.
  3. 3. 3 Relevant Credentials o BA Mathematics, The Ohio State University, 2007 o MAT Teaching Leadership and Curriculum Studies, Kent State University, 2008 o PhD Applied Mathematics, Case Western Reserve University, 2014 o Currently, Data Scientist at Nexosis o Previously, various flavors of nerd.
  4. 4. 4 METACOGNITION IS TOTALLY NOT MADE UP o Metacognition is simply thinking about the way you think o Why does it matter? o Problem Solving is a creative endeavor
  5. 5. 5 Themes that I have noticed o Knobbiness – Creative ideas are the result of twisting knobs on an idea machine o Local Triviality – Complicated ideas are just a series of simple ideas o Cognitive Resolution o Like visual resolution, but with ideas o Explain things as simply as possible, but no simpler
  6. 6. 6 Applying these ideas to regression Linear regression – also known as finding a line of “best” of fit 𝑦𝑦𝑖𝑖 ≈ 𝛼𝛼1 𝑥𝑥 𝑖𝑖 + 𝛼𝛼0 where y is the target value, x is the input, α are the model parameters. How do we find such a line?
  7. 7. 7 How do we find this? Find the αi that minimize the following sum: � 𝑖𝑖 𝑟𝑟𝑖𝑖 2 , Where 𝑟𝑟𝑖𝑖 = 𝑦𝑦𝑖𝑖 − 𝛼𝛼1 𝑥𝑥 𝑖𝑖 − 𝛼𝛼0. This is called minimizing the residuals. We do this by creating a system of equations called the “Normal equations.” What are the Normal equations? The subject of another talk entirely.
  8. 8. 8 What about non-Linear regression? o Polynomial regression o 𝒓𝒓𝒊𝒊 = 𝒚𝒚𝒊𝒊 − 𝛼𝛼2 𝑥𝑥1 𝑖𝑖 2 − 𝛼𝛼1 𝑥𝑥1 (𝑖𝑖) − 𝛼𝛼0 o Polynomial multiple-regression or multinomial o 𝒓𝒓𝒊𝒊 = 𝒚𝒚𝒊𝒊 − 𝛼𝛼1 𝑥𝑥2 𝑖𝑖 2 − 𝛼𝛼2 𝑥𝑥1 𝑖𝑖 2 − 𝛼𝛼3 𝑥𝑥2 𝑖𝑖 − 𝑎𝑎4 𝑥𝑥1 𝑖𝑖 − 𝑎𝑎0 A subtle twist of the knob allows us to create all sorts “new” methods. They’re all solved the same way!
  9. 9. 9 Variable Selection – How much is too much? o How do you know which is relevant? o You could manually test all combinations - not wise o Physical intuition – doesn’t always apply, but it extremely useful when it does o Past experience – food service employees know their regulars o Is there an algorithmic approach? o Should you preserve the contributions of all features? o What happens if you decided to throw some of them out?
  10. 10. 10 Ridge Regresssion 𝑎𝑎𝑎𝑎𝑎𝑎 𝑚𝑚𝑚𝑚𝑚𝑚 � 𝑖𝑖 𝑟𝑟𝑖𝑖 2 + 𝜆𝜆 � 𝑘𝑘 𝛼𝛼𝑘𝑘 2 Ridge regression is just regular regression with an additional term. Ridge regression forces the parameters to be small.
  11. 11. 11 LASSO Regularization 𝑎𝑎𝑎𝑎𝑎𝑎 𝑚𝑚𝑚𝑚𝑚𝑚 � 𝑖𝑖 𝑟𝑟𝑖𝑖 2 + 𝜆𝜆 � 𝑘𝑘 |𝛼𝛼𝑘𝑘| LASSO is just ridge regression, but with a different penalty term. Unlike previous small changes, this one leads to a significant difference in how the solution is computed. Namely, the normal equations no longer apply. Will often force a “sparse” set of parameters.
  12. 12. 12 Concluding remarks o Machine Learning is nuanced. o Most methods are variations on a theme. o X is just Y but with Z changed. o Explain things as simply as possible, but no simpler. o Complex problems are series of simple problems strung together. o Think about how you think. It will make you a better problem solver.
  13. 13. 13