Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
1© Cloudera, Inc. All rights reserved.
Or, A Less Pretentious Title
Josh Wills | Senior Director of Data Science
Meditatio...
2© Cloudera, Inc. All rights reserved.
About Me
3© Cloudera, Inc. All rights reserved.
Data Scientists At Work
4© Cloudera, Inc. All rights reserved.
Data Scientists at Home
5© Cloudera, Inc. All rights reserved.
I For One…
6© Cloudera, Inc. All rights reserved.
Deep Learning
7© Cloudera, Inc. All rights reserved.
Deeper Deep Learning
8© Cloudera, Inc. All rights reserved.
The Importance of Context
9© Cloudera, Inc. All rights reserved.
Help Me Help You
10© Cloudera, Inc. All rights reserved.
Brainwash!
11© Cloudera, Inc. All rights reserved.
The Operational/Analytical Impedance Mismatch
12© Cloudera, Inc. All rights reserved.
Supernova Schemas
13© Cloudera, Inc. All rights reserved.
Exhibit: http://github.com/jwills/exhibit
14© Cloudera, Inc. All rights reserved.
Demo Time!
jwills@cloudera.com
Upcoming SlideShare
Loading in …5
×

Josh Wills, Director of Data Science, Cloudera at MLconf SEA - 5/01/15

1,021 views

Published on

Brainwashed: Building an IDE for Feature Engineering: Feature engineering- writing code to map raw input data into a set of signals that will be fed into a machine learning algorithm- is the dark art of data science. Although the process of crafting new features is tedious and failure-prone, the key to a successful model is a diverse set of high-quality features that are informed by domain experts. Recently, academic researchers have begun to focus on the problem of feature engineering, and have started to publish research that addresses the relative lack of tools that are designed to support the feature engineering process. In this talk, I will review some of my favorite papers and present some efforts to convert these ideas into tools that leverage the principles of reactive application design in order to make feature engineering (dare I say it) fun.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Josh Wills, Director of Data Science, Cloudera at MLconf SEA - 5/01/15

  1. 1. 1© Cloudera, Inc. All rights reserved. Or, A Less Pretentious Title Josh Wills | Senior Director of Data Science Meditations on on Feature Engineering
  2. 2. 2© Cloudera, Inc. All rights reserved. About Me
  3. 3. 3© Cloudera, Inc. All rights reserved. Data Scientists At Work
  4. 4. 4© Cloudera, Inc. All rights reserved. Data Scientists at Home
  5. 5. 5© Cloudera, Inc. All rights reserved. I For One…
  6. 6. 6© Cloudera, Inc. All rights reserved. Deep Learning
  7. 7. 7© Cloudera, Inc. All rights reserved. Deeper Deep Learning
  8. 8. 8© Cloudera, Inc. All rights reserved. The Importance of Context
  9. 9. 9© Cloudera, Inc. All rights reserved. Help Me Help You
  10. 10. 10© Cloudera, Inc. All rights reserved. Brainwash!
  11. 11. 11© Cloudera, Inc. All rights reserved. The Operational/Analytical Impedance Mismatch
  12. 12. 12© Cloudera, Inc. All rights reserved. Supernova Schemas
  13. 13. 13© Cloudera, Inc. All rights reserved. Exhibit: http://github.com/jwills/exhibit
  14. 14. 14© Cloudera, Inc. All rights reserved. Demo Time! jwills@cloudera.com

×