Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

0 to kaggle in 30 minutes

2,987 views

Published on

0 to kaggle in 30 minutes

Published in: Data & Analytics
  • Be the first to comment

0 to kaggle in 30 minutes

  1. 1. 0 to Kaggle in 30 minutesYou-Cyuan Jhang, Sr. Data Science Engineer @Castlight Health Ming Tsai, Sr. Data Engineer @Silicon Valley Data Science
  2. 2. Founded 2010 200,000 data scientists worldwide
  3. 3. Digit Recognizer Contest US Postal Service 21 million pieces of mails every hour More than $1 million could be saved each day sorting zip codes
  4. 4. PostgreSQL MADLib run algorithm in-place no data movement
  5. 5. Kaggle Machine Learning Pipeline Unknown Data Training Prediction Dataset ModelKnown Data
  6. 6. What can Madlib Do? Linear Regression, Logistic Regression, Support Vector Machine, Random Forest, Singular Value Decomposition, Clustering K-means Clustering
  7. 7. K-means Demo
  8. 8. Demo
  9. 9. Future K-means Visualization http://tech.nitoyon. com/en/blog/2013/11/07/k-means/ Source https://github.com/ming-svds/kmeans-digit-on-madlib

×