Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data science essentials in azure ml


Published on

In this session i am covering data science principals such as: Regression, Clustering, Classification, Recommendation and how to build programmable components in Azure Machine Learning experiments using data science programming languages. The session shows and illustrate how to implement these concepts using Azure ML studio.

Published in: Technology
  • Login to see the comments

  • Be the first to like this

Data science essentials in azure ml

  2. 2. Session Objectives & Takeaways Data Science basic concepts of regression, classification, clustering and recommendations. Azure ML studio platform capabilities
  3. 3. Data Science Involves  Data science is about using data to make decision that drive actions.  Data science process involves:  Data selection  Preprocessing  Transformation  Data Mining  Delivering value from data : Interpretation and evaluation
  4. 4. Machine Learning Overview  Machine Learning is a computing technique that has its origins in artificial intelligence (AI) and statistics. Machine Learning solutions include:  Classification - Predicting a Boolean true/false value for an entity with a given set of features.  Regression - Predicting a real numeric value for an entity with a given set of features.  Clustering - Grouping entities with similar features.  Recommendation - Recommending an item to a user based on past behavior or preferences of similar users (Recommender systems).
  5. 5. Classification  We want to teach a computer how to recognize images of chairs?  We will provide set of images to a computer and tell which is a chair and which is not.  A computer is supposed to learn to recognize chairs and which ones are chairs and which are not chairs even from images that has not seen before.  It is learning by example  To build this experiment, we need training set and test set. We need as much data as we can.  Each observation is represented by a set of numbers (features)  A training & test sets are set of vector numbers and the label is represented by a number  Example: if an image is a chair, label will be +1 otherwise -1
  6. 6. Classification Example
  7. 7. Classification Cont.  Yes/No questions is the most basic  Examples: Automatic handwriting recognition, speech recognition, biometrics, document classification, spam detection, credit card fraud, predicting customer churn….etc.  Formally, given training set (xi,yi) for i=1…n, we want to create a classification model f that can predict label y for a new x  Machine learning algorithm creates function f(x) for you.  The predicted value of y for a new x is simply the sigh of the function f(x)
  8. 8. Regression  For predicting real-valued outcomes:  How many customers will arrive at our website next week?  How many TV’s will sell next year?  Can we predict someone’s income from their click through information?  Formally, given training set (Xi,Yi) for i=1…n, we want to create a regression model f that can predict label y for a new x
  9. 9. Supervised Learning  Classification and Regression are supervised learning problems  “Supervised” means that the training data has ground truth labels to learn from  (Supervised) classification often has +1 or -1 labels  (Supervised) regression has numerical labels  There are lots of supervised problems  Supervised learning algorithm are much easier to evaluate than unsupervised ones
  10. 10. DEMO Regression: Demand Estimation (Bike Rental)
  11. 11. Clustering  Clustering is an unsupervised learning problem  “Unsupervised” means that the training set has no ground truth labels to learn from  This means they are much harder to evaluate  Clustering groups data entities based on their feature values.
  12. 12. Recommendation  Recommender systems are machine learning solutions that match individuals to items based on the preferences of other similar individuals, or other similar items that the individual is already known to like.  Recommendation is one of the most commonly used forms of machine learning.  An example of recommender system: Netflix Contest  Build a better recommender system from Netflix data  Azure ML modules: split, train matchbox recommender, Score matchbox recommender, Evaluate Recommender.
  13. 13. Recommendation Cont.  Use Score Matchbox Recommender to generate predictions  You can generate the following kinds of prediction:  Item Recommendation: Predicts recommended items based on a given user.  Related Items: Predicts recommended items based on a given item.  Rating Prediction: Predicts ratings for given users and items.  Related Users: Predicts users based on a given user.  Use the Evaluate Recommender module to evaluate recommender performance.
  14. 14. DEMO Movie Recommender System
  15. 15. Azure ML Platform Features  Notebooks for writing code in Azure ML (Jupyter)  Building Programmable component using Python & R  Publish experiments as web services  Save trained models for re-usability  Projects for creating combined assets (Experiments, datasets, notebooks..etc).
  16. 16. References  Free e-book “Azure Machine Learning”   Azure Machine Learning documentation  learning/  Data Science and Machine Learning Essentials 
  17. 17. Thank you  Check out my blog for Azure ML articles:  Follow me on Twitter: @MostafaElzoghbi